Recap
- plot: action hist bars
- plot: action hist polygons
- plot: market bound with optimal strat
- plot: split the train eval plot along the y axis
- experiment: train with same
T / num_steps
ratio for the same amount of iterations - experiment: train with full price history in observation space
- experiment: train with sliding window of price history
- experiment: change reward type from delta wealth to delta bankroll
TODO
- insert action hist plotting into the end of the training pipeline, after train eval plot
- make asimptotic plot without market bound and put the trained policy eval on it
- make asimptotic plot for a single hurst
- make an FBM env with stochastic hurst
- fix
train_lstm.py
Remaining from last week
- show side by side of H0.1 T1024 with and without clipped action space, that is
[-10, 10]
and[-1_000, 1_000]
- if the training works for small
T
then we can try to incrementally extend the time horizon - when running parallell envs try to shuffle the starting point to cover more of the episode length
- train model on with with big
num_steps_eval
number (5_000) - fix training for
- Liquidation test env continue
- make a grid search where there are three axis:
clip_coef
,learning_rate
,H = 0.1, 0.7
- make an env that is a child of FBMEnv and has a price process of a linear function with random slope