Recap
- plot: added action hist plotting during training
- infra: added stochastic hurst FBM environment
- experiment: tested FBM environment
- infra: fixed up LSTM training code, now it works properly
TODO
- add forced linear liquidation
- pretrain model on then continue training on
Remaining from last week
- make asimptotic plot without market bound and put the trained policy eval on it
- make asimptotic plot for a single hurst
- show side by side of H0.1 T1024 with and without clipped action space, that is
[-10, 10]
and[-1_000, 1_000]
- if the training works for small
T
then we can try to incrementally extend the time horizon - when running parallell envs try to shuffle the starting point to cover more of the episode length
- train model on with with big
num_steps_eval
number (5_000) - fix training for
- Liquidation test env continue
- make a grid search where there are three axis:
clip_coef
,learning_rate
,H = 0.1, 0.7
- make an env that is a child of FBMEnv and has a price process of a linear function with random slope