Recap
- plot: added action hist plotting during training
- infra: added stochastic hurst FBM environment
- experiment: tested FBM environment
- infra: fixed up LSTM training code, now it works properly
TODO
- add forced linear liquidation
- pretrain model on then continue training on
Remaining from last week
- make asimptotic plot without market bound and put the trained policy eval on it
- make asimptotic plot for a single hurst
- show side by side of H0.1 T1024 with and without clipped action space, that is
[-10, 10]and[-1_000, 1_000] - if the training works for small
Tthen we can try to incrementally extend the time horizon - when running parallell envs try to shuffle the starting point to cover more of the episode length
- train model on with with big
num_steps_evalnumber (5_000) - fix training for
- Liquidation test env continue
- make a grid search where there are three axis:
clip_coef,learning_rate,H = 0.1, 0.7 - make an env that is a child of FBMEnv and has a price process of a linear function with random slope