Recap

  • plot: added action hist plotting during training
  • infra: added stochastic hurst FBM environment
  • experiment: tested FBM environment
  • infra: fixed up LSTM training code, now it works properly

TODO

  • add forced linear liquidation
  • pretrain model on then continue training on

Remaining from last week

  • make asimptotic plot without market bound and put the trained policy eval on it
  • make asimptotic plot for a single hurst
  • show side by side of H0.1 T1024 with and without clipped action space, that is [-10, 10] and [-1_000, 1_000]
  • if the training works for small T then we can try to incrementally extend the time horizon
  • when running parallell envs try to shuffle the starting point to cover more of the episode length
  • train model on with with big num_steps_eval number (5_000)
  • fix training for
    • Liquidation test env continue
    • make a grid search where there are three axis: clip_coef, learning_rate, H = 0.1, 0.7
    • make an env that is a child of FBMEnv and has a price process of a linear function with random slope