Where are we now?

Where are we at with the code

it works?

Where are we at with the project

Tanito algo pythonban van
Environment C-ben
Config .ini-ben
Custom fbm generalas C-ben, this is the biggest pain point probably

Experiments

DOING

  • generate static dataset of fbm trajectories with python, then sample from this db from c env

TODO

  • increase action space from -1, 0, 1 to

  • instead of delta riskless do terminal riskless reward, see if it helps out with liquidation

  • instead of returning pre terminal reward in the last step, try out to retrun terminal reward

  • if something useful works run a sweep on it

  • run the experiments which return positive results on higher total_steps, bc they were dubious if they falttened out