Where are we now?
Where are we at with the code
it works?
Where are we at with the project
Tanito algo pythonban van
Environment C-ben
Config .ini-ben
Custom fbm generalas C-ben, this is the biggest pain point probably
Experiments
-
fbm worked with forced contrarian reward: https://wandb.ai/leonardotoffalini-e-tv-s-lor-nd-university/pufferlib/runs/s2zmd7tl?nw=nwuserleonardotoffalini
-
fbm worked with delta riskless reward: https://wandb.ai/leonardotoffalini-e-tv-s-lor-nd-university/pufferlib/runs/2gpqre2a?nw=nwuserleonardotoffalini
-
there is something funky with liquidation which i dont understand:
DOING
- generate static dataset of fbm trajectories with python, then sample from this db from c env
TODO
-
increase action space from -1, 0, 1 to
-
instead of delta riskless do terminal riskless reward, see if it helps out with liquidation
-
instead of returning pre terminal reward in the last step, try out to retrun terminal reward
-
if something useful works run a sweep on it
-
run the experiments which return positive results on higher total_steps, bc they were dubious if they falttened out