Last week
- sampled from random uniform, like hurst
- prioritized experience replay
- instead of
100, ..., 550
for do - each model iferred on all other ‘s
- finish rewrite in C
- sweep hyperparams in C
Notes
- checkers
- asteroids
- hardmaze
- double cart pole (wip)
TODO
- try to evaluate the optimal strategy on C env
- write interfacing python program to get fbm relaization from c to python