Last week

  • sampled from random uniform, like hurst
  • prioritized experience replay
  • instead of 100, ..., 550 for do
  • each model iferred on all other ‘s
  • finish rewrite in C
  • sweep hyperparams in C

Notes

  • checkers
  • asteroids
  • hardmaze
  • double cart pole (wip)

TODO

  • try to evaluate the optimal strategy on C env
  • write interfacing python program to get fbm relaization from c to python