TODO

  • sampled from random uniform, like hurst
  • prioritized experience replay
  • instead of 100, ..., 550 for do
  • each model iferred on all other ‘s
  • finish rewrite in C
  • sweep hyperparams in C