• log constant optimal performance values to compare against
  • sweep on new scaled FFT
  • make T be sampled from (128 1024)
  • plot the analytical optimal terminal riskless for and , like how it is already for 256
  • sort the values into buckets based on episode T then log terminal riskless in buckets
  • to log the Sharpe ratio, probably would have to log std then just expression with terminal riskless / std