Parameters
Simulation Speed
External Force
Exploration Rate
step size for Q update
red axis: angle,
black axis: not used,
blue axis: value

left diagram: expected reward function for pushing to the left

middle diagram: difference between expected reward for left and for right

right diagram: expected reward function for pushing to the right