r__{t+1}_+_\gamma_\displaystyle_\max__a_Q(S__{t+1},_a)_-_Q(S__t,_a__t)