inverse model - AGI

inverse model

逆モデル

現在の状態$ s_tと次の状態$ s_{t+1}から取った行動を推定：$ \hat{a}_t=g(s_t, s_{t+1})

行動予測誤差の逆伝播により$ gを学習