Reinforcement Learning, Fast and Slow

深層強化学習は人の学習と比較して大量の学習データが必要だと指摘されていた。原因は逐次的なパラメータ更新と弱い帰納バイアスにある。しかし近年は、前者はノンパラに近いエピソード記憶を使った手法、後者はRNNが暗黙的に実現するメタ学習で解決されてきている https://t.co/2QqVzZQtwt

— Daisuke Okanohara (@hillbig) May 8, 2019

「早い学習」(sample-efficientな学習)における「遅い学習」(weight-based incremental learning)の役割

In Meta-RL: establish inductive biases that can guide inference and thus support rapid adaptation to new tasks

In Episodic RL: Episodic RL inherently depends on judgments concerning resemblances between situations or states. Slow learning shapes the way that states are internally represented and thus puts in place a set of inductive biases concerning which states are most closely related.

Episodic Meta-RL

さらに遅い学習として進化もある

gradually sculpting architectural biases and algorithmic biases that allow faster lifetime learning.

meta-RLは生物における進化のような役割もある

真に汎用な学習アルゴリズムではなく，周囲の環境のregularitiesを活用するアルゴリズムを選択する

Reinforcement Learning, Fast and Slow

Matthew M. Botvinick, Sam Ritter, Jane X. Wang, Zeb Kurth-Nelson, Charles Blundell, Demis Hassabis

DeepMind

Trends in Cognitive Sciences

REVIEW | VOLUME 23, ISSUE 5, P408-422, MAY 01, 2019

Published: April 16, 2019

DOI: https://doi.org/10.1016/j.tics.2019.02.006