訓練したRoBERTaモデルのエクスポート

trainer.save_modelでColabに保存する

Googleドライブをマウントし、成果物ファイルを移動（cpやmv）

Googleドライブからダウンロード（フォルダを選ぶとzip形式でダウンロードされる）

unzip foo.zip -d .

ローカル環境構築

Python 3.9.4

pip install transformers torch

torch==1.11.0

transformers==4.18.0

code:load.py

>> from pprint import pprint

>> from transformers import pipeline

>> fill_mask = pipeline("fill-mask", model="KantaiBERT", tokenizer="KantaiBERT")

>> predictions = fill_mask("Human thinking involves<mask>.")

>> pprint(predictions)

[{'score': 0.02367042936384678,

'sequence': 'Human thinking involves reason.',

'token': 393,

'token_str': ' reason'},

{'score': 0.015469103120267391,

'sequence': 'Human thinking involves it.',

'token': 306,

'token_str': ' it'},

{'score': 0.012228213250637054,

'sequence': 'Human thinking involves conceptions.',

'token': 605,

'token_str': ' conceptions'},

{'score': 0.011805328540503979,

'sequence': 'Human thinking involves experience.',

'token': 531,

'token_str': ' experience'},

{'score': 0.009144463576376438,

'sequence': 'Human thinking involves them.',

'token': 508,

'token_str': ' them'}]

Colabのnotebookと同様の結果が再現した！

scoreの小数が途中から異なる（が深追いしない）