transformersの文書分類の例のrun_glue.pyをimdbデータセットについて動かす
The following example fine-tunes BERT on the imdb dataset hosted on our hub:
imdbデータセット in hub
データの形式
code:imdb_row.json
{
"label": 0,
"text": "Goodbye world2\n"
}
text: a string feature.
label: a classification label, with possible values including neg (0), pos (1).
code:train.ipynb
!cd transformers/examples/pytorch/text-classification/ && \
python run_glue.py \
--model_name_or_path bert-base-cased \
--dataset_name imdb \
--do_train \
--do_predict \
--max_seq_length 128 \
--per_device_train_batch_size 32 \
--learning_rate 2e-5 \
--num_train_epochs 3 \
--output_dir /tmp/imdb/
数時間かかりそう