transformersの文書分類の例のrun_glue.pyをimdbデータセットについて動かす

The following example fine-tunes BERT on the imdb dataset hosted on our hub:

imdbデータセット in hub

データの形式

code:imdb_row.json

{

"label": 0,

"text": "Goodbye world2\n"

}

text: a string feature.

label: a classification label, with possible values including neg (0), pos (1).

code:train.ipynb

!cd transformers/examples/pytorch/text-classification/ && \

python run_glue.py \

--model_name_or_path bert-base-cased \

--dataset_name imdb \

--do_train \

--do_predict \

--max_seq_length 128 \

--per_device_train_batch_size 32 \

--learning_rate 2e-5 \

--num_train_epochs 3 \

--output_dir /tmp/imdb/

数時間かかりそう