transformers.TrainingArguments
TrainingArguments is the subset of the arguments we use in our example scripts which relate to the training loop itself.
Using HfArgumentParser we can turn this class into argparse arguments that can be specified on the command line.
82のパラメタ!?
output_dir (str) — The output directory where the model predictions and checkpoints will be written.
overwrite_output_dir (bool, optional, defaults to False) — If True, overwrite the content of the output directory. Use this to continue training if output_dir points to a checkpoint directory.
output_dirがcheckpointディレクトリの場合に指定する
num_train_epochs(float, optional, defaults to 3.0) — Total number of training epochs to perform (if not an integer, will perform the decimal part percents of the last epoch before stopping training).
float!
Total optimization stepsの表示は全ステップ数になる(1エポックあたりのsteps × num_train_epochs)
per_device_train_batch_size (int, optional, defaults to 8) — The batch size per GPU/TPU core/CPU for training.
save_steps (int, optional, defaults to 500) — Number of updates steps before two checkpoint saves if save_strategy="steps".
1エポックでは到達しないステップ数でも動作した
Total optimization stepsをsave_steps刻みで保存
save_strategy (str or IntervalStrategy, optional, defaults to "steps") — The checkpoint save strategy to adopt during training. Possible values are: {no,epoch,steps}
save_total_limit (int, optional) — If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in output_dir.
output_dirにcheckpointが増え続けない
logging_steps (int, optional, defaults to 500) — Number of update steps between two logs if logging_strategy="steps".
EarlyStoppingCallback関連
load_best_model_at_end (bool, optional, defaults to False) — Whether or not to load the best model found during training at the end of training.
metric_for_best_model (str, optional) — Use in conjunction with load_best_model_at_end to specify the metric to use to compare two different models.
Must be the name of a metric returned by the evaluation with or without the prefix "eval_". Will default to "loss" if unspecified and load_best_model_at_end=True (to use the evaluation loss).
If you set this value, greater_is_better will default to True. Don’t forget to set it to False if your metric is better when lower.
デフォルトはevaluation lossなので、greater_is_better=Falseと指定する必要がある
evaluation_strategy (str or IntervalStrategy, optional, defaults to "no") — The evaluation strategy to adopt during training. Possible values are: {no,epoch,steps}
log_level (str, optional, defaults to passive)
Logger log level to use on the main process. Possible choices are the log levels as strings: ‘debug’, ‘info’, ‘warning’, ‘error’ and ‘critical’, plus a ‘passive’ level which doesn’t set anything and keeps the current log level for the Transformers library (which will be "warning" by default).
log_level_replicaもある
hub_model_id
ユーザ名は省略?(トークンから分かるのかな?)
set_push_to_hub で設定できる