get linear schedule with warmup
https://gyazo.com/f17ba9630fd083a98af77d1eca3a0756
BERT、つまりtensorflowの参照実装はcreate_optimizerあたりにある
polynomial_decay
これつかって実装されてる
pytorchはこの通り、transformrsで実装されてる
get_linear_schedule_with_warmup
code:py
def get_linear_schedule_with_warmup(optimizer, num_warmup_steps, num_training_steps, last_epoch=-1):
def lr_lambda(current_step):
learning_rate = max(0.0, 1. - (float(current_step) / float(num_training_steps)))
learning_rate *= min(1.0, float(current_step) / float(num_warmup_steps))
return learning_rate
return LambdaLR(optimizer, lr_lambda, last_epoch)