我正在下载模型https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384/tree/main微软/多语种-MiniLM-L12-H384,然后使用它。
变压器版本:4.11.3
我编写了以下代码:
import wandb
wandb.login()
%env WANDB_LOG_MODEL=true
model = tr.BertForSequenceClassification.from_pretrained("/home/pc/minilm_model",num_labels=2)
model.to(device)
print("hello")
training_args = tr.TrainingArguments(
report_to = 'wandb',
output_dir='/home/pc/proj/results2', # output directory
num_train_epochs=10, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=32, # batch size for evaluation
learning_rate=2e-5,
warmup_steps=1000, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
logging_steps=1000,
evaluation_strategy="epoch",
save_strategy="no"
)
print("hello")
trainer = tr.Trainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_data, # training dataset
eval_dataset=val_data, # evaluation dataset
compute_metrics=compute_metrics
)
在执行此操作之后:
这个模型就停留在这一点上:
*跑步训练*
Num examples = 12981
Num Epochs = 20
Instantaneous batch size per device = 16
Total train batch size (w. parallel, distributed & accumulation) = 32
Gradient Accumulation steps = 1
Total optimization steps = 8120
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"
,可能的解决方案是什么?
发布于 2022-01-04 12:12:34
我不知道为什么这会停止训练。
如果你发到HF论坛,也许有人可以帮你:https://discuss.huggingface.co
我为W&B工作,所以如果您认为这与使用W&B有关,或者您有任何问题,我可以在这里或在我们的论坛上帮助您。http://community.wandb.ai
https://stackoverflow.com/questions/70573652
复制相似问题