我试图微调一个变压器模型的文本分类,但我有困难的培训模型。我试过很多东西,但似乎都没有用。我在其他问题上也尝试过不同的解决方案,但它们没有奏效。我正在使用‘microsoft/deberta-v3基’模型进行微调。这是我的密码:
train_dataset = Dataset.from_pandas(df_tr[['text', 'label']]).class_encode_column("label")
val_dataset = Dataset.from_pandas(df_tes[['text', 'label']]).class_encode_column("label")
train_tok_dataset = train_dataset.map(tokenizer_func, batched=True, remove_columns=('text'))
val_tok_dataset = val_dataset.map(tokenizer_func, batched=True, remove_columns=('text'))
from transformers import TFAutoModelForSequenceClassification
model = TFAutoModelForSequenceClassification.from_pretrained(config.model_name, num_labels=3)
transformer_model = TFAutoModelForSequenceClassification.from_pretrained(config.model_name, output_hidden_states=True)
input_ids = tf.keras.Input(shape=(config.max_len, ),dtype='int32')
attention_mask = tf.keras.Input(shape=(config.max_len, ), dtype='int32')
transformer = transformer_model([input_ids, attention_mask])
hidden_states = transformer[1] # get output_hidden_states
#print(hidden_states)
hidden_states_size = 4 # count of the last states
hiddes_states_ind = list(range(-hidden_states_size, 0, 1))
selected_hiddes_states = tf.keras.layers.concatenate(tuple([hidden_states[i] for i in hiddes_states_ind]))
# Now we can use selected_hiddes_states as we want
output = tf.keras.layers.Dense(128, activation='relu')(selected_hiddes_states)
output=tf.keras.layers.Flatten()(output)
output = tf.keras.layers.Dense(3, activation='softmax')(output)
model = tf.keras.models.Model(inputs = [input_ids, attention_mask], outputs = output)
from transformers import create_optimizer
import tensorflow as tf
batch_size = 8
num_epochs = config.epochs
#batches_per_epoch = len(tokenized_tweets["train"]) // batch_size
total_train_steps = int(num_steps * num_epochs)
optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=num_steps/2)
model.compile(optimizer=optimizer)
with tf.device('GPU:0'):
model.fit(x=[np.array(train_tok_dataset["input_ids"]),np.array(train_tok_dataset["attention_mask"])],
y=tf.keras.utils.to_categorical(y_train,num_classes=3),
validation_data=([np.array(val_tok_dataset["input_ids"]),np.array(val_tok_dataset["attention_mask"])],tf.keras.utils.to_categorical(y_test,num_classes=3)),
epochs=config.epochs,class_weight={0:0.57,1:0.18,2:0.39})这似乎是一个小问题,但我是新的tensorflow和变压器,所以我无法解决它自己。
发布于 2022-08-15 11:15:11
我想说,这可能是因为您没有在编译中添加损失,因此无法计算梯度:
model.compile(optimizer=optimizer)
^^^^^^^^^^^^^^^^^^^^---- no "loss = tf.keras.losses...发布于 2022-08-15 10:19:07
也许你只是错过了一个=在validation_data的右边。
model.fit(
x=[np.array(...),np.array(...)],
y=tf.keras.utils.to_categorical(...),
validation_data=([np.array(...), np.array(...)], tf.keras.utils.to_categorical(...)),
...
)https://stackoverflow.com/questions/73358850
复制相似问题