我最近成功地使用基于BERT模型的语句转换器分析了基于文本的数据。灵感来源于Kulkarni等人的书。(2022年),我的代码如下所示:
# Import SentenceTransformer
from sentence_transformers import SentenceTransformer
# use paraphrase-MiniLM-L12-v2 pre trained model
sbert_model = SentenceTransformer('paraphrase-MiniLM-L12-v2')
# My text
x='The cat cought the mouse'
# get embeddings for each question
sentence_embeddings_BERT= sbert_model.encode(x)我想使用DeBERTa模型来做同样的事情,但是无法让它运行。我成功地加载了模型,但是如何应用呢?
import transformers
from transformers import DebertaTokenizer, AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("microsoft/deberta-v3-base")
model = AutoModel.from_pretrained("microsoft/deberta-v3-base")
sentence_embeddings_deBERTa= model(x)最后一行未运行,错误消息是:
AttributeError:'str‘对象没有属性'size’
有任何有经验的DeBERTa用户吗?
谢谢帕特
发布于 2022-04-15 08:45:16
)当您调用encode()方法时,它将tokenize输入,然后将其编码到转换器模型所期望的张量,然后通过模型架构传递它。在使用transformers时,必须手动执行这些步骤。
from transformers import DebertaTokenizer, DebertaModel
import torch
# downloading the models
tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
model = DebertaModel.from_pretrained("microsoft/deberta-base")
# tokenizing the input text and converting it into pytorch tensors
inputs = tokenizer(["The cat cought the mouse", "This is the second sentence"], return_tensors="pt", padding=True)
# pass through the model
outputs = model(**inputs)
print(outputs.last_hidden_state.shape)最后,你必须知道你应该使用什么样的输出。
https://stackoverflow.com/questions/71878447
复制相似问题