首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >T5Tokenizer需要SentencePiece库,但在您的环境中找不到它。

T5Tokenizer需要SentencePiece库,但在您的环境中找不到它。
EN

Stack Overflow用户
提问于 2020-12-25 05:54:18
回答 2查看 6.3K关注 0票数 4

我正在尝试探索T5

这是密码

代码语言:javascript
运行
复制
!pip install transformers
from transformers import T5Tokenizer, T5ForConditionalGeneration
qa_input = """question: What is the capital of Syria? context: The name "Syria" historically referred to a wider region,
 broadly synonymous with the Levant, and known in Arabic as al-Sham. The modern state encompasses the sites of several ancient 
 kingdoms and empires, including the Eblan civilization of the 3rd millennium BC. Aleppo and the capital city Damascus are 
 among the oldest continuously inhabited cities in the world."""
tokenizer = T5Tokenizer.from_pretrained('t5-small')
model = T5ForConditionalGeneration.from_pretrained('t5-small')
input_ids = tokenizer.encode(qa_input, return_tensors="pt")  # Batch size 1
outputs = model.generate(input_ids)
output_str = tokenizer.decode(outputs.reshape(-1))

我发现了一个错误:

代码语言:javascript
运行
复制
---------------------------------------------------------------------------

ImportError                               Traceback (most recent call last)

<ipython-input-2-8d24c6a196e4> in <module>()
      5  kingdoms and empires, including the Eblan civilization of the 3rd millennium BC. Aleppo and the capital city Damascus are
      6  among the oldest continuously inhabited cities in the world."""
----> 7 tokenizer = T5Tokenizer.from_pretrained('t5-small')
      8 model = T5ForConditionalGeneration.from_pretrained('t5-small')
      9 input_ids = tokenizer.encode(qa_input, return_tensors="pt")  # Batch size 1

1 frames

/usr/local/lib/python3.6/dist-packages/transformers/file_utils.py in requires_sentencepiece(obj)
    521     name = obj.__name__ if hasattr(obj, "__name__") else obj.__class__.__name__
    522     if not is_sentencepiece_available():
--> 523         raise ImportError(SENTENCEPIECE_IMPORT_ERROR.format(name))
    524 
    525 

ImportError: 
T5Tokenizer requires the SentencePiece library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/google/sentencepiece#installation and follow the ones
that match your environment.


--------------------------------------------------------------------------

在此之后,我安装了如下所建议的句子库:

代码语言:javascript
运行
复制
!pip install transformers
!pip install sentencepiece

from transformers import T5Tokenizer, T5ForConditionalGeneration
qa_input = """question: What is the capital of Syria? context: The name "Syria" historically referred to a wider region,
 broadly synonymous with the Levant, and known in Arabic as al-Sham. The modern state encompasses the sites of several ancient 
 kingdoms and empires, including the Eblan civilization of the 3rd millennium BC. Aleppo and the capital city Damascus are 
 among the oldest continuously inhabited cities in the world."""
tokenizer = T5Tokenizer.from_pretrained('t5-small')
model = T5ForConditionalGeneration.from_pretrained('t5-small')
input_ids = tokenizer.encode(qa_input, return_tensors="pt")  # Batch size 1
outputs = model.generate(input_ids)
output_str = tokenizer.decode(outputs.reshape(-1))

但我有另一个问题:

在初始化'decoder.block.0.layer.1.EncDecAttention.relative_attention_bias.weight‘:T5ForConditionalGeneration:时不使用模型检查点的一些权重。

  • 这是预期的,如果您是从一个模型的检查点对另一个任务或使用另一个架构来初始化T5ForConditionalGeneration (例如从BertForPreTraining模型初始化一个BertForSequenceClassification模型)。
  • 如果您从期望完全相同的模型的检查点初始化T5ForConditionalGeneration (从BertForSequenceClassification模型初始化BertForSequenceClassification模型),则不需要这样做。

所以我不明白是怎么回事,有什么解释吗?

EN

回答 2

Stack Overflow用户

发布于 2022-05-05 09:46:04

我用了这两个命令,这对我来说很好!

代码语言:javascript
运行
复制
!pip install datsets transformers[sentencepiece]
!pip install sentencepiece
票数 2
EN

Stack Overflow用户

发布于 2021-04-17 13:19:14

这不是问题。我还观察到了第二个输出。这只是图书馆显示的警告。你解决了你的实际问题。不要担心警告。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/65445651

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档