首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >如何使用AllenNLP和coref-spanbert-large在没有互联网的情况下解析共引?

如何使用AllenNLP和coref-spanbert-large在没有互联网的情况下解析共引?
EN

Stack Overflow用户
提问于 2021-05-13 17:09:08
回答 1查看 264关注 1票数 0

A希望在没有互联网的情况下使用AllenNLP和coref-spanbert-large模型来解决相互引用问题。我试着用这里描述的方式来做,https://demo.allennlp.org/coreference-resolution

我的代码:

代码语言:javascript
复制
from allennlp.predictors.predictor import Predictor
import allennlp_models.tagging

predictor = Predictor.from_path(r"C:\Users\aap\Desktop\coref-spanbert-large-2021.03.10.tar.gz")
example = 'Paul Allen was born on January 21, 1953, in Seattle, Washington, to Kenneth Sam Allen and Edna Faye Allen.Allen attended Lakeside School, a private school in Seattle, where he befriended Bill Gates, two years younger, with whom he shared an enthusiasm for computers.'
pred = predictor.predict(document=example)
coref_res = predictor.coref_resolved(example)
print(pred)
print(coref_res)

当我有一个访问互联网的代码正常工作。但是当我不能访问互联网时,我会得到以下错误:

代码语言:javascript
复制
Traceback (most recent call last):
  File "C:/Users/aap/Desktop/CoreNLP/Coref_AllenNLP.py", line 14, in <module>
    predictor = Predictor.from_path(r"C:\Users\aap\Desktop\coref-spanbert-large-2021.03.10.tar.gz")
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\predictors\predictor.py", line 361, in from_path
    load_archive(archive_path, cuda_device=cuda_device, overrides=overrides),
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\models\archival.py", line 206, in load_archive
    config.duplicate(), serialization_dir
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\models\archival.py", line 232, in _load_dataset_readers
    dataset_reader_params, serialization_dir=serialization_dir
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 604, in from_params
    **extras,
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 632, in from_params
    kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 200, in create_kwargs
    cls.__name__, param_name, annotation, param.default, params, **extras
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 307, in pop_and_construct_arg
    return construct_arg(class_name, name, popped_params, annotation, default, **extras)
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 391, in construct_arg
    **extras,
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 341, in construct_arg
    return annotation.from_params(params=popped_params, **subextras)
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 604, in from_params
    **extras,
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py", line 634, in from_params
    return constructor_to_call(**kwargs)  # type: ignore
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\data\token_indexers\pretrained_transformer_mismatched_indexer.py", line 63, in __init__
    **kwargs,
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\data\token_indexers\pretrained_transformer_indexer.py", line 58, in __init__
    model_name, tokenizer_kwargs=tokenizer_kwargs
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\data\tokenizers\pretrained_transformer_tokenizer.py", line 71, in __init__
    model_name, add_special_tokens=False, **tokenizer_kwargs
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\cached_transformers.py", line 110, in get_tokenizer
    **kwargs,
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 362, in from_pretrained
    config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\models\auto\configuration_auto.py", line 368, in from_pretrained
    config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\configuration_utils.py", line 424, in get_config_dict
    use_auth_token=use_auth_token,
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\file_utils.py", line 1087, in cached_path
    local_files_only=local_files_only,
  File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\file_utils.py", line 1268, in get_from_cache
    "Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

Process finished with exit code 1

请告诉我,在没有互联网的情况下,我需要做什么才能让我的代码工作呢?

EN

回答 1

Stack Overflow用户

发布于 2021-05-15 00:58:14

你需要一个transformer model的配置文件和词汇表的本地副本,这样tokenizer和token indexer就不需要下载它们了:

代码语言:javascript
复制
from transformers import AutoConfig, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(transformer_model_name)
config = AutoConfig.from_pretrained(transformer_model_name)
tokenizer.save_pretrained(local_config_path)
config.to_json_file(local_config_path + "/config.json")

然后,您需要将配置文件中的转换器模型名称覆盖到保存以下内容的本地目录(local_config_path):

代码语言:javascript
复制
predictor = Predictor.from_path(
    r"C:\Users\aap\Desktop\coref-spanbert-large-2021.03.10.tar.gz",
    overrides={
        "dataset_reader.token_indexers.tokens.model_name": local_config_path,
        "validation_dataset_reader.token_indexers.tokens.model_name": local_config_path,
        "model.text_field_embedder.tokens.model_name": local_config_path,
    },
)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67516665

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档