首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >斯坦福大学CoreNLP BasicPipelineExample不工作

斯坦福大学CoreNLP BasicPipelineExample不工作
EN

Stack Overflow用户
提问于 2017-07-04 16:49:52
回答 2查看 2.1K关注 0票数 3

我正试着从斯坦福大学CoreNLP开始学习,甚至无法通过第一个简单的例子。

https://stanfordnlp.github.io/CoreNLP/api.html

这是我的代码:

代码语言:javascript
运行
复制
package stanford.corenlp;

import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.util.List;
import java.util.Map;
import java.util.Properties;

import com.google.common.io.Files;

import edu.stanford.nlp.dcoref.CorefChain;
import edu.stanford.nlp.dcoref.CorefCoreAnnotations.CorefChainAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.NamedEntityTagAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.semgraph.SemanticGraph;
import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation;
import edu.stanford.nlp.util.CoreMap;
import java.util.logging.Level;
import java.util.logging.Logger;

    private void test2() {
        // creates a StanfordCoreNLP object, with POS tagging, lemmatization, NER, parsing, and coreference resolution
        Properties props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

        // read some text in the text variable
        String text = "Now is the time for all good men to come to the aid of their country.";

        // create an empty Annotation just with the given text
        Annotation document = new Annotation(text);

        // run all Annotators on this text
        pipeline.annotate(document);

    }

  public static void main(String[] args) throws IOException {
      StanfordNLP nlp = new StanfordNLP();
      nlp.test2();
  }

}

下面是堆栈跟踪:

代码语言:javascript
运行
复制
Adding annotator tokenize
No tokenizer type provided. Defaulting to PTBTokenizer.
Adding annotator ssplit
Adding annotator pos
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:791)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:312)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:265)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:85)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:73)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(AnnotatorImplementations.java:55)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$42(StanfordCoreNLP.java:496)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getDefaultAnnotatorPool$65(StanfordCoreNLP.java:533)
    at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:118)
    at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:146)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:447)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:146)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:133)
    at stanford.corenlp.StanfordNLP.test2(StanfordNLP.java:93)
    at stanford.corenlp.StanfordNLP.main(StanfordNLP.java:108)
Caused by: java.io.IOException: Unable to open "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:480)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:789)
    ... 16 more
C:\Users\Greg\AppData\Local\NetBeans\Cache\8.2\executor-snippets\run.xml:53: Java returned: 1
BUILD FAILED (total time: 0 seconds)

我遗漏了什么?

EN

Stack Overflow用户

发布于 2019-12-31 23:05:23

2019-12-31为清晰/参考;注: Linux终端。

下载https://stanfordnlp.github.io/CoreNLP/download.html](https://stanfordnlp.github.io/CoreNLP/download.html])并解压它,并将其解压缩。

代码语言:javascript
运行
复制
pwd; ls -l
  /mnt/Vancouver/apps/CoreNLP/src-local/zzz
  -rw-r--r-- 1 victoria victoria 393239982 Dec 31 14:13 stanford-corenlp-full-2018-10-05.zip

unzip stanford-corenlp-full-2018-10-05.zip
  # ...

ls -l
  drwxrwxr-x 5 victoria victoria      4096 Oct  8  2018 stanford-corenlp-full-2018-10-05
  -rw-r--r-- 1 victoria victoria 393239982 Dec 31 14:13 stanford-corenlp-full-2018-10-05.zip 

保存"BasicPipelineExample.java“代码

在一个名为BasicPipelineExample.java的文件中:

代码语言:javascript
运行
复制
/mnt/Vancouver/apps/CoreNLP/src-local/zzz/BasicPipelineExample.java

编译它

代码语言:javascript
运行
复制
pwd     ## "sanity check"
  /mnt/Vancouver/apps/CoreNLP/src-local/zzz/

javac -cp stanford-corenlp-3.9.2.jar  BasicPipelineExample.java -Xdiags:verbose

它给出了Java类文件BasicPipelineExample.class,并从该dir运行它,

代码语言:javascript
运行
复制
java -cp .:* BasicPipelineExample

增编

上面的代码描述了在Java环境中对CoreNLP的访问,如下所述:https://stanfordnlp.github.io/CoreNLP/api.html#quickstart-with-convenience-wrappers

对于那些更倾向于(包括我自己)的人,斯坦福在Python环境中提供了基本相同的功能,如下所述:client.html

例如,

代码语言:javascript
运行
复制
import stanfordnlp
from stanfordnlp.server import CoreNLPClient
# JSON output [default]:
client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', \
    'parse', 'depparse','coref'], timeout=30000, memory='16G')
# Plain-text ourput (much more compact):
client = CoreNLPClient(annotators='tokenize, ssplit, pos, lemma, ner, parse, \
    depparse, coref', output_format='text',  timeout=30000, memory='16G')
text = 'Breast cancer susceptibility gene 1 (BRCA1) is a tumor suppressor protein.'
# This auto-starts the client() instance:
ann = client.annotate(text)
  # ....
sentence = ann.sentence[0]
print(sentence)
  # ... copious output ...
print(ann)
  # ... more succinct ...

注意:如果您使用的是output_format='text'参数,您可以这样做

代码语言:javascript
运行
复制
print(ann)

但不是这个

代码语言:javascript
运行
复制
sentence = ann.sentence[0]
print(sentence)
  Traceback (most recent call last):
    File "<console>", line 1, in <module>
  AttributeError: 'str' object has no attribute 'sentence'

使用stanfordnlp包,还可以设置管道,如下所述:https://stanfordnlp.github.io/stanfordnlp/

例如,

代码语言:javascript
运行
复制
import stanfordnlp
stanfordnlp.download('en')
nlp = stanfordnlp.Pipeline()
text = 'Bananas are an excellent source of potassium.'
text_nlp = nlp(text)
text_nlp.sentences[0].print_dependencies()

最后--尽管我觉得功能有限(请参阅。斯坦福--作者的CoreNLP库)--通过在spaCy:https://github.com/explosion/spacy-stanfordnlp中访问CoreNLP获得了一些类似的结果。

代码语言:javascript
运行
复制
import stanfordnlp
from spacy_stanfordnlp import StanfordNLPLanguage
snlp = stanfordnlp.Pipeline(lang="en")
nlp = StanfordNLPLanguage(snlp)
doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
    print(token.text, token.lemma_, token.pos_, token.dep_)
  # ...
票数 0
EN
查看全部 2 条回答
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/44910934

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档