文章/答案/技术大牛

发布

社区首页 >问答首页 >斯坦福大学CoreNLP BasicPipelineExample不工作

问斯坦福大学CoreNLP BasicPipelineExample不工作
EN

Stack Overflow用户

提问于 2017-07-04 16:49:52

回答 2查看 2.1K关注 0票数 3

我正试着从斯坦福大学CoreNLP开始学习，甚至无法通过第一个简单的例子。

https://stanfordnlp.github.io/CoreNLP/api.html

这是我的代码：

package stanford.corenlp;

import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.util.List;
import java.util.Map;
import java.util.Properties;

import com.google.common.io.Files;

import edu.stanford.nlp.dcoref.CorefChain;
import edu.stanford.nlp.dcoref.CorefCoreAnnotations.CorefChainAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.NamedEntityTagAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.semgraph.SemanticGraph;
import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation;
import edu.stanford.nlp.util.CoreMap;
import java.util.logging.Level;
import java.util.logging.Logger;

    private void test2() {
        // creates a StanfordCoreNLP object, with POS tagging, lemmatization, NER, parsing, and coreference resolution
        Properties props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

        // read some text in the text variable
        String text = "Now is the time for all good men to come to the aid of their country.";

        // create an empty Annotation just with the given text
        Annotation document = new Annotation(text);

        // run all Annotators on this text
        pipeline.annotate(document);

    }

  public static void main(String[] args) throws IOException {
      StanfordNLP nlp = new StanfordNLP();
      nlp.test2();
  }

}

下面是堆栈跟踪：

Adding annotator tokenize
No tokenizer type provided. Defaulting to PTBTokenizer.
Adding annotator ssplit
Adding annotator pos
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:791)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:312)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:265)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:85)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:73)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(AnnotatorImplementations.java:55)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$42(StanfordCoreNLP.java:496)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getDefaultAnnotatorPool$65(StanfordCoreNLP.java:533)
    at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:118)
    at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:146)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:447)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:146)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:133)
    at stanford.corenlp.StanfordNLP.test2(StanfordNLP.java:93)
    at stanford.corenlp.StanfordNLP.main(StanfordNLP.java:108)
Caused by: java.io.IOException: Unable to open "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:480)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:789)
    ... 16 more
C:\Users\Greg\AppData\Local\NetBeans\Cache\8.2\executor-snippets\run.xml:53: Java returned: 1
BUILD FAILED (total time: 0 seconds)

我遗漏了什么？

stanford-nlp

Stack Overflow用户

发布于 2019-12-31 23:05:23

2019-12-31为清晰/参考；注: Linux终端。

下载https://stanfordnlp.github.io/CoreNLP/download.html](https://stanfordnlp.github.io/CoreNLP/download.html])并解压它，并将其解压缩。

pwd; ls -l
  /mnt/Vancouver/apps/CoreNLP/src-local/zzz
  -rw-r--r-- 1 victoria victoria 393239982 Dec 31 14:13 stanford-corenlp-full-2018-10-05.zip

unzip stanford-corenlp-full-2018-10-05.zip
  # ...

ls -l
  drwxrwxr-x 5 victoria victoria      4096 Oct  8  2018 stanford-corenlp-full-2018-10-05
  -rw-r--r-- 1 victoria victoria 393239982 Dec 31 14:13 stanford-corenlp-full-2018-10-05.zip

保存"BasicPipelineExample.java“代码

在一个名为BasicPipelineExample.java的文件中：

/mnt/Vancouver/apps/CoreNLP/src-local/zzz/BasicPipelineExample.java

编译它

pwd     ## "sanity check"
  /mnt/Vancouver/apps/CoreNLP/src-local/zzz/

javac -cp stanford-corenlp-3.9.2.jar  BasicPipelineExample.java -Xdiags:verbose

它给出了Java类文件BasicPipelineExample.class，并从该dir运行它，

java -cp .:* BasicPipelineExample

增编

上面的代码描述了在Java环境中对CoreNLP的访问，如下所述：https://stanfordnlp.github.io/CoreNLP/api.html#quickstart-with-convenience-wrappers

对于那些更倾向于(包括我自己)的人，斯坦福在Python环境中提供了基本相同的功能，如下所述：client.html

例如，

import stanfordnlp
from stanfordnlp.server import CoreNLPClient
# JSON output [default]:
client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', \
    'parse', 'depparse','coref'], timeout=30000, memory='16G')
# Plain-text ourput (much more compact):
client = CoreNLPClient(annotators='tokenize, ssplit, pos, lemma, ner, parse, \
    depparse, coref', output_format='text',  timeout=30000, memory='16G')
text = 'Breast cancer susceptibility gene 1 (BRCA1) is a tumor suppressor protein.'
# This auto-starts the client() instance:
ann = client.annotate(text)
  # ....
sentence = ann.sentence[0]
print(sentence)
  # ... copious output ...
print(ann)
  # ... more succinct ...

注意:如果您使用的是output_format='text'参数，您可以这样做

print(ann)

但不是这个

sentence = ann.sentence[0]
print(sentence)
  Traceback (most recent call last):
    File "<console>", line 1, in <module>
  AttributeError: 'str' object has no attribute 'sentence'

使用stanfordnlp包，还可以设置管道，如下所述：https://stanfordnlp.github.io/stanfordnlp/

例如，

import stanfordnlp
stanfordnlp.download('en')
nlp = stanfordnlp.Pipeline()
text = 'Bananas are an excellent source of potassium.'
text_nlp = nlp(text)
text_nlp.sentences[0].print_dependencies()

最后--尽管我觉得功能有限(请参阅。斯坦福--作者的CoreNLP库)--通过在spaCy：https://github.com/explosion/spacy-stanfordnlp中访问CoreNLP获得了一些类似的结果。

import stanfordnlp
from spacy_stanfordnlp import StanfordNLPLanguage
snlp = stanfordnlp.Pipeline(lang="en")
nlp = StanfordNLPLanguage(snlp)
doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
    print(token.text, token.lemma_, token.pos_, token.dep_)
  # ...

票数 0

查看全部 2 条回答

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/44910934

复制

相似问题

问斯坦福大学CoreNLP BasicPipelineExample不工作
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问斯坦福大学CoreNLP BasicPipelineExample不工作EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问斯坦福大学CoreNLP BasicPipelineExample不工作
EN