AllenNLP系列文章之四：指代消解

sparkexpert

发布于 2019-05-27 19:54:59

3K0

发布于 2019-05-27 19:54:59

指代消解是自然语言处理的一大任务之一，它是信息抽取不可或缺的组成部分。在信息抽取中，由于用户关心的事件和实体间语义关系往往散布于文本的不同位置，其中涉及到的实体通常可以有多种不同的表达方式，例如某个语义关系中的实体可能是以代词形式出现的，为了更准确且没有遗漏地从文本中抽取相关信息，必须要对文章中的指代现象进行消解。指代消解不但在信息抽取中起着重要的作用，而且在机器翻译、文本摘要和问答系统等应用中也极为关键。

如本方第一句话： “指代消解是自然语言处理的一大任务之一，它是信息抽取不可或缺的组成部分。”

AllenNLP很Nice的一点是，提供了指代消解的功能，其介绍如下：

Coreference Resolution

Coreference resolution is the task of finding all expressions that refer to the same entity in a text. It is an important step for many higher level NLP tasks that involve natural language understanding, such as document summarization, question answering and information extraction. Our implementation is based on End-to-End Coreference Resolution (Lee et al, 2017)--a neural model which considers all possible spans in the document as potential mentions and learns distributions over possible anteceedents for each span. This approach achieved state-of-the-art results on the Ontonotes 5.0 dataset in early 2017. The AllenNLP implementation achives 63.0% F1 on the CoNLL test set. Please note that this model does not include speaker features (impractical for general use), variational dropout (currently difficult to implement in Pytorch) or data augmentation and considers 100 anteceedents rather than 250 due to memory constraints.

指代消解的基本实现原理可以见stanford的CS224n课程15的介绍，其基本原理是找到一个句子中的所有mention,然后两两配对，评分，如课程PPT中的图示：

由于机器并不知道哪些会成为一个Coreference Cluster，因此需要两两配对，再打分。

打分后聚类的结果如下，从而可实现指代消解。

1、论文原理

即里面集成了ACL 2017年的指代消解算法，End-to-end Neural Coreference Resolution。它针对的问题就是上面配对的数量随着文档而指数增长的问题，因此采用一些策略来减少配对，提高速度，同时在精度上也有所提升。

Scoring all span pairs in our end-to-end model is impractical, since the complexity would be quartic in the document length. Therefore we factor the model over unary mention scores and pairwise antecedent scores, both of which are simple functions of the learned span embedding. The unary mention scores are used to prune the space of spans and antecedents, to aggressively reduce the number of pairwise computations.

其技术框架如下：