自然语言处理学术速递[7.12]

公众号-arXiv每日学术速递

发布于 2021-07-27 10:47:54

1810

发布于 2021-07-27 10:47:54

文章被收录于专栏：arXiv每日学术速递

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.CL 方向，今日共计12篇

QA|VQA|问答|对话(1篇)

【1】 Joint Models for Answer Verification in Question Answering Systems 标题：问答系统中答案验证的联合模型

作者：Zeyu Zhang,Thuy Vu,Alessandro Moschitti 机构：School of Information, The University of Arizona, Tucson, AZ, USA, Amazon Alexa AI, Manhattan Beach, CA, USA 链接：https://arxiv.org/abs/2107.04217 摘要：本文研究了基于检索的问答系统的核心组件&答案句选择模块（AS2）提供的前k$中选择正确答案句的联合模型。我们的工作表明，有效利用答案集的一个关键步骤是对答案对之间的相关信息进行建模。为此，我们建立了一个三向多分类器，用来判断一个答案是支持、反驳，还是对另一个答案中立。更具体地说，我们的神经结构集成了最先进的AS2模型和多分类器，以及连接所有组件的连接层。我们在WikiQA、TREC-QA和真实数据集上测试了我们的模型。结果表明，我们的模型在AS2中取得了新的进展。摘要：This paper studies joint models for selecting correct answer sentences among the top $k$ provided by answer sentence selection (AS2) modules, which are core components of retrieval-based Question Answering (QA) systems. Our work shows that a critical step to effectively exploit an answer set regards modeling the interrelated information between pair of answers. For this purpose, we build a three-way multi-classifier, which decides if an answer supports, refutes, or is neutral with respect to another one. More specifically, our neural architecture integrates a state-of-the-art AS2 model with the multi-classifier, and a joint layer connecting all components. We tested our models on WikiQA, TREC-QA, and a real-world dataset. The results show that our models obtain the new state of the art in AS2.

机器翻译(2篇)

【1】 Using Machine Translation to Localize Task Oriented NLG Output 标题：使用机器翻译实现面向任务的NLG输出本地化

作者：Scott Roy,Cliff Brunk,Kyu-Young Kim,Justin Zhao,Markus Freitag,Mihir Kale,Gagan Bansal,Sidharth Mudgal,Chris Varano 机构：Google, Inc. 备注：12 pages, 10 figures 链接：https://arxiv.org/abs/2107.04512 摘要：面向任务的自然语言应用程序（如googleassistant、Siri或Alexa）面临的挑战之一是将输出本地化为多种语言。本文探讨了如何将机器翻译应用于英语输出。使用机器翻译具有很强的可伸缩性，因为它可以处理任何英语输出，并且可以处理动态文本，但在其他方面，问题是不适合。所需的质量栏近乎完美，句子范围极窄，而且句子往往与机器翻译训练数据中的句子差别很大。这种需求组合在机器翻译领域是一种新颖的领域适应。我们可以通过建立现有的想法并添加新的想法来达到所需的质量标准：微调域内翻译、添加来自Web的句子、添加语义注释以及使用自动错误检测。本文分享了我们的研究方法和结果，并提出了一个蒸馏模型，为规模翻译模型服务。摘要：One of the challenges in a task oriented natural language application like the Google Assistant, Siri, or Alexa is to localize the output to many languages. This paper explores doing this by applying machine translation to the English output. Using machine translation is very scalable, as it can work with any English output and can handle dynamic text, but otherwise the problem is a poor fit. The required quality bar is close to perfection, the range of sentences is extremely narrow, and the sentences are often very different than the ones in the machine translation training data. This combination of requirements is novel in the field of domain adaptation for machine translation. We are able to reach the required quality bar by building on existing ideas and adding new ones: finetuning on in-domain translations, adding sentences from the Web, adding semantic annotations, and using automatic error detection. The paper shares our approach and results, together with a distillation model to serve the translation models at scale.

【2】 A Survey on Low-Resource Neural Machine Translation 标题：低资源神经机器翻译研究综述

作者：Rui Wang,Xu Tan,Renqian Luo,Tao Qin,Tie-Yan Liu 机构：Microsoft Research Asia 备注：A short version has been submitted to IJCAI2021 Survey Track on Feb. 26th, 2021, accepted on Apr. 16th, 2021. 14 pages, 4 figures 链接：https://arxiv.org/abs/2107.04239 摘要：神经网络方法在机器翻译中已经达到了最先进的精度，但是由于收集大规模并行数据的成本很高。因此，大量的研究已经进行了非常有限的并行数据，即低资源设置的神经机器翻译（NMT）。本文对低资源NMT进行了综述，并根据所使用的辅助数据将相关工作分为三类：（1）利用源语言和/或目标语言的单语数据；（2）利用辅助语言的数据；（3）利用多模态数据。我们希望我们的调查能够帮助研究人员更好地理解这一领域，启发他们设计更好的算法，帮助行业从业者为自己的应用选择合适的算法。摘要：Neural approaches have achieved state-of-the-art accuracy on machine translation but suffer from the high cost of collecting large scale parallel data. Thus, a lot of research has been conducted for neural machine translation (NMT) with very limited parallel data, i.e., the low-resource setting. In this paper, we provide a survey for low-resource NMT and classify related works into three categories according to the auxiliary data they used: (1) exploiting monolingual data of source and/or target languages, (2) exploiting data from auxiliary languages, and (3) exploiting multi-modal data. We hope that our survey can help researchers to better understand this field and inspire them to design better algorithms, and help industry practitioners to choose appropriate algorithms for their applications.

Graph|知识图谱|Knowledge(2篇)

【1】 Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment 标题：基于关联图的句法密集嵌入学习自动可读性评估

作者：Xinying Qiu,Yuan Chen,Hanwu Chen,Jian-Yun Nie,Yuming Shen,Dawei Lu 机构：School of Information Science and Technology,Department of Computer Science, Guangdong University of Foreign Studies, China and Operations Research, School of Liberal Arts, Renmin University of China University of Montreal, Canada 备注：Accepted to the 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021) 链接：https://arxiv.org/abs/2107.04268 摘要：自动可读性评估的深度学习模型通常抛弃了机器学习模型中传统的语言特征。本文提出了一种基于语言特征的句法密集嵌入方法，将语言特征融入神经网络模型。为了处理特征之间的关系，我们在特征之间建立了一个相关图，并利用它来学习特征的嵌入，使得相似的特征可以用相似的嵌入来表示。实验结果表明，本文所提出的方法可以对BERT-only模型进行有效的补充，显著提高自动可读性评估的性能。摘要：Deep learning models for automatic readability assessment generally discard linguistic features traditionally used in machine learning models for the task. We propose to incorporate linguistic features into neural network models by learning syntactic dense embeddings based on linguistic features. To cope with the relationships between the features, we form a correlation graph among features and use it to learn their embeddings so that similar features will be represented by similar embeddings. Experiments with six data sets of two proficiency levels demonstrate that our proposed methodology can complement BERT-only model to achieve significantly better performances for automatic readability assessment.

【2】 Levi Graph AMR Parser using Heterogeneous Attention 标题：基于异构注意力的Levi图AMR分析器

作者：Han He,Jinho D. Choi 机构：Computer Science, Emory University, Atlanta GA , USA 备注：Accepted in IWPT 2021: The 17th International Conference on Parsing Technologies 链接：https://arxiv.org/abs/2107.04152 摘要：结合双仿射解码器，transformers已经有效地适应了文本到图形的转换，并在AMR解析方面取得了最先进的性能。然而，许多先前的工作依赖于双仿射解码器来进行弧和标签预测中的一个或两个，尽管解码器使用的大多数特征可能已经被Transformer学习。本文提出了一种新的AMR解析方法，它将异构数据（标记、概念、标签）作为一个输入输入到一个转换器来学习注意，并且只使用来自转换器的注意矩阵来预测AMR图中的所有元素（概念、弧、标签）。虽然我们的模型使用的参数比以前最先进的图形解析器少得多，但在amr2.0和3.0上显示出相似或更好的精度。摘要：Coupled with biaffine decoders, transformers have been effectively adapted to text-to-graph transduction and achieved state-of-the-art performance on AMR parsing. Many prior works, however, rely on the biaffine decoder for either or both arc and label predictions although most features used by the decoder may be learned by the transformer already. This paper presents a novel approach to AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention, and use only attention matrices from the transformer to predict all elements in AMR graphs (concepts, arcs, labels). Although our models use significantly fewer parameters than the previous state-of-the-art graph parser, they show similar or better accuracy on AMR 2.0 and 3.0.

半/弱/无监督|不确定性(2篇)

【1】 Measuring and Improving Model-Moderator Collaboration using Uncertainty Estimation 标题：使用不确定性估计度量和改进模型-主持人协作

作者：Ian D. Kivlichan,Zi Lin,Jeremiah Liu,Lucy Vasserman 机构：Jigsaw, Google Research 备注：WOAH 2021 链接：https://arxiv.org/abs/2107.04212 摘要：内容调节通常由人类和机器学习模型之间的协作来执行。然而，如何设计协同过程以最大限度地提高联合调节模型系统的性能，目前尚不清楚。这项工作提出了一个严格的研究这个问题，重点放在一种方法，将模型的不确定性纳入到协作过程。首先，我们引入原则性的度量来描述协作系统在人类调节者的能力约束下的性能，量化组合系统如何有效地利用人类决策。利用这些指标，我们进行了一项大型基准研究，评估了在不同协作评审策略下最先进的不确定性模型的性能。我们发现，基于不确定性的策略始终优于基于毒性评分的广泛使用的策略，而且，审查策略的选择极大地改变了系统的整体性能。我们的结果证明了严格的度量对于理解和开发有效的内容调节模型系统的重要性，以及不确定性估计在这一领域的实用性。摘要：Content moderation is often performed by a collaboration between humans and machine learning models. However, it is not well understood how to design the collaborative process so as to maximize the combined moderator-model system performance. This work presents a rigorous study of this problem, focusing on an approach that incorporates model uncertainty into the collaborative process. First, we introduce principled metrics to describe the performance of the collaborative system under capacity constraints on the human moderator, quantifying how efficiently the combined system utilizes human decisions. Using these metrics, we conduct a large benchmark study evaluating the performance of state-of-the-art uncertainty models under different collaborative review strategies. We find that an uncertainty-based strategy consistently outperforms the widely used strategy based on toxicity scores, and moreover that the choice of review strategy drastically changes the overall system performance. Our results demonstrate the importance of rigorous metrics for understanding and developing effective moderator-model systems for content moderation, as well as the utility of uncertainty estimation in this domain.

【2】 Improved Language Identification Through Cross-Lingual Self-Supervised Learning 标题：通过跨语言自我监督学习提高语言识别能力

作者：Andros Tjandra,Diptanu Gon Choudhury,Frank Zhang,Kritika Singh,Alexei Baevski,Assaf Sela,Yatharth Saraf,Michael Auli 机构：Facebook AI, USA 备注：Submitted to ASRU 2021 链接：https://arxiv.org/abs/2107.04082 摘要：语言识别对自动语音识别等下游任务的成功与否有着重要的影响。最近，用wav2vec2.0学习的自监督语音表征被证明对一系列语音任务是非常有效的。我们扩展了以前在语言识别方面的自监督工作，通过实验使用了预先训练的模型，这些模型是在现实世界中的多语言无约束语音上学习的，而不仅仅是在英语上。我们证明了在许多语言上预先训练的模型表现得更好，并且使得需要很少标记数据的语言识别系统能够很好地执行。在25种语言上的实验结果表明，每种语言只有10分钟的标记数据，一个跨语言的预训练模型可以达到93%以上的准确率。摘要：Language identification greatly impacts the success of downstream tasks such as automatic speech recognition. Recently, self-supervised speech representations learned by wav2vec 2.0 have been shown to be very effective for a range of speech tasks. We extend previous self-supervised work on language identification by experimenting with pre-trained models which were learned on real-world unconstrained speech in multiple languages and not just on English. We show that models pre-trained on many languages perform better and enable language identification systems that require very little labeled data to perform well. Results on a 25 languages setup show that with only 10 minutes of labeled data per language, a cross-lingually pre-trained model can achieve over 93% accuracy.

检测相关(1篇)

【1】 A Robust Deep Ensemble Classifier for Figurative Language Detection 标题：一种用于比喻语言检测的鲁棒深度集成分类器

作者：Rolandos Alexandros Potamias,Georgios Siolas,Andreas - Georgios Stafylopatis 机构：School of Electrical and Computer Engineering, National Technical University of Athens 备注：Published in Engineering Applications of Neural Networks (EANN)-2019 链接：https://arxiv.org/abs/2107.04372 摘要：比喻语言的识别与分类是自然语言处理领域中情感分析的一个开放性问题。这个问题本身包含三个相互关联的外语识别任务：讽刺、反讽和隐喻，在本文中，这三个任务涉及高级深度学习（DL）技术。首先，我们引入一个面向高效数据表示格式的数据预取框架，以便优化DL模型的各个输入。此外，还提取了一些特殊的特征，以表征社交媒体文本参考中所反映的句法、表达、情感和情绪内容。这些特征旨在捕捉社交网络用户写作方法的各个方面。最后，将特征输入到一个基于不同DL技术组合的健壮的深度集成软分类器（DESC）中。使用三个不同的基准数据集（其中一个包含不同的FL形式）我们得出结论，DESC模型取得了非常好的性能，值得与FL识别领域的相关方法和最新技术进行比较。摘要：Recognition and classification of Figurative Language (FL) is an open problem of Sentiment Analysis in the broader field of Natural Language Processing (NLP) due to the contradictory meaning contained in phrases with metaphorical content. The problem itself contains three interrelated FL recognition tasks: sarcasm, irony and metaphor which, in the present paper, are dealt with advanced Deep Learning (DL) techniques. First, we introduce a data prepossessing framework towards efficient data representation formats so that to optimize the respective inputs to the DL models. In addition, special features are extracted in order to characterize the syntactic, expressive, emotional and temper content reflected in the respective social media text references. These features aim to capture aspects of the social network user's writing method. Finally, features are fed to a robust, Deep Ensemble Soft Classifier (DESC) which is based on the combination of different DL techniques. Using three different benchmark datasets (one of them containing various FL forms) we conclude that the DESC model achieves a very good performance, worthy of comparison with relevant methodologies and state-of-the-art technologies in the challenging field of FL recognition.

Word2Vec|文本|单词(1篇)

【1】 A Systematic Survey of Text Worlds as Embodied Natural Language Environments 标题：文本世界作为具体化自然语言环境的系统考察

作者：Peter A Jansen 机构：A Systematic Survey ofText Worlds as Embodied Natural Language EnvironmentsPeter JansenSchool of Information, University of Arizona 备注：18 pages 链接：https://arxiv.org/abs/2107.04132 摘要：文本世界是用于具体化代理的虚拟环境，与2D或3D环境不同，这些代理仅使用文本描述来呈现。这些环境提供了一种高保真3D环境的替代方案，因为它们的低进入壁垒，提供了研究语义、合成推理和其他具有丰富高级动作空间的高级任务的能力，同时控制感知输入。这项系统调查概述了文本世界的工具、环境和代理建模的最新发展，同时考察了知识图、常识推理、文本世界性能向高保真环境的转移学习的最新趋势，以及一旦实现的近期发展目标，使文本世界成为自然语言处理的一个有吸引力的通用研究范式。摘要：Text Worlds are virtual environments for embodied agents that, unlike 2D or 3D environments, are rendered exclusively using textual descriptions. These environments offer an alternative to higher-fidelity 3D environments due to their low barrier to entry, providing the ability to study semantics, compositional inference, and other high-level tasks with rich high-level action spaces while controlling for perceptual input. This systematic survey outlines recent developments in tooling, environments, and agent modeling for Text Worlds, while examining recent trends in knowledge graphs, common sense reasoning, transfer learning of Text World performance to higher-fidelity environments, as well as near-term development targets that, once achieved, make Text Worlds an attractive general research paradigm for natural language processing.

其他神经网络|深度学习|模型|建模(1篇)

【1】 Can Deep Neural Networks Predict Data Correlations from Column Names? 标题：深度神经网络能根据列名预测数据相关性吗？

作者：Immanuel Trummer 机构：Cornell Database Group, Ithaca, NY, USA 链接：https://arxiv.org/abs/2107.04553 摘要：对于人类来说，通常可以从列名预测数据相关性。我们进行实验，以找出深层神经网络是否也能学会这样做。如果是这样的话，例如，它将打开一种可能性，即调优工具，使用对模式元素的NLP分析来优先考虑相关检测的工作。我们分析了大约120000个列对的相关性，这些列对来自大约4000个数据集。我们试图预测相关性，仅基于列名。对于预测，我们利用预先训练的语言模型，基于最近提出的Transformer架构。我们考虑不同类型的相关性，多个预测方法，以及各种预测方案。研究了列名长度、训练数据量等因素对预测精度的影响。总之，我们发现深层神经网络可以在许多情况下以相对较高的精度预测相关性（例如，对于长列名的准确率为95%）。摘要：For humans, it is often possible to predict data correlations from column names. We conduct experiments to find out whether deep neural networks can learn to do the same. If so, e.g., it would open up the possibility of tuning tools that use NLP analysis on schema elements to prioritize their efforts for correlation detection. We analyze correlations for around 120,000 column pairs, taken from around 4,000 data sets. We try to predict correlations, based on column names alone. For predictions, we exploit pre-trained language models, based on the recently proposed Transformer architecture. We consider different types of correlations, multiple prediction methods, and various prediction scenarios. We study the impact of factors such as column name length or the amount of training data on prediction accuracy. Altogether, we find that deep neural networks can predict correlations with a relatively high accuracy in many scenarios (e.g., with an accuracy of 95% for long column names).

其他(2篇)

【1】 Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT 标题：使用特定领域的Albert进行生物医学自然语言处理任务的基准测试

作者：Usman Naseem,Adam G. Dunn,Matloob Khushi,Jinman Kim 机构：School of Computer Science, The University of Sydney, Sydney, Australia, School of Medical Science, The University of Sydney, Sydney, Australia 链接：https://arxiv.org/abs/2107.04374 摘要：生物医学文本数据的可用性和自然语言处理（NLP）的发展使生物医学NLP的新应用成为可能。使用特定领域语料库训练或微调的语言模型可以比一般模型更好，但迄今为止在生物医学自然语言处理方面的工作在语料库和任务方面受到限制。我们介绍了BioALBERT，一种特定领域的对来自Transformers（ALBERT）的Lite双向编码器表示的改编，在生物医学（PubMed和PubMed Central）和临床（MIMIC-III）语料库上进行了训练，并针对20个基准数据集的6个不同任务进行了微调。实验表明，BioALBERT在命名实体识别（+11.09%BLURB分数提高）、关系提取（+0.80%BLURB分数）、句子相似度（+1.05%BLURB分数）、文档分类（+0.62%F1分数）和问答（+2.83%BLURB分数）方面均优于现有技术。它代表了20个基准数据集中17个的最新水平。通过提供BioALBERT模型和数据，我们的目标是帮助生物医学NLP社区避免训练的计算成本，并为未来广泛的生物医学NLP任务建立一套新的基线。摘要：The availability of biomedical text data and advances in natural language processing (NLP) have made new applications in biomedical NLP possible. Language models trained or fine tuned using domain specific corpora can outperform general models, but work to date in biomedical NLP has been limited in terms of corpora and tasks. We present BioALBERT, a domain-specific adaptation of A Lite Bidirectional Encoder Representations from Transformers (ALBERT), trained on biomedical (PubMed and PubMed Central) and clinical (MIMIC-III) corpora and fine tuned for 6 different tasks across 20 benchmark datasets. Experiments show that BioALBERT outperforms the state of the art on named entity recognition (+11.09% BLURB score improvement), relation extraction (+0.80% BLURB score), sentence similarity (+1.05% BLURB score), document classification (+0.62% F1-score), and question answering (+2.83% BLURB score). It represents a new state of the art in 17 out of 20 benchmark datasets. By making BioALBERT models and data available, our aim is to help the biomedical NLP community avoid computational costs of training and establish a new set of baselines for future efforts across a broad range of biomedical NLP tasks.

【2】 UniRE: A Unified Label Space for Entity Relation Extraction 标题：UniRE：一种用于实体关系抽取的统一标签空间

作者：Yijun Wang,Changzhi Sun,Yuanbin Wu,Hao Zhou,Lei Li,Junchi Yan 机构：Department of Computer Science and Engineering, Shanghai Jiao Tong University, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, School of Computer Science and Technology, East China Normal University, ByteDance AI Lab 备注：ACL2021 链接：https://arxiv.org/abs/2107.04292 摘要：许多联合实体关系抽取模型为两个子任务（即实体检测和关系分类）建立了两个独立的标签空间。我们认为，这种设置可能会阻碍实体和关系之间的信息交互。在这项工作中，我们建议消除对两个子任务标签空间的不同处理。我们模型的输入是一个包含一个句子中所有单词对的表。实体和关系在表中用正方形和矩形表示。我们使用一个统一的分类器来预测每个细胞的标签，从而统一了两个子任务的学习。为了测试，提出了一种有效的（快速的）近似解码器，用于从表中查找正方形和矩形。在三个基准（ACE04、ACE05、SciERC）上的实验表明，我们的模型只需使用一半的参数，就可以用最好的抽取器获得具有竞争力的精度，而且速度更快。摘要：Many joint entity relation extraction models setup two separated label spaces for the two sub-tasks (i.e., entity detection and relation classification). We argue that this setting may hinder the information interaction between entities and relations. In this work, we propose to eliminate the different treatment on the two sub-tasks' label spaces. The input of our model is a table containing all word pairs from a sentence. Entities and relations are represented by squares and rectangles in the table. We apply a unified classifier to predict each cell's label, which unifies the learning of two sub-tasks. For testing, an effective (yet fast) approximate decoder is proposed for finding squares and rectangles from tables. Experiments on three benchmarks (ACE04, ACE05, SciERC) show that, using only half the number of parameters, our model achieves competitive accuracy with the best extractor, and is faster.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-07-12，如有侵权请联系 cloudcommunity@tencent.com 删除

linux