前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >自然语言处理学术速递[12.23]

自然语言处理学术速递[12.23]

作者头像
公众号-arXiv每日学术速递
发布2021-12-27 17:07:54
7210
发布2021-12-27 17:07:54
举报

cs.CL 方向,今日共计34篇

Transformer(4篇)

【1】 Trees in transformers: a theoretical analysis of the Transformer's ability to represent trees 标题:Transformer中的树:Transformer表示树能力的理论分析 链接:https://arxiv.org/abs/2112.11913

作者:Qi He,João Sedoc,Jordan Rodu 机构: 1 New York University, 2 New York University, 3 University ofVirginia 摘要:Transformer网络是自然语言处理中事实上的标准架构。迄今为止,还没有对Transformer捕获树状结构的能力进行理论分析。我们关注Transformer网络学习树结构的能力,这对树转换问题很重要。我们首先分析标准Transformer体系结构的理论能力,在枚举所有可能的树主干(我们定义为没有标签的树)的情况下学习树结构。然后,我们证明了具有ReLU激活函数的两个线性层可以从任意两个非零、线性独立的起始主干恢复任意树主干。这意味着Transformer在理论上可以很好地学习树结构。我们对合成数据进行了实验,发现标准变换器与树位置信息显式编码的变换器相比具有相似的精度,尽管收敛速度较慢。这从经验上证实了Transformer可以学习树结构。 摘要:Transformer networks are the de facto standard architecture in natural language processing. To date, there are no theoretical analyses of the Transformer's ability to capture tree structures. We focus on the ability of Transformer networks to learn tree structures that are important for tree transduction problems. We first analyze the theoretical capability of the standard Transformer architecture to learn tree structures given enumeration of all possible tree backbones, which we define as trees without labels. We then prove that two linear layers with ReLU activation function can recover any tree backbone from any two nonzero, linearly independent starting backbones. This implies that a Transformer can learn tree structures well in theory. We conduct experiments with synthetic data and find that the standard Transformer achieves similar accuracy compared to a Transformer where tree position information is explicitly encoded, albeit with slower convergence. This confirms empirically that Transformers can learn tree structures.

【2】 Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization 标题:面向查询的摘要文本摘要中基于预训练转换器的领域自适应 链接:https://arxiv.org/abs/2112.11670

作者:Md Tahmid Rahman Laskar,Enamul Hoque,Jimmy Xiangji Huang 机构:Dialpad Canada Inc., Information Retrieval & Knowledge Management Research Lab, York University, School of Information Technology, York University, Toronto, Ontario, Canada 备注:The final version will be published in the Computational Linguistics journal 摘要:以查询为中心的文本摘要(QFTS)任务旨在构建基于给定查询生成文本文档摘要的系统。解决此任务的一个关键挑战是缺少用于训练摘要模型的大型标记数据。在本文中,我们通过探索一系列领域适应技术来应对这一挑战。鉴于预训练的transformer模型最近在广泛的自然语言处理任务中取得了成功,我们利用这些模型为单文档和多文档场景的QFTS任务生成抽象摘要。对于域自适应,我们使用预训练的基于转换器的摘要模型应用各种技术,包括转移学习、弱监督学习和远程监督。在六个数据集上进行的大量实验表明,我们提出的方法在生成QFTS任务的抽象摘要方面非常有效,同时在多个数据集中设置了一组自动和人工评估指标的最新结果。 摘要:The Query Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on the given query. A key challenge in addressing this task is the lack of large labeled data for training the summarization model. In this paper, we address this challenge by exploring a series of domain adaptation techniques. Given the recent success of pre-trained transformer models in a wide range of natural language processing tasks, we utilize such models to generate abstractive summaries for the QFTS task for both single-document and multi-document scenarios. For domain adaptation, we apply a variety of techniques using pre-trained transformer-based summarization models including transfer learning, weakly supervised learning, and distant supervision. Extensive experiments on six datasets show that our proposed approach is very effective in generating abstractive summaries for the QFTS task while setting a new state-of-the-art result in several datasets across a set of automatic and human evaluation metrics.

【3】 Diformer: Directional Transformer for Neural Machine Translation 标题:Diformer:用于神经机器翻译的方向变换器 链接:https://arxiv.org/abs/2112.11632

作者:Minghan Wang,Jiaxin Guo,Yuxia Wang,Daimeng Wei,Hengchao Shang,Chang Su,Yimeng Chen,Yinglu Li,Min Zhang,Shimin Tao,Hao Yang 机构:Huawei Translation Services Center, Beijing, China, The University of Melbourne, Melbourne, Australia, suchang,chenyimeng,liyinglu,zhangmin 摘要:自回归(AR)和非自回归(NAR)模型在性能和延迟方面都有各自的优势,将它们组合成一个模型可能会同时利用这两个优势。当前的组合框架更多地关注于将多种解码范式与统一的生成模型(如掩蔽语言模型)进行集成。然而,由于训练目标和推理之间的差距,泛化可能对性能有害。在本文中,我们的目标是通过在统一的框架下保持AR和NAR的原始目标来缩小差距。具体而言,我们提出了方向变换器(DIFORER),通过将AR和NAR联合建模为三个生成方向(从左到右、从右到左和从直),并引入一个新的方向变量,该变量通过控制每个令牌的预测来工作,以使其在该方向下具有特定的依赖性。direction实现的统一成功地保留了AR和NAR中使用的原始依赖性假设,同时保留了泛化和性能。在4个WMT基准上的实验表明,Diformer在AR和NAR解码方面优于当前的联合建模工作,其BLEU点数超过1.5,并且与最先进的独立AR和NAR模型具有竞争力。 摘要:Autoregressive (AR) and Non-autoregressive (NAR) models have their own superiority on the performance and latency, combining them into one model may take advantage of both. Current combination frameworks focus more on the integration of multiple decoding paradigms with a unified generative model, e.g. Masked Language Model. However, the generalization can be harmful to the performance due to the gap between training objective and inference. In this paper, we aim to close the gap by preserving the original objective of AR and NAR under a unified framework. Specifically, we propose the Directional Transformer (Diformer) by jointly modelling AR and NAR into three generation directions (left-to-right, right-to-left and straight) with a newly introduced direction variable, which works by controlling the prediction of each token to have specific dependencies under that direction. The unification achieved by direction successfully preserves the original dependency assumption used in AR and NAR, retaining both generalization and performance. Experiments on 4 WMT benchmarks demonstrate that Diformer outperforms current united-modelling works with more than 1.5 BLEU points for both AR and NAR decoding, and is also competitive to the state-of-the-art independent AR and NAR models.

【4】 Mixed Precision of Quantization of Transformer Language Models for Speech Recognition 标题:用于语音识别的Transformer语言模型量化的混合精度 链接:https://arxiv.org/abs/2112.11540

作者:Junhao Xu,Shoukang Hu,Jianwei Yu,Xunying Liu,Helen Meng 机构:The Chinese University of Hong Kong, Hong Kong SAR, China 备注:arXiv admin note: substantial text overlap with arXiv:2112.11438, arXiv:2111.14479 摘要:以Transformer为代表的最先进的神经语言模型在实际应用中变得越来越复杂和昂贵。低比特深度神经网络量化技术提供了一个强大的解决方案,以大大减少其模型的大小。当前的低比特量化方法基于统一的精度,无法考虑系统不同部分对量化误差的不同性能灵敏度。为此,本文提出了一种新的混合精度DNN量化方法。使用两种技术自动学习最佳局部精度设置。第一种是基于Hessian迹加权量化扰动形式的量化灵敏度度量。第二种是基于混合精度的Transformer结构搜索。采用交替方向乘子法(ADMM)有效地训练混合精度量化DNN系统。在Penn Treebank(PTB)和交换机语料库训练的LF-MMI TDNN系统上进行的实验表明,所提出的混合精度Transformer量化技术在不降低识别性能的情况下,在全精度基线上实现了高达16倍的模型大小压缩比。当用于压缩具有更多层的更大全精度TransformerLM时,可获得高达1.7%绝对(18%相对)的总体字错误率(WER)降低。 摘要:State-of-the-art neural language models represented by Transformers are becoming increasingly complex and expensive for practical applications. Low-bit deep neural network quantization techniques provides a powerful solution to dramatically reduce their model size. Current low-bit quantization methods are based on uniform precision and fail to account for the varying performance sensitivity at different parts of the system to quantization errors. To this end, novel mixed precision DNN quantization methods are proposed in this paper. The optimal local precision settings are automatically learned using two techniques. The first is based on a quantization sensitivity metric in the form of Hessian trace weighted quantization perturbation. The second is based on mixed precision Transformer architecture search. Alternating direction methods of multipliers (ADMM) are used to efficiently train mixed precision quantized DNN systems. Experiments conducted on Penn Treebank (PTB) and a Switchboard corpus trained LF-MMI TDNN system suggest the proposed mixed precision Transformer quantization techniques achieved model size compression ratios of up to 16 times over the full precision baseline with no recognition performance degradation. When being used to compress a larger full precision Transformer LM with more layers, overall word error rate (WER) reductions up to 1.7% absolute (18% relative) were obtained.

QA|VQA|问答|对话(1篇)

【1】 Few-shot Multi-hop Question Answering over Knowledge Base 标题:基于知识库的Few-Shot多跳问答系统 链接:https://arxiv.org/abs/2112.11909

作者:Fan Meihao 摘要:由于缺乏复杂的中文语义分析数据集,以及搜索空间随关系路径长度呈指数增长,以往的中文知识库问答工作受到限制。本文提出了一种高效的流水线方法,该方法采用预先训练好的语言模型和构造人工训练样本的策略,只需少量数据,但在开放领域复杂中文问答任务中表现良好。此外,通过采用基于语言模型标记候选查询元组分数的Beam搜索算法,我们在生成多跳查询路径时降低了增长关系路径的速度。最后,我们通过知识库任务对我们的模型在CCKS2019复杂问题回答中进行了评估,在测试数据集上获得了62.55\%的F1分数。此外,当只使用10%的数据进行训练时,我们的模型仍然可以达到58.54%的F1分数。结果表明,我们的模型具有处理KBQA任务的能力,并且在Few-Shot学习方面具有优势。 摘要:Previous work on Chinese Knowledge Base Question Answering has been restricted due to the lack of complex Chinese semantic parsing dataset and the exponentially growth of searching space with the length of relation paths. This paper proposes an efficient pipeline method equipped with a pre-trained language model and a strategy to construct artificial training samples, which only needs small amount of data but performs well on open-domain complex Chinese Question Answering task. Besides, By adopting a Beam Search algorithm based on a language model marking scores for candidate query tuples, we decelerate the growing relation paths when generating multi-hop query paths. Finally, we evaluate our model on CCKS2019 Complex Question Answering via Knowledge Base task and achieves F1-score of 62.55\% on the test dataset. Moreover when training with only 10\% data, our model can still achieves F1-score of 58.54\%. The result shows the capability of our model to process KBQA task and the advantage in few-shot learning.

机器翻译(3篇)

【1】 Joint-training on Symbiosis Networks for Deep Nueral Machine Translation models 标题:基于共生网络的深度核机器翻译模型联合训练 链接:https://arxiv.org/abs/2112.11642

作者:Zhengzhe Yu,Jiaxin Guo,Minghan Wang,Daimeng Wei,Hengchao Shang,Zongyao Li,Zhanglin Wu,Yuxia Wang,Yimeng Chen,Chang Su,Min Zhang,Lizhi Lei,shimin tao,Hao Yang 机构:Huawei Translation Services Center, Beijing, China, The University of Melbourne, Melbourne, Australia, lizongyao,wuzhanglin,chenyimeng,suchang,zhangmin 摘要:深编码器已被证明是有效的改进神经机器翻译(NMT)系统,但它达到翻译质量的上限时,编码器层的数量超过18。更糟糕的是,更深层次的网络消耗了大量内存,因此无法有效地训练。在本文中,我们提出了一种共生网络,它包括一个完整的网络作为共生主网(M-Net)和另一个与共生子网(S-Net)结构相同但层次较少的共享子网。我们采用了Transformer深度(m-n)结构上的共生网络,并在NMT中定义了m-Net和S-Net之间的特定正则化损失$\mathcal{L}{\tau}$。我们在共生网络上进行联合训练,旨在提高M-Net的性能。在WMT'14 EN->DE、DE->EN和EN->FR任务的经典训练中,我们提出的训练策略将Transformer deep(12-6)的BLEU提高了0.61、0.49和0.69。此外,我们的Transformer deep(12-6)甚至优于经典的Transformer deep(18-6)。 摘要:Deep encoders have been proven to be effective in improving neural machine translation (NMT) systems, but it reaches the upper bound of translation quality when the number of encoder layers exceeds 18. Worse still, deeper networks consume a lot of memory, making it impossible to train efficiently. In this paper, we present Symbiosis Networks, which include a full network as the Symbiosis Main Network (M-Net) and another shared sub-network with the same structure but less layers as the Symbiotic Sub Network (S-Net). We adopt Symbiosis Networks on Transformer-deep (m-n) architecture and define a particular regularization loss $\mathcal{L}_{\tau}$ between the M-Net and S-Net in NMT. We apply joint-training on the Symbiosis Networks and aim to improve the M-Net performance. Our proposed training strategy improves Transformer-deep (12-6) by 0.61, 0.49 and 0.69 BLEU over the baselines under classic training on WMT'14 EN->DE, DE->EN and EN->FR tasks. Furthermore, our Transformer-deep (12-6) even outperforms classic Transformer-deep (18-6).

【2】 Self-Distillation Mixup Training for Non-autoregressive Neural Machine Translation 标题:用于非自回归神经机器翻译的自精馏混合训练 链接:https://arxiv.org/abs/2112.11640

作者:Jiaxin Guo,Minghan Wang,Daimeng Wei,Hengchao Shang,Yuxia Wang,Zongyao Li,Zhengzhe Yu,Zhanglin Wu,Yimeng Chen,Chang Su,Min Zhang,Lizhi Lei,shimin tao,Hao Yang 机构:Huawei Translation Services Center, Beijing, China, The University of Melbourne, Melbourne, Australia, lizongyao,yuzhengzhe,wuzhanglin,chenyimeng,suchang 摘要:最近,非自回归(NAT)模型并行预测输出,与自回归(AT)模型相比,实现了发电速度的大幅提高。虽然在原始数据上的表现较差,但大多数NAT模型在AT教师模型生成的提取数据上被训练为学生模型,这称为序列级知识提取。提高AT模型性能的一种有效训练策略是自蒸馏混合(SDM)训练,它根据原始数据预训练模型,通过预训练模型本身生成蒸馏数据,最后根据原始数据和蒸馏数据的组合重新训练模型。在这项工作中,我们的目标是查看针对NAT模型的SDM,但发现直接将SDM应用于NAT模型在翻译质量方面没有任何改进。通过仔细分析,我们观察到无效性与AT教师模型和NAT学生模型之间的建模多样性和确认偏差相关。基于这些发现,我们提出了一种改进的策略,称为SDMRT,在经典SDM中增加了两个阶段:一个是对自提取数据进行预重排序,另一个是对过滤后的教师提取数据进行微调。在多个NAT模型上,我们的结果比基线高出0.6到1.2个BLEU。另一个好处是,对于迭代优化NAT模型,我们的方法可以在一半迭代次数内超越基线,这意味着2倍的加速。 摘要:Recently, non-autoregressive (NAT) models predict outputs in parallel, achieving substantial improvements in generation speed compared to autoregressive (AT) models. While performing worse on raw data, most NAT models are trained as student models on distilled data generated by AT teacher models, which is known as sequence-level Knowledge Distillation. An effective training strategy to improve the performance of AT models is Self-Distillation Mixup (SDM) Training, which pre-trains a model on raw data, generates distilled data by the pre-trained model itself and finally re-trains a model on the combination of raw data and distilled data. In this work, we aim to view SDM for NAT models, but find directly adopting SDM to NAT models gains no improvements in terms of translation quality. Through careful analysis, we observe the invalidation is correlated to Modeling Diversity and Confirmation Bias between the AT teacher model and the NAT student models. Based on these findings, we propose an enhanced strategy named SDMRT by adding two stages to classic SDM: one is Pre-Rerank on self-distilled data, the other is Fine-Tune on Filtered teacher-distilled data. Our results outperform baselines by 0.6 to 1.2 BLEU on multiple NAT models. As another bonus, for Iterative Refinement NAT models, our methods can outperform baselines within half iteration number, which means 2X acceleration.

【3】 English2Gbe: A multilingual machine translation model for {Fon/Ewe}Gbe 标题:english2Gbe:一个面向{FON/EWE}GBE的多语言机器翻译模型 链接:https://arxiv.org/abs/2112.11482

作者:Gilles Hacheme 机构:Masakhane NLP, Ai,Innov 备注:None 摘要:语言是解放的要素。不幸的是,2000多种非洲语言中的大多数资源不足。共同体最近使用机器翻译来恢复和加强几种非洲语言。然而,经过训练的模型通常是双语的,因此需要训练和维护的模型数量可能呈指数级增长,以涵盖所有可能的翻译方向。此外,双语模式没有利用某些语言之间的相似性。因此,多语言神经机器翻译(NMT)正在获得相当大的兴趣,特别是对于低资源的语言。然而,社会对它的采用仍然有限。本文介绍了English2Gbe,一种能够将英语翻译成Ewe或Fon的多语言NMT模型。通过使用Sacrebleu(Post,2018)软件包计算的BLEU、CHRF和TER分数的再现性,我们表明英语H2GBE优于双语模型(英语对Ewe和英语对Fon),并给出了Nekoto等人(2020)建立的Fon JW300基准的最新结果。我们希望这项工作将有助于在社区内大规模采用多语言模式。我们的代码可以从Github访问。 摘要:Language is an essential factor of emancipation. Unfortunately, most of the more than 2,000 African languages are low-resourced. The community has recently used machine translation to revive and strengthen several African languages. However, the trained models are often bilingual, resulting in a potentially exponential number of models to train and maintain to cover all possible translation directions. Additionally, bilingual models do not leverage the similarity between some of the languages. Consequently, multilingual neural machine translation (NMT) is gaining considerable interest, especially for low-resourced languages. Nevertheless, its adoption by the community is still limited. This paper introduces English2Gbe, a multilingual NMT model capable of translating from English to Ewe or Fon. Using the BLEU, CHRF, and TER scores computed with the Sacrebleu (Post, 2018) package for reproducibility, we show that English2Gbe outperforms bilingual models (English to Ewe and English to Fon) and gives state-of-the-art results on the JW300 benchmark for Fon established by Nekoto et al. (2020). We hope this work will contribute to the massive adoption of Multilingual models inside the community. Our code is made accessible from Github.

语义分析(1篇)

【1】 Hierarchical Cross-Modality Semantic Correlation Learning Model for Multimodal Summarization 标题:多通道文摘的层次化跨通道语义关联学习模型 链接:https://arxiv.org/abs/2112.12072

作者:Litian Zhang,Xiaoming Zhang,Junshu Pan,Feiran Huang 机构:School of Cyber Science and Technology, Beihang University, Beijing, China, College of Cyber Security, Jinan University, Guangzhou, China 备注:Accepted by AAAI2022 摘要:具有多模式输出的多模式摘要(MSMO)生成包含文本和视觉内容的摘要。多模态新闻报道包含异构内容,这使得MSMO非常重要。此外,我们还观察到,新闻报道中不同形式的数据在层次上相互关联。传统的MSMO方法通过学习整个数据的表示来处理不同的数据模式,这不能直接适应异构内容和层次关联。在本文中,我们提出了一个分层跨模态语义相关学习模型(HCSCL)来学习多模态数据中存在的模态内和模态间相关性。HCSCL采用图形网络对模式内相关性进行编码。然后,提出了一种分层融合框架来学习文本和图像之间的分层相关性。此外,我们还构建了一个包含相关图像注释和图像对象标签信息的新数据集,为学习过程提供监督信息。数据集上的大量实验表明,HCSCL在自动摘要度量和细粒度多样性测试方面显著优于基线方法。 摘要:Multimodal summarization with multimodal output (MSMO) generates a summary with both textual and visual content. Multimodal news report contains heterogeneous contents, which makes MSMO nontrivial. Moreover, it is observed that different modalities of data in the news report correlate hierarchically. Traditional MSMO methods indistinguishably handle different modalities of data by learning a representation for the whole data, which is not directly adaptable to the heterogeneous contents and hierarchical correlation. In this paper, we propose a hierarchical cross-modality semantic correlation learning model (HCSCL) to learn the intra- and inter-modal correlation existing in the multimodal data. HCSCL adopts a graph network to encode the intra-modal correlation. Then, a hierarchical fusion framework is proposed to learn the hierarchical correlation between text and images. Furthermore, we construct a new dataset with relevant image annotation and image object label information to provide the supervision information for the learning procedure. Extensive experiments on the dataset show that HCSCL significantly outperforms the baseline methods in automatic summarization metrics and fine-grained diversity tests.

推理|分析|理解|解释(3篇)

【1】 Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding 标题:光有文字是不够的!一种基于配置文件的口语理解基准 链接:https://arxiv.org/abs/2112.11953

作者:Xiao Xu,Libo Qin,Kaiji Chen,Guoxing Wu,Linlin Li,Wanxiang Che 机构:Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, China, Huawei Technologies Co., Ltd. 备注:Accepted by AAAI 2022 摘要:目前对口语理解(SLU)的研究主要局限于一个简单的环境:基于纯文本的SLU,它将用户的话语作为输入,并生成相应的语义框架(例如意图和时隙)。不幸的是,这样一个简单的设置可能无法在复杂的现实世界场景中工作,当一个话语在语义上是模糊的时,这是基于文本的SLU模型无法实现的。在本文中,我们首先介绍了一项新的重要任务,基于情景模式的口语理解(ProSLU),它要求模型不仅依赖于纯文本,而且还依赖于支持情景模式的信息来预测正确的意图和时隙。为此,我们进一步介绍了一个大规模的人类注释中文数据集,该数据集包含超过5K个话语及其相应的支持概要信息(知识图(KG)、用户概要(UP)、上下文感知(CA))。此外,我们评估了几种最先进的基线模型,并探索了一种多级知识适配器,以有效地整合概要信息。实验结果表明,现有的所有基于文本的SLU模型在话语语义模糊时都无法工作,我们提出的框架能够有效地融合句子级意图检测和标记级时隙填充的支持信息。最后,我们总结了主要的挑战,并为未来的研究方向提供了新的观点,以期促进研究。 摘要:Current researches on spoken language understanding (SLU) heavily are limited to a simple setting: the plain text-based SLU that takes the user utterance as input and generates its corresponding semantic frames (e.g., intent and slots). Unfortunately, such a simple setting may fail to work in complex real-world scenarios when an utterance is semantically ambiguous, which cannot be achieved by the text-based SLU models. In this paper, we first introduce a new and important task, Profile-based Spoken Language Understanding (ProSLU), which requires the model that not only relies on the plain text but also the supporting profile information to predict the correct intents and slots. To this end, we further introduce a large-scale human-annotated Chinese dataset with over 5K utterances and their corresponding supporting profile information (Knowledge Graph (KG), User Profile (UP), Context Awareness (CA)). In addition, we evaluate several state-of-the-art baseline models and explore a multi-level knowledge adapter to effectively incorporate profile information. Experimental results reveal that all existing text-based SLU models fail to work when the utterances are semantically ambiguous and our proposed framework can effectively fuse the supporting information for sentence-level intent detection and token-level slot filling. Finally, we summarize key challenges and provide new points for future directions, which hopes to facilitate the research.

【2】 CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning of Large Language Models 标题:CRASS:一种新的测试大型语言模型反事实推理的数据集和基准 链接:https://arxiv.org/abs/2112.11941

作者:Jörg Frohberg,Frank Binder 机构:apergo UG, Institute for Applied Informatics at Leipzig University (InfAI) 备注:8 pages including references, plus 3 pages appendix 摘要:我们介绍了CRAS(反事实推理评估)数据集和基准测试,利用质疑的反事实条件句作为一种新颖而强大的工具来评估大型语言模型。我们介绍了数据集设计和基准以及支持根据人群验证的人类基线进行评分的附带API。我们根据我们的基准测试了六个最先进的模型。我们的结果表明,这对这些模型提出了有效的挑战,并为它们的改进开辟了相当大的空间。 摘要:We introduce the CRASS (counterfactual reasoning assessment) data set and benchmark utilizing questionized counterfactual conditionals as a novel and powerful tool to evaluate large language models. We present the data set design and benchmark as well as the accompanying API that supports scoring against a crowd-validated human baseline. We test six state-of-the-art models against our benchmark. Our results show that it poses a valid challenge for these models and opens up considerable room for their improvement.

【3】 Multimodal Analysis of memes for sentiment extraction 标题:用于情感提取的模因多模态分析 链接:https://arxiv.org/abs/2112.11850

作者:Nayan Varma Alluri,Neeli Dheeraj Krishna 机构:Department of Computer Science and, Engineering, PES University, Bangalore, India 备注:5 pages 摘要:模因是最普遍的社交媒体传播形式之一。模因本质上是多媒体,研究和处理模因是当前的热门话题。本研究中的研究基于Memotion数据集,该数据集包括根据反讽、喜剧、动机和总体情绪对模因进行分类。开发了三种基于Transformer的独立创新技术,并对其结果进行了全面审查。在我们的所有技术中,最佳算法的幽默分类的F1总分为0.633分,动机分类的F1总分为0.55分,讽刺分类的F1总分为0.61分,模因整体情绪的F1总分为0.575分。 摘要:Memes are one of the most ubiquitous forms of social media communication. The study and processing of memes, which are intrinsically multimedia, is a popular topic right now. The study presented in this research is based on the Memotion dataset, which involves categorising memes based on irony, comedy, motivation, and overall-sentiment. Three separate innovative transformer-based techniques have been developed, and their outcomes have been thoroughly reviewed.The best algorithm achieved a macro F1 score of 0.633 for humour classification, 0.55 for motivation classification, 0.61 for sarcasm classification, and 0.575 for overall sentiment of the meme out of all our techniques.

GAN|对抗|攻击|生成相关(5篇)

【1】 A Label Dependence-aware Sequence Generation Model for Multi-level Implicit Discourse Relation Recognition 标题:一种标签依赖感知的多级隐含话语关系识别序列生成模型 链接:https://arxiv.org/abs/2112.11740

作者:Changxing Wu,Liuwen Cao,Yubin Ge,Yang Liu,Min Zhang,Jinsong Su 机构: School of Software, East China Jiaotong University, Nanchang, China, University of Illinois at Urbana-Champaign, Urbana, IL , USA, Tsinghua University, Beijing, China, Soochow University, Soochow, China, Xiamen University, Xiamen, China 备注:Accepted at AAAI 2022 摘要:内隐语篇关系识别(IDRR)是语篇分析中一项具有挑战性但至关重要的任务。大多数现有方法训练多个模型来独立预测多级标签,而忽略了层次结构标签之间的依赖关系。在本文中,我们考虑多级IDRR作为条件标签序列生成任务,并提出了标签依赖感知序列生成模型(LDSGM)。具体来说,我们首先设计一个关注标签的编码器来学习输入实例及其特定于级别的上下文的全局表示,其中标签依赖性被集成以获得更好的标签嵌入。然后,我们使用标签序列解码器以自上而下的方式输出预测标签,其中预测的高级标签直接用于指导当前级别的标签预测。我们进一步发展了一种相互学习增强的训练方法,以利用训练过程中引入的辅助解码器捕获的自下而上方向上的标签依赖性。在PDTB数据集上的实验结果表明,我们的模型在多级IDRR上达到了最先进的性能。我们将在https://github.com/nlpersECJTU/LDSGM. 摘要:Implicit discourse relation recognition (IDRR) is a challenging but crucial task in discourse analysis. Most existing methods train multiple models to predict multi-level labels independently, while ignoring the dependence between hierarchically structured labels. In this paper, we consider multi-level IDRR as a conditional label sequence generation task and propose a Label Dependence-aware Sequence Generation Model (LDSGM) for it. Specifically, we first design a label attentive encoder to learn the global representation of an input instance and its level-specific contexts, where the label dependence is integrated to obtain better label embeddings. Then, we employ a label sequence decoder to output the predicted labels in a top-down manner, where the predicted higher-level labels are directly used to guide the label prediction at the current level. We further develop a mutual learning enhanced training method to exploit the label dependence in a bottomup direction, which is captured by an auxiliary decoder introduced during training. Experimental results on the PDTB dataset show that our model achieves the state-of-the-art performance on multi-level IDRR. We will release our code at https://github.com/nlpersECJTU/LDSGM.

【2】 A Survey of Natural Language Generation 标题:自然语言生成研究综述 链接:https://arxiv.org/abs/2112.11739

作者:Chenhe Dong,Yinghui Li,Haifan Gong,Miaoxin Chen,Junxin Li,Ying Shen,Min Yang 机构: Sun Yat-Sen University, Tsinghua University 备注:36 pages, 4 tables; Under review 摘要:本文全面回顾了近二十年来自然语言生成(NLG)的研究,特别是数据到文本生成和文本到文本生成的深度学习方法,以及NLG技术的新应用。这项调查旨在(a)提供关于NLG核心任务的深度学习研究的最新综合,以及该领域采用的架构;(b) 细致全面地细化NLG的各项任务和数据集,关注NLG评估面临的挑战,关注不同的评估方法及其关系;(c) 强调由于NLG与其他人工智能领域(如计算机视觉、文本和计算创造力)之间的协同作用日益增强而产生的一些未来重点和相对较新的研究问题。 摘要:This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology. This survey aims to (a) give the latest synthesis of deep learning research on the NLG core tasks, as well as the architectures adopted in the field; (b) detail meticulously and comprehensively various NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity.

【3】 How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness? 标题:预先训练的语言模型应该如何针对对手的健壮性进行微调? 链接:https://arxiv.org/abs/2112.11668

作者:Xinhsuai Dong,Luu Anh Tuan,Min Lin,Shuicheng Yan,Hanwang Zhang 机构:Nanyang Technological University, Sea AI Lab 备注:Accepted by NeurIPS-2021 摘要:预训练语言模型的微调在许多NLP领域都取得了巨大的成功。然而,它极易受到敌对示例的攻击,例如,仅使用同义词的单词替换攻击很容易愚弄基于BERT的情绪分析模型。在本文中,我们证明了对抗性训练,这种流行的防御技术,并不直接适用于传统的微调场景,因为它受到灾难性遗忘的严重影响:无法保留预训练模型已经捕捉到的通用和健壮的语言特征。有鉴于此,我们从信息论的角度提出了一种新的对抗性微调方法——鲁棒信息微调(RIFT)。特别是,RIFT鼓励目标模型在整个微调过程中保留从预训练模型中学习到的特征,而传统模型仅使用预训练权重进行初始化。实验结果表明,在各种预先训练的语言模型中,在不同的攻击下,RIFT在两种流行的NLP任务:情感分析和自然语言推理上始终优于最新水平。 摘要:The fine-tuning of pre-trained language models has a great success in many NLP fields. Yet, it is strikingly vulnerable to adversarial examples, e.g., word substitution attacks using only synonyms can easily fool a BERT-based sentiment analysis model. In this paper, we demonstrate that adversarial training, the prevalent defense technique, does not directly fit a conventional fine-tuning scenario, because it suffers severely from catastrophic forgetting: failing to retain the generic and robust linguistic features that have already been captured by the pre-trained model. In this light, we propose Robust Informative Fine-Tuning (RIFT), a novel adversarial fine-tuning method from an information-theoretical perspective. In particular, RIFT encourages an objective model to retain the features learned from the pre-trained model throughout the entire fine-tuning process, whereas a conventional one only uses the pre-trained weights for initialization. Experimental results show that RIFT consistently outperforms the state-of-the-arts on two popular NLP tasks: sentiment analysis and natural language inference, under different attacks across various pre-trained language models.

【4】 BACON: Deep-Learning Powered AI for Poetry Generation with Author Linguistic Style Transfer 标题:培根:深度学习驱动的人工智能诗歌生成与作者语言风格转换 链接:https://arxiv.org/abs/2112.11483

作者:Alejandro Rodriguez Pascual 机构:Campolindo High School, Moraga Rd, Moraga, CA , Note: This paper was presented at the , California Science & Engineering Fair, Los Angeles, CA, . It is the result of research conducted by the author while in High School, and it is submitted to arXiv 备注:9 pages, 7 figures, independent high school research project 摘要:本文描述了BACON,一个具有作者语言风格转换的自动诗歌生成器的基本原型。它结合了有限状态机、概率模型、人工神经网络和深度学习的概念和技术,以任何特定作者的风格创作具有丰富美学品质的原创诗歌。对培根生成的输出进行的外部评估表明,参与者无法以任何具有统计意义的方式区分人类和人工智能生成的诗歌。 摘要:This paper describes BACON, a basic prototype of an automatic poetry generator with author linguistic style transfer. It combines concepts and techniques from finite state machinery, probabilistic models, artificial neural networks and deep learning, to write original poetry with rich aesthetic-qualities in the style of any given author. Extrinsic evaluation of the output generated by BACON shows that participants were unable to tell the difference between human and AI-generated poems in any statistically significant way.

【5】 Translating Human Mobility Forecasting through Natural Language Generation 标题:通过自然语言生成翻译人员流动性预测 链接:https://arxiv.org/abs/2112.11481

作者:Hao Xue,Flora D. Salim,Yongli Ren,Charles L. A. Clarke 机构:School of Computing Technologies, RMIT University, Melbourne, Victoria, Australia, University of Waterloo, Waterloo, Ontario, Canada 备注:Accepted at WSDM2022 摘要:现有的人员流动预测模型遵循时间序列预测模型的标准设计,该模型以一系列数值作为输入,生成一个数值作为预测。尽管将其视为回归问题似乎很简单,但在设计有效的移动性预测模型时,整合各种上下文信息(如每个兴趣点(POI)的语义类别信息)是一个必要步骤,而且往往是瓶颈。与典型的方法不同,我们将预测视为一个翻译问题,并通过语言生成管道提出了一种新的预测方法。本文旨在将人员流动预测问题作为一项语言翻译任务,以顺序对顺序的方式进行研究。首先引入语言迁移模板,将数字迁移数据描述为自然语言句子。人类移动性预测翻译任务的核心直觉是将输入的移动性描述语句转换为未来的移动性描述,从中可以获得预测目标。在此管道下,设计了一个两分支网络SHIFT(翻译人员流动预测)。具体来说,它由一个主要的语言生成分支和一个辅助分支组成,用于直接学习迁移模式。在训练过程中,我们开发了动量模式,以便更好地连接和训练两个分支机构。在三个真实数据集上进行的大量实验表明,所提出的转换是有效的,并为预测人类流动性提供了一种新的革命性方法。 摘要:Existing human mobility forecasting models follow the standard design of the time-series prediction model which takes a series of numerical values as input to generate a numerical value as a prediction. Although treating this as a regression problem seems straightforward, incorporating various contextual information such as the semantic category information of each Place-of-Interest (POI) is a necessary step, and often the bottleneck, in designing an effective mobility prediction model. As opposed to the typical approach, we treat forecasting as a translation problem and propose a novel forecasting through a language generation pipeline. The paper aims to address the human mobility forecasting problem as a language translation task in a sequence-to-sequence manner. A mobility-to-language template is first introduced to describe the numerical mobility data as natural language sentences. The core intuition of the human mobility forecasting translation task is to convert the input mobility description sentences into a future mobility description from which the prediction target can be obtained. Under this pipeline, a two-branch network, SHIFT (Translating Human Mobility Forecasting), is designed. Specifically, it consists of one main branch for language generation and one auxiliary branch to directly learn mobility patterns. During the training, we develop a momentum mode for better connecting and training the two branches. Extensive experiments on three real-world datasets demonstrate that the proposed SHIFT is effective and presents a new revolutionary approach to forecasting human mobility.

检测相关(1篇)

【1】 AtteSTNet -- An attention and subword tokenization based approach for code-switched Hindi-English hate speech detection 标题:AtteSTNet--一种基于关注度和子词标记化的语码转换印英仇恨言语检测方法 链接:https://arxiv.org/abs/2112.11479

作者:Vedangi Wagh,Geet Shingi 机构:Pune Institute of Computer Technology, Maharashtra, India 摘要:最近的技术进步促进了社交媒体的使用,最终导致了大量用户生成的数据,其中还包括仇恨和冒犯性言论。社交媒体中使用的语言通常是英语和本地语言的结合。在印度,印地语主要使用,并且经常与英语进行代码转换,从而产生了Hinglish(印地语+英语)语言。过去,人们使用不同的机器学习和基于深度学习的技术对混合编码的Hinglish仇恨语音进行分类。然而,这些技术利用了卷积机制上的递归,卷积机制的计算代价很高,并且具有很高的内存需求。过去的技术还利用复杂的数据处理,使得现有技术非常复杂,无法持续地改变数据。我们提出了一个更简单的方法,不仅与这些复杂的网络,但也超过性能与使用子词标记算法,如BPE和UNIGRAM以及多头关注为基础的技术,给出了87.41%的准确性和F1得分0.851的标准数据集。有效使用BPE和Unigram算法有助于处理非传统的Hinglish词汇,使我们的技术简单、高效、可持续地在现实世界中使用。 摘要:Recent advancements in technology have led to a boost in social media usage which has ultimately led to large amounts of user-generated data which also includes hateful and offensive speech. The language used in social media is often a combination of English and the native language in the region. In India, Hindi is used predominantly and is often code-switched with English, giving rise to the Hinglish (Hindi+English) language. Various approaches have been made in the past to classify the code-mixed Hinglish hate speech using different machine learning and deep learning-based techniques. However, these techniques make use of recurrence on convolution mechanisms which are computationally expensive and have high memory requirements. Past techniques also make use of complex data processing making the existing techniques very complex and non-sustainable to change in data. We propose a much simpler approach which is not only at par with these complex networks but also exceeds performance with the use of subword tokenization algorithms like BPE and Unigram along with multi-head attention-based technique giving an accuracy of 87.41% and F1 score of 0.851 on standard datasets. Efficient use of BPE and Unigram algorithms help handle the non-conventional Hinglish vocabulary making our technique simple, efficient and sustainable to use in the real world.

识别/分类(2篇)

【1】 ALP: Data Augmentation using Lexicalized PCFGs for Few-Shot Text Classification 标题:ALP:基于词汇化PCFGS的Few-Shot文本分类数据增强 链接:https://arxiv.org/abs/2112.11916

作者:Hazel Kim,Daecheol Woo,Seong Joon Oh,Jeong-Won Cha,Yo-Sub Han 机构: Yonsei University, Seoul, Republic of Korea, NAVER AI Lab, Changwon National University, Changwon, Republic of Korea 备注:Accepted to AAAI2022 摘要:数据扩充是提高学习模型性能的一个重要因素。以前用于少量镜头文本分类的数据增强方法已经极大地提高了性能。然而,它们并不是为了捕捉自然语言复杂的组成结构而设计的。结果,他们无法生成具有似是而非和多样化句子结构的样本。基于此,我们提出了使用词汇化概率上下文无关语法(ALP)进行数据扩充,该语法生成具有不同语法结构和似是而非语法的扩充样本。词汇化的PCFG解析树考虑成分和依赖关系,以产生句法框架,最大化各种单词选择的语法可保存的方式,而无特定领域专家。对少量镜头文本分类任务的实验表明,ALP增强了许多最先进的分类方法。作为第二个贡献,我们深入研究了数据扩充方法发挥作用时的train-val分割方法。我们在经验上认为,与我们的新的基于增强的分割策略相比,传统的训练集和验证集的分割是次优的,该策略使用相同数量的标记数据进一步扩展训练分割。综上所述,我们在数据增强策略方面的贡献为少量镜头文本分类任务提供了强大的训练配方。 摘要:Data augmentation has been an important ingredient for boosting performances of learned models. Prior data augmentation methods for few-shot text classification have led to great performance boosts. However, they have not been designed to capture the intricate compositional structure of natural language. As a result, they fail to generate samples with plausible and diverse sentence structures. Motivated by this, we present the data Augmentation using Lexicalized Probabilistic context-free grammars (ALP) that generates augmented samples with diverse syntactic structures with plausible grammar. The lexicalized PCFG parse trees consider both the constituents and dependencies to produce a syntactic frame that maximizes a variety of word choices in a syntactically preservable manner without specific domain experts. Experiments on few-shot text classification tasks demonstrate that ALP enhances many state-of-the-art classification methods. As a second contribution, we delve into the train-val splitting methodologies when a data augmentation method comes into play. We argue empirically that the traditional splitting of training and validation sets is sub-optimal compared to our novel augmentation-based splitting strategies that further expand the training split with the same number of labeled data. Taken together, our contributions on the data augmentation strategies yield a strong training recipe for few-shot text classification tasks.

【2】 Hybrid Curriculum Learning for Emotion Recognition in Conversation 标题:会话中情感识别的混合课程学习 链接:https://arxiv.org/abs/2112.11718

作者:Lin Yang,Yi Shen,Yue Mao,Longjun Cai 机构:Alibaba Group, Beijing, China 备注:Accepted by AAAI-2022 摘要:会话中的情感识别(ERC)旨在检测每一个话语的情感标签。最近的研究证明,以有意义的顺序输入训练示例而不是随机考虑它们可以提高模型的性能,因此我们提出了一个面向ERC的混合课程学习框架。我们的框架包括两个课程:(1)会话级课程(CC);(2)话语水平课程(UC)。在CC中,我们根据会话中的“情绪转移”频率构造了一个难度度量器,然后根据难度度量器返回的难度分数将会话安排在“易到难”的模式中。对于UC,它是从情感相似性的角度实现的,这逐渐增强了模型识别混乱情感的能力。通过提出的模型不可知混合课程学习策略,我们观察到,与现有的各种ERC模型相比,性能显著提高,并且我们能够在四个公共ERC数据集上获得最新的结果。 摘要:Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC). In CC, we construct a difficulty measurer based on "emotion shift" frequency within a conversation, then the conversations are scheduled in an "easy to hard" schema according to the difficulty score returned by the difficulty measurer. For UC, it is implemented from an emotion-similarity perspective, which progressively strengthens the model's ability in identifying the confusing emotions. With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models and we are able to achieve new state-of-the-art results on four public ERC datasets.

Word2Vec|文本|单词(2篇)

【1】 Assisted Text Annotation Using Active Learning to Achieve High Quality with Little Effort 标题:基于主动学习的文本辅助标注实现轻而易举的高质量标注 链接:https://arxiv.org/abs/2112.11914

作者:Franziska Weeber,Felix Hamborg,Karsten Donnay,Bela Gipp 机构:∗University of Konstanz, Germany, †Heidelberg Academy of Sciences and Humanities, Germany, ‡University of Zurich, Switzerland, §University of Wuppertal, Germany 摘要:大量带注释的数据变得比以往任何时候都重要,特别是自深度学习技术兴起以来。但是,手动注释的成本很高。我们提出了一种工具,使研究人员能够创建大型、高质量、带注释的数据集,只需少量手动注释,从而大大降低注释成本和工作量。为此,我们将主动学习(AL)方法与预先训练的语言模型相结合,以半自动地识别给定文本文档中的注释类别。为了突出我们的研究方向的潜力,我们评估了识别新闻文章框架的方法。我们的初步结果表明,使用AL可以大大减少注释的数量,以便对这些复杂而微妙的帧进行正确的分类。在框架数据集上,AL方法只需要16.3%的注释即可达到与在完整数据集上训练的模型相同的性能。 摘要:Large amounts of annotated data have become more important than ever, especially since the rise of deep learning techniques. However, manual annotations are costly. We propose a tool that enables researchers to create large, high-quality, annotated datasets with only a few manual annotations, thus strongly reducing annotation cost and effort. For this purpose, we combine an active learning (AL) approach with a pre-trained language model to semi-automatically identify annotation categories in the given text documents. To highlight our research direction's potential, we evaluate the approach on the task of identifying frames in news articles. Our preliminary results show that employing AL strongly reduces the number of annotations for correct classification of even these complex and subtle frames. On the framing dataset, the AL approach needs only 16.3\% of the annotations to reach the same performance as a model trained on the full dataset.

【2】 STEREO: Scientific Text Reuse in Open Access Publications 标题:STEREO:开放存取出版物中的科学文本重用 链接:https://arxiv.org/abs/2112.11800

作者:Lukas Gienapp,Wolfgang Kircheis,Bjarne Sievers,Benno Stein,Martin Potthast 机构:Leipzig University, Text Mining and Retrieval Group, Leipzig, DE-, Germany, Bauhaus-Universit¨at Weimar, Web Technology and Information Systems Group, Weimar, DE-, Germany 备注:10 pages, 2 figures, 4 tables 摘要:我们展示了Webis-STEREO-21数据集,它是开放存取出版物中科学文本重用的大量集合。它包含超过9100万个重复使用的文本段落案例,在420万个独特的开放存取出版物中找到。我们的数据集涵盖了大量的科学学科和各种各样的重复使用,并提供了全面的元数据来对每一个案例进行上下文分析,它解决了以前的数据集在科学写作方面最突出的缺点。Webis-STEREO-21允许处理来自不同科学背景的广泛研究问题,有助于对这一现象进行定性和定量分析,并首次基于科学出版物中文本重用的基本速率。 摘要:We present the Webis-STEREO-21 dataset, a massive collection of Scientific Text Reuse in Open-access publications. It contains more than 91 million cases of reused text passages found in 4.2 million unique open-access publications. Featuring a high coverage of scientific disciplines and varieties of reuse, as well as comprehensive metadata to contextualize each case, our dataset addresses the most salient shortcomings of previous ones on scientific writing. Webis-STEREO-21 allows for tackling a wide range of research questions from different scientific backgrounds, facilitating both qualitative and quantitative analysis of the phenomenon as well as a first-time grounding on the base rate of text reuse in scientific publications.

其他神经网络|深度学习|模型|建模(1篇)

【1】 On the Compression of Natural Language Models 标题:论自然语言模型的压缩 链接:https://arxiv.org/abs/2112.11480

作者:Saeed Damadi 机构:Department of Computer Science and Electrical Engineering Department, University of Maryland, Baltimore County 摘要:深度神经网络是有效的特征提取器,但对于部署场景来说,它们的规模太大了。由于参数数量巨大,不同层参数的可解释性并不直接。这就是为什么神经网络有时被认为是黑箱。虽然简单的模型更容易解释,但找到它们并不容易。若能找到一个稀疏网络,它可以从零开始拟合数据,这将有助于解释神经网络的参数。为此,彩票假设指出,典型的稠密神经网络包含一个小的稀疏子网络,该子网络可以经过训练,在相同的步骤数内达到相似的测试精度。这项工作的目标是评估自然语言模型(NLM)是否存在这样一个可训练的子网络。为了实现这一目标,我们将回顾最先进的压缩技术,如量化、知识提取和剪枝。 摘要:Deep neural networks are effective feature extractors but they are prohibitively large for deployment scenarios. Due to the huge number of parameters, interpretability of parameters in different layers is not straight-forward. This is why neural networks are sometimes considered black boxes. Although simpler models are easier to explain, finding them is not easy. If found, a sparse network that can fit to a data from scratch would help to interpret parameters of a neural network. To this end, lottery ticket hypothesis states that typical dense neural networks contain a small sparse sub-network that can be trained to a reach similar test accuracy in an equal number of steps. The goal of this work is to assess whether such a trainable subnetwork exists for natural language models (NLM)s. To achieve this goal we will review state-of-the-art compression techniques such as quantization, knowledge distillation, and pruning.

其他(11篇)

【1】 VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation 标题:VoiceMoji:一种新颖的听写无缝Emoji插入流水线 链接:https://arxiv.org/abs/2112.12028

作者:Sumit Kumar,Harichandana B S S,Himanshu Arora 机构:Samsung R&D Institute, Bangalore, India 备注:Accepted at IEEE INDICON 2021, 19-21 December, 2021, India 摘要:大多数语音识别系统只恢复语音中的单词,无法捕捉情感。用户必须在文本中手动添加表情符号,以增加语气和交流乐趣。虽然在转录语音的标点符号添加方面做了很多工作,但是情感添加方面的工作还没有涉及到。在本文中,我们提出了一种新的设备上管道来丰富语音输入体验。它涉及到,给定一团转录文本,智能地处理和识别插入表情符号有意义的结构。此外,它还包括语义文本分析,以预测每个子部分的表情,为此,我们提出了一种新的基于注意的字符感知(ACA)LSTM架构,该架构还处理词汇表外(OOV)单词。所有这些任务都完全在设备上执行,因此可以帮助设备听写系统。据我们所知,这是第一部展示如何在转录文本中添加表情符号的作品。我们证明,我们的组件在标点符号添加和表情预测方面取得了与以前的神经方法相当的结果,参数减少了80%。总的来说,我们提出的模型具有非常小的内存占用,仅4MB,以适应设备部署。 摘要:Most of the speech recognition systems recover only words in the speech and fail to capture emotions. Users have to manually add emoji(s) in text for adding tone and making communication fun. Though there is much work done on punctuation addition on transcribed speech, the area of emotion addition is untouched. In this paper, we propose a novel on-device pipeline to enrich the voice input experience. It involves, given a blob of transcribed text, intelligently processing and identifying structure where emoji insertion makes sense. Moreover, it includes semantic text analysis to predict emoji for each of the sub-parts for which we propose a novel architecture Attention-based Char Aware (ACA) LSTM which handles Out-Of-Vocabulary (OOV) words as well. All these tasks are executed completely on-device and hence can aid on-device dictation systems. To the best of our knowledge, this is the first work that shows how to add emoji(s) in the transcribed text. We demonstrate that our components achieve comparable results to previous neural approaches for punctuation addition and emoji prediction with 80% fewer parameters. Overall, our proposed model has a very small memory footprint of a mere 4MB to suit on-device deployment.

【2】 Quantifying Gender Biases Towards Politicians on Reddit 标题:量化Reddit上对政客的性别偏见 链接:https://arxiv.org/abs/2112.12014

作者:Sara Marjanovic,Karolina Stańczak,Isabelle Augenstein 机构: Copenhagen Center for Social Science Data, University of Copenhagen, Copenhagen, Department of Computer Science, University of Copenhagen, Copenhagen, Denmark 摘要:尽管有人试图增加政治中的性别平等,但全球努力仍在努力确保女性代表权平等。这很可能与对当权女性的隐性性别偏见有关。在这项工作中,我们对在线政治讨论中出现的性别偏见进行了全面研究。为此,我们收集了1000万条关于Reddit的评论,这些评论涉及男性和女性政治家的对话,这使得我们能够对自动性别偏见检测进行详尽的研究。我们讨论的不仅是厌恶女性的语言,还有仁慈的性别歧视,其表现形式似乎是积极的态度,审视女性政治家的情绪和支配地位。最后,我们对政治家的性别偏见进行了多方面的研究,调查语言和语言外线索。我们评估了5种不同类型的性别偏见,评估了社交媒体语言和话语中存在的报道、组合、名义、情感和词汇偏见。总的来说,我们发现,与之前的研究相反,报道和情感偏见表明女性政治家的公共利益是平等的。然而,名词和词汇分析的结果表明,这种兴趣并不像对男性政治家表达的那样专业或尊重。女性政治家通常以她们的名字命名,并根据她们的身体、衣着或家庭来描述;这是一种不适用于男性的治疗方法。在目前被禁止的极右翼亚红点上,这种差异最大,尽管性别偏见的差异仍然出现在右翼和左倾亚红点上。我们向公众发布策划的数据集,以备将来研究之用。 摘要:Despite attempts to increase gender parity in politics, global efforts have struggled to ensure equal female representation. This is likely tied to implicit gender biases against women in authority. In this work, we present a comprehensive study of gender biases that appear in online political discussion. To this end, we collect 10 million comments on Reddit in conversations about male and female politicians, which enables an exhaustive study of automatic gender bias detection. We address not only misogynistic language, but also benevolent sexism in the form of seemingly positive attitudes examining both sentiment and dominance attributed to female politicians. Finally, we conduct a multi-faceted study of gender bias towards politicians investigating both linguistic and extra-linguistic cues. We assess 5 different types of gender bias, evaluating coverage, combinatorial, nominal, sentimental and lexical biases extant in social media language and discourse. Overall, we find that, contrary to previous research, coverage and sentiment biases suggest equal public interest in female politicians. However, the results of the nominal and lexical analyses suggest this interest is not as professional or respectful as that expressed about male politicians. Female politicians are often named by their first names and are described in relation to their body, clothing, or family; this is a treatment that is not similarly extended to men. On the now banned far-right subreddits, this disparity is greatest, though differences in gender biases still appear in the right and left-leaning subreddits. We release the curated dataset to the public for future studies.

【3】 Toward Educator-focused Automated Scoring Systems for Reading and Writing 标题:走向以教育者为中心的阅读和写作自动评分系统 链接:https://arxiv.org/abs/2112.11973

作者:Mike Hardy 机构:University of California, Berkeley 备注:10 pages, 8 figures 摘要:本文介绍了改进自动论文评分的方法,这些方法采用了解决自我注意和文档长度的计算权衡的技术。为了使自动论文评分(AES)对从业者更有用,研究人员必须克服数据和标签可用性、真实性和扩展性写作、领域评分、即时性和来源多样性以及迁移学习等挑战。本文利用神经网络模型解决了这些挑战,采用了在不增加模型训练成本的情况下保持文章长度作为重要特征的技术。它介绍了使用多目标学习最小化顺序标签上的分类损失的技术,使用句子嵌入在整个文章中捕获语义信息,在任意长的文档中使用transformer架构,使用此类模型进行迁移学习,基于即时语料库元数据的自动超参数生成,最重要的是,通过分析与文章相关的写作,使用语义信息为学生阅读提供有意义的见解,从而为各种论文任务提供最先进的结果。 摘要:This paper presents methods for improving automated essay scoring with techniques that address the computational trade-offs of self-attention and document length. To make Automated Essay Scoring (AES) more useful to practitioners, researchers must overcome the challenges of data and label availability, authentic and extended writing, domain scoring, prompt and source variety, and transfer learning. This paper addresses these challenges using neural network models by employing techniques that preserve essay length as an important feature without increasing model training costs. It introduces techniques for minimizing classification loss on ordinal labels using multi-objective learning, capturing semantic information across the entire essay using sentence embeddings to use transformer architecture across arbitrarily long documents, the use of such models for transfer learning, automated hyperparameter generation based on prompt-corpus metadata, and, most importantly, the use of semantic information to provide meaningful insights into student reading through analysis of passage-dependent writing resulting in state-of-the-art results for various essay tasks.

【4】 Automatic Product Copywriting for E-Commerce 标题:面向电子商务的产品自动抄写 链接:https://arxiv.org/abs/2112.11915

作者:Xueying Zhang,Yanyan Zou,Hainan Zhang,Jing Zhou,Shiliang Diao,Jiajia Chen,Zhuoye Ding,Zhen He,Xueqi He,Yun Xiao,Bo Long,Han Yu,Lingfei Wu 机构: JD.COM Silicon Valley Research Center, Nanyang Technological University 备注:Accepted by AAAI 2022/IAAI 2022 under the track of "Highly Innovative Applications of AI" 摘要:产品文案是电子商务推荐平台的重要组成部分。它旨在通过文本描述突出产品特征,吸引用户的兴趣并改善用户体验。在本文中,我们报告了我们在JD中部署拟议的自动产品文案生成(APCG)系统的经验。com电子商务产品推荐平台。它由两个主要部分组成:1)自然语言生成,它由转换器指针网络和基于我们内部平台数百万训练数据的预训练序列到序列模型构建;2)文案质量控制,基于自动评估和人工筛选。对于选定的领域,每天使用更新的训练数据对模型进行训练和更新。此外,该模型还被用作我们直播平台上的实时写作辅助工具。APCG系统已在JD部署。com自2月2021日起。到SEP 2021,它已经产生了253万个产品描述,并分别提高了总的平均点击率(CTR)和转换率(CVR)分别为4.22%和3.61%,与基线相比,分别按年计算。我们的系统所累积的总商品体积(GMV)比FEB 2021的数量提高了213.42%。 摘要:Product copywriting is a critical component of e-commerce recommendation platforms. It aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. In this paper, we report our experience deploying the proposed Automatic Product Copywriting Generation (APCG) system into the JD.com e-commerce product recommendation platform. It consists of two main components: 1) natural language generation, which is built from a transformer-pointer network and a pre-trained sequence-to-sequence model based on millions of training data from our in-house platform; and 2) copywriting quality control, which is based on both automatic evaluation and human screening. For selected domains, the models are trained and updated daily with the updated training data. In addition, the model is also used as a real-time writing assistant tool on our live broadcast platform. The APCG system has been deployed in JD.com since Feb 2021. By Sep 2021, it has generated 2.53 million product descriptions, and improved the overall averaged click-through rate (CTR) and the Conversion Rate (CVR) by 4.22% and 3.61%, compared to baselines, respectively on a year-on-year basis. The accumulated Gross Merchandise Volume (GMV) made by our system is improved by 213.42%, compared to the number in Feb 2021.

【5】 Towards Interactive Language Modeling 标题:走向交互式语言建模 链接:https://arxiv.org/abs/2112.11911

作者:Maartje ter Hoeve,Evgeny Kharitonov,Dieuwke Hupkes,Emmanuel Dupoux 机构:University of Amsterdam, Meta AI Labs, EHESS 摘要:照顾者和儿童之间的互动在人类语言习得和发展中起着至关重要的作用。鉴于这一观察结果,显式交互在人工语言建模中几乎不起作用是值得注意的。人工语言建模也以人工模型获取人类语言为目标。此外,语言建模的交互式方法有可能使语言模型更加通用,并对下游应用程序产生重大影响。基于这些考虑,我们开创了交互式语言建模的空间。作为第一个贡献,我们提出了一个路线图,其中详细说明了交互式语言建模需要采取的步骤。然后,我们以身作则,在路线图上迈出第一步,展示我们方法的初步可行性。因此,这项工作旨在成为交互式语言建模更大研究议程的开始。 摘要:Interaction between caregivers and children plays a critical role in human language acquisition and development. Given this observation, it is remarkable that explicit interaction plays little to no role in artificial language modeling -- which also targets the acquisition of human language, yet by artificial models. Moreover, an interactive approach to language modeling has the potential to make language models substantially more versatile and to considerably impact downstream applications. Motivated by these considerations, we pioneer the space of interactive language modeling. As a first contribution we present a road map in which we detail the steps that need to be taken towards interactive language modeling. We then lead by example and take the first steps on this road map, showing the initial feasibility of our approach. As such, this work aims to be the start of a larger research agenda on interactive language modeling.

【6】 The Importance of the Current Input in Sequence Modeling 标题:电流输入在序列建模中的重要性 链接:https://arxiv.org/abs/2112.11776

作者:Christian Oliva,Luis F. Lago-Fernández 机构:Departamento de Ingenier´ıa Inform´atica, Universidad Aut´onoma de Madrid, Madrid, Spain 备注:11 pages, 2 appendix pages 摘要:序列建模的最新进展主要基于深度学习方法。目前的技术水平包括使用标准LSTM体系结构的变体,并结合一些技巧来提高训练神经网络的最终预测率。然而,在某些情况下,这些调整可能过于适应正在解决的特定问题。在本文中,我们展示了一个非常简单的想法,即在输入和输出之间添加直接连接,跳过递归模块,从而在与自然语言处理相关的序列建模问题中提高预测精度。在不同问题上进行的实验表明,无论结构和训练的具体细节如何,将这种连接添加到递归网络总是能够改善结果。当这一思想被引入到引领该领域的模型中时,由此产生的网络在语言建模问题上实现了一种新的最先进的困惑。 摘要:The last advances in sequence modeling are mainly based on deep learning approaches. The current state of the art involves the use of variations of the standard LSTM architecture, combined with several tricks that improve the final prediction rates of the trained neural networks. However, in some cases, these adaptations might be too much tuned to the particular problems being addressed. In this article, we show that a very simple idea, to add a direct connection between the input and the output, skipping the recurrent module, leads to an increase of the prediction accuracy in sequence modeling problems related to natural language processing. Experiments carried out on different problems show that the addition of this kind of connection to a recurrent network always improves the results, regardless of the architecture and training-specific details. When this idea is introduced into the models that lead the field, the resulting networks achieve a new state-of-the-art perplexity in language modeling problems.

【7】 On Theoretical Complexity and Boolean Satisfiability 标题:关于理论复杂性与布尔可满足性 链接:https://arxiv.org/abs/2112.11769

作者:Mohamed Ghanem,Dauod Siniora 机构:Supervisor:, Dr. Daoud Siniora, arXiv:,.,v, [cs.CC] , Dec 备注:Undergraduate Math Survey Thesis 摘要:理论复杂性是计算机科学的一个重要分支,它使我们能够从数学上研究计算并回答许多有关计算问题本质的有趣问题。它提供了理论工具来评估计算的时间和空间需求,以及评估问题的难度——对它们进行相应的分类。它的核心还包括数学中最重要的问题之一,即$\textbf{P vs.NP}$millennium问题。本质上,这个问题是问解决方案和验证是否处于两个不同的困难级别。在这篇论文中,我们将介绍一些计算理论中最核心的概念,概述如何使用图灵机来抽象计算。此外,我们还介绍了两个最著名的问题复杂性类$\textbf{P}$和$\textbf{NP}$,以及它们之间的关系。此外,我们还阐述了问题约简的概念,以及它是如何在不同问题之间进行硬度比较的重要工具。随后,我们提出了布尔可满足性问题(SAT),它位于NP完全问题的中心。然后,我们分别探索了一些易处理和难处理的变体,如Horn SAT和3-SAT。最后但并非最不重要的是,我们建立了从3-SAT到一些著名的NP完全图问题的多项式时间约化,即团发现、哈密顿圈发现和3-着色。 摘要:Theoretical complexity is a vital subfield of computer science that enables us to mathematically investigate computation and answer many interesting queries about the nature of computational problems. It provides theoretical tools to assess time and space requirements of computations along with assessing the difficultly of problems - classifying them accordingly. It also garners at its core one of the most important problems in mathematics, namely, the $\textbf{P vs. NP}$ millennium problem. In essence, this problem asks whether solution and verification reside on two different levels of difficulty. In this thesis, we introduce some of the most central concepts in the Theory of Computing, giving an overview of how computation can be abstracted using Turing machines. Further, we introduce the two most famous problem complexity classes $\textbf{P}$ and $\textbf{NP}$ along with the relationship between them. In addition, we explicate the concept of problem reduction and how it is an essential tool for making hardness comparisons between different problems. Later, we present the problem of Boolean Satisfiability (SAT) which lies at the center of NP-complete problems. We then explore some of its tractable as well as intractable variants such as Horn-SAT and 3-SAT, respectively. Last but not least, we establish polynomial-time reductions from 3-SAT to some of the famous NP-complete graph problems, namely, Clique Finding, Hamiltonian Cycle Finding, and 3-Coloring.

【8】 Consistency and Coherence from Points of Contextual Similarity 标题:从语境相似性看一致性与连贯性 链接:https://arxiv.org/abs/2112.11638

作者:Oleg Vasilyev,John Bohannon 机构:Primer Technologies Inc., San Francisco, California 备注:9 pages, 7 figures, 1 table 摘要:事实一致性是总结评估的一个重要维度,尤其是当总结生成变得更加流畅和连贯时。最近专门针对事实一致性提出的ESTIME度量,在一致性和流利性方面与人类专家得分具有高度相关性,而原则上仅限于评估字典重叠度较高的文本摘要对。这对于当前的摘要样式来说不是问题,但它可能会成为未来摘要系统的障碍,或者成为评估针对文本的任意声明的障碍。在这项工作中,我们推广了该方法,使其适用于任何文本摘要对。由于ESTIME使用了上下文相似点,因此它可以深入了解从不同层获取的信息的有用性。我们观察到,除了几个最低层之外,几乎所有层中都存在有用的信息。为了保持连贯性和流畅性——注重本地文本细节——最有用的层次靠近顶部(但不在顶部);对于连贯性和相关性,我们发现了一幅更为复杂和有趣的画面。 摘要:Factual consistency is one of important summary evaluation dimensions, especially as summary generation becomes more fluent and coherent. The ESTIME measure, recently proposed specifically for factual consistency, achieves high correlations with human expert scores both for consistency and fluency, while in principle being restricted to evaluating such text-summary pairs that have high dictionary overlap. This is not a problem for current styles of summarization, but it may become an obstacle for future summarization systems, or for evaluating arbitrary claims against the text. In this work we generalize the method, making it applicable to any text-summary pairs. As ESTIME uses points of contextual similarity, it provides insights into usefulness of information taken from different BERT layers. We observe that useful information exists in almost all of the layers except the several lowest ones. For consistency and fluency - qualities focused on local text details - the most useful layers are close to the top (but not at the top); for coherence and relevance we found a more complicated and interesting picture.

【9】 Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents 标题:基于句子嵌入和快速相似度搜索的快速计算机辅助法律文本标注 链接:https://arxiv.org/abs/2112.11494

作者:Hannes Westermann,Jaromir Savelka,Vern R. Walker,Kevin D. Ashley,Karim Benyekhlef 机构:Cyberjustice Laboratory, Facult´e de droit, Universit´e de Montr´eal, School of Computer Science, Carnegie Mellon University, LLT Lab, Maurice A. Deane School of Law, Hofstra University, School of Computing and Information, University of Pittsburgh 备注:None 摘要:法律文件中句子的人工注释是许多支持法律任务的机器学习系统的重要前提。通常,注释是按顺序逐句完成的,这通常很耗时,因此成本很高。在这篇文章中,我们介绍了一个“横向”注释句子的概念证明系统这种方法基于这样一种观察,即意义相似的句子在特定类型系统中通常具有相同的标签。我们利用这一观察结果,允许注释者在整个文档库中快速查看和注释语义与给定句子相似的句子。这里,我们展示了系统的界面,并对该方法进行了实证评估。实验表明,横向标注有可能使标注过程更快、更一致。 摘要:Human-performed annotation of sentences in legal documents is an important prerequisite to many machine learning based systems supporting legal tasks. Typically, the annotation is done sequentially, sentence by sentence, which is often time consuming and, hence, expensive. In this paper, we introduce a proof-of-concept system for annotating sentences "laterally." The approach is based on the observation that sentences that are similar in meaning often have the same label in terms of a particular type system. We use this observation in allowing annotators to quickly view and annotate sentences that are semantically similar to a given sentence, across an entire corpus of documents. Here, we present the interface of the system and empirically evaluate the approach. The experiments show that lateral annotation has the potential to make the annotation process quicker and more consistent.

【10】 LSH methods for data deduplication in a Wikipedia artificial dataset 标题:用于维基百科人工数据集中重复数据删除的LSH方法 链接:https://arxiv.org/abs/2112.11478

作者:Juan Ciro,Daniel Galvez,Tim Schlippe,David Kanter 机构:Factored, NVIDIA, IU University of Applied Sciences, MLCommons 摘要:本文阐述了用于识别和删除文本数据集中几乎冗余数据的局部敏感hasing(LSH)模型。为了评估不同的模型,我们使用英文维基百科文章创建了一个用于重复数据消除的人工数据集。大多数模型的曲线下面积(AUC)超过0.9,最佳模型达到0.96。重复数据消除可防止模型因重复数据而学习到与真实分布不同的分布,从而实现更有效的模型训练。 摘要:This paper illustrates locality sensitive hasing (LSH) models for the identification and removal of nearly redundant data in a text dataset. To evaluate the different models, we create an artificial dataset for data deduplication using English Wikipedia articles. Area-Under-Curve (AUC) over 0.9 were observed for most models, with the best model reaching 0.96. Deduplication enables more effective model training by preventing the model from learning a distribution that differs from the real one as a result of the repeated data.

【11】 Towards a Science of Human-AI Decision Making: A Survey of Empirical Studies 标题:走向人类-人工智能决策科学:实证研究综述 链接:https://arxiv.org/abs/2112.11471

作者:Vivian Lai,Chacha Chen,Q. Vera Liao,Alison Smith-Renner,Chenhao Tan 机构: University of Colorado Boulder, University of Chicago 备注:36 pages, 2 figures, see this https URL for website 摘要:随着人工智能系统显示出越来越强的预测性能,它们在许多领域的应用也越来越多。然而,在刑事司法和医疗保健等高风险领域,出于安全、道德和法律方面的考虑,全自动化通常是不可取的,但全手动方法可能不准确且耗时。因此,研究界越来越有兴趣利用人工智能辅助人类决策。除了为此目的开发人工智能技术外,人类人工智能决策的新兴领域还必须包含经验方法,以形成对人类如何与人工智能互动并与人工智能一起做出决策的基本理解。为了邀请并帮助构建理解和改进人工智能决策的科学的研究工作,我们调查了关于这一主题的实证人类主体研究的最新文献。我们总结了100多篇论文在三个重要方面做出的研究设计选择:(1)决策任务,(2)人工智能模型和人工智能辅助元素,以及(3)评估指标。对于每个方面,我们总结当前趋势,讨论该领域当前实践中的差距,并为未来的研究提出建议。我们的调查强调需要开发通用框架来解释人工智能决策的设计和研究空间,以便研究人员能够在研究设计中做出严格的选择,研究社区能够在彼此的工作基础上发展,产生可概括的科学知识。我们还希望这项调查将成为HCI和AI社区合作的桥梁,共同塑造人类AI决策的经验科学和计算技术。 摘要:As AI systems demonstrate increasingly strong predictive performance, their adoption has grown in numerous domains. However, in high-stakes domains such as criminal justice and healthcare, full automation is often not desirable due to safety, ethical, and legal concerns, yet fully manual approaches can be inaccurate and time consuming. As a result, there is growing interest in the research community to augment human decision making with AI assistance. Besides developing AI technologies for this purpose, the emerging field of human-AI decision making must embrace empirical approaches to form a foundational understanding of how humans interact and work with AI to make decisions. To invite and help structure research efforts towards a science of understanding and improving human-AI decision making, we survey recent literature of empirical human-subject studies on this topic. We summarize the study design choices made in over 100 papers in three important aspects: (1) decision tasks, (2) AI models and AI assistance elements, and (3) evaluation metrics. For each aspect, we summarize current trends, discuss gaps in current practices of the field, and make a list of recommendations for future research. Our survey highlights the need to develop common frameworks to account for the design and research spaces of human-AI decision making, so that researchers can make rigorous choices in study design, and the research community can build on each other's work and produce generalizable scientific knowledge. We also hope this survey will serve as a bridge for HCI and AI communities to work together to mutually shape the empirical science and computational technologies for human-AI decision making.

机器翻译,仅供参考

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-12-23,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 arXiv每日学术速递 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
NLP 服务
NLP 服务(Natural Language Process,NLP)深度整合了腾讯内部的 NLP 技术,提供多项智能文本处理和文本生成能力,包括词法分析、相似词召回、词相似度、句子相似度、文本润色、句子纠错、文本补全、句子生成等。满足各行业的文本智能需求。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档