自然语言处理学术速递[7.23]

公众号-arXiv每日学术速递

发布于 2021-07-27 11:15:21

2710

发布于 2021-07-27 11:15:21

文章被收录于专栏：arXiv每日学术速递

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.CL 方向，今日共计22篇

Transformer(1篇)

【1】 Multi-Stream Transformers 标题：多流Transformer

作者：Mikhail Burtsev,Anna Rumshisky 机构：Artificial Intelligence Research Institute, Moscow, Russia, Moscow Institute of Physics and Technology, Dolgoprudny, Russia, Univ. of Massachusetts Lowell, Lowell MA 链接：https://arxiv.org/abs/2107.10342 摘要：基于Transformer的编码器-解码器模型在每个编码器层之后产生融合的令牌表示。我们调查的影响，允许编码器保存和探索替代假设，结合在最后的编码过程。为此，我们设计并研究了$\textit{Multi-stream Transformer}$体系结构，发现将Transformer编码器拆分为多个编码器流，并允许模型合并多个代表性假设可以提高性能，通过在第一和最终编码器层之间添加跳过连接来获得进一步的改进。摘要：Transformer-based encoder-decoder models produce a fused token-wise representation after every encoder layer. We investigate the effects of allowing the encoder to preserve and explore alternative hypotheses, combined at the end of the encoding process. To that end, we design and examine a $\textit{Multi-stream Transformer}$ architecture and find that splitting the Transformer encoder into multiple encoder streams and allowing the model to merge multiple representational hypotheses improves performance, with further improvement obtained by adding a skip connection between the first and the final encoder layer.

机器翻译(2篇)

【1】 To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation 标题：出货还是不出货：机器翻译自动度量的广泛评价

作者：Tom Kocmi,Christian Federmann,Roman Grundkiewicz,Marcin Junczys-Dowmunt,Hitokazu Matsushita,Arul Menezes 链接：https://arxiv.org/abs/2107.10821 摘要：自动度量通常被用作声明一个机器翻译系统的质量优于另一个机器翻译系统的专用工具。自动度量的社区选择通过决定哪些模型被认为更好来指导研究方向和工业发展。评估指标相关性仅限于一小部分人类判断。在这篇论文中，我们证实了可靠的指标是如何与人类判断形成对比的-就我们所知-人类判断的最大集合。我们以人类的判断为金标准，研究哪些度量具有最高的精度来对成对的系统进行系统级质量排名，这是最接近真实度量使用的场景。此外，我们还评估了不同语言对和域中各种度量的性能。最后，我们证明了BLEU的单独使用对改进模型的发展产生了负面影响。我们发布了4380个系统的人类判断集，以及2.3米的注释句子，以便进一步分析和复制我们的工作。摘要：Automatic metrics are commonly used as the exclusive tool for declaring the superiority of one machine translation system's quality over another. The community choice of automatic metric guides research directions and industrial developments by deciding which models are deemed better. Evaluating metrics correlations has been limited to a small collection of human judgements. In this paper, we corroborate how reliable metrics are in contrast to human judgements on - to the best of our knowledge - the largest collection of human judgements. We investigate which metrics have the highest accuracy to make system-level quality rankings for pairs of systems, taking human judgement as a gold standard, which is the closest scenario to the real metric usage. Furthermore, we evaluate the performance of various metrics across different language pairs and domains. Lastly, we show that the sole use of BLEU negatively affected the past development of improved models. We release the collection of human judgements of 4380 systems, and 2.3 M annotated sentences for further analysis and replication of our work.

【2】 Confidence-Aware Scheduled Sampling for Neural Machine Translation 标题：神经机器翻译中的置信度感知定时采样

作者：Yijin Liu,Fandong Meng,Yufeng Chen,Jinan Xu,Jie Zhou 机构：Beijing Jiaotong University, China, Pattern Recognition Center, WeChat AI, Tencent Inc, China 备注：Findings of ACL-2021, code at this https URL 链接：https://arxiv.org/abs/2107.10427 摘要：计划抽样是减轻神经机器翻译中暴露偏倚问题的有效方法。在训练过程中，用预测符号随机替换地面真实目标输入符号，模拟推理场景。尽管它取得了成功，但它的关键调度策略仅仅基于训练步骤，忽略了模型的实时性，限制了它潜在的性能和收敛速度。为了解决这个问题，我们提出了有信心的计划抽样。具体来说，我们通过模型预测的置信度来量化实时模型的能力，并在此基础上设计细粒度的调度策略。通过这种方式，该模型精确地暴露于高置信度位置的预测标记和低置信度位置的静止地面真值标记。此外，我们观察到，由于大多数预测标记与地面真值标记相同，香草计划抽样会退化为原始的教师强制模式。因此，在上述置信度感知策略下，我们进一步针对高置信度标记位置暴露更多的噪声标记（例如，冗长和不正确的词序），而不是预测的标记。我们评估了我们在Transformer上的方法，并在大规模WMT 2014英语德语、WMT 2014英语法语和WMT 2019中文英语上进行了实验。结果表明，该方法在翻译质量和收敛速度上都明显优于Transformer和vanilla调度采样方法。摘要：Scheduled sampling is an effective method to alleviate the exposure bias problem of neural machine translation. It simulates the inference scene by randomly replacing ground-truth target input tokens with predicted ones during training. Despite its success, its critical schedule strategies are merely based on training steps, ignoring the real-time model competence, which limits its potential performance and convergence speed. To address this issue, we propose confidence-aware scheduled sampling. Specifically, we quantify real-time model competence by the confidence of model predictions, based on which we design fine-grained schedule strategies. In this way, the model is exactly exposed to predicted tokens for high-confidence positions and still ground-truth tokens for low-confidence positions. Moreover, we observe vanilla scheduled sampling suffers from degenerating into the original teacher forcing mode since most predicted tokens are the same as ground-truth tokens. Therefore, under the above confidence-aware strategy, we further expose more noisy tokens (e.g., wordy and incorrect word order) instead of predicted ones for high-confidence token positions. We evaluate our approach on the Transformer and conduct experiments on large-scale WMT 2014 English-German, WMT 2014 English-French, and WMT 2019 Chinese-English. Results show that our approach significantly outperforms the Transformer and vanilla scheduled sampling on both translation quality and convergence speed.

Graph|知识图谱|Knowledge(1篇)

【1】 DEAP-FAKED: Knowledge Graph based Approach for Fake News Detection 标题：Deap-faked：基于知识图的假新闻检测方法

作者：Mohit Mayank,Shakshi Sharma,Rajesh Sharma 备注：8 链接：https://arxiv.org/abs/2107.10648 摘要：最近一段时间，社交媒体平台上的假新闻吸引了大量关注，主要是与政治（2016年美国总统选举）、医疗（COVID-19期间的infodemic）等相关的事件。人们提出了各种方法来检测假新闻。这些方法涵盖了与网络分析、自然语言处理（NLP）和图形神经网络（GNNs）相关的开发技术。本文提出了一种基于知识图的假新闻检测框架DEAP-FAKED。我们的方法是NLP（我们在这里对新闻内容进行编码）和GNN技术（我们在这里对知识图（KG）进行编码）的结合。这些编码的多样性为我们的检测器提供了一个互补的优势。我们使用两个公开可用的数据集评估我们的框架，其中包含来自政治、商业、技术和医疗保健等领域的文章。作为数据集预处理的一部分，我们还消除了可能影响模型性能的偏差，例如文章的来源。DEAP-FAKED的F1评分分别为88%和78%，分别提高了21%和3%，说明了该方法的有效性。摘要：Fake News on social media platforms has attracted a lot of attention in recent times, primarily for events related to politics (2016 US Presidential elections), healthcare (infodemic during COVID-19), to name a few. Various methods have been proposed for detecting Fake News. The approaches span from exploiting techniques related to network analysis, Natural Language Processing (NLP), and the usage of Graph Neural Networks (GNNs). In this work, we propose DEAP-FAKED, a knowleDgE grAPh FAKe nEws Detection framework for identifying Fake News. Our approach is a combination of the NLP -- where we encode the news content, and the GNN technique -- where we encode the Knowledge Graph (KG). A variety of these encodings provides a complementary advantage to our detector. We evaluate our framework using two publicly available datasets containing articles from domains such as politics, business, technology, and healthcare. As part of dataset pre-processing, we also remove the bias, such as the source of the articles, which could impact the performance of the models. DEAP-FAKED obtains an F1-score of 88% and 78% for the two datasets, which is an improvement of 21%, and 3% respectively, which shows the effectiveness of the approach.

推理|分析|理解|解释(1篇)

【1】 iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability 标题：iReason：基于视频和自然语言的多模态常识推理

作者：Aman Chadha,Vinija Jain 机构：Department of Computer Science, Stanford University 备注：12 pages, 1 figure, 7 tables 链接：https://arxiv.org/abs/2107.10300 摘要：因果关系知识对于构建健壮的人工智能系统至关重要。深度学习模型通常在需要因果推理的任务上表现不佳，而因果推理通常是使用某种形式的常识推导出来的，这些常识不是在输入中立即可用的，而是由人类隐含地推断出来的。先前的工作已经揭示了在没有因果关系的情况下，模型所受到的虚假的观测偏差。虽然语言表征模型在学习的嵌入中保留了语境知识，但在训练过程中它们不考虑因果关系。通过将因果关系与输入特征融合到一个执行视觉认知任务（如场景理解、视频字幕、视频问答等）的现有模型中，由于因果关系带来的洞察力，可以获得更好的性能。最近，有人提出了一些模型来处理从视觉或文本模态中挖掘因果数据的任务。然而，目前还没有广泛的研究通过视觉和语言形式并置来挖掘因果关系。虽然图像为我们提供了丰富且易于处理的资源，可以从中挖掘因果关系知识，但视频密度更高，并且由自然的时间顺序事件组成。此外，文本信息提供了视频中可能隐含的细节。我们提出了iReason，一个使用视频和自然语言字幕推断视觉语义常识知识的框架。此外，iReason的架构集成了一个因果合理化模块，以帮助解释性、错误分析和偏差检测的过程。我们通过与语言表征学习模型（BERT，GPT-2）以及当前最先进的多模态因果关系模型的双管齐下的比较分析，证明了iReason的有效性。摘要：Causality knowledge is vital to building robust AI systems. Deep learning models often perform poorly on tasks that require causal reasoning, which is often derived using some form of commonsense knowledge not immediately available in the input but implicitly inferred by humans. Prior work has unraveled spurious observational biases that models fall prey to in the absence of causality. While language representation models preserve contextual knowledge within learned embeddings, they do not factor in causal relationships during training. By blending causal relationships with the input features to an existing model that performs visual cognition tasks (such as scene understanding, video captioning, video question-answering, etc.), better performance can be achieved owing to the insight causal relationships bring about. Recently, several models have been proposed that have tackled the task of mining causal data from either the visual or textual modality. However, there does not exist widespread research that mines causal relationships by juxtaposing the visual and language modalities. While images offer a rich and easy-to-process resource for us to mine causality knowledge from, videos are denser and consist of naturally time-ordered events. Also, textual information offers details that could be implicit in videos. We propose iReason, a framework that infers visual-semantic commonsense knowledge using both videos and natural language captions. Furthermore, iReason's architecture integrates a causal rationalization module to aid the process of interpretability, error analysis and bias detection. We demonstrate the effectiveness of iReason using a two-pronged comparative analysis with language representation learning models (BERT, GPT-2) as well as current state-of-the-art multimodal causality models.

识别/分类(5篇)

【1】 A Systematic Literature Review of Automated ICD Coding and Classification Systems using Discharge Summaries 标题：放电自动编码与分类系统的系统文献综述

作者：Rajvir Kaur,Jeewani Anupama Ginige,Oliver Obst 机构：• A systematic literature review focus on automated ICD code assignment using discharge summaries was conducted., • We review computerised systems that employ Artificial Intelligence, Machine Learning, Deep Learning and Natural 备注：33 pages, 1 figure. Under review in the Journal of Artificial Intelligence in Medicine 链接：https://arxiv.org/abs/2107.10652 摘要：长期以来，自由文本临床叙述的编码一直被认为有利于辅助用途，如资金、保险索赔处理和研究。当前分配代码的场景是一个手动过程，非常昂贵、耗时且容易出错。近年来，许多研究者研究了利用自然语言处理（NLP）、相关机器学习（ML）和深度学习（DL）等方法和技术来解决临床叙述的人工编码问题，帮助人类编码者更准确、高效地分配临床代码。这篇系统性文献综述全面综述了自动临床编码系统，该系统利用适当的NLP、ML和DL方法和技术将ICD代码分配给出院总结。我们遵循了系统评价和元分析（PRISMA）指南的首选报告项目，并在四个学术数据库（PubMed、ScienceDirect、计算机协会（ACM）数字图书馆）中对2010年1月至2020年12月的出版物进行了全面搜索，以及计算语言学协会（ACL）选集。我们审查了7556份出版物；38例符合纳入标准。本综述确定：具有出院总结的数据集；NLP技术以及其他一些数据提取过程、不同的特征提取和嵌入技术。为了衡量分类方法的性能，采用了不同的评价指标。最后，对有兴趣的ICD代码自动分配学者提出了今后的研究方向。仍需努力提高ICD编码预测的准确性，利用最新版本的分类系统提供大规模的去识别临床语料库。这可以成为一个平台来指导和分享知识与经验不足的程序员和研究人员。摘要：Codification of free-text clinical narratives have long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research. The current scenario of assigning codes is a manual process which is very expensive, time-consuming and error prone. In recent years, many researchers have studied the use of Natural Language Processing (NLP), related Machine Learning (ML) and Deep Learning (DL) methods and techniques to resolve the problem of manual coding of clinical narratives and to assist human coders to assign clinical codes more accurately and efficiently. This systematic literature review provides a comprehensive overview of automated clinical coding systems that utilises appropriate NLP, ML and DL methods and techniques to assign ICD codes to discharge summaries. We have followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA) guidelines and conducted a comprehensive search of publications from January, 2010 to December 2020 in four academic databases- PubMed, ScienceDirect, Association for Computing Machinery(ACM) Digital Library, and the Association for Computational Linguistics(ACL) Anthology. We reviewed 7,556 publications; 38 met the inclusion criteria. This review identified: datasets having discharge summaries; NLP techniques along with some other data extraction processes, different feature extraction and embedding techniques. To measure the performance of classification methods, different evaluation metrics are used. Lastly, future research directions are provided to scholars who are interested in automated ICD code assignment. Efforts are still required to improve ICD code prediction accuracy, availability of large-scale de-identified clinical corpora with the latest version of the classification system. This can be a platform to guide and share knowledge with the less experienced coders and researchers.

【2】 Target-Oriented Fine-tuning for Zero-Resource Named Entity Recognition 标题：面向目标的零资源命名实体识别精调

作者：Ying Zhang,Fandong Meng,Yufeng Chen,Jinan Xu,Jie Zhou 机构：Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China, Pattern Recognition Center, WeChat AI, Tencent Inc, China 备注：9 pages, ACL21 Findings 链接：https://arxiv.org/abs/2107.10523 摘要：零资源命名实体识别（NER）在特定领域或语言中存在严重的数据匮乏问题。大多数关于零资源的研究都是通过对不同辅助任务的微调，从不同的数据中转移知识。然而，如何正确选择训练数据和微调任务仍然是一个有待解决的问题。本文从领域、语言和任务三个方面进行知识转移，并加强三者之间的联系。具体来说，我们提出了四个指导知识转移和任务微调的实用指南。基于这些准则，我们设计了一个面向目标的微调（TOF）框架，从三个方面对各种数据进行统一的训练。在六个基准测试上的实验结果表明，我们的方法在跨领域和跨语言的情况下都比基线方法有一致的改进。特别是，我们在五个基准上取得了最新的水平。摘要：Zero-resource named entity recognition (NER) severely suffers from data scarcity in a specific domain or language. Most studies on zero-resource NER transfer knowledge from various data by fine-tuning on different auxiliary tasks. However, how to properly select training data and fine-tuning tasks is still an open problem. In this paper, we tackle the problem by transferring knowledge from three aspects, i.e., domain, language and task, and strengthening connections among them. Specifically, we propose four practical guidelines to guide knowledge transfer and task fine-tuning. Based on these guidelines, we design a target-oriented fine-tuning (TOF) framework to exploit various data from three aspects in a unified training manner. Experimental results on six benchmarks show that our method yields consistent improvements over baselines in both cross-domain and cross-lingual scenarios. Particularly, we achieve new state-of-the-art performance on five benchmarks.

【3】 Back-Translated Task Adaptive Pretraining: Improving Accuracy and Robustness on Text Classification 标题：反向翻译任务自适应预训练：提高文本分类的准确性和鲁棒性

作者：Junghoon Lee,Jounghee Kim,Pilsung Kang 机构：Korea University, Seoul, Republic of Korea 链接：https://arxiv.org/abs/2107.10474 摘要：在大型文本语料库上预训练、在下游文本语料库上微调、在下游任务上微调的语言模型（LMs）成为多个自然语言处理（NLP）任务的实际训练策略。最近，一种自适应预训练方法利用任务相关数据重新训练预训练语言模型，取得了显著的效果。然而，现有的自适应预训练方法由于需要对LM进行预训练的数据量相对较少，存在着对任务分布拟合不足的问题。为了充分利用自适应预训练的概念，我们提出了一种后译任务自适应预训练（BT-TAPT）方法，该方法通过增加任务数据量，将LM推广到目标任务域，从而增加LM再训练的任务特定数据量。实验结果表明，与传统的自适应预训练方法相比，提出的BT-TAPT方法在高、低资源数据的分类精度和对噪声的鲁棒性方面都有较大的提高。摘要：Language models (LMs) pretrained on a large text corpus and fine-tuned on a downstream text corpus and fine-tuned on a downstream task becomes a de facto training strategy for several natural language processing (NLP) tasks. Recently, an adaptive pretraining method retraining the pretrained language model with task-relevant data has shown significant performance improvements. However, current adaptive pretraining methods suffer from underfitting on the task distribution owing to a relatively small amount of data to re-pretrain the LM. To completely use the concept of adaptive pretraining, we propose a back-translated task-adaptive pretraining (BT-TAPT) method that increases the amount of task-specific data for LM re-pretraining by augmenting the task data using back-translation to generalize the LM to the target task domain. The experimental results show that the proposed BT-TAPT yields improved classification accuracy on both low- and high-resource data and better robustness to noise than the conventional adaptive pretraining method.

【4】 Small-text: Active Learning for Text Classification in Python 标题：小文本：Python中文本分类的主动学习

作者：Christopher Schröder,Lydia Müller,Andreas Niekler,Martin Potthast 机构：†Leipzig University, Germany, §Institute for Applied Informatics (InfAI), Leipzig, Germany 备注：preprint 链接：https://arxiv.org/abs/2107.10314 摘要：我们提出了一个简单的模块化主动学习库small text，它为Python中的文本分类提供了基于池的主动学习。它附带了各种预先实现的最先进的查询策略，包括一些可以利用GPU的策略。明确定义的接口允许将多种这样的查询策略与不同的分类器相结合，从而促进快速混合和匹配，并支持快速开发主动学习实验和应用程序。为了以一致的方式访问各种分类器，它集成了几个著名的机器学习库，即sciket learn、PyTorch和huggingface transformers，后者作为可选的可安装扩展提供。该图书馆可在麻省理工学院许可证下获得https://github.com/webis-de/small-text. 摘要：We present small-text, a simple modular active learning library, which offers pool-based active learning for text classification in Python. It comes with various pre-implemented state-of-the-art query strategies, including some which can leverage the GPU. Clearly defined interfaces allow to combine a multitude of such query strategies with different classifiers, thereby facilitating a quick mix and match, and enabling a rapid development of both active learning experiments and applications. To make various classifiers accessible in a consistent way, it integrates several well-known machine learning libraries, namely, scikit-learn, PyTorch, and huggingface transformers -- for which the latter integrations are available as optionally installable extensions. The library is available under the MIT License at https://github.com/webis-de/small-text.

【5】 A baseline model for computationally inexpensive speech recognition for Kazakh using the Coqui STT framework 标题：基于Coqui STT框架的哈萨克语低成本语音识别基线模型

作者：Ilnar Salimzianov 机构：Taruen 备注：4 pages, 2 tables 链接：https://arxiv.org/abs/2107.10637 摘要：移动设备正在改变人们与计算机的交互方式，应用程序的语音接口也变得越来越重要。最近出版的自动语音识别系统非常精确，但通常需要强大的机器（专门的图形处理单元）进行推理，这使得它们无法在商品设备上运行，尤其是在流模式下。在不使用GPU的情况下，我们对哈萨克ASR基线模型（Khassanov等人，2021年）的准确性印象深刻，但对其推理时间不满意，因此我们训练了一个新的基线声学模型（与上述论文在同一数据集上）和三个语言模型，以用于Coqui STT框架。结果看起来很有希望，但是需要进一步的训练和参数扫描，或者限制ASR系统必须支持的词汇量，才能达到生产级的精度。摘要：Mobile devices are transforming the way people interact with computers, and speech interfaces to applications are ever more important. Automatic Speech Recognition systems recently published are very accurate, but often require powerful machinery (specialised Graphical Processing Units) for inference, which makes them impractical to run on commodity devices, especially in streaming mode. Impressed by the accuracy of, but dissatisfied with the inference times of the baseline Kazakh ASR model of (Khassanov et al.,2021) when not using a GPU, we trained a new baseline acoustic model (on the same dataset as the aforementioned paper) and three language models for use with the Coqui STT framework. Results look promising, but further epochs of training and parameter sweeping or, alternatively, limiting the vocabulary that the ASR system must support, is needed to reach a production-level accuracy.

Word2Vec|文本|单词(4篇)

【1】 Lumen: A Machine Learning Framework to Expose Influence Cues in Text 标题：Lumen：一种揭示文本中影响线索的机器学习框架

作者：Hanyu Shi,Mirela Silva,Daniel Capecci,Luiz Giovanini,Lauren Czech,Juliana Fernandes,Daniela Oliveira 机构：Fernandes is with the Department of Advertising 链接：https://arxiv.org/abs/2107.10655 摘要：网络钓鱼和造谣是流行的社会工程攻击，攻击者总是在文本中应用影响线索，使其更吸引用户。我们介绍了Lumen，一个基于学习的框架，它揭示了文本中的影响线索：（i）说服，（ii）框架，（iii）情感，（iv）客观性/主观性，（v）内疚/责备，以及（vi）强调的使用。Lumen接受了一个新开发的3K文本数据集的训练，该数据集由造谣、网络钓鱼、超党派新闻和主流新闻组成。与其他学习模型相比，流明和LSTM的F1微评分最好，但流明的解释性更好。我们的研究结果强调了ML在文本中暴露影响线索的前景，其目标是应用于自动标记工具，以提高基于人的检测的准确性，并降低用户对欺骗性在线内容的依赖。摘要：Phishing and disinformation are popular social engineering attacks with attackers invariably applying influence cues in texts to make them more appealing to users. We introduce Lumen, a learning-based framework that exposes influence cues in text: (i) persuasion, (ii) framing, (iii) emotion, (iv) objectivity/subjectivity, (v) guilt/blame, and (vi) use of emphasis. Lumen was trained with a newly developed dataset of 3K texts comprised of disinformation, phishing, hyperpartisan news, and mainstream news. Evaluation of Lumen in comparison to other learning models showed that Lumen and LSTM presented the best F1-micro score, but Lumen yielded better interpretability. Our results highlight the promise of ML to expose influence cues in text, towards the goal of application in automatic labeling tools to improve the accuracy of human-based detection and reduce the likelihood of users falling for deceptive online content.

【2】 Theoretical foundations and limits of word embeddings: what types of meaning can they capture? 标题：词语嵌入的理论基础与局限：它们能捕捉到什么类型的意义？

作者：Alina Arseniev-Koehler 机构：Department of Sociology, University of California, Los Angeles, the University of California, Los Angeles. Alina’s research interests include cultural sociology, machine-learned models of language, social categorization, and computational social science. 链接：https://arxiv.org/abs/2107.10413 摘要：衡量意义是文化社会学中的一个核心问题，词语嵌入可以提供强有力的新工具。但和任何工具一样，它们建立在理论假设的基础上，并运用理论假设。在这篇论文中，我从理论上阐述了词语嵌入对结构语言学意义理论的三个核心前提的建模方式：意义是关系的、连贯的，并且可以作为一个静态系统来分析。在某些方面，单词嵌入方法容易受到这些前提的相同的、持久的批评。在其他方面，他们为这些批评提供了新颖的解决方案。更广泛地说，通过词语嵌入的形式化意义研究为阐明文化社会学中的核心概念和争论提供了理论机会，例如意义的连贯性。正如网络分析规定了一度模糊的社会关系概念（Borgatti et al.2009），用嵌入方法形式化意义可以促使我们对意义本身进行规定和重新想象。摘要：Measuring meaning is a central problem in cultural sociology and word embeddings may offer powerful new tools to do so. But like any tool, they build on and exert theoretical assumptions. In this paper I theorize the ways in which word embeddings model three core premises of a structural linguistic theory of meaning: that meaning is relational, coherent, and may be analyzed as a static system. In certain ways, word embedding methods are vulnerable to the same, enduring critiques of these premises. In other ways, they offer novel solutions to these critiques. More broadly, formalizing the study of meaning with word embeddings offers theoretical opportunities to clarify core concepts and debates in cultural sociology, such as the coherence of meaning. Just as network analysis specified the once vague notion of social relations (Borgatti et al. 2009), formalizing meaning with embedding methods can push us to specify and reimagine meaning itself.

【3】 COfEE: A Comprehensive Ontology for Event Extraction from text, with an online annotation tool 标题：Cofee：一个用于从文本中提取事件的综合本体，带有在线注释工具

作者：Ali Balali,Masoud Asadpour,Seyed Hossein Jafari 机构：a School of ECE, College of Engineering, University of Tehran, Tehran, Iran 链接：https://arxiv.org/abs/2107.10326 摘要：随着时间的推移，数据在网上大量发布，但大多数数据都是非结构化的，因此很难理解和解释。信息提取（IE）方法从非结构化数据中提取结构化信息。其中一个具有挑战性的IE任务是事件提取（EE），它试图从文本中获得关于特定事件及其参与者的信息。EE在建立知识库、信息检索、摘要和在线监控系统等领域有着广泛的应用。在过去的几十年中，一些事件本体如ACE、CAMEO和ICEWS被开发用来定义文本中观察到的事件的形式、参与者和维度。这些事件本体论还存在一些不足，如只涉及政治事件等少数主题，对论据角色的界定结构不灵活，缺乏分析维度，事件子类型的选择复杂等。为了解决这些问题，我们提出了一个事件本体，即COfEE，它结合了专家领域知识、以前的本体和数据驱动的方法来从文本中识别事件。COfEE由两个层次（事件类型和事件子类型）组成，其中包括与环境问题、网络空间、犯罪活动和自然灾害相关的新类别，需要立即监控。此外，根据每个事件子类型定义动态角色，以捕获事件的各个维度。在后续的实验中，我们以维基百科事件为例，对所提出的本体进行了评估，结果表明该本体具有通用性和综合性。此外，为了便于事件提取的金标准数据的准备，提出了一种基于COfEE的独立于语言的在线工具。摘要：Data is published on the web over time in great volumes, but majority of the data is unstructured, making it hard to understand and difficult to interpret. Information Extraction (IE) methods extract structured information from unstructured data. One of the challenging IE tasks is Event Extraction (EE) which seeks to derive information about specific incidents and their actors from the text. EE is useful in many domains such as building a knowledge base, information retrieval, summarization and online monitoring systems. In the past decades, some event ontologies like ACE, CAMEO and ICEWS were developed to define event forms, actors and dimensions of events observed in the text. These event ontologies still have some shortcomings such as covering only a few topics like political events, having inflexible structure in defining argument roles, lack of analytical dimensions, and complexity in choosing event sub-types. To address these concerns, we propose an event ontology, namely COfEE, that incorporates both expert domain knowledge, previous ontologies and a data-driven approach for identifying events from text. COfEE consists of two hierarchy levels (event types and event sub-types) that include new categories relating to environmental issues, cyberspace, criminal activity and natural disasters which need to be monitored instantly. Also, dynamic roles according to each event sub-type are defined to capture various dimensions of events. In a follow-up experiment, the proposed ontology is evaluated on Wikipedia events, and it is shown to be general and comprehensive. Moreover, in order to facilitate the preparation of gold-standard data for event extraction, a language-independent online tool is presented based on COfEE.

【4】 Digital Einstein Experience: Fast Text-to-Speech for Conversational AI 标题：数字爱因斯坦体验：对话式人工智能的快速文本到语音转换

作者：Joanna Rownicka,Kilian Sprenkamp,Antonio Tripiana,Volodymyr Gromoglasov,Timo P Kunz 机构：Aflorithmic Labs Ltd. 备注：accepted at Interspeech 2021 链接：https://arxiv.org/abs/2107.10658 摘要：我们描述了为会话人工智能用例创建和提供自定义语音的方法。更具体地说，我们为数字爱因斯坦角色提供了一个声音，以实现数字对话体验中的人机交互。为了生成符合上下文的语音，我们首先设计一个语音字符，然后生成与所需语音属性相对应的录音。然后我们模拟声音。我们的解决方案利用FastSpeech2从音素中预测对数标度的mel谱图，并利用并行WaveGAN生成波形。该系统支持字符输入，并在输出端提供语音波形。我们为选定的单词使用自定义词典，以确保它们的正确发音。我们提出的云架构能够实现快速的语音传输，使我们能够与数字版本的Albert Einstein进行实时对话。摘要：We describe our approach to create and deliver a custom voice for a conversational AI use-case. More specifically, we provide a voice for a Digital Einstein character, to enable human-computer interaction within the digital conversation experience. To create the voice which fits the context well, we first design a voice character and we produce the recordings which correspond to the desired speech attributes. We then model the voice. Our solution utilizes Fastspeech 2 for log-scaled mel-spectrogram prediction from phonemes and Parallel WaveGAN to generate the waveforms. The system supports a character input and gives a speech waveform at the output. We use a custom dictionary for selected words to ensure their proper pronunciation. Our proposed cloud architecture enables for fast voice delivery, making it possible to talk to the digital version of Albert Einstein in real-time.

其他神经网络|深度学习|模型|建模(3篇)

【1】 TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy 标题：TagRec：基于层次学习分类法的问题自动标注

作者：Venktesh V,Mukesh Mohania,Vikram Goyal 机构：Indraprastha Institute of Information Technology, Delhi 备注：16 pages, accepted at ECML-PKDD 2021 链接：https://arxiv.org/abs/2107.10649 摘要：在线教育平台根据分层学习分类法（主题-章节-主题）组织学术问题。使用现有分类法自动标记新问题将有助于将这些问题组织到不同的层次分类法类中，以便可以基于诸如章节之类的方面进行搜索。这个任务可以描述为一个平面多类分类问题。通常，基于平面分类的方法忽略了层次分类法中术语与问题之间的语义关系。一些传统的方法也遭受类失衡问题，因为他们只考虑叶节点忽略层次结构。因此，我们将问题描述为一个基于相似度的检索任务，优化分类法和问题之间的语义关系。我们证明了我们的方法有助于处理看不见的标签，因此可以在野外用于分类标记。在该方法中，我们用相应的答案扩充问题以获取更多的语义信息，然后将问题-答案对的上下文嵌入与相应的标签（分类）向量表示对齐。通过微调一个基于Transformer的模型和一个由余弦相似性和铰链秩损失组成的损失函数来对齐表示。损失函数使问题-答案对与正确标签表示之间的相似度最大化，并使与不相关标签之间的相似度最小。最后，我们在两个真实的数据集上进行了实验。实验结果表明，所提出的学习方法比使用多类分类方法和其他最新方法学习的表示法的性能提高了6%Recall@k. 在不需要对网络进行再训练的情况下，我们还证明了该方法在不可见但相关的学习内容（如学习目标）上的性能。摘要：Online educational platforms organize academic questions based on a hierarchical learning taxonomy (subject-chapter-topic). Automatically tagging new questions with existing taxonomy will help organize these questions into different classes of hierarchical taxonomy so that they can be searched based on the facets like chapter. This task can be formulated as a flat multi-class classification problem. Usually, flat classification based methods ignore the semantic relatedness between the terms in the hierarchical taxonomy and the questions. Some traditional methods also suffer from the class imbalance issues as they consider only the leaf nodes ignoring the hierarchy. Hence, we formulate the problem as a similarity-based retrieval task where we optimize the semantic relatedness between the taxonomy and the questions. We demonstrate that our method helps to handle the unseen labels and hence can be used for taxonomy tagging in the wild. In this method, we augment the question with its corresponding answer to capture more semantic information and then align the question-answer pair's contextualized embedding with the corresponding label (taxonomy) vector representations. The representations are aligned by fine-tuning a transformer based model with a loss function that is a combination of the cosine similarity and hinge rank loss. The loss function maximizes the similarity between the question-answer pair and the correct label representations and minimizes the similarity to unrelated labels. Finally, we perform experiments on two real-world datasets. We show that the proposed learning method outperforms representations learned using the multi-class classification method and other state of the art methods by 6% as measured by Recall@k. We also demonstrate the performance of the proposed method on unseen but related learning content like the learning objectives without re-training the network.

【2】 Spinning Sequence-to-Sequence Models with Meta-Backdoors 标题：带元后门的旋转序列到序列模型

作者：Eugene Bagdasaryan,Vitaly Shmatikov 机构：Cornell Tech 链接：https://arxiv.org/abs/2107.10443 摘要：我们研究了神经序列对序列（seq2seq）模型的一个新威胁：当输入包含对手选择的触发词时，训练时间攻击导致模型“旋转”其输出并支持某种情绪。例如，摘要模型将输出提及某个人或组织名称的任何文本的肯定摘要。我们引入“元后门”的概念来解释模型旋转攻击。这些攻击产生的模型的输出是有效的，保留了上下文，但也满足了对手选择的元任务（例如，积极情绪）。先前研究的语言模型中的后门只是简单地翻转情感标签或替换单词，而不考虑上下文。它们的输出与触发器的输入不正确。另一方面，元后门是第一类后门，可以针对seq2seq模型部署，以（a）在输出中引入对手选择的自旋，同时（b）保持标准的准确性度量。为了证明模型纺纱的可行性，我们开发了一种新的后门技术。它将对抗性元任务（如情绪分析）堆叠到seq2seq模型上，将所需的元任务输出（如积极情绪）反向传播到我们称之为“伪词”的单词嵌入空间中的点，并使用伪词移动seq2seq模型的整个输出分布。使用流行的、不太流行的和全新的专有名词作为触发点，我们在BART摘要模型上对该技术进行了评估，结果表明，该方法在显著改变情绪的同时保持了输出的胭脂分数。我们解释了为什么模型旋转在人工智能驱动的造谣中是一种危险的技术，并讨论了如何减轻这些攻击。摘要：We investigate a new threat to neural sequence-to-sequence (seq2seq) models: training-time attacks that cause models to "spin" their output and support a certain sentiment when the input contains adversary-chosen trigger words. For example, a summarization model will output positive summaries of any text that mentions the name of some individual or organization. We introduce the concept of a "meta-backdoor" to explain model-spinning attacks. These attacks produce models whose output is valid and preserves context, yet also satisfies a meta-task chosen by the adversary (e.g., positive sentiment). Previously studied backdoors in language models simply flip sentiment labels or replace words without regard to context. Their outputs are incorrect on inputs with the trigger. Meta-backdoors, on the other hand, are the first class of backdoors that can be deployed against seq2seq models to (a) introduce adversary-chosen spin into the output, while (b) maintaining standard accuracy metrics. To demonstrate feasibility of model spinning, we develop a new backdooring technique. It stacks the adversarial meta-task (e.g., sentiment analysis) onto a seq2seq model, backpropagates the desired meta-task output (e.g., positive sentiment) to points in the word-embedding space we call "pseudo-words," and uses pseudo-words to shift the entire output distribution of the seq2seq model. Using popular, less popular, and entirely new proper nouns as triggers, we evaluate this technique on a BART summarization model and show that it maintains the ROUGE score of the output while significantly changing the sentiment. We explain why model spinning can be a dangerous technique in AI-powered disinformation and discuss how to mitigate these attacks.

【3】 Machine learning for assessing quality of service in the hospitality sector based on customer reviews 标题：基于客户评论评估酒店部门服务质量的机器学习

作者：Vladimir Vargas-Calderón,Andreina Moros Ochoa,Gilmer Yovani Castro Nieto,Jorge E. Camargo 机构：Pontificia Universidad Javeriana, Fundación Universitaria Konrad Lorenz 备注：29 pages, 6 figures 链接：https://arxiv.org/abs/2107.10328 摘要：越来越多的在线酒店平台的使用提供了有关客户偏好的第一手信息，这对于改善酒店服务和提高服务质量至关重要。客户评论可以用来自动提取最相关的方面的服务质量的好客客户。本文提出了一个基于自然语言处理和机器学习方法的顾客评价的服务质量评价框架。该框架能够自动发现与酒店客户相关的服务质量方面。来自波哥大和马德里的酒店评论将自动从Booking.com中删除。语义信息是通过潜在的Dirichlet分配和FastText推断出来的，这使得文本评论可以表示为向量。降维技术被用于可视化和解释大量的客户评论。生成最重要的服务质量方面的可视化，从而可以定性和定量地评估服务质量。结果表明，从大量的顾客评价数据集中自动提取顾客感知的主要服务质量方面是可行的。这些发现可以被酒店管理者用来更好地了解客户，提高服务质量。摘要：The increasing use of online hospitality platforms provides firsthand information about clients preferences, which are essential to improve hotel services and increase the quality of service perception. Customer reviews can be used to automatically extract the most relevant aspects of the quality of service for hospitality clientele. This paper proposes a framework for the assessment of the quality of service in the hospitality sector based on the exploitation of customer reviews through natural language processing and machine learning methods. The proposed framework automatically discovers the quality of service aspects relevant to hotel customers. Hotel reviews from Bogot\'a and Madrid are automatically scrapped from Booking.com. Semantic information is inferred through Latent Dirichlet Allocation and FastText, which allow representing text reviews as vectors. A dimensionality reduction technique is applied to visualise and interpret large amounts of customer reviews. Visualisations of the most important quality of service aspects are generated, allowing to qualitatively and quantitatively assess the quality of service. Results show that it is possible to automatically extract the main quality of service aspects perceived by customers from large customer review datasets. These findings could be used by hospitality managers to understand clients better and to improve the quality of service.

其他(5篇)

【1】 Semiparametric Latent Topic Modeling on Consumer-Generated Corpora 标题：基于消费者生成语料库的半参数潜在主题建模

作者：Dominic B. Dayta,Erniel B. Barrios 机构：School of StatisticsUniversity of the PhilippinesDiliman 链接：https://arxiv.org/abs/2107.10651 摘要：传统的主题建模方法普遍存在过拟合问题，并且在重建稀疏主题结构方面存在弱点。基于消费者生成语料库的动机，提出了半参数主题模型，该模型采用非负矩阵分解和半参数回归两步建模方法。该模型能够重建语料库中稀疏的主题结构，为预测进入语料库的新文档中的主题提供了一个生成模型。假设存在与主题相关的辅助信息，这种方法在语料库较小且词汇量有限的情况下，能更好地发现潜在的主题结构。在一个实际的消费者反馈语料库中，该模型还提供了可解释的和有用的主题定义，与其他方法产生的主题定义相当。摘要：Legacy procedures for topic modelling have generally suffered problems of overfitting and a weakness towards reconstructing sparse topic structures. With motivation from a consumer-generated corpora, this paper proposes semiparametric topic model, a two-step approach utilizing nonnegative matrix factorization and semiparametric regression in topic modeling. The model enables the reconstruction of sparse topic structures in the corpus and provides a generative model for predicting topics in new documents entering the corpus. Assuming the presence of auxiliary information related to the topics, this approach exhibits better performance in discovering underlying topic structures in cases where the corpora are small and limited in vocabulary. In an actual consumer feedback corpus, the model also demonstrably provides interpretable and useful topic definitions comparable with those produced by other methods.

【2】 Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines 标题：阅读、参与和编码：用机器推动临床记录中医疗编码预测的极限

作者：Byung-Hak Kim,Varun Ganapathi 备注：To appear in Proceedings of Machine Learning Research, Volume 149: Machine Learning for Healthcare Conference (MLHC), Virtual, August 6-7, 2021 链接：https://arxiv.org/abs/2107.10650 摘要：从临床笔记中预测医疗代码是当前医疗系统中每个医疗服务机构的一个实际和必要的需求。自动注释将节省大量的时间和过度的努力，由人类程序员今天。然而，最大的挑战是从非结构化的自由文本临床笔记中的数千个高维代码中直接识别出合适的医学代码。在过去的三年里，卷积神经网络（CNN）和长-短期记忆（LTSM）网络在处理最具挑战性的模拟III全标签住院患者临床笔记数据集的基准方面有了巨大的改进。这一进展提出了一个基本问题，即自动机器学习（ML）系统离人类编码人员的工作性能有多远。我们在相同的子样本测试集上评估了人类编码者的基本表现。我们还提出了我们的读，参加，和代码（RAC）模型学习医学代码分配映射。通过将卷积嵌入与自我注意和代码标题引导的注意模块相结合，结合基于句子排列的数据增强和随机加权平均训练，RAC建立了一种新的最新技术（SOTA），大大超过了当前最好的宏F1 18.7%，超过人类水平的编码基线。这一新的里程碑标志着一个有意义的步骤，完全自主的医疗编码（AMC）的机器达到对等的人类编码器的性能在医学代码预测。摘要：Prediction of medical codes from clinical notes is both a practical and essential need for every healthcare delivery organization within current medical systems. Automating annotation will save significant time and excessive effort spent by human coders today. However, the biggest challenge is directly identifying appropriate medical codes out of several thousands of high-dimensional codes from unstructured free-text clinical notes. In the past three years, with Convolutional Neural Networks (CNN) and Long Short-Term Memory (LTSM) networks, there have been vast improvements in tackling the most challenging benchmark of the MIMIC-III-full-label inpatient clinical notes dataset. This progress raises the fundamental question of how far automated machine learning (ML) systems are from human coders' working performance. We assessed the baseline of human coders' performance on the same subsampled testing set. We also present our Read, Attend, and Code (RAC) model for learning the medical code assignment mappings. By connecting convolved embeddings with self-attention and code-title guided attention modules, combined with sentence permutation-based data augmentations and stochastic weight averaging training, RAC establishes a new state of the art (SOTA), considerably outperforming the current best Macro-F1 by 18.7%, and reaches past the human-level coding baseline. This new milestone marks a meaningful step toward fully autonomous medical coding (AMC) in machines reaching parity with human coders' performance in medical code prediction.

【3】 Evaluation of contextual embeddings on less-resourced languages 标题：在资源较少的语言上进行上下文嵌入的评估

作者：Matej Ulčar,Aleš Žagar,Carlos S. Armendariz,Andraž Repar,Senja Pollak,Matthew Purver,Marko Robnik-Šikonja 机构：University of Ljubljana, Veˇcna pot , Ljubljana, Slovenia, Queen Mary University of London, Cognitive Science Research Group, Mile End Road, London E,NS, United Kingdom, Joˇzef Stefan Institute, Jamova , Ljubljana, Slovenia 备注：45 pages 链接：https://arxiv.org/abs/2107.10614 摘要：目前，深层神经网络在自然语言处理中的主导地位是基于上下文嵌入，如ELMo、BERT和BERT导数。现有的工作大多集中在英语方面；相比之下，我们在这里提出了第一个多语种的经验比较两个ELMo和几个单语和多语种BERT模型使用14个任务在九种语言。在单语环境下，我们的分析表明，单语BERT模型通常占主导地位，但有少数例外，如依赖分析任务，它们与在大型语料库上训练的ELMo模型没有竞争力。在跨语言环境中，只使用少数几种语言训练的BERT模型表现最好，紧随其后的是大量使用多种语言的BERT模型。摘要：The current dominance of deep neural networks in natural language processing is based on contextual embeddings such as ELMo, BERT, and BERT derivatives. Most existing work focuses on English; in contrast, we present here the first multilingual empirical comparison of two ELMo and several monolingual and multilingual BERT models using 14 tasks in nine languages. In monolingual settings, our analysis shows that monolingual BERT models generally dominate, with a few exceptions such as the dependency parsing task, where they are not competitive with ELMo models trained on large corpora. In cross-lingual settings, BERT models trained on only a few languages mostly do best, closely followed by massively multilingual BERT models.

【4】 Impacts Towards a comprehensive assessment of the book impact by integrating multiple evaluation sources 标题：通过整合多个评估来源，对图书影响进行全面评估的影响

作者：Qingqing Zhou,Chengzhi Zhang 机构：. Department of Network and New Media, Nanjing Normal University, Nanjing , China, . Department of Information Management, Nanjing University of Science and Technology 备注：None 链接：https://arxiv.org/abs/2107.10434 摘要：图书出版数量的激增，使得手工评价方法难以有效地评价图书。利用图书的引文和替代性评价指标可以辅助人工评价，降低评价成本。然而，现有的评价研究大多是基于单一评价源进行粗粒度分析，可能会得出不全面或片面的图书影响评价结果。同时，依赖单一资源进行图书评估，可能会因评估数据的缺乏而导致评估结果无法获得的风险，尤其是对于新出版的图书。因此，本文在整合多个评价来源构建评价体系的基础上，对图书影响力进行了测度。具体来说，我们对多个评价源进行了细粒度挖掘，包括图书内部评价资源和外部评价资源。采用多种技术（如主题提取、情感分析、文本分类等）从内外部评价资源中提取相应的评价指标。然后，采用专家评价与层次分析法相结合的方法，整合评价指标，构建图书影响力评价体系。最后，通过与专家评价结果的比较，验证了评价体系的可靠性，得到了详细、多样的评价结果。实验结果表明，差异化评价资源可以从不同维度衡量图书的影响力，整合多种评价数据可以更全面地评价图书。同时，图书影响评价系统可以根据用户的评价目的提供个性化的评价结果。此外，在评估书籍的影响时，还应考虑学科差异。摘要：The surge in the number of books published makes the manual evaluation methods difficult to efficiently evaluate books. The use of books' citations and alternative evaluation metrics can assist manual evaluation and reduce the cost of evaluation. However, most existing evaluation research was based on a single evaluation source with coarse-grained analysis, which may obtain incomprehensive or one-sided evaluation results of book impact. Meanwhile, relying on a single resource for book assessment may lead to the risk that the evaluation results cannot be obtained due to the lack of the evaluation data, especially for newly published books. Hence, this paper measured book impact based on an evaluation system constructed by integrating multiple evaluation sources. Specifically, we conducted finer-grained mining on the multiple evaluation sources, including books' internal evaluation resources and external evaluation resources. Various technologies (e.g. topic extraction, sentiment analysis, text classification) were used to extract corresponding evaluation metrics from the internal and external evaluation resources. Then, Expert evaluation combined with analytic hierarchy process was used to integrate the evaluation metrics and construct a book impact evaluation system. Finally, the reliability of the evaluation system was verified by comparing with the results of expert evaluation, detailed and diversified evaluation results were then obtained. The experimental results reveal that differential evaluation resources can measure the books' impacts from different dimensions, and the integration of multiple evaluation data can assess books more comprehensively. Meanwhile, the book impact evaluation system can provide personalized evaluation results according to the users' evaluation purposes. In addition, the disciplinary differences should be considered for assessing books' impacts.

【5】 Evaluation of In-Person Counseling Strategies To Develop Physical Activity Chatbot for Women 标题：开发女性健身聊天机器人的面对面咨询策略评价

作者：Kai-Hui Liang,Patrick Lange,Yoo Jung Oh,Jingwen Zhang,Yoshimi Fukuoka,Zhou Yu 机构：Columbia University, University of California, Davis, San Francisco 备注：Accepted by SIGDIAL 2021 as a long paper 链接：https://arxiv.org/abs/2107.10410 摘要：人工智能聊天机器人是基于技术干预改变人们行为的先锋。要开发干预聊天机器人，首先要了解人类会话中的自然语言会话策略。这项工作介绍了一个干预会话数据集收集自现实世界中的妇女体育活动干预计划。我们设计了四个维度（领域、策略、社会交换和任务聚焦交换）的综合注释方案，并对对话子集进行了注释。我们建立了一个包含上下文信息的策略分类器，在注释的基础上检测训练者和参与者的策略。为了了解人为干预如何诱导有效的行为改变，我们分析了干预策略与参与者身体活动障碍和社会支持变化之间的关系。我们还分析了参与者的基线权重如何与相应策略的发生量相关。这项工作为开发个性化的体育活动干预BOT奠定了基础。数据集和代码位于https://github.com/KaihuiLiang/physical-activity-counseling 摘要：Artificial intelligence chatbots are the vanguard in technology-based intervention to change people's behavior. To develop intervention chatbots, the first step is to understand natural language conversation strategies in human conversation. This work introduces an intervention conversation dataset collected from a real-world physical activity intervention program for women. We designed comprehensive annotation schemes in four dimensions (domain, strategy, social exchange, and task-focused exchange) and annotated a subset of dialogs. We built a strategy classifier with context information to detect strategies from both trainers and participants based on the annotation. To understand how human intervention induces effective behavior changes, we analyzed the relationships between the intervention strategies and the participants' changes in the barrier and social support for physical activity. We also analyzed how participant's baseline weight correlates to the amount of occurrence of the corresponding strategy. This work lays the foundation for developing a personalized physical activity intervention bot. The dataset and code are available at https://github.com/KaihuiLiang/physical-activity-counseling

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-07-23，如有侵权请联系 cloudcommunity@tencent.com 删除

linux