作者采用了一个基于规则的区分模型:在每次迭代中,保留一个最常出现的生成式回复的列表,并用一个简单的二分函数来判断两个回复是否相似,用这个相似得分去更新数据的权重。最后在 Persona Dataset(zhang 2018)上做了实验验证,表明模型在 BLUE 上和现有的模型接近,ROUGH 值有时会变差,但是在多样性的指标上,比如不同 n-gram 的个数,则有明显的提升。
论文 2::Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study
本文通过加入时序和空间上的 feature,来解决对话系统中的回复句子的选择问题。方法分两步,第一步是通过软对齐来获取上下文和回复之间的关联信息;第二步是在时间维度聚合注意力的映像,并用 3D 卷积和池化来抽取匹配信息。模型分表达模块(Representation module)和匹配模块(Matching block)两部分,如图 1,表达模块用的是 Bi-GRU,匹配模块用的是深度 3D 卷积网络(Ji 2013)。
图 1
时序和空间上的匹配体现在如下过程中:句子空间上的关联,通过 attention 机制来构建;时间上的关联,则是把不同时间维度上的 3D 特征扩展成 4D「方块」(cube),之后采用类似 2D 卷积核对 3D 数据的处理流程,这里用 3D 卷积核来处理 4D 数据,并在 3D 上进行池化操作。最后再加上一个 softmax 进行分类。
论文 6:One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues
作者:Chongyang Tao,Wei Wu,Can Xu,Wenpeng Hu,Dongyan Zhao,Rui Yan
论文链接:https://www.aclweb.org/anthology/P19-1001
本文提出了一个基于检索的深度交互对话模型,来解决现有模型中,对对话交互信息利用较浅的问题。问题的定义如下:对话数据由 D={(yi,ci,ri)} 三元组组成。其中 ci 是对话的问句,ri 是回复,yi 是标记,表明 ri 是否是 ci 的回复。模型需要计算 ci 和 ri 之间的匹配得分,来表明两者是否是关联的。
本文的侧重点,其实不在模型方面,而是在数据集的收集整理方法上。解决的问题,是如何在对话中采用不同的劝说策略,来劝说人们对慈善机构进行捐助。采用的方法,是设计了一个数据采集的策略,并对数据中涉及到的劝说策略进行分析和分类。然后基于分类的结果,来训练一个分类器。数据收集的方法是本文的重点。作者先在 Amazon Mechanical Turk 平台上,设计了一个在线的任务。任务包括四个部分:
一个高熵值的例子,如「what did you do today」,这个问句的答案会有很多种回复;而「what is the color of sky」的熵值就比较低,因为回复很明确。其中,计算熵值的时候,对对话中的 source 和 target 做了区分(source 表示对话的发起方,target 为应答方)。在给定数据集 D 时,Target 和 source 的熵值的定义如下:
另外,先对语句聚类,也会对实验效果有影响。聚类能反映出问句的回答是否是语义上的多样。比如「how old are you」,虽然答案也会有很多种,但语义上都是接近的。一个句子可能有低熵值,但是如果组成的 cluster 有高熵值,这个 cluster 也会从数据集中删除掉。一个 source cluster 的目标熵值定义如下:
Mohammad Norouzi, Samy Bengio, Navdeep Jaitly, Mike Schuster, Yonghui Wu, Dale Schuurmans, et al. 2016. Reward augmented maximum likelihood for neural structured prediction. In Advances In Neural Information Processing Systems, pages 1723–1731
Jiwei Li, Will Monroe, Tianlin Shi, Sebastien Jean, Alan Ritter, and Dan Jurafsky. 2017b. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2157–2169.
Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. 2018. Personalizing dialogue agents: I have a dog, do you have pets too? In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, ages 2204–2213.
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings Of The International Conference on Representation Learning (ICLR 2015).
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is ll you need. In Advances in Neural Information Processing Systems, pages 5998–6008
Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 2013. 3d convolutional neural networks for human action recognition. IEEE transactions on pattern nalysis and machine intelligence, 35(1):221–231.
Ryan Lowe, Nissan Pow, Iulian V. Serban, and Joelle Pineau. 2015b. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. Proceedings of the SIGDIAL 2015 Conference, page 285294
Yu Wu, Wei Wu, Chen Xing, Zhoujun Li, and Ming Zhou. 2017. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. Proceedings ofthe 55th Annual Meeting ofthe Association for Computational Linguistics, pages 496–505.
Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, and Hua Wu. 2018. Multi-turn response selection for chatbots with deep attention matching network. Proceedings ofthe 56th Annual Meeting ofthe Association for Computational Linguistics, pages 1–10.
Kangyan Zhou, Shrimai Prabhumoye, and Alan W Black. 2018. A dataset for document grounded conversations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 708–713.
Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Thirtieth AAAI Conference on Artificial Intelligence.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105.
Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, et al. 2015. End-to-end memory networks. In NIPS
Antoine Bordes, Y-Lan Boureau, and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683.
Rich Caruana. 1993. Multitask learning: A knowledge-based source of inductive bias. In ICML, pages 41–48.
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In NAACL
Seungwhan Moon, Leonard Neves, and Vitor Carvalho. 2018a. Multimodal named entity recognition for short social media posts. NAACL. 854
Seungwhan Moon, Leonard Neves, and Vitor Carvalho. 2018b. Zeroshot multimodal named entity disambiguation for noisy social media posts. ACL.
Seungwhan Moon and Jaime Carbonell. 2017. Completely heterogeneous transfer learning with attention: What and what not to transfer. IJCAI
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In NIPS.
Tom Young, Erik Cambria, Iti Chaturvedi, Minlie Huang, Hao Zhou, and Subham Biswas. 2018. Augmenting end-to-end dialog systems with commonsense knowledge. AAAI
Prasanna Parthasarathi and Joelle Pineau. 2018. Extending neural generative conversational model using external knowledge sources. EMNLP.
Puyang Xu and Qi Hu. 2018. An end-to-end approach for handling unknown slot values in dialogue state tracking. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1448–1457. Association for Computational Linguistics
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, and Richard Socher. 2018. The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730
Osman Ramadan, Paweł Budzianowski, and Milica Gasic. 2018. Large-scale multi-domain belief tracking with knowledge sharing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 432–437. Association for Computational Linguistics.
Victor Zhong, Caiming Xiong, and Richard Socher. 2018. Global-locally self-attentive encoder for dialogue state tracking. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1458–1467. Association for Computational Linguistics.
Elnaz Nouri and Ehsan Hosseini-Asl. 2018. Toward scalable neural dialogue state tracking model. In Advances in neural information processing systems (NeurIPS), 2nd Conversational AI workshop. https://arxiv.org/abs/1812.00899.
Paweł Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Tseng, Inigo Casanueva, Stefan Ultes, Osman Ra-madan, and Milica Gasic. 2018. Multiwoz-a largescale multi-domain wizard-of-oz dataset for taskoriented dialogue modelling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 5016–5026.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450.
Koichiro Yoshino, Chiori Hori, Julien Perez, Luis Fernando D』Haro, Lazaros Polymenakos, Chulaka Gunasekara, Walter S. Lasecki, Jonathan Kummerfeld, Michael Galley, Chris Brockett, Jianfeng Gao, Bill Dolan, Sean Gao, Tim K. Marks, Devi Parikh, and Dhruv Batra. 2018. The 7th dialog system technology challenge. arXiv preprint.
Gunnar A Sigurdsson, Gul Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, and Abhinav Gupta. 2016. Hollywood in homes: Crowdsourcing data collection for activity understanding. In European Conference on Computer Vision, pages 510–526. Springer.
Satwik Kottur, Jose MF Moura, Devi Parikh, Dhruv Batra, and Marcus Rohrbach. 2018. Visual coreference resolution in visual dialog using neural module networks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 153–169.
Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, Jose MF Moura, Devi Parikh, and Dhruv Batra. 2017a. Visual dialog. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, volume 2.
Klaus Krippendorff. 2004. Reliability in content analysis: Some common misconceptions and recommendations. Human communication research, 30(3):411–433.
Keinosuke Fukunaga and Larry Hostetler. 1975. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions on information theory, 21(1):32–40.
William R Miller and Stephen Rollnick. 2012. Motivational interviewing: Helping people change. Guilford press.
William R Miller, Theresa B Moyers, Denise Ernst, and Paul Amrhein. 2003. Manual for the motivational interviewing skill code (misc). Unpublished manuscript. Albuquerque: Center on Alcoholism, Substance Abuse and Addictions, University of New Mexico.
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988