【论文推荐】最新六篇机器翻译相关论文— 自注意力残差解码器、SGNMT、级联方法、神经序列预测、Benchmark、人类水平

【导读】专知内容组整理了最近六篇机器翻译(Machine Translation)相关文章,为大家进行介绍,欢迎查看!

1.Self-Attentive Residual Decoder for Neural Machine Translation(基于自注意力残差解码器的神经机器翻译)


作者:Lesly Miculicich Werlen,Nikolaos Pappas,Dhananjay Ram,Andrei Popescu-Belis

摘要:Neural sequence-to-sequence networks with attention have achieved remarkable performance for machine translation. One of the reasons for their effectiveness is their ability to capture relevant source-side contextual information at each time-step prediction through an attention mechanism. However, the target-side context is solely based on the sequence model which, in practice, is prone to a recency bias and lacks the ability to capture effectively non-sequential dependencies among words. To address this limitation, we propose a target-side-attentive residual recurrent network for decoding, where attention over previous words contributes directly to the prediction of the next word. The residual learning facilitates the flow of information from the distant past and is able to emphasize any of the previously translated words, hence it gains access to a wider context. The proposed model outperforms a neural MT baseline as well as a memory and self-attention network on three language pairs. The analysis of the attention learned by the decoder confirms that it emphasizes a wider context, and that it captures syntactic-like structures.

期刊:arXiv, 2018年3月23日

网址

http://www.zhuanzhi.ai/document/dbc697ffcb43eb2a4b64fc55d86a543f

2.Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation(为什么不是万能的呢?SGNMT解码器在机器翻译中的应用)


作者:Felix Stahlberg,Danielle Saunders,Gonzalo Iglesias,Bill Byrne

机构:University of Cambridge

摘要:SGNMT is a decoding platform for machine translation which allows paring various modern neural models of translation with different kinds of constraints and symbolic models. In this paper, we describe three use cases in which SGNMT is currently playing an active role: (1) teaching as SGNMT is being used for course work and student theses in the MPhil in Machine Learning, Speech and Language Technology at the University of Cambridge, (2) research as most of the research work of the Cambridge MT group is based on SGNMT, and (3) technology transfer as we show how SGNMT is helping to transfer research findings from the laboratory to the industry, eg. into a product of SDL plc.

期刊:arXiv, 2018年3月20日

网址

http://www.zhuanzhi.ai/document/3e0197f22fd5e19520c2cdae69b4b726

3.English-Catalan Neural Machine Translation in the Biomedical Domain through the cascade approach(在生物医学领域中通过级联方法进行的英语-加泰罗尼亚神经机器翻译)


作者:Marta R. Costa-jussà,Noe Casas,Maite Melero

机构:Universitat Politecnica de Catalunya

摘要:This paper describes the methodology followed to build a neural machine translation system in the biomedical domain for the English-Catalan language pair. This task can be considered a low-resourced task from the point of view of the domain and the language pair. To face this task, this paper reports experiments on a cascade pivot strategy through Spanish for the neural machine translation using the English-Spanish SCIELO and Spanish-Catalan El Peri\'odico database. To test the final performance of the system, we have created a new test data set for English-Catalan in the biomedical domain which is freely available on request.

期刊:arXiv, 2018年3月20日

网址

http://www.zhuanzhi.ai/document/2fbed977773fddbd8da6233e0824d330

4.Generative Bridging Network in Neural Sequence Prediction(神经序列预测中的生成桥接网络)


作者:Wenhu Chen,Guanlin Li,Shuo Ren,Shujie Liu,Zhirui Zhang,Mu Li,Ming Zhou

机构:University of Science and Technology,Beijing University of Aeronautics and Astronautics,University of California

摘要:In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network). Unlike MLE directly maximizing the conditional likelihood, the bridge extends the point-wise ground truth to a bridge distribution conditioned on it, and the generator is optimized to minimize their KL-divergence. Three different GBNs, namely uniform GBN, language-model GBN and coaching GBN, are proposed to penalize confidence, enhance language smoothness and relieve learning burden. Experiments conducted on two recognized sequence prediction tasks (machine translation and abstractive text summarization) show that our proposed GBNs can yield significant improvements over strong baselines. Furthermore, by analyzing samples drawn from different bridges, expected influences on the generator are verified.

期刊:arXiv, 2018年3月18日

网址

http://www.zhuanzhi.ai/document/33ef54e6acf3422c70b73e5bfa1c4787

5.TBD: Benchmarking and Analyzing Deep Neural Network Training(TBD: 对深度神经网络训练进行Benchmark分析)


作者:Hongyu Zhu,Mohamed Akrout,Bojian Zheng,Andrew Pelegris,Amar Phanishayee,Bianca Schroeder,Gennady Pekhimenko

机构:University of Toronto,Microsoft Research

摘要:The recent popularity of deep neural networks (DNNs) has generated a lot of research interest in performing DNN-related computation efficiently. However, the primary focus is usually very narrow and limited to (i) inference -- i.e. how to efficiently execute already trained models and (ii) image classification networks as the primary benchmark for evaluation. Our primary goal in this work is to break this myopic view by (i) proposing a new benchmark for DNN training, called TBD (TBD is short for Training Benchmark for DNNs), that uses a representative set of DNN models that cover a wide range of machine learning applications: image classification, machine translation, speech recognition, object detection, adversarial networks, reinforcement learning, and (ii) by performing an extensive performance analysis of training these different applications on three major deep learning frameworks (TensorFlow, MXNet, CNTK) across different hardware configurations (single-GPU, multi-GPU, and multi-machine). TBD currently covers six major application domains and eight different state-of-the-art models. We present a new toolchain for performance analysis for these models that combines the targeted usage of existing performance analysis tools, careful selection of new and existing metrics and methodologies to analyze the results, and utilization of domain specific characteristics of DNN training. We also build a new set of tools for memory profiling in all three major frameworks; much needed tools that can finally shed some light on precisely how much memory is consumed by different data structures (weights, activations, gradients, workspace) in DNN training. By using our tools and methodologies, we make several important observations and recommendations on where the future research and optimization of DNN training should be focused.

期刊:arXiv, 2018年3月16日

网址

http://www.zhuanzhi.ai/document/505ba67620ace41b240b8d9a310e61be

6.Achieving Human Parity on Automatic Chinese to English News Translation(自动中英文新闻翻译上达到人类水平)


作者:Hany Hassan,Anthony Aue,Chang Chen,Vishal Chowdhary,Jonathan Clark,Christian Federmann,Xuedong Huang,Marcin Junczys-Dowmunt,William Lewis,Mu Li,Shujie Liu,Tie-Yan Liu,Renqian Luo,Arul Menezes,Tao Qin,Frank Seide,Xu Tan,Fei Tian,Lijun Wu,Shuangzhi Wu,Yingce Xia,Dongdong Zhang,Zhirui Zhang,Ming Zhou

摘要:Machine translation has made rapid advances in recent years. Millions of people are using it today in online translation systems and mobile applications in order to communicate across language barriers. The question naturally arises whether such systems can approach or achieve parity with human translations. In this paper, we first address the problem of how to define and accurately measure human parity in translation. We then describe Microsoft's machine translation system and measure the quality of its translations on the widely used WMT 2017 news translation task from Chinese to English. We find that our latest neural machine translation system has reached a new state-of-the-art, and that the translation quality is at human parity when compared to professional human translations. We also find that it significantly exceeds the quality of crowd-sourced non-professional translations.

期刊:arXiv, 2018年3月15日

网址

http://www.zhuanzhi.ai/document/7586f0034a3b659e8c7148e63b02c0c2

原文发布于微信公众号 - 专知(Quan_Zhuanzhi)

原文发表时间:2018-03-27

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏专知

【论文推荐】最新六篇行人再识别相关论文—特定视角、多目标、双注意匹配网络、联合属性-身份、迁移学习、多通道金字塔型

【导读】专知内容组整理了最近六篇行人再识别(Person Re-Identification)相关文章,为大家进行介绍,欢迎查看! 1. Learning Vi...

87550
来自专栏专知

【专知荟萃01】深度学习知识资料大全集(入门/进阶/论文/代码/数据/综述/领域专家等)(附pdf下载)

【导读】主题荟萃知识是专知的核心功能之一,为用户提供AI领域系统性的知识学习服务。主题荟萃为用户提供全网关于该主题的精华(Awesome)知识资料收录整理,使得...

50270
来自专栏机器学习算法与Python学习

常用机器学习与数据挖掘相关术语(该充充电了...)

Sampling(采样): Simple Random Sampling(简单随机采样), OfflineSampling(离线等可能K...

32670
来自专栏CVer

谷歌CVPR 2018最全总结:45篇论文,Ian Goodfellow GAN演讲PPT下载

谷歌在今年的CVPR上表现强势,有超过200名谷歌员工将在大会上展示论文或被邀请演讲,45篇论文被接收。在计算机视觉领域,生成对抗网络GAN无疑是最受关注的主题...

50130
来自专栏PPV课数据科学社区

【学习】常用的机器学习&数据挖掘知识点

Basis(基础): MSE(Mean Square Error 均方误差),LMS(LeastMean Square 最小均方),LSM(Least Squa...

372120
来自专栏专知

【论文推荐】最新5篇行人再识别(ReID)相关论文—迁移学习、特征集成、重排序、 多通道金字塔、深层生成模型

【导读】专知内容组整理了最近五篇行人再识别(Person Re-identification)相关文章,为大家进行介绍,欢迎查看! 1.Unsupervised...

53770
来自专栏专知

【干货】Python大数据处理库PySpark实战——使用PySpark处理文本多分类问题

【导读】近日,多伦多数据科学家Susan Li发表一篇博文,讲解利用PySpark处理文本多分类问题的详情。我们知道,Apache Spark在处理实时数据方面...

12.8K100
来自专栏专知

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

【导读】专知内容组整理了最近八篇目标跟踪(Object Tracking)相关文章,为大家进行介绍,欢迎查看! 1. Adaptive Correlation ...

49880
来自专栏专知

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【导读】专知内容组整理了最近五篇情感分析(Sentiment Analysis)相关文章,为大家进行介绍,欢迎查看! 1. Deep Learning for ...

48250
来自专栏大数据挖掘DT机器学习

机器学习&数据挖掘知识点大总结

Basis(基础): MSE(Mean Square Error 均方误差), LMS(LeastMean Square 最小均方), LSM(L...

432140

扫码关注云+社区

领取腾讯云代金券