前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >具有泛化能力的句子表征模型:Gensen评测实验

具有泛化能力的句子表征模型:Gensen评测实验

作者头像
sparkexpert
发布2019-05-26 14:10:24
1.1K0
发布2019-05-26 14:10:24
举报
文章被收录于专栏:大数据智能实战

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning(https://arxiv.org/abs/1804.00079)一文发表在 ICLR 2018,中,该论文提出的模型能够在各种各样的任务中泛化句子表征,且设计了一个一对多的多任务学习框架。其主要贡献的描述如文中所述。

The primary contribution of our work is to combine the benefits of diverse sentence-representation learning objectives into a single multi-task framework. 同时实验表明,在增添了一个多语言神经机器翻译任务时,句法属性能够被更好地学习到,句子长度和词序能够通过一个句法分析任务学习到,并且训练一个神经语言推理能够编码语法信息。

2、实验部分

(1)由于实验复现采用了python3环境,因此对gensen中的相关代码进行了修改,主要是两个部分:

1)修改了glove2h5的部分代码,由于python3脚本对float(val)敏感,需要引入try, exception机制来解决。

代码语言:javascript
复制
print('change vectors')
w2v_vector=[]
for line in glove_vectors:
    try:
        w2v_vector+=[float(val) for val in line[1:]]
    except ValueError:
        print("error on line",line)
        
        
del glove_vectors
gc.collect()
vectors = np.array(w2v_vector).astype(np.float32)

注:由于在转换过程中需要消耗大量的内存,因此进行了内存回收,不然容易出现memory 不够的问题。

2)vocab文件在python3中的打开存在编码不对应的问题(gensen.py)。

代码语言:javascript
复制
def _load_params(self):
        """Load pretrained params."""
        # Read vocab pickle files
        filename = os.path.join(
                self.model_folder,
                '%s_vocab.pkl' % (self.filename_prefix)
            )
        print(filename)
        with open(filename, 'rb') as f:            
            u = pickle._Unpickler(f)
            u.encoding = 'latin1'
            model_vocab = u.load()

(2)实验评测

1)gensen中自带的例子测试

sentences = [ 'hello world .', 'the quick brown fox jumped over the lazy dog .', 'this is a sentence .' ] vocab = [ 'the', 'quick', 'brown', 'fox', 'jumped', 'over', 'lazy', 'dog', 'hello', 'world', '.', 'this', 'is', 'a', 'sentence', '<s>', '</s>', '<pad>', '<unk>' ]

2)senteval中集成的测试

在senteval的17项任务评测结果如下:

{'STS12': {'MSRpar': {'pearson': (0.4242749254520813, 3.973321856075198e-34), 'spearman': SpearmanrResult(correlation=0.43689783218545136, pvalue=2.623847109207459e-36), 'nsamples': 750}, 'MSRvid': {'pearson': (0.8431200046048173, 9.954996055278301e-204), 'spearman': SpearmanrResult(correlation=0.8434445060271232, pvalue=4.899452803862567e-204), 'nsamples': 750}, 'SMTeuroparl': {'pearson': (0.5085791335463655, 1.4543565654958856e-31), 'spearman': SpearmanrResult(correlation=0.5910758372570859, pvalue=1.3966783465806513e-44), 'nsamples': 459}, 'surprise.OnWN': {'pearson': (0.6924773496538905, 3.609130779999958e-108), 'spearman': SpearmanrResult(correlation=0.6831386989584722, pvalue=3.338887773358492e-104), 'nsamples': 750}, 'surprise.SMTnews': {'pearson': (0.5699883750430004, 9.450042347374515e-36), 'spearman': SpearmanrResult(correlation=0.4924898524588661, pvalue=9.093432952648339e-26), 'nsamples': 399}, 'all': {'pearson': {'mean': 0.607687957660031, 'wmean': 0.6212250301554153}, 'spearman': {'mean': 0.6094093453773997, 'wmean': 0.6243301281564914}}}, 'STS13': {'FNWN': {'pearson': (0.44812079835627416, 1.0069228531863214e-10), 'spearman': SpearmanrResult(correlation=0.46648892903400746, pvalue=1.3294443717066755e-11), 'nsamples': 189}, 'headlines': {'pearson': (0.7039260583060535, 3.0407269106065143e-113), 'spearman': SpearmanrResult(correlation=0.6938053503920689, pvalue=9.57272215561543e-109), 'nsamples': 750}, 'OnWN': {'pearson': (0.4673906945667383, 8.638108134571737e-32), 'spearman': SpearmanrResult(correlation=0.4912206669354746, pvalue=2.062109639692091e-35), 'nsamples': 561}, 'all': {'pearson': {'mean': 0.5398125170763554, 'wmean': 0.5832303695138774}, 'spearman': {'mean': 0.550504982120517, 'wmean': 0.5893968096881869}}}, 'STS14': {'deft-forum': {'pearson': (0.3569903021625723, 5.687789477438835e-15), 'spearman': SpearmanrResult(correlation=0.35030097553307676, pvalue=1.9451695508522155e-14), 'nsamples': 450}, 'deft-news': {'pearson': (0.6982006599148092, 3.6759088740801256e-45), 'spearman': SpearmanrResult(correlation=0.6737747591797518, pvalue=4.777495360379787e-41), 'nsamples': 300}, 'headlines': {'pearson': (0.6607384123277869, 2.8622323715880622e-95), 'spearman': SpearmanrResult(correlation=0.6358071122283795, pvalue=3.423245972757599e-86), 'nsamples': 750}, 'images': {'pearson': (0.8203867295657831, 9.425906365562756e-184), 'spearman': SpearmanrResult(correlation=0.7886886906152988, pvalue=3.4410673031736275e-160), 'nsamples': 750}, 'OnWN': {'pearson': (0.6293394184088853, 5.683036182764806e-84), 'spearman': SpearmanrResult(correlation=0.670056168045285, pvalue=6.855093979408371e-99), 'nsamples': 750}, 'tweet-news': {'pearson': (0.7373854337595948, 1.3901773295687528e-129), 'spearman': SpearmanrResult(correlation=0.6985061104075353, pvalue=8.2287377776831e-111), 'nsamples': 750}, 'all': {'pearson': {'mean': 0.6505068260232385, 'wmean': 0.6682648878651034}, 'spearman': {'mean': 0.6361889693348879, 'wmean': 0.6545497140576491}}}, 'STS15': {'answers-forums': {'pearson': (0.6554249972891065, 2.1271261100008417e-47), 'spearman': SpearmanrResult(correlation=0.6601587975606593, pvalue=2.726124243082021e-48), 'nsamples': 375}, 'answers-students': {'pearson': (0.7573938306575574, 1.316509038401903e-140), 'spearman': SpearmanrResult(correlation=0.7580757377086599, pvalue=5.307184802456452e-141), 'nsamples': 750}, 'belief': {'pearson': (0.6938360601407095, 3.874701942825776e-55), 'spearman': SpearmanrResult(correlation=0.70217697353048, pvalue=5.543214866718992e-57), 'nsamples': 375}, 'headlines': {'pearson': (0.7074291257643678, 7.606360102914599e-115), 'spearman': SpearmanrResult(correlation=0.7048600919337434, pvalue=1.1433940132491299e-113), 'nsamples': 750}, 'images': {'pearson': (0.8668311693111709, 2.796222690292546e-228), 'spearman': SpearmanrResult(correlation=0.8666790008905915, pvalue=4.158163553853086e-228), 'nsamples': 750}, 'all': {'pearson': {'mean': 0.7361830366325823, 'wmean': 0.751571163612001}, 'spearman': {'mean': 0.7383901203248268, 'wmean': 0.7526956790196411}}}, 'STS16': {'answer-answer': {'pearson': (0.6390744533171796, 1.4684840675437832e-30), 'spearman': SpearmanrResult(correlation=0.6344016536328181, pvalue=5.220136063860679e-30), 'nsamples': 254}, 'headlines': {'pearson': (0.7236185236906961, 1.1773489013100314e-41), 'spearman': SpearmanrResult(correlation=0.7177487166715745, pvalue=1.0437695211803904e-40), 'nsamples': 249}, 'plagiarism': {'pearson': (0.8068349082371395, 5.0035304651968556e-54), 'spearman': SpearmanrResult(correlation=0.8217365702171775, pvalue=1.328353052825811e-57), 'nsamples': 230}, 'postediting': {'pearson': (0.8460786512088675, 4.5804231431872625e-68), 'spearman': SpearmanrResult(correlation=0.8507624405425562, pvalue=1.4715854949019706e-69), 'nsamples': 244}, 'question-question': {'pearson': (0.3250647177568497, 1.5686673016894581e-06), 'spearman': SpearmanrResult(correlation=0.3367988573603013, pvalue=6.154958595925993e-07), 'nsamples': 209}, 'all': {'pearson': {'mean': 0.6681342508421466, 'wmean': 0.6766101765111587}, 'spearman': {'mean': 0.6722896476848855, 'wmean': 0.6802983628200636}}}, 'MR': {'devacc': 80.18, 'acc': 80.9, 'ndev': 10662, 'ntest': 10662}, 'CR': {'devacc': 85.11, 'acc': 86.54, 'ndev': 3775, 'ntest': 3775}, 'MPQA': {'devacc': 90.49, 'acc': 90.48, 'ndev': 10606, 'ntest': 10606}, 'SUBJ': {'devacc': 93.56, 'acc': 92.96, 'ndev': 10000, 'ntest': 10000}, 'SST2': {'devacc': 85.32, 'acc': 81.6, 'ndev': 872, 'ntest': 1821}, 'SST5': {'devacc': 45.14, 'acc': 44.16, 'ndev': 1101, 'ntest': 2210}, 'TREC': {'devacc': 87.93, 'acc': 92.2, 'ndev': 5452, 'ntest': 500}, 'MRPC': {'devacc': 77.48, 'acc': 77.86, 'f1': 83.56, 'ndev': 4076, 'ntest': 1725}, 'SICKEntailment': {'devacc': 84.8, 'acc': 87.21, 'ndev': 500, 'ntest': 4927}, 'SICKRelatedness': {'devpearson': 0.8888073586069731, 'pearson': 0.8871502442441512, 'spearman': 0.83463204343083, 'mse': 0.22029426612841826, 'yhat': array([3.22583999, 4.16475753, 1.29015625, ..., 2.99803383, 4.42068648, 4.87265243]), 'ndev': 500, 'ntest': 4927}, 'STSBenchmark': {'devpearson': 0.8086089078977258, 'pearson': 0.7825342758470275, 'spearman': 0.7858373058266386, 'mse': 1.1126025740886758, 'yhat': array([2.35551956, 2.33857733, 1.54570818, ..., 4.20836965, 4.17153144, 3.38133943]), 'ndev': 1500, 'ntest': 1379}, 'Length': {'devacc': 93.08, 'acc': 93.45, 'ndev': 9996, 'ntest': 9996}, 'WordContent': {'devacc': 93.96, 'acc': 93.95, 'ndev': 10000, 'ntest': 10000}, 'Depth': {'devacc': 40.69, 'acc': 40.85, 'ndev': 10000, 'ntest': 10000}, 'TopConstituents': {'devacc': 85.45, 'acc': 85.43, 'ndev': 10000, 'ntest': 10000}, 'BigramShift': {'devacc': 74.37, 'acc': 74.63, 'ndev': 10000, 'ntest': 10000}, 'Tense': {'devacc': 90.78, 'acc': 89.99, 'ndev': 10000, 'ntest': 10000}, 'SubjNumber': {'devacc': 91.32, 'acc': 89.65, 'ndev': 10000, 'ntest': 10000}, 'ObjNumber': {'devacc': 89.64, 'acc': 89.81, 'ndev': 10000, 'ntest': 10000}, 'OddManOut': {'devacc': 52.28, 'acc': 51.98, 'ndev': 10000, 'ntest': 10000}, 'CoordinationInversion': {'devacc': 68.9, 'acc': 67.96, 'ndev': 10002, 'ntest': 10002}}

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2019年02月21日,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
机器翻译
机器翻译(Tencent Machine Translation,TMT)结合了神经机器翻译和统计机器翻译的优点,从大规模双语语料库自动学习翻译知识,实现从源语言文本到目标语言文本的自动翻译,目前可支持十余种语言的互译。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档