前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >GitHub 项目推荐 | ParlAI 多任务智能对话平台

GitHub 项目推荐 | ParlAI 多任务智能对话平台

作者头像
机器学习之禅
发布2022-07-11 15:17:51
1.6K0
发布2022-07-11 15:17:51
举报
文章被收录于专栏:机器学习之禅机器学习之禅

为让AI更会聊天,Facebook又开源了,我们先来看下 ParlAI 的 3 大特色:

  • 集成了大量的公开数据集---从公开领域闲聊到专业的视觉问题问答一应俱全;
  • 海量参考模型,应有尽有;
  • 无缝衔接亚马逊 Mechanical Turk 系统,完成数据收集、训练和人工评估。

该项目目前已经提供了 100 多个流行数据集,可使用相同的 API 进行调用,其中包括 PersonaChat、DailyDialog、维基百科向导、Empathetic Dialogues、SQuAD、MS MARCO、QuAC、HotpotQA、QACNN 和 QADailyMail、CBT、BookTest、bAbI Dialogue 任务、Ubuntu 对话、OpenSubtitles、图像聊天、VQA、VisDial 和 CLEVR 等等。

我们可以参阅相关的论文来了解 ParlAI 的情况

“ParlAI:A Dialog Research Software Platform”,arXiv:1705.06476。

安装ParlAI

ParlAI 目前需要 Python3.7+ 和 Pytorch 1.6 或更高版本,核心模块的依赖项在 requirements.txt 中列出,包含(在 parlai/agents 中)的一些模型有额外的要求。强烈建议您在 venv 或 conda 环境中安装 ParlAI。

首先我们新建一个目录,如下所示:

代码语言:javascript
复制
 mkdir  parlAI

接下来我们进行下载和安装:

代码语言:javascript
复制
git clone https://github.com/facebookresearch/ParlAI.git
cd ParlAI; python setup.py develop

这时可以看到输出如下:

代码语言:javascript
复制
Cloning into 'ParlAI'...
remote: Enumerating objects: 47, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (44/44), done.
remote: Total 24850 (delta 18), reused 10 (delta 3), pack-reused 24803
Receiving objects: 100% (24850/24850), 28.80 MiB | 1.36 MiB/s, done.
Resolving deltas: 100% (17462/17462), done.

可以看出安装速度还是很快的。

之后进入目录,执行测试代码,这个测试是在 1k 训练样本的 BabI 任务上随机输出 10 条任务 1 的样例结果:

代码语言:javascript
复制
python examples/display_data.py -t babi:task1k:1

中间可能还有很多依赖需要安装,这里就不一一说明了,大家请自行处理吧。

正常运行的结果,我们来看一下:

代码语言:javascript
复制
[ optional arguments: ] 
[  display_ignore_fields: agent_reply ]
[  max_display_len: 1000 ]
[  num_examples: 10 ]
[ Main ParlAI Arguments: ] 
[  batchsize: 1 ]
[  datapath: /home/xxx/parlAI/ParlAI/data ]
[  datatype: train:stream ]
[  download_path: /home/xxx/parlAI/ParlAI/downloads ]
[  hide_labels: False ]
[  image_mode: raw ]
[  init_opt: None ]
[  multitask_weights: [1] ]
[  numthreads: 1 ]
[  show_advanced_args: False ]
[  task: babi:task1k:1 ]
[ ParlAI Model Arguments: ] 
[  dict_class: None ]
[  init_model: None ]
[  model: None ]
[  model_file: None ]
[ PytorchData Arguments: ] 
[  batch_length_range: 5 ]
[  batch_sort_cache_type: pop ]
[  batch_sort_field: text ]
[  numworkers: 4 ]
[  pytorch_context_length: -1 ]
[  pytorch_datapath: None ]
[  pytorch_include_labels: True ]
[  pytorch_preprocess: False ]
[  pytorch_teacher_batch_sort: False ]
[  pytorch_teacher_dataset: None ]
[  pytorch_teacher_task: None ]
[  shuffle: False ]
[ ParlAI Image Preprocessing Arguments: ] 
[  image_cropsize: 224 ]
[  image_size: 256 ]
[ Current ParlAI commit: e8d0a75d291c7bb4b4e5565d60a899f794c10963 ]
[creating task(s): babi:task1k:1]
[building data: /home/xxx/parlAI/ParlAI/data/bAbI]
[ downloading: http://parl.ai/downloads/babi/babi.tar.gz to /home/xxx/parlAI/ParlAI/data/bAbI/babi.tar.gz ]
Downloading babi.tar.gz: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 19.2M/19.2M [00:05<00:00, 3.29MB/s]
unpacking babi.tar.gz
[loading fbdialog data:/home/xxx/parlAI/ParlAI/data/bAbI/tasks_1-20_v1-2/en-valid-nosf/qa1_train.txt]
[loading fbdialog data:/home/xxx/parlAI/ParlAI/data/bAbI/tasks_1-20_v1-2/en-valid-nosf/qa1_train.txt]
[babi:task1k:1]: Mary moved to the bathroom.
John went to the hallway.
Where is Mary?
[labels: bathroom]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: Daniel went back to the hallway.
Sandra moved to the garden.
Where is Daniel?
[labels: hallway]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: John moved to the office.
Sandra journeyed to the bathroom.
Where is Daniel?
[labels: hallway]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: Mary moved to the hallway.
Daniel travelled to the office.
Where is Daniel?
[labels: office]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: John went back to the garden.
John moved to the bedroom.
Where is Sandra?
[labels: bathroom]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
- - - - - - - - - - - - - - - - - - - - -
~~
[babi:task1k:1]: Sandra travelled to the office.
Sandra went to the bathroom.
Where is Sandra?
[labels: bathroom]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: Mary went to the bedroom.
Daniel moved to the hallway.
Where is Sandra?
[labels: bathroom]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: John went to the garden.
John travelled to the office.
Where is Sandra?
[labels: bathroom]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: Daniel journeyed to the bedroom.
Daniel travelled to the hallway.
Where is John?
[labels: office]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
~~
[babi:task1k:1]: John went to the bedroom.
John travelled to the office.
Where is Daniel?
[labels: hallway]
[label_candidates: office|hallway|kitchen|bathroom|bedroom|...and 1 more]
- - - - - - - - - - - - - - - - - - - - -
~~
[ loaded 180 episodes with a total of 900 examples ]

当然,这个测试是英文的,你可以尝试按照他这个思路梳理一个中文的,对于一般的小任务已经足够了。

最后附上 GitHub 项目地址:https://github.com/facebookresearch/ParlAI,感兴趣的小伙伴快去学习吧。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-06-21,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 机器学习之禅 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 为让AI更会聊天,Facebook又开源了,我们先来看下 ParlAI 的 3 大特色:
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档