前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >torch.distributed.init_process_group()

torch.distributed.init_process_group()

作者头像
狼啸风云
修改2022-09-02 22:27:21
6.6K0
修改2022-09-02 22:27:21
举报

torch.distributed.init_process_group(backend, init_method=None, timeout=datetime.timedelta(0, 1800), world_size=-1, rank=-1, store=None, group_name='')[source]

Initializes the default distributed process group, and this will also initialize the distributed package.

There are 2 main ways to initialize a process group:

  1. Specify store, rank, and world_size explicitly.
  2. Specify init_method (a URL string) which indicates where/how to discover peers. Optionally specify rank and world_size, or encode all required parameters in the URL and omit them.

If neither is specified, init_method is assumed to be “env://”.

Parameters:

  • backend (str or Backend) – The backend to use. Depending on build-time configurations, valid values include mpi, gloo, and nccl. This field should be given as a lowercase string (e.g., "gloo"), which can also be accessed via Backend attributes (e.g., Backend.GLOO). If using multiple processes per machine with nccl backend, each process must have exclusive access to every GPU it uses, as sharing GPUs between processes can result in deadlocks.
  • init_method (str, optional) – URL specifying how to initialize the process group. Default is “env://” if no init_method or store is specified. Mutually exclusive with store.
  • world_size (int, optional) – Number of processes participating in the job. Required if store is specified.
  • rank (int, optional) – Rank of the current process. Required if store is specified.
  • store (Store, optional) – Key/value store accessible to all workers, used to exchange connection/address information. Mutually exclusive with init_method.
  • timeout (timedelta, optional) – Timeout for operations executed against the process group. Default value equals 30 minutes. This is applicable for the gloo backend. For nccl, this is applicable only if the environment variable NCCL_BLOCKING_WAIT is set to 1.
  • group_name (str, optional, deprecated) – Group name.

To enable backend == Backend.MPI, PyTorch needs to built from source on a system that supports MPI.

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2020-05-07 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档