pytorch tpu_pytorch使用tpu_从运行在CPU上的TPU保存的pytorch模型 - 腾讯云开发者社区

、、、

我跟随这个在Google Colab TPU上启动了我的PyTorch Lightning项目。所以我安装了 !pip install cloud-tpu-client==0.10 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl 然后 !pip install pytorch-lightning 然后我 !pip install torch torchvision torchaudio !pip install -r requirements.txt 在安

浏览 9提问于2021-11-27得票数 1

3回答

Colab PyTorch PyTorch ImportError：

、、、

在谷歌的竞争，我已经尝试了所有三个运行时: CPU，GPU，TPU。都犯了同样的错误。细胞： # NB: Only run in TPU environment !pip install cloud-tpu-client==0.10 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.8-cp37-cp37m-linux_x86_64.whl !pip -q install pytorch-lightning==1.2.7 transformers torchmetrics awscli mlflow boto3 pyc

浏览 8提问于2021-08-19得票数 0

回答已采纳

2回答

如何在google上使用torchaudio与torch xla

、、

我正在尝试运行一个在google上使用torchaudio的py手电筒脚本。为此，我在之后使用，更具体地说，我使用这个代码单元来加载xla： !pip install torchaudio import os assert os.environ['COLAB_TPU_ADDR'], 'Make sure to select TPU from Edit > Notebook settings > Hardware accelerator' VERSION = "20200220" #@param ["20200220"

浏览 1提问于2020-03-17得票数 2

回答已采纳

3回答

与PyTorch一起使用TPU

、、、、

我正在尝试使用谷歌云的TPU来自Colab。我通过使用Tensorflow完成了本教程中的操作。有没有人知道是否可以使用PyTorch的TPU？如果是这样的话，我怎么做呢？你有什么例子吗？

浏览 0提问于2018-10-04得票数 14

回答已采纳

1回答

经过PyTorch训练的模型可以在GPU和TPU之间传输吗？

、、

在使用图形处理器训练PyTorch模型后，我可以使用保存的权重在TPU上继续训练模型吗？

浏览 6提问于2021-09-25得票数 0

1回答

TPU与VM实例-使用

我刚开始学习使用google，对TPU实例(或TPU资源/TPU)和VM实例感到困惑。我遵循并创建了tpu，在这里我克隆了我的github，创建了一个conda环境，并安装了另外需要的培训包。正如我认为我已经准备好了安装一样，我看到了各种教程，讨论如何创建VM实例，并在这个VM实例中链接创建的TPU实例。但是我在google云文档中找不到更多关于它的细节。如果有人能向我解释:我们应该如何一起或单独使用TPU和VM实例，那就太好了。这两者之间有什么联系(从工作流的角度来看)？背景信息，如果需要的话:我将在TPU上使用XLA运行pytorch代码。非常感谢！

浏览 8提问于2022-08-01得票数 0

回答已采纳

1回答

您将如何建立一个tensorflow集群的多PETPUv2-8(TPUv2)？

、、、

我有两个tpu (v2-8)运行在GCE上，软件版本tpu-vm-tf-2.8.0。我想用tensorflow来执行分布式深入学习，使用这两个vms，也就是说，总共有2x8 =16个核心。对于8个核心的分布式学习，我将策略设置为： resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='local') tf.config.experimental_connect_to_cluster(resolver) tf.tpu.experimental.initialize_tpu_system(resolve

浏览 14提问于2022-05-17得票数 1

回答已采纳

1回答

在Google上导入PyTorch XLA时出错

、、、

我试图运行一些代码上的谷歌Colab TPU。我正在使用下面的代码行来安装pytorch： !pip install cloud-tpu-client==0.10 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl 当我试图导入torch_xla时，我得到的是错误 ImportError: /usr/local/lib/python3.7/dist-packages/_XLAC.cpython-37m-x86_64-linux-gnu.so: undefined

浏览 3提问于2021-09-10得票数 0

1回答

Google Cloud TPU: gcloud计算TPU创建失败，权限被拒绝

、、

我正在尝试遵循谷歌云TPU训练模型的官方教程。这是教程：在“启动Cloud TPU资源”步骤中，我执行以下操作 :~$ gcloud compute tpus create train-bert-one \ > --zone=europe-west4-a \ > --network=default \ > --version=pytorch-1.6 \ > --accelerator-type=v3-8 就像在教程中一样，我只是调整了区域。命令失败，出现以下错误 ERROR: (gcloud.compute.tpus.create) PERMISSION_DENIE

浏览 0提问于2020-10-06得票数 1

1回答

为什么PyTorch中的Google上的TPU没有被检测到？

、

我正在使用google和PyTorch。我把硬件加速器调到TPU了。这一行代码显示没有检测到cuda设备： device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') print(device)

浏览 10提问于2022-11-03得票数 0

4回答

在Google Colab上设置MLflow

、、

我经常使用Google Colab来训练TF/PyTorch模型，因为Colab为我提供了GPU/TPU运行时。此外，我喜欢使用MLflow来存储和比较经过训练的模型，跟踪进度，共享等。在Google Colab中使用MLflow有哪些可用的解决方案？

浏览 12提问于2020-05-05得票数 3

1回答

从运行在CPU上的TPU保存的pytorch模型

、、

我发现了有趣的模型-问题生成器，但不能运行它。我得到一个错误： Traceback (most recent call last): File "qg.py", line 5, in <module> model = AutoModelWithLMHead.from_pretrained("/home/user/ml-experiments/gamesgen/t5-base-finetuned-question-generation-ap/") File "/home/user/.virtualenvs/hugging/li

浏览 24提问于2020-08-15得票数 1

1回答

无法在GCP上创建pytorch cpu映像

、

我正在尝试使用以下命令在GCP上创建一个可抢占的tpu实例 gcloud compute instances create tpu-1-vm \ --image-project=deep-learning-platform-release \ --image-family=pytorch-latest-cpu\ --preemptible \ --zone=us-central1-f 这些命令崩溃，并出现以下错误： ERROR: (gcloud.compute.instances.create) Could not fetch resource: - Required '

浏览 12提问于2019-07-27得票数 0

回答已采纳

1回答

使用PyTorch在云TPU上训练FairSeq RoBERTa时，RPC失败，错误状态=“不可用: Socket closed”

、、、

我按照教程"Pre-training FairSeq RoBERTa on Cloud TPU using Pytorch“设置了一个可抢占(v2-8) TPU环境，并训练了我的RoBERTa模型。按照文档的说明，PyTorch环境基于torch-xla-1.6。但是，它不会像往常一样在GPU中输出任何训练日志，并且会在2-3天内(间隔12小时)抛出两次RPC故障警告(参见下文-此处删除网络端点)。我的训练步数是161,529步。根据文档，根据我的配置，v2-8将在5个时期花费80小时。然而，我的工作似乎悬而未决。有什么建议吗？ W 4566 tensorflow/cor

浏览 125提问于2020-09-09得票数 1

2回答

无法从“_TPU_AVAILABLE”导入名称“pytorch_lightning.utilities”

、、

我正在尝试导入aitexten包来处理GPT-2解决方案。但是我遇到了一个错误:ImportError:无法从‘(/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/init.py)’导入名称'_TPU_AVAILABLE‘ 我试着将Py火炬降级到1.11.0，但这也无济于事。请帮帮我！

浏览 24提问于2022-11-04得票数 0

1回答

导入torch_geometric.data的Colab会话崩溃

、、

我的colab会话总是在试图导入torch_geometric.data模块时崩溃。供参考，我正在编写的代码如下： import torch def format_pytorch_version(version): return version.split('+')[0] TORCH_version = torch.__version__ TORCH = format_pytorch_version(TORCH_version) def format_cuda_version(version): return 'cu' + version.r

浏览 17提问于2022-04-29得票数 0

回答已采纳

1回答

如何在google colab TPU上使用torchaudio和torch_xla？

、、、

我使用google colab (启用了GPU)来训练我的基于pytorch和torchaudio的自动语音识别模型。但是当我尝试使用google colab TPU时，我在训练我的模型时得到了以下错误： ImportError: /usr/local/lib/python3.6/dist-packages/_torch_sox.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceISt7complexIfEEEPKNS_6detail12TypeMetaDa

浏览 226提问于2020-07-09得票数 1

回答已采纳

1回答

在TPU上运行不同序列长度的HuggingFace转换器是否会导致每次都重新编译XLA？

、、

在TPU上运行不同序列长度的HuggingFace变压器会导致每次都生成新的计算图吗？从而导致每次都要重新编译XLA？另外，在训练时，这是否意味着所有批次都应该填充到整个数据集中的整体最大长度？如果我使用，它会自动为我做吗？或者PyTorch/XLA的ParallelLoader可以做到这一点？

浏览 7提问于2020-05-18得票数 0

1回答

OSError: libmkl_intel_lp64.so.1:无法打开共享对象文件:没有此类文件或目录

、、、、

我正在试着在colab notebook中给出的TPU上运行一个模型。模型工作得很好，但今天我无法运行模型。我使用以下代码安装pytorch-xla。 VERSION = "nightly" #@param ["1.5" , "20200325", "nightly"] !curl https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py !python pytor

浏览 588提问于2021-04-26得票数 2

回答已采纳

2回答

如何在PyTorch中使用TPU？

、、

我正在尝试使用pytorch_xla使用TPU，但它在_XLAC中显示导入错误。 !curl https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py !python pytorch-xla-env-setup.py --version $VERSION import torch_xla import torch_xla.core.xla_model as xm ImportError

浏览 19提问于2020-05-17得票数 6

1回答

如何在TPU上使用自定义数据集来训练Pytorch模型？

、、、

由于我是一名学生，所以我只能省下最初的300美元，所以我需要尽量减少试错阶段。我有一个基于Pytorch的模型，它目前运行在本地GPU上，在我的本地存储中有大约100 or的帧数据集，我正在寻找一个指南，展示如何设置一台机器来在我的Google (？)(或任何其他推荐的云存储)数据集中使用TPU来训练和测试我的模型。我发现的指南与我的描述不匹配，大多数指南要么运行在GPU上，要么在TPU上运行，其中包含在dataset库中，我不想浪费时间和预算试图从这些片段中拼凑一个谜题。

浏览 8提问于2022-02-21得票数 1

回答已采纳

1回答

PyTorch / loss.backward() ->缺少XLA配置

、、、

损失是根据使用py手电(而不是TensorFlow)创建的目标模型计算的，在传播时，我运行下面的代码，并出现以下错误消息。 loss.backward() (可以毫无问题地计算前向传播。) terminate called after throwing an instance of 'std::runtime_error' what(): tensorflow/compiler/xla/xla_client/computation_client.cc:280 : Missing XLA configuration Aborted -pytorch(1.12.0+cu102)

浏览 36提问于2022-10-12得票数 1

1回答

为什么当我试图转换Keras时，Google Colab会给出一个“未知设备”错误？

、、、

我试着用TPU在google上训练一个简单的MLP模型。但是，当我尝试将模型转换为 from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from keras.constraints import NonNeg model = Sequential() model.add(Dense(57,input_shape=(57,))) model.add(Dense(60,kernel_constraint=NonNeg(),activation="relu")

浏览 0提问于2019-03-19得票数 2

回答已采纳

2回答

运行pytorch/xla时缺少XLA配置

、、

我尝试使用Pytorch/XLA运行GCP TPU，我使用具有debian-9-torch-xla-v20200818镜像的虚拟机，我启动TPU并使用ctpu状态检查它是否正在运行，该状态显示CPU和TPU都在运行，然后我激活torch-xla-nightly环境，但当我尝试调用以下简单代码时： import torch import torch_xla import torch_xla.core.xla_model as xm dev = xm.xla_device() t1 = torch.ones(3, 3, device = dev) print(t1) 出现以下错误： Trace

浏览 88提问于2020-08-19得票数 7

回答已采纳

1回答

带有Tensorflow v1的TPU

、

请任何人能给我的代码运行TPU与Tensorflow V1？我正在尝试这段代码，但它只适用于Tensorflow 2.0： try: # TPU detection. No parameters necessary if TPU_NAME environment variable is # set: this is always the case on Kaggle. tpu = tf.distribute.cluster_resolver.TPUClusterResolver() print('Running on TPU ', tpu.m

浏览 14提问于2020-07-03得票数 2

1回答

Google版本

、、

如何在Google中打印我正在使用的TPU版本，以及TPU有多少内存？我得到以下输出 tpu = tf.distribute.cluster_resolver.TPUClusterResolver() tf.config.experimental_connect_to_cluster(tpu) tf.tpu.experimental.initialize_tpu_system(tpu) tpu_strategy = tf.distribute.experimental.TPUStrategy(tpu) 输出 INFO:tensorflow:Initializing the TPU syst

浏览 5提问于2020-11-06得票数 2

回答已采纳

2回答

ModuleNotFoundError:没有名为“tensorflow.compiler”的模块

、、、

~\AppData\Roaming\Python\Python36\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py in <module>() 38 from tensorflow.contrib.tpu.python.tpu import tpu_config 39 from tensorflow.contrib.tpu.python.tpu import tpu_context ---> 40 from tensorflow.contrib.tpu.python.tpu

浏览 2提问于2018-08-18得票数 1

1回答

无法使用GCP上的ssh连接到TPU

、

我正在学习上的教程。我创建了一个TPU实例，并尝试通过gcloud compute ssh线路连接到它。然后，出现了这个错误。 AppData\Local\Google\Cloud SDK>gcloud compute ssh node-1 --zone=asia-east1-c PythonERROR: (gcloud.compute.ssh) Could not fetch resource: - The resource 'projects/project-masker/zones/asia-east1-c/instances/node-1' was not f

浏览 8提问于2021-07-03得票数 0

2回答

在Google环境下运行云TPU分析器

、、、

我正在运行Google笔记本，并试图捕获TPU分析数据，以便在TensorBoard中使用，但是在运行capture_tpu_profile代码时，无法让capture_tpu_profile在后台运行。到目前为止，我尝试在后台运行捕获过程： !capture_tpu_profile --logdir=gs://<my_logdir> --tpu=$COLAB_TPU_ADDR & 和 !bg capture_tpu_profile --logdir=gs://<my_logdir> --tpu=$COLAB_TPU_ADDR

浏览 0提问于2018-11-11得票数 1

回答已采纳

2回答

在Google Colab Pro中使用TPU v3

、、、

有没有办法在Google Colab Pro中使用TPU v3而不是TPU v2？不幸的是，我得到了一个错误信息Compilation failure: Ran out of memory in memory space hbm. Used 8.29G of 7.48G hbm. Exceeded hbm capacity by 825.60M.与TPU的v2，我不再收到与TPU的v3。因为TPU v3具有更多的存储器。有没有人知道一个可能性/选项？有了这个，我启动了TPU try: tpu = tf.distribute.cluster_resolver.TPUClusterRes

浏览 4提问于2020-11-08得票数 2

1回答

从“`ConcatDataset`”创建的“`DataLoader`”是从不同的文件创建批处理，还是从单个文件创建批处理？

我正在处理多个文件，以及每个文件中的多个培训样本。我将使用ConcatDataset，如下所述：除了我的真实样本之外，我还需要有阴性样本，并且我需要从所有的训练数据文件中随机选择我的阴性样本。因此，我想知道，返回的批处理示例是来自单个文件的随机连续夹子，还是跨所有数据文件的多个随机索引的批处理范围？如果需要更多的细节来说明我到底想要做什么，那是因为我正试图通过一个TPU来训练Pytorch XLA。通常，对于阴性样本，我只需要使用第二个DataSet和DataLoader，但是，我试图使用Pytorch (alpha是几天前发布的 )在TPUs上进行培训，要做到这一点，我需要将自己的D

浏览 2提问于2019-10-13得票数 2

回答已采纳

2回答

为什么Google Colab TPU和我的电脑一样慢？

、、、、

因为我有一个很大的数据集，而且我的PC没有多少电力，所以我认为在Google Colab上使用TPU是一个好主意。下面是我的TPU配置： try: tpu = tf.distribute.cluster_resolver.TPUClusterResolver() print('Running on TPU ', tpu.master()) except ValueError: tpu = None if tpu: tf.config.experimental_connect_to_cluster(tpu) tf.tpu.experi

浏览 90提问于2020-12-15得票数 1

1回答

将TensoBoard与TPU一起使用时的UnimplementedError

、、、、

我目前正在用TPU训练我的模型。不幸的是，在使用TensoBoard和TPU时，我得到了一个X错误。如果我只使用TPU，那么一切都可以正常工作。如果我使用图形处理器和TensorBoard，一切都能正常工作。我使用谷歌colab。 %tensorflow_version 2.x import tensorflow as tf print("Tensorflow version " + tf.__version__) TPU_ACTIVATED = True if TPU_ACTIVATED: try: tpu = tf.distribute.cluster_res

浏览 7提问于2020-11-17得票数 0

1回答

jupyter笔记本用TPU pod v2-32

、、

我用下面的代码来训练我的模型在v3-8 TPU中，它是一个单一的设备TPU，它工作的很好，但是相同的代码不工作在TPU v2-32上。据我所知，v2-32是通过专用高速网络相互连接的TPU设备集群，那么如何调整代码使其在v2-32上工作呢？ tpu = tf.distribute.cluster_resolver.TPUClusterResolver(tpu="tpu-name", zone="us-central1-a", project="myproject") tf.config.experimental_connect_to_cluste

浏览 3提问于2021-01-29得票数 1

1回答

未实现文件系统方案'[local]‘(文件：'./logs')

、、、

我在Colab和TPU运行时使用HuggingFace库训练BERT模型时遇到了这个问题。我已经正确设置了TPU，并检查它是否工作正常。 BERT模型的训练参数如下： from transformers import TFTrainer, TFTrainingArguments training_args = TFTrainingArguments( output_dir='./results', # output directory num_train_epochs=5, # total number of tr

浏览 17提问于2021-06-17得票数 0

1回答

云TPU工具未生成配置文件

、、、

我已经按照上的说明操作了。除了必须将--tpu_name更改为--tpu的步骤4之外，一切似乎都像预期的那样工作。失败的是"Profile“选项卡的生成。我执行了 capture_tpu_profile --tpu_name=$TPU_NAME --logdir=${model_dir} 产生了 Welcome to the Cloud TPU Profiler v1.6.0 Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 3 Limiting the number of trace events t

浏览 0提问于2018-05-16得票数 1

2回答

谷歌Colab KeyError：'COLAB_TPU_ADDR‘

、、

我试图使用TPU选项在Google上运行一个简单的MNIST分类器。在使用Keras创建模型之后，我试图通过以下方法将其转换为TPU： import tensorflow as tf import os tpu_model = tf.contrib.tpu.keras_to_tpu_model( model, strategy=tf.contrib.tpu.TPUDistributionStrategy( tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.envi

浏览 0提问于2018-11-04得票数 2

回答已采纳

1回答

在VM架构下分配TPU Pod的要求是什么？

、

在TPU体系结构下分配TPU时，tpu-vm-tf-2.6.2-pod等版本可作为TPU软件版本使用。当选择pod作为软件版本时，按照 jax.device_count()的指令无法找到TPU。选择pod版本是否足以分配TPU Pod，还是有其他步骤/要求？我如何选择哪个TPU VM运行在吊舱下？

浏览 20提问于2022-07-24得票数 1

1回答

Google Cloud TPU --未使用TPU

、、

我正在试着在TPU上运行一个简单的程序： import tensorflow as tf tpu = tf.distribute.cluster_resolver.TPUClusterResolver() print("Device:", tpu.master()) tf.config.experimental_connect_to_cluster(tpu) tf.tpu.experimental.initialize_tpu_system(tpu) strategy = tf.distribute.experimental.TPUStrategy(tpu) a = tf.

浏览 0提问于2021-01-06得票数 1

1回答

初始化ttpu时的InvalidArgumentError

、、、

实际上，我在使用tf-2.3.0稳定构建时遇到了这个问题，同时在kaggle中通过以下代码初始化tpu： try: tpu_name = os.getenv('TPU_NAME') tpu = tf.distribute.cluster_resolver.TPUClusterResolver(tpu_name) print("running on tpu: ", tpu.master()) except ValueError: tpu = None if tpu: tf.config.experimental_connect_to_cluster(tpu

浏览 0提问于2020-07-31得票数 1

1回答

capture_tpu_profile未能连接到所有地址

、、

我正在尝试从运行这个命令。 capture_tpu_profile --tpu=[my-tpu-name] --monitoring_level=2 --tpu_zone=[my-tpu-zone] 它会产生以下错误 2022-08-07 08:42:22.253271: I tensorflow/core/tpu/tpu_initializer_helper.cc:66] libtpu.so already in used by another process. Not attempting to load libtpu.so in this process. WARNING: Lo

浏览 8提问于2022-08-07得票数 0

2回答

如何将keras模型转换为tpu模型

、、、

我正在尝试将我在Google云控制台中的Keras模型转换为TPU模型。不幸的是，我得到了一个错误，如下所示。下面是我的最小示例： import keras from keras.models import Sequential from keras.layers import Dense, Activation import tensorflow as tf import os model = Sequential() model.add(Dense(32, input_dim=784)) model.add(Dense(32)) model.add(Activation('relu

浏览 1提问于2019-02-05得票数 2

1回答

TPU比GPU慢？

、、、、

我刚刚尝试在Google中使用TPU，我想看看TPU比GPU快多少。我意外地得到了相反的结果。以下是神经网络。 random_image = tf.random_normal((100, 100, 100, 3)) result = tf.layers.conv2d(random_image, 32, 7) result = tf.reduce_sum(result) 业绩结果： CPU: 8s GPU: 0.18s TPU: 0.50s 我想知道为什么..。TPU的完整代码如下： def calc(): random_image = tf.random_normal((10

浏览 0提问于2018-09-30得票数 5

1回答

如何使用Google平台TPU v3？

、、、、

我找到了一个关于的教程本教程是关于如何在Google平台上运行木星笔记本。我也想用TPU v3。我还阅读了文档。但不幸的是这对我没什么帮助。我现在的问题是：如何创建TPU v3并在木星笔记本中使用？当选择TPU v3作为运行时时，如何读取和写入数据？在Google中，为了使用TPU，我做了以下工作： %tensorflow_version 2.x import tensorflow as tf print("Tensorflow version " + tf.__version__) try: tpu = tf.distribute.cluste

浏览 0提问于2020-11-17得票数 1

1回答

TPUEstimator错误-- AttributeError:模块'tensorflow.contrib.tpu.python.ops.tpu_ops‘没有属性'cross_replica_sum’

、、、

我已经使用TPUEstimator编写了tensorflow代码，但在use_tpu=False模式下运行它时遇到问题。我想在我的本地计算机上运行它，以确保所有操作都与TPU兼容。代码与普通的Estimator一起工作得很好。下面是我的主代码： import logging from tensorflow.contrib.tpu.python.tpu import tpu_config, tpu_estimator, tpu_optimizer from tensorflow.contrib.cluster_resolver import TPUClusterResolver from cap

浏览 0提问于2018-07-17得票数 0

2回答

无法从google中打开google存储中的文件

、、、、

我试图使用TPU引擎打开google工作簿中存储在google存储桶中的文件。然而，我总是面临这样的错误： FileNotFoundError: [Errno 2] No such file or directory: 'gs://vocab_jb/merges.txt' 我的问题很简单:我应该如何使一个桶在谷歌存储可读的google？我什么都试过了：使用IAM公开水桶向所有者分配一个特殊的电子邮件入口。通过LCA选项公开文件跟随x个不同的我每次都尝试通过"gs:// bucket“或"”调用桶但没有一种选择是正确的。更让

浏览 4提问于2021-01-26得票数 0

回答已采纳

1回答

如何更多地了解您正在运行程序的云TPU设备？

、

无论我们是使用Google还是直接访问Cloud，下面的程序只提供关于底层TPU的有限信息： import os import tensorflow as tf tpu_address = 'grpc://' + os.environ['COLAB_TPU_ADDR'] print ('TPU address is', tpu_address) def printTPUDevices(): with tf.Session(tpu_address) as session: devices = session.list_devi

浏览 0提问于2018-11-13得票数 3

回答已采纳

1回答

如何用本地机器上的云TPU训练Tensorflow？

、

我正在尝试做这个教程()，但是使用我自己提供的TPU。但是，当我试图切换到我自己的TPU时，我无法解析TPU集群，也就是说，当我运行单元格时，我会得到一个超时： tpu_addr = f"{MY_TPU_IP}:8470" # os.environ['COLAB_TPU_ADDR'], if running colab's TPU resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu=f'grpc://{tpu_addr}') tf.config.experime

浏览 2提问于2020-01-29得票数 0

回答已采纳

1回答

在Google Colab中使用TPU

、、、

我目前正在TPU的帮助下训练一个神经网络。我更改了运行时类型并初始化了TPU。我有一种感觉，它还是不快。我使用了https://www.tensorflow.org/guide/tpu。我做错什么了吗？ # TPU initialization resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR']) tf.config.experimental_connect_to_cluster(resolver) #

浏览 16提问于2020-11-05得票数 1

回答已采纳

2回答

在GCP中无法从VM访问TPU

、、、

试图运行这段代码 import os import tensorflow as tf from tensorflow.contrib import tpu from tensorflow.contrib.cluster_resolver import TPUClusterResolver def axy_computation(a, x, y): return a * x + y inputs = [ 3.0, tf.ones([3, 3], tf.float32), tf.ones([3, 3], tf.float32), ] tpu_computat

浏览 1提问于2018-10-23得票数 2

回答已采纳