文章/答案/技术大牛

发布

社区首页 >问答首页 >训练Rasa/LaBSE时内存不足

问训练Rasa/LaBSE时内存不足
EN

Stack Overflow用户

提问于 2022-06-05 05:46:29

回答 3查看 223关注 0票数 1

我想从rasa/LaBSE那里训练LanguageModelFeaturizer。我遵循了文档中的步骤，没有更改默认的培训数据。

我的配置文件看起来如下：

# The config recipe.
# https://rasa.com/docs/rasa/model-configuration/
recipe: default.v1

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en

pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
   - name: WhitespaceTokenizer
#   - name: RegexFeaturizer
#   - name: LexicalSyntacticFeaturizer
   - name: LanguageModelFeaturizer
     # Name of the language model to use
     model_name: "bert"
     # Pre-Trained weights to be loaded
     model_weights: "rasa/LaBSE"
     cache_dir: null
   - name: CountVectorsFeaturizer
   - name: CountVectorsFeaturizer
     analyzer: char_wb
     min_ngram: 1
     max_ngram: 4
   - name: DIETClassifier
     epochs: 100
     constrain_similarities: true
     batch_size: 8
   - name: EntitySynonymMapper
   - name: ResponseSelector
     epochs: 100
     constrain_similarities: true
   - name: FallbackClassifier
     threshold: 0.3
     ambiguity_threshold: 0.1

在运行rasa train之后，我得到：

tensorflow.python.framework.errors_impl.ResourceExhaustedError: failed to allocate memory [Op:AddV2]

我使用的是6GB内存的GTX 1660ti。我的系统规范是：

Rasa
----------------------
rasa                    3.0.8
rasa-sdk                3.0.5

System
----------------------
OS: Ubuntu 18.04.6 LTS x86_64
Kernel: 5.4.0-113-generic
CUDA Version: 11.4
Driver Version: 470.57.02

Tensorflow
----------------------
tensorboard             2.8.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.8.1
tensorflow              2.6.1
tensorflow-addons       0.14.0
tensorflow-estimator    2.6.0
tensorflow-hub          0.12.0
tensorflow-probability  0.13.0
tensorflow-text         2.6.0

定期的训练效果很好，我可以运行模型。我试图减少batch_size，但错误仍然存在。

python

tensorflow

rasa

rasa-nlu

rasa-sdk

回答 3

Stack Overflow用户

回答已采纳

发布于 2022-06-26 04:19:36

使用google运行相同的代码(使用16 GPU的GPU内存)可以正常工作。该模型使用6.5-7GB的内存。

票数 0

Stack Overflow用户

发布于 2022-06-07 16:45:41

您可以创建交换内存，如果您的RAM在培训的某个点变得满。

票数 0

Stack Overflow用户

发布于 2022-09-02 06:50:05

我假设OOM是使用饮食分类器的

尝试减少这些参数中的一些。我将列出下面的默认值

- name: DIETClassifier
  epochs: 100
  batch_size: [16, 32]
  num_transformer_layers: 2
  embedding_dimension: 20
  hidden_layer_sizes:
    text: [256, 128]
  ...

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72505074

复制

相似问题

问训练Rasa/LaBSE时内存不足
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问训练Rasa/LaBSE时内存不足EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问训练Rasa/LaBSE时内存不足
EN