首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >使用tensorflow创建数据集的问题

使用tensorflow创建数据集的问题
EN

Stack Overflow用户
提问于 2021-02-26 15:09:12
回答 2查看 0关注 0票数 0

我想从我上传到GitHub的压缩图像中为我的python笔记本创建一个数据集。我遵循了我看到的步骤,但当我运行该命令时,它抛出了一个错误。

这是我正在运行的命令

代码语言:javascript
复制
!tfds build human_dataset

这就是我得到的错误

代码语言:javascript
复制
2021-02-26 15:02:11.312748: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
INFO[build.py]: Loading dataset human_dataset from path: /content/drive/MyDrive/DeepPerceptionLearning/human_dataset/human_dataset.py
2021-02-26 15:02:13.692513: I tensorflow/core/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2021-02-26 15:02:13.880000: I tensorflow/core/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
INFO[build.py]: download_and_prepare for dataset human_dataset/1.0.0...
INFO[dataset_builder.py]: Generating dataset human_dataset (/root/tensorflow_datasets/human_dataset/1.0.0)
Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to /root/tensorflow_datasets/human_dataset/1.0.0...
2021-02-26 15:02:14.132647: I tensorflow/core/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
2021-02-26 15:02:14.329230: I tensorflow/core/platform/cloud/google_auth_provider.cc:180] Attempting an empty bearer token since no token was retrieved from files, and GCE metadata check was skipped.
Dl Completed...: 0 url [00:00, ? url/s]
Dl Size...: 0 MiB [00:00, ? MiB/s]

                                       


INFO[download_manager.py]: Skipping download of https://github.com/egjlmn1/DeepPerceptionLearning/archive/master/humans.zip: File cached in /root/tensorflow_datasets/downloads/egjlmn1_DeepPercepti_archive_master_humansqcMvXg2sLJ3_V9MFSR8_gIF2ERgzGdA9jeVBzyg_kvY.zip
Dl Completed...: 0 url [00:00, ? url/s]
Dl Size...: 0 MiB [00:00, ? MiB/s]

                                       


INFO[download_manager.py]: Reusing extraction of /root/tensorflow_datasets/downloads/egjlmn1_DeepPercepti_archive_master_humansqcMvXg2sLJ3_V9MFSR8_gIF2ERgzGdA9jeVBzyg_kvY.zip at /root/tensorflow_datasets/downloads/extracted/ZIP.egjlmn1_DeepPercepti_archive_master_humansqcMvXg2sLJ3_V9MFSR8_gIF2ERgzGdA9jeVBzyg_kvY.zip.
Dl Completed...: 0 url [00:00, ? url/s]
Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]
Extraction completed...: 0 file [00:00, ? file/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Dl Completed...: 0 url [00:00, ? url/s]
Generating splits...:   0% 0/2 [00:00
    sys.exit(launch_cli())
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/scripts/cli/main.py", line 126, in launch_cli
    app.run(main, flags_parser=_parse_flags)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/scripts/cli/main.py", line 121, in main
    args.subparser_fn(args)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/scripts/cli/build.py", line 199, in _build_datasets
    _download_and_prepare(args, builder)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/scripts/cli/build.py", line 357, in _download_and_prepare
    download_config=dl_config,
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/dataset_builder.py", line 452, in download_and_prepare
    download_config=download_config,
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/dataset_builder.py", line 1187, in _download_and_prepare
    leave=False,
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/dataset_builder.py", line 1182, in 
    for split_name, generator
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/split_builder.py", line 295, in submit_split_generation
    return self._build_from_generator(**build_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/split_builder.py", line 366, in _build_from_generator
    shard_lengths, total_size = writer.finalize()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/tfrecords_writer.py", line 222, in finalize
    self._shuffler.bucket_lengths, self._path)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/tfrecords_writer.py", line 95, in _get_shard_specs
    shard_boundaries = _get_shard_boundaries(num_examples, num_shards)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/tfrecords_writer.py", line 118, in _get_shard_boundaries
    raise AssertionError("No examples were yielded.")
AssertionError: No examples were yielded.

错误似乎是什么?它说没有生成示例,但在压缩包中有我正在提取的图像。而且,它说它正在跳过下载,因为数据已经被缓存了,也许这就是问题所在,但是我如何迫使他重新下载数据呢?

EN

回答 2

Stack Overflow用户

发布于 2021-03-01 09:25:59

修好了。我所做的是将压缩文件放在创建数据集的同一目录中,而不是从URL获取数据集,而只是从本地路径提取数据集。

票数 0
EN

Stack Overflow用户

发布于 2022-02-17 11:36:21

你好 我也是用tfds 來創建自家数據集

也是遇到你的問題

我的數據集是放在本地文件夾中,是png 沒有zip 的

請問这会有問題?

代码语言:javascript
复制
class Grass(tfds.core.GeneratorBasedBuilder):
  """DatasetBuilder for face_grass dataset."""

  VERSION = tfds.core.Version('1.0.0')
  RELEASE_NOTES = {
      '1.0.0': 'Initial release.',
  }

  def _info(self) -> tfds.core.DatasetInfo:
    """Returns the dataset metadata."""
    # TODO(grass): Specifies the tfds.core.DatasetInfo object
    return tfds.core.DatasetInfo(
        builder=self,
        description=_DESCRIPTION,
        features=tfds.features.FeaturesDict({
            "lr": tfds.features.Image(),
            "hr": tfds.features.Image(),
        }),
        supervised_keys=("lr", "hr"),
        # If there's a common (input, target) tuple from the
        # features, specify them here. They'll be used if
        # `as_supervised=True` in `builder.as_dataset`.
        # supervised_keys=('image', 'label'),  # Set to `None` to disable
        homepage='https://dataset-homepage/',
        citation=_CITATION,
    )

  def _split_generators(self, dl_manager: tfds.download.DownloadManager):
    """Returns SplitGenerators."""
    # TODO(face_grass): Downloads the data and defines the splits
    #path = dl_manager.download_and_extract('https://todo-data-url')

    # TODO(face_grass): Returns the Dict[split names, Iterator[Key, Example]]
    return [
        tfds.core.SplitGenerator(
            name=tfds.Split.TRAIN,
            gen_kwargs={
                "lr_path":
                    "../data1/project/grass/train/LR",
                "hr_path":
                    "../data1/project/grass/train/HR",
            }),
        tfds.core.SplitGenerator(
            name=tfds.Split.VALIDATION,
            gen_kwargs={
                "lr_path":
                    "../data1/project/grass/valid/LR",
                "hr_path":
                    "../data1/project/grass/valid/HR",
            }),
    ]

  def _generate_examples(self, lr_path, hr_path):
    """Yields examples."""
    # TODO(face_grass): Yields (key, example) tuples from the dataset
    for root, _, files in tf.io.gfile.walk(lr_path):
      for file_path in files:
        # Select only png files.
        if file_path.endswith(".png"):
          yield file_path, {
              "lr": os.path.join(root, file_path),
            
              "hr": os.path.join(hr_path, file_path)
          }
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66388210

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档