文章/答案/技术大牛

发布

社区首页 >问答首页 >在木星笔记本中使用Python从github导入数据

问在木星笔记本中使用Python从github导入数据
EN

Stack Overflow用户

提问于 2020-02-10 22:00:50

回答 2查看 1.2K关注 0票数 0

我使用的书“手工机器学习与科学工具包-学习和丹索尔流”由奥雷利恩杰伦。

这是我第一次使用木星和Python。

我试着遵循下面的代码。

我的问题是，当我使用以下代码运行该单元时：

import os
import tarfile
import urllib
DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
HOUSING_PATH = os.path.join("datasets", "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"
def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    os.makedirs(housing_path, exist_ok=True)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()

单元格评估永远不会结束，In[*]:永远不会变成类似In[1]:的东西。

所以，我认为这是一个问题，最初的网址，因为它显示了一个错误，当我访问它在我的互联网浏览器。

因此，我将其更改为DOWNLOAD_ROOT = "https://github.com/ageron/handson-ml2/tree/master/"。

现在我得到了In[1]:。然而，当我运行fetch_housing_data()时，我得到：

---------------------------------------------------------------------------
ReadError                                 Traceback (most recent call last)
<ipython-input-6-bd66b1fe6daf> in <module>
----> 1 fetch_housing_data()

<ipython-input-5-ef3c39b342d8> in fetch_housing_data(housing_url, housing_path)
      9     tgz_path = os.path.join(housing_path, "housing.tgz")
     10     urllib.request.urlretrieve(housing_url, tgz_path)
---> 11     housing_tgz = tarfile.open(tgz_path)
     12     housing_tgz.extractall(path=housing_path)
     13     housing_tgz.close()

~\Anaconda3\lib\tarfile.py in open(cls, name, mode, fileobj, bufsize, **kwargs)
   1576                         fileobj.seek(saved_pos)
   1577                     continue
-> 1578             raise ReadError("file could not be opened successfully")
   1579 
   1580         elif ":" in mode:

ReadError: file could not be opened successfully

为什么会发生这种事，我怎么解决呢？

jupyter-notebook

python

import

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-02-11 18:20:12

您是否重新启动了内核并再次尝试运行？

你看到的是不可复制的。

上面粘贴的第一个代码块以编写的方式工作。没必要修改它。

我只是在下面运行了这个程序，然后在另一个单元中运行fetch_housing_data()时，它起了作用：

import os
import tarfile
import urllib
DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
HOUSING_PATH = os.path.join("datasets", "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"
def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    os.makedirs(housing_path, exist_ok=True)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()

你确定这不只是一个你看不到细胞完成的人造产物吗？

如果您想独立验证，可以像我一样在其他地方运行它。我只是通过访问这里并按下底部的launch binder链接来测试它。然后我把你的代码粘在了上面的一个牢房里。在运行这两个单元格之后，我在/home/jovyan/scripts/datasets/housing上有一个目录，其中包含housing.csv housing.tgz内容。

票数 1

Stack Overflow用户

发布于 2021-07-24 00:32:48

https://raw.githubusercontent.com/ageron/handson-ml2/master/

我不知道这是什么联系。也许有人能解释。输入此链接时，无法访问该页。然而，它确实可以检索数据，我在下一段中解释了这一点。如果我使用上面提到的实际github链接https://github.com/ageron/handson-ml2/tree/master/，代码将无法提取数据。

通过在“导入”中添加另一行，我已经能够使用书中的步骤从链接中提取csv文件。我添加了“导入urllib.request”。这似乎适用于我在谷歌科拉布。导入urllib时，您可能会认为urllib.request也是导入的，但事实并非如此。我无法回答为什么它能工作，但是在一个例子中，urllib的文档中有'import urllib.request‘，我接受了这个想法。

票数 -2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/60159092

复制

相似问题

问在木星笔记本中使用Python从github导入数据
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在木星笔记本中使用Python从github导入数据EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在木星笔记本中使用Python从github导入数据
EN