我运行下面的代码是为了从Google抓取,但是当我试图从终端运行时,我得到了这个错误代码。
File "Coordinate-Scraper.py", line 33, in <module>
for loc in locations_array:
TypeError: iteration over a 0-d array
我一直在努力找出是什么原因造成的。有什么想法吗?我添加了代码的一部分,以说明谷歌打破了几百次观察,但它仍然拒绝运行。
import pandas as pd
import numpy as np
import requests
import csv
import sys
import json
from bs4 import BeautifulSoup
Locations_file1 = 'Locations_file1.csv'
Locations_Sheet = 'Sheet1'
Col_name = 'Locations'
try:
if Locations_file1.split(".")[1] == 'csv':
locations_df = pd.read_excel(Locations_file1, sheetname=Locations_Sheet)
locations_array = np.asarray(locations_df[Col_name])
elif Locations_file1.split(".")[1] == 'csv':
locations_df = pd.read_csv(Locations_file1)
locations_array = locations_df[Col_name]
except:
locations_array = np.asarray(Locations_file1)
features = ['Location', 'Latitude', 'Longitude']
Complete_df = pd.DataFrame(columns=features)
c = 0
for loc in locations_array:
if c < 223:
c+=1
continue
desired_location = loc
search_url = 'https://www.google.com/search?q='
url = search_url + str(desired_location.replace(' ','+'))
r = requests.get(url)
# print(json.dumps(r.text))
content = r.text
# content = r.content
soup = BeautifulSoup(content, features="html.parser")
body = soup.find('body')
# print(body)
# break
map_class = body.find('a',href=lambda href: href and "maps" in href)
# map_class = body.find('a',,{'class' :'VGHMXd'})
# print(map_class)
map_url = map_class.get('href')
r_map = requests.get(map_url)
content_map = r_map.text
soup_map = BeautifulSoup(content_map, features="html.parser")
head = soup_map.find('head')
url_long_lat = head.find_all('meta')[8].get('content')
Lat, Long = url_long_lat[url_long_lat.find('center=')+len('center='):url_long_lat.rfind('&zoom')].split('%2C')
location_info = pd.DataFrame([[desired_location,Lat,Long]])
location_info.columns = features
Complete_df = Complete_df.append(location_info, ignore_index = True)
print(Complete_df)
Complete_df.to_csv('Locations_Latitude_Longitude.csv')
发布于 2020-11-16 13:46:40
你有两个问题。
if Locations_file1.split(".")[1] == 'csv':
应该检查'xlsx'
或其他电子表格格式。try
块中得到了一个异常。在except
块中,您试图将文件名Locations_file1
转换为np数组,这将导致locations_array
只有Locations_file1
字符串的值。因此,如果您打印locations_array
,您将在您的控制台上得到Locations_file1.csv。有趣的是,locations_array
是一个具有__str__()函数的对象,它返回字符串的值,但是verabile是0维数组。所以np.array对象的datafeild是空的。
简单地说
strs = "Hello world"
arr = np.asarray(strs)
print(arr.__str__())
print(arr)
print(arr.__repr__())
for i in arr:
print(i)
给出以下输出:
Hello world
Hello world
array('Hello wolrd', dtype='<U11')
Traceback (most recent call last):
File "/home/user/tests.py", line 73, in <module>
for i in arr:
TypeError: iteration over a 0-d array
Process finished with exit code 1
更新:
现在,如果要解决try:...except:
块中的问题,首先,您在那里捕获所有可能的异常,并忽略它们,假设如果读取文件不工作,那么必须已经读取了它。但是,您的代码并不表示它是在失败时读取的。我建议您更改except
块:
except Exception as e:
print(e)
这会让你知道到底出了什么问题。我敢假设有两种可能的情况:
您的文件是坏的,您未能将它读取到pandas.DataFrame
Col_name
值"Locations"
。发布于 2020-11-16 13:27:33
我认为错误在代码的这一部分:
try:
if Locations_file1.split(".")[1] == 'csv':
locations_df = pd.read_excel(Locations_file1, sheetname=Locations_Sheet)
locations_array = np.asarray(locations_df[Col_name])
elif Locations_file1.split(".")[1] == 'csv':
locations_df = pd.read_csv(Locations_file1)
locations_array = locations_df[Col_name]
except:
locations_array = np.asarray(Locations_file1)
if
和elif
语句都是在条件Locations_file1.split(".")[1] == 'csv'
上触发的。没有else
语句,因此,如果Locations_file1.split(".")[1]
不等于'csv'
,则不会填充locations_array
。您可以期望代码中的以外部分捕获所有其他情况,但这可能不会触发,因为理论上有可能出现条件不为真但没有抛出异常的情况。另一个错误可能隐藏在这个错误的背后,但我认为这是根源所在。
https://stackoverflow.com/questions/64789076
复制相似问题