“使用json文件获取youtube视频评论”
import simplejson as json
from urllib.request import urlopen
import sys
import time
import csv
import os
import io
os.chdir(r'C:\Users\adity\Desktop\data science')
csvFile =open('test1.csv',"w")
#csvFile =open('test.tsv',"w")
#writer = csv.writer(csvFile,delimiter=',')
#writer.writerow('Comments')
csvFile.write("comments\n")
STAGGER_TIME = 1
# open the url and the screen name
# (The screen name is the screen name of the user for whom to return results for)
url = "https://www.googleapis.com/youtube/v3/commentThreads?key=AIzaSyCYkTUjKgFGcKDnkNQMgSBbb4obnqIzUEM&textFormat=plainText&part=snippet&videoId=Ye8mB6VsUHw&maxResults=100"
“这将获取python对象并将其转储为字符串,该字符串是该对象的JSON表示形式。”
url1=urlopen(url)
#data = json.load(urllib2.urlopen(url))
result = json.load(url1)
# print the result
itemList= result.get("items")
length=len(itemList)
for i in range(0,length):
results= (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
print(results)
results=results.replace(",", "")
#print (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
#writer.writerow((result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8"))
csvFile.write(results)
csvFile.write('\n')
time.sleep(STAGGER_TIME)
csvFile.close()
“获取错误: TypeError:需要类似字节的对象,而不是'str”
TypeError Traceback (most recent call last)
<ipython-input-112-a5225431e178> in <module>()
32 results= (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
33 print(results)
---> 34 results=results.replace(",", "")
35 #print (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
36 #writer.writerow((result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8"))
TypeError: a bytes-like object is required, not 'str'
发布于 2018-06-11 05:53:32
results= (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
这里的责任在于最后一部分.encode("utf-8")
,它将字符串转换为字节,这很好,除非您尝试使用常规字符串的replace
。建议(最适合你的):
选项1如果可以,只需从行中删除该部分
results = result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")
选项2在尝试replace
之前添加decode
results = results.decode().replace(",", "")
选项3使用具有适当字节的replace
:
results = results.replace(b",", b"")
选项1是理想的选项,因为它更简单,并且与其余代码更兼容(不需要首先转换为字节,我看不到它有什么作用)
https://stackoverflow.com/questions/50788282
复制相似问题