我已经为此工作了几个小时,需要一些帮助。这基本上是可行的。我可以连接到Twitter,提取json数据并将其存储在MongoDB中,但是我在“打印(Tweet)”行中看到的数据并不都显示在MongoDB中。具体地说,我没有看到screen_name (或name或matter)字段。我真的只需要这些字段:"id","text","created_at","screen_name","retweet_count","favourites_count","lang“,我得到了除了name之外的所有字段。我不确定为什么它没有与所有其他字段一起插入到数据库中。任何帮助都将不胜感激!
from twython import Twython
from pymongo import MongoClient
ConsumerKey = "XXXXX"
ConsumerSecret = "XXXXX"
AccessToken = "XXXXX-XXXXX"
AccessTokenSecret = "XXXXX"
twitter = Twython(ConsumerKey,
ConsumerSecret,
AccessToken,
AccessTokenSecret)
result = twitter.search(q="drexel", count='100')
result1 = result['statuses']
for tweet in result1:
print(tweet) #prints tweets so I know I got data
client = MongoClient('mongodb://localhost:27017/')
db = client.twitterdb
tweet_collection = db.twitter_search
#Fields I need ["id", "text", "created_at", "screen_name", "retweet_count", "favourites_count", "lang"]
for tweet in result1:
try:
tweet_collection.insert(tweet)
except:
pass
print("The number of tweets in English: ")
print(tweet_collection.count(lang="en"))
发布于 2018-06-04 08:30:01
您可以使用以下方式:
def get_document(post):
return {
'id': post['id_str'],
'text': post['text'],
'created_at': post['created_at'],
'retweet_count' : post['retweet_count'],
'favourites_count': post['user']['favourites_count'],
'lang': post['lang'],
'screen_name': post['user']['screen_name']
}
for tweet in result1:
try:
tweet_collection.insert(
get_document(tweet)
)
except:
pass
应该能行得通。
发布于 2018-06-04 07:39:16
"screen_name“字段是推特元数据的"user”部分的子集。确保你深入到了足够远的地方。
https://stackoverflow.com/questions/50671834
复制相似问题