我正在尝试从Instagram页面获取用户名。我应该使用在"data = soup.find_all('script') 3“之后获得的数据的一部分,如下所示:
查看器“{"config":{"csrf_token":"hIuZDxW17bTXz5EDLY25ftqivOOrLEeZ",,”viewerId“:"entry_data":{"PostPage":[{"graphql":{"shortcode_media":{"__typename":"GraphImage",},"supports_es6":false,"country_code":"RU","language_code":"en","locale":"en_US",javascript"id":"1968747493659350883",“短码”:“BtSZWokAZdj”,“尺寸”:{“高度”:640,“宽度”:640},"gating_info":null,"display_resources":[{"src":"https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net",“https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net”,config_width "config_width":640,"config_height":640},{"src":"https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","config_width":750,"config_height":750},{"src":"https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","config_width":1080,"config_height":1080}],“accessibility_caption”:“图片可能包含:一个或多个人物和特写”,"is_video":false,"should_log_client_event":false,src "edge_media_to_tagged_user":{"edges":[]},"edge_media_to_caption":{"edges":{"node":{"text":"\u2022\nScars展示你的故事。\n你的痛苦。\n你的仇恨\n你的悲伤和绝望。\n他们造就了你,你是独一无二的,有着不同的印记。\n有些留下来,有些走了。\n有些亮,有些轻。\n有些大,有些小。\n有些深,有些浮在表面。但它们其实都是一样的,你明白吗?\n它们都是伤痕,只是讲述了我们生活的不同点,我们的故事。这是我们一生中的纪念品,它告诉我们我们成长了多少。\n我们克服了多少困难。我们变得多么强大\n我们从生命中最艰难和最黑暗的时刻变得多么勇敢和勇敢。\u2022\n\u2022\n#poem #cuts #自残#纹身#黑暗#痛苦#悲伤#孤独#焦虑#抑郁“},"caption_is_edited":true,"has_ranked_comments":false,"edge_media_to_comment":{"count":1,"page_info":{"has_next_page":false,"end_cursor":null},"edges":[]},"comments_disabled":false,"taken_at_timestamp":1548913011,"edge_media_preview_like":{"count":17,"edges":[]},"edge_media_to_sponsor_user":{"edges":[]},"location":null,"viewer_has_liked":false,"viewer_has_saved":false,"viewer_has_saved_to_collection":false,"viewer_in_photo_of_you":false,"viewer_can_reshare":true"owner":{"id":"10173498181","is_verified":false,"profile_pic_url":"https://instagram.fhel3-1.fna.fbcdn.net/vp/9a17134e8d0a36efec53f1da5cac1f38/5D14BC0F/t51.2885-19/s150x150/47690762_475199173011446_4764198224049209344_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","username":"devils..tea.","blocked_by_viewer":false,"followed_by_viewer":false,“full_name”:“devils..tea\ud83e\udd40”,"has_blocked_viewer":false,"is_private":false,"is_unpublished":false,"requested_by_viewer":false}......
有"username“部分(在blockquote的末尾)。我想它是一根线,但我抓不住它。所以它不是字符串,但它是什么呢?这是一个类吗?我应该使用哪个方法来检索用户名" username ":"devils..tea.“。提前感谢您,如果您能帮上忙的话。
....
req = requests.get(url)
soup = BeautifulSoup(req.text, "lxml")
data = soup.find_all('script') [3]
username = data.find_all_next(string="username")
print (username)
发布于 2019-03-12 04:07:41
或者,对于我们这些不喜欢正则表达式的人,你可以尝试这样做:
data = [your quote above]
data_list = data.split(",")
for i in data_list:
if 'username' in i:
print(i)
输出:
"username":"devils..tea."
发布于 2019-03-12 02:08:57
您可以使用regex
import re
data = '''
(script type="text/javascript">window._sharedData = {"config":{"csrf_token":"hIuZDxW17bTXz5EDLY25ftqivOOrLEeZ","viewer":null,"viewerId":null},"supports_es6":false,"country_code":"RU","language_code":"en","locale":"en_US","entry_data":{"PostPage":[{"graphql":{"shortcode_media":{"__typename":"GraphImage","id":"1968747493659350883","shortcode":"BtSZWokAZdj","dimensions":{"height":640,"width":640},"gating_info":null,"media_preview":"ACoq5miitSxxIGTHPXPGcd8ZFAGXRXSSWypFsAAZ/lzjpn/Csm5sjAu7Ib8MUAUaKU0lABVq0lMUqsPUA/Q8VVpynBB9CKAOtuOFB9CD+uP5Gq19HuiOPTP5Ul1exhdgy7kdF7fU/wCGatJiRPqv5ZFIZybnP4UynOpUlT1HFNpiClDFeRSUUATLcSJ904+lPF5MvR2H41WooAc7lzuY5J702iigD//Z","display_url":"https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","display_resources":[{"src":"https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","config_width":640,"config_height":640},{"src":"https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","config_width":750,"config_height":750},{"src":"https://instagram.fhel3-1.fna.fbcdn.net/vp/68311f4b09669fd75609e9fcabbf1ae0/5D0517DE/t51.2885-15/e35/49907137_294327238101721_6745007497573009307_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","config_width":1080,"config_height":1080}],"accessibility_caption":"Image may contain: one or more people and closeup","is_video":false,"should_log_client_event":false,"tracking_token":"eyJ2ZXJzaW9uIjo1LCJwYXlsb2FkIjp7ImlzX2FuYWx5dGljc190cmFja2VkIjp0cnVlLCJ1dWlkIjoiN2Q1Yjg2NmY5OGIwNDVhNWIxMmRhNjEwZTA3NDY1MmYxOTY4NzQ3NDkzNjU5MzUwODgzIn0sInNpZ25hdHVyZSI6IiJ9","edge_media_to_tagged_user":{"edges":[]},"edge_media_to_caption":{"edges":[{"node":{"text":"\u2022\nScars show your story. \nYour pain. \nYour hate.\nYour sadness and despair. \nThey make you who you are, and one of a kind with every different mark. \nSome stay, some go.\nSome brighter, some lighter.\nSome bigger, some smaller.\nSome deeper, some one the surface. \nBut they are really all the same, you see?\nThey are all scars, just telling different points of our life, our story. \nOur souvenir throughout our whole life, that shows us how much we've grown. \nHow much we have overcome. How strong we've become.\nHow brave and courageous we've become from the hardest and darkest times of our life. \u2022\n\u2022\n\u2022\n\u2022\n#poem #cuts #selfharm #tatoo #dark #pain #sad #lonely #anxiety #depressed"}}]},"caption_is_edited":true,"has_ranked_comments":false,"edge_media_to_comment":{"count":1,"page_info":{"has_next_page":false,"end_cursor":null},"edges":[]},"comments_disabled":false,"taken_at_timestamp":1548913011,"edge_media_preview_like":{"count":17,"edges":[]},"edge_media_to_sponsor_user":{"edges":[]},"location":null,"viewer_has_liked":false,"viewer_has_saved":false,"viewer_has_saved_to_collection":false,"viewer_in_photo_of_you":false,"viewer_can_reshare":true,"owner":{"id":"10173498181","is_verified":false,"profile_pic_url":"https://instagram.fhel3-1.fna.fbcdn.net/vp/9a17134e8d0a36efec53f1da5cac1f38/5D14BC0F/t51.2885-19/s150x150/47690762_475199173011446_4764198224049209344_n.jpg?_nc_ht=instagram.fhel3-1.fna.fbcdn.net","username":"devils..tea.","blocked_by_viewer":false,"followed_by_viewer":false,"full_name":"depressed\ud83e\udd40","has_blocked_viewer":false,"is_private":false,"is_unpublished":false,"requested_by_viewer":false}......
'''
r = re.compile(r'username":"(.*)(?=","blocked)')
print(r.findall(data))
https://stackoverflow.com/questions/55107715
复制相似问题