Python / Beautifulsoup:当前元素的HTML路径_无法使用BeautifulSoup Python找到HTML元素_如何在Python中使用BeautifulSoup查找特定的HTML元素 - 腾讯云开发者社区

python、visual-studio-code、interpreter

import requests from bs4 import BeautifulSoup import openai #write each line of nuclear.txt to a list with open('nuclear.txt', 'r') as f: lines = f.readlines() #remove the newline character from each line lines = [line.rstrip() for line in lines] #gather the text from each we

浏览 23提问于2022-11-04得票数 0

2回答

Spyder3崩溃后，安装jupyter-记事本

ubuntu-16.04、spyder

在笔记本电脑中，我使用的是Spyder3，在安装Jupyter-记事本之前没有任何问题。当从命令行运行spyder3时，将出现下一条消息：文件"/usr/lib/python2.7/dist-packages/bs4/builder/_html5lib.py"，第70行，在TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder)：类中 AttributeError：“模块”对象没有属性“_base” 经过一些搜索后，尝试以下建议的解决方案：尝试： sudo pip安装-升级beautifulsou

浏览 0提问于2018-09-22得票数 1

回答已采纳

2回答

如何使用Python获取网站中所有xpath的树？

python、html、selenium、xpath、tree

方法一在尝试使用Python获取网站()中所有xpath的层次树时，我首先尝试使用以下方法获取分支的xpath：/html/body： from selenium import webdriver url = 'https://startpagina.nl' driver = webdriver.Firefox() driver.get(url) test = driver.find_elements_by_xpath('//*') print(len(test)) driver.close() 根据@Prophet的回答，这会产生一个网站上所有元素的列

浏览 12提问于2022-11-24得票数 0

2回答

Selenium Python页面更新后返回为空

javascript、python、selenium、scrape

我正在使用Selenium Python和BeautifulSoup来抓取数据。我需要的网站的html后，‘生活’按钮被点击。我正在获取要单击的按钮，但是新的HTML没有返回给我。我认为在按钮单击后，html会很快返回，所以我休眠了。但即便如此，它也只返回了类的空div 'Collapsible__contentInner‘。 from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.s

浏览 0提问于2020-08-24得票数 0

1回答

错误的‘服务’对象没有属性‘进程’时使用python漂亮汤提取与硒

python、selenium、selenium-webdriver、beautifulsoup

我使用这段代码来删除链接中的一些数据。因为在加载15秒后，带有我想要提取的标记的实际脚本加载，有人建议我在代码中引入延迟。因此，我使用以下代码代码如下 #!/usr/bin/python import urllib import time from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from bs4 import BeautifulSoup from dateutil.parser import parse from datetime import timedel

浏览 1提问于2016-11-28得票数 0

1回答

吡咯烷酮parse_sms美汤误差

python、python-2.7、beautifulsoup、google-voice

我正在尝试运行pygooglevoice示例脚本parse_sms.py，尝试用Python下载sms的内容，并收到以下错误： Traceback (most recent call last): File "C:\Python27\pygooglevoice-0.5-extras\examples\parse_sms.py", line 39, in <module> for msg in extractsms(voice.sms.html): File "C:\Python27\pygooglevoice-0.5-extras\examp

浏览 2提问于2013-12-27得票数 1

回答已采纳

1回答

如何从OSX10.12.5上的bs4中从Python3中导入漂亮汤？

python-3.x、beautifulsoup

我试图解决在Python3中导入模块时出现的一个常见问题。我正在运行OSX10.12.5，并将Python3安装在我的MacBook Air上，并使用崇高文本来编辑和运行我的代码。当我尝试这个导入时： from bs4 import BeautifulSoup ...I得到了以下错误： Traceback (most recent call last): File "/Users/<myname>/Python/code-python3/Pgm#001", line 5, in <module> from bs4 import Beaut

浏览 0提问于2017-07-04得票数 1

1回答

我正在尝试从一个网站解析板球得分。Ans还希望在分数发生变化时得到警报。我的代码出了什么问题？

python、automation、beautifulsoup、html-parsing

我分析了一个队的板球比分。并将其存储在文本文件中。最后，我想将它与我解析的web内容进行比较。但是什么也没发生。下面是我的代码： import urllib.request from bs4 import BeautifulSoup import re import time url = "http://www.cricbuzz.com/live-cricket-scores/15788/ind-vs-pak-19th-match-super-10-group-2-icc-world-t20-2016" def hello(): fine = urllib.request.

浏览 0提问于2016-03-20得票数 1

1回答

调度程序运行python脚本，但不生成csv文件

python、python-3.x、windows、windows-task-scheduler

我用python编写了一个脚本，用于从网页中抓取一些链接并将它们写入csv文件。当从IDE运行它时，我的脚本以正确的方式执行此操作。当我使用windows task scheduler运行相同的时候，我可以看到一个command prompt弹出，脚本也在那里运行并打印结果，但是当任务完成时，我没有得到任何csv文件。我是不是遗漏了什么？当脚本通过.bat windows task scheduler__通过文件运行时，应该进行哪些更改以获得csv文件？ .bat 包含： @echo off "C:\Users\WCS\AppData\Local\Programs\Python\

浏览 0提问于2018-12-02得票数 1

回答已采纳

1回答

是否有python的文件命名规则？

python、python-3.x、python-module

环境Python版本: 3.4.2 OS版本: OS小牛嗨, 我想用python做一些web抓取示例。因此，我创建了脚本文件并将其命名为“html.py”。(在我的项目目录中) 但是，当我使用python3执行它时，它会生成这样的错误。 Traceback (most recent call last): File "html.py", line 1, in <module> from bs4 import BeautifulSoup File "/Library/Frameworks/Python.framework/Versio

浏览 6提问于2014-11-17得票数 0

回答已采纳

4回答

即使在我的电脑上安装时，BeautifulSoup导入也不适用于vscode。

python、visual-studio-code、beautifulsoup

我目前的问题是，美丽汤的进口是不工作的，即使它是安装在我的电脑。我一直收到错误“没有模块名为'bs4'”。我目前正在使用VS代码，但我启动了python空闲，而且它也没有工作。如果有人知道发生了什么，那将是很大的帮助。 1. from pip._vendor import requests 2. from bs4 import BeautifulSoup 3. 4. url = 'https://someonerandomwebsite' 5. r = requests.get(url) 6. b_soup = BeautifulSoup(r.con

浏览 3提问于2020-12-08得票数 1

回答已采纳

4回答

ImportError:没有名为html.entities的模块

python

我正在尝试让这个模块在服务器上工作，但我在标题中得到了错误：我的脚本： from bs4 import BeautifulSoup 当我运行它时： aclark@tycho ~ % python test.py Traceback (most recent call last): File "test.py", line 1, in <module> from bs4 import BeautifulSoup File "/usr/lib/python2.7/site-packages/bs4/__init__.py", line

浏览 0提问于2014-12-09得票数 7

1回答

不能使用BeautifulSoup

python、beautifulsoup

我正在使用Python3.6，并且已经使用pip install beautifulsoup4安装了beautifulsoup4。但是如果我在Python3环境中从bs4导入BeautifulSoup输入，我会得到以下Trackback。我已经按照一些类似帖子的建议更新了漂亮汤和html5，但还没有解决这个问题。 {'results': [], 'status': 'ZERO_RESULTS'} AttributeError: module 'copy' has no attribute 'deepcopy' T

浏览 0提问于2017-09-23得票数 2

2回答

使用BeautifulSoup打印一个目录下所有html文件的内容

python、python-3.x、web-scraping、beautifulsoup

我使用BeautifulSoup打开了一个目录，其中包含200个html文件，但当我尝试使用print(soup.prettify())打印所有目录的内容时，它只显示一个HTML文件的内容。如果我尝试使用soup.find('title')，也会发生同样的情况，它只加载与之前相同的HTML文件的标题。你能告诉我为什么吗？Python没有显示任何错误，我无法理解代码中的错误所在。 import os from bs4 import BeautifulSoup import glob import errno dir_path = '/directory/path/to

浏览 23提问于2019-05-04得票数 1

回答已采纳

3回答

漂亮的汤和bs4有什么区别？

python、xml、python-3.x、beautifulsoup、bs4

我是python的新手，我试着解析一些XML文件，以便添加一些新标记并存储新的XML文件。 python-beautifulsoup看起来就是一个合适的包。在web上搜索教程，如何向BeautifulSoup解析的XML添加新标记，我发现使用的是python-bs4包。查看包描述，两个包具有相同的标题： python-bs4 - error-tolerant HTML parser for Python python-beautifulsoup - error-tolerant HTML parser for Python 所以我的问题是:有什么不同？

浏览 2提问于2015-03-27得票数 25

回答已采纳

1回答

为什么BeautifulSoup .children包含无名元素和预期的标记？

python、html-parsing、beautifulsoup

代码 #!/usr/bin/env python3 from bs4 import BeautifulSoup test="""<!DOCTYPE html> <html> <head> <meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/> <title>Test</title> </head> <body> <table> <tbody

浏览 0提问于2013-08-17得票数 2

回答已采纳

1回答

ModuleNotFoundError:即使在安装和重新安装之后，也没有名为“bs4”的模块

python、python-3.x、beautifulsoup

我正在运行我的Python文件。(py name.py) from bs4 import BeautifulSoup as BS File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\bs4\__init__.py", line 29, in <module> from .builder import builder_registry File "C:\Users\Administrator\AppData\Local

浏览 0提问于2019-04-28得票数 1

回答已采纳

2回答

在不显示完整图像链接时使用BeautifulSoup下载图像，除非在src标记上悬停。

python、web-scraping、beautifulsoup

我正在尝试从页面下载图片。我编写了以下Python脚本： import requests import subprocess from bs4 import BeautifulSoup request = requests.get("http://ottofrello.dk/malerierstor.htm") content = request.content soup = BeautifulSoup(content, "html.parser") element = soup.find_all("img") for img in eleme

浏览 2提问于2018-05-20得票数 1

回答已采纳

1回答

jQuery脚本阻塞了BeautifulSoup，有什么已知的解决方法吗？

html、xml、screen-scraping、html-parsing、beautifulsoup

我给BeautifulSoup提供了一个html文档，通过构造一个包含完整html的BeautifulSoup对象实例，它似乎被嵌入到html中的jQuery脚本的以下行卡住了： var txt = "Logged in as: <a href=\"http://somedomain.com/the-blah/\">" + uname + "</a> <small>(<a href=\"http://somedomain.com/the-blah/\">The Blah<

浏览 0提问于2010-11-15得票数 0

回答已采纳

4回答

用BeautifulSoup摘录标题

python-3.x、beautifulsoup

我有这个 from urllib import request url = "http://www.bbc.co.uk/news/election-us-2016-35791008" html = request.urlopen(url).read().decode('utf8') html[:60] from bs4 import BeautifulSoup raw = BeautifulSoup(html, 'html.parser').get_text() raw.find_all('title', limit=1) pr

浏览 5提问于2016-03-12得票数 20

回答已采纳

1回答

如何让Python库在Spark YARN上工作

python、apache-spark、pyspark

如果我想使用python库来完成特定的任务，比如NLTK或BeautifulSoup，我可以在本地机器上使用Spark来完成，但是同样的事情在Spark on YARN上就不起作用了。下面是一个示例代码： from pyspark.sql.functions import udf from pyspark.sql.types import StringType def html_parsing(x): """ Cleans the text from Data Frame text column""" textcleaned

浏览 8提问于2017-02-22得票数 1

1回答

从URL抓取文本并在pi上显示

python、raspberry-pi

我正在从web服务器上获取文本，并试图在python上的raspberry pi屏幕上显示当前的歌曲。使用LCD 16x2 #!/usr/bin/python # Example using a character LCD connected to a Raspberry Pi or BeagleBone Black. import math import time import urllib2 from BeautifulSoup import BeautifulSoup import Adafruit_CharLCD as LCD page = urllib2.urlopen(&#

浏览 5提问于2014-07-24得票数 0

回答已采纳

1回答

在漂亮的汤网里一丝不挂

python、html、web-scraping、beautifulsoup

我的意愿我想要刮从提交用户使用漂亮的汤与python。我的问题获得none作为我的脚本的结果。我的代码 from bs4 import BeautifulSoup import requests html = requests.get('https://github.com/pnp/cli-microsoft365').text soup = BeautifulSoup(html, 'html.parser') commits = soup.select_one('svg.octicon.octicon-history + span stron

浏览 1提问于2021-10-16得票数 1

回答已采纳

19回答

如何按类查找元素

python、html、web-scraping、beautifulsoup

我在使用Beautifulsoup解析带有"class“属性的HTML元素时遇到了问题。代码如下所示 soup = BeautifulSoup(sdata) mydivs = soup.findAll('div') for div in mydivs: if (div["class"] == "stylelistrow"): print div 在脚本结束后，我在同一行得到了一个错误。 File "./beautifulcoding.py", line 130, in getlanguage

浏览 6提问于2011-02-18得票数 532

回答已采纳

1回答

试图使用Xpath从我的代码中使用BeautifulSoup进行web抓取

python、python-2.7、xpath、web-scraping、beautifulsoup

这是一个关于网络抓取的问题。我能够使用BeautifulSoup刮站点，但我想使用XPaths，因为Chrome的“复制Xpath”功能使它变得非常容易。我的理解是Xpath更容易，因为要使用BeautifulSoup，我们需要手工生成的HTML。例如，下面是我得到的标题，但必须手动生成“查找”部分。如果是Xpath的话，我的理解是我可以从Chrome 'Inspect‘窗口中“复制XPath”。 import requests from bs4 import BeautifulSoup url = "http://www.indeed.com/jobs?q=hardwar

浏览 2提问于2016-01-04得票数 1

4回答

BeautifulSoup HTMLParseError

python、web-scraping、beautifulsoup

Python新手，有一个简单的情景问题：尝试使用BeautifulSoup解析一系列页面。 from bs4 import BeautifulSoup import urllib.request BeautifulSoup(urllib.request.urlopen('http://bit.ly/')) 追溯..。 html.parser.HTMLParseError: expected name token at '<!=KN\x01... 在64位Windows 7和Python 3.2上工作。我需要机械化吗？(这将需要Python 2.X)

浏览 0提问于2012-03-23得票数 5

回答已采纳

1回答

如何使用BeautifulSoup匹配嵌入了<a></a>的<div></div>中的文本？

python、html、beautifulsoup

我在test.py中有以下BeautifulSoup代码。 #!/usr/bin/env python # vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1: from bs4 import BeautifulSoup import sys soup = BeautifulSoup(sys.stdin.read(), 'html.parser', from_encoding='utf-8') import re from pprint import pprint pprint(soup.f

浏览 1提问于2016-01-03得票数 1

3回答

在replaceWith()不起作用后查找(使用BeautifulSoup)

python、find、beautifulsoup

请考虑以下python会话： >>> from BeautifulSoup import BeautifulSoup >>> s = BeautifulSoup("<p>This <i>is</i> a <i>test</i>.</p>"); myi = s.find("i") >>> myi.replaceWith(BeautifulSoup("was")) >>> s.find("i"

浏览 0提问于2013-03-17得票数 6

回答已采纳

1回答

Python:异常后重试将离开导致异常的行。

python、beautifulsoup

我是Python新手。我正在使用BeautifulSoup - python模块。如果存在，我必须查找和获取任何id (如MathJax-Element-1, MathJax-Element-2, MathJax-Element-3, MathJax-Element-4,…. )的文本。我的代码是 from bs4 import BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') attempts = 0 a=-1 while attempts < 100: try: a+=1

浏览 5提问于2015-09-16得票数 1

回答已采纳

1回答

<script>标签和HTMLParseError

python、beautifulsoup

我试图用BeautifulSoup解析超文本标记语言，却得到了奇怪的错误。下面是重现问题的最小代码。(Windows 7 32位，ActivePython 2.7)。 from bs4 import BeautifulSoup s = """ <html> <script> var pstr = "<li><font color='blue'>1</font></li>"; for(var lc=0;lc<o.length;lc++){} </script

浏览 4提问于2012-05-05得票数 1

回答已采纳

1回答

使用BeautifulSoup 4 (lxml解析器)，如何从标记中提取内部decode_contents (decode_contents不起作用)？

python、python-3.x、beautifulsoup、innerhtml

我使用的是BeautifulSoup 4和Python3.7。我想从找到的文章中提取内部HTML。我有这个 soup = BeautifulSoup(html, features="lxml") ... article_elt = top_article_elt.select('div[class*="outer"]')[0] article = article_elt.decode_contents() ... print("article: " + str(article) + " score:" + str(

浏览 6提问于2019-12-08得票数 0

回答已采纳

1回答

ImportError:无法导入名称“HTMLAwareEntitySubstitution”

beautifulsoup、python-3.5

我刚安装了漂亮的Soup4-4.1.0，并将pip升级到9.0.1版。当我写这个的时候： from bs4 import BeautifulSoup 错误发生： Traceback (most recent call last): File "<pyshell#4>", line 1, in <module> from bs4 import BeautifulSoup File "D:\Program Files (x86)\Python35-32\lib\site-packages\bs4\__init__.py", l

浏览 6提问于2016-11-29得票数 7

回答已采纳

1回答

BeautifulSoup，TypeError：“NoneType”对象不可调用。摘自:使用Python进行Web抓取

python、beautifulsoup、nonetype

这个训练来自一本书，由Ryan Mitchell写的Python Web Scraping With Python，中文版P23。我发现其他任何人都是similar.who可以告诉我如何修复它吗？提前谢谢你。我发了一张照片。代码如下： from urllib.request import urlopen from bs4 import BeautifulSoup import re html = urlopen("http://www.pythonscraping.com/pages/page3.html") bsObj = BeautifulSoup(html,"h

浏览 2提问于2016-08-08得票数 0

3回答

不安装easy_install的BeautifulSoup安装或替代方案

python、beautifulsoup、windows-7-x64

我想写一个程序来从python中抓取一个网站。由于没有这样做的内置可能性，我决定尝试一下BeautifulSoup模块。不幸的是，我在使用pip和ez_install时遇到了一些问题，因为我使用的是Windows764位和Python3.3。有没有办法在我的Python3.3安装在Windows764x上没有ez_install或easy_install的情况下获得BeautifulSoup模块，因为我在这方面有太多的问题，或者有一个替代模块可以很容易地安装？

浏览 2提问于2013-09-09得票数 0

3回答

使用Gecko/Firefox或Webkit获得python中的HTML解析

python、html、parsing

我使用BeautifulSoup和urllib2来下载和解析超文本标记语言页面。问题出在格式错误的HTML页面。尽管BeautifulSoup擅长处理格式错误的超文本标记语言，但它仍然不如火狐。考虑到Firefox或Webkit在处理HTML方面更新、更灵活，我认为理想的做法是使用它们来构造和规范化页面的DOM树，然后通过Python对其进行操作。然而，我找不到任何相同的python绑定。有没有人能给我一个建议？我遇到了一些运行无头Firefox进程并通过python操作它的解决方案，但是有没有更有python风格的解决方案呢？

浏览 1提问于2009-04-22得票数 6

回答已采纳

1回答

ModuleNotFoundError:没有名为'bs4‘的模块，即使我已经正确安装了BeautifulSoup4和pip3 (Windows)

python、python-3.x、beautifulsoup、pip、windows-10

Python版本: 3.9.5 pip版本: 21.1.1 BeautifulSoup4版本: 4.9.3 from bs4 import BeautifulSoup with open('home.html', 'r') as html_file: content = html_file.read() print(content) 我一直在尝试使用BeautifulSoup4库，但它就是不起作用。在vscode中，当我在用代码编写的bs4上执行CTRL+click时，它会显示bs4在那里。但它仍然给了ModuleNotFoundError:

浏览 197提问于2021-06-24得票数 0

回答已采纳

1回答

使用BeautifulSoup将HTML插入html文件

python、html、beautifulsoup

我有一个文件，它包含指向不同页面的链接。我想将它们插入到带有div的id="links"下面的HTML文件中。要明确的是，div已经存在，所以我不想在任何地方创建新标记。我的python和HTML尝试如图所示 <html> <head> </head> <body> <div id="links"> </div> </body> </html> from bs4 import BeautifulSoup soup = BeautifulSoup(ope

浏览 13提问于2020-04-14得票数 1

回答已采纳

3回答

无法使用BeautifulSoup从span元素中收集属性

python、html、beautifulsoup

是我希望使用BeautifulSoup从下面的站点()解析的源代码的映像。我希望提取< span class=‘print’>属性中的属性: htm链接. 我的python代码如下所示： import urllib.request try:

浏览 10提问于2017-08-01得票数 0

回答已采纳

1回答

为什么bs4.element.ResultSet上的迭代不复制原件？

python-3.x、beautifulsoup、iteration

我对优美汤ResultSets上的迭代行为感到有点困惑。一般来说，在python中，我希望迭代生成每个元素的副本。不能通过为迭代元素分配新值来修改列表。 l1 = [1,2,3] for elem in l1: elem = elem + 10 不会修改原始列表l1 但如果我这么做了 from bs4 import BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') for elem in soup('body'): elem.unwrap() 然后修改原始的“汤”元素！这在

浏览 1提问于2016-05-24得票数 1

回答已采纳

2回答

"class“属性返回列表，而其他属性返回值

python、beautifulsoup、html-parsing

对于python中的html解析非常方便，下面的代码结果融合了我。 from bs4 import BeautifulSoup tr =""" <table> <tr class="passed" id="row1"><td>t1</td></tr> <tr class="failed" id="row2"><td>t2</td></tr> </table> "&#

浏览 5提问于2016-07-26得票数 1

回答已采纳

1回答

当我使用BeautifulSoup .findAll时，如何获得下一个div？

python、beautifulsoup

我在python2.7中使用BeautifulSoup，我有这样的代码： html = "<div>\ <div>\ <div>\ <div>one</div>\ <div>\ <div>two</div>\ <div>three</div>\ <div>four</div>\ <

浏览 5提问于2016-01-28得票数 3

1回答

如何从python的HTML表中的特定单元格中获取数据？

python、html、parsing、beautifulsoup

我试图在Python中使用BeautifulSoup。我对BeautifulSoup和HTML非常陌生。这是我解决问题的尝试。 soup = BeautifulSoup(open('BBS_student_grads.php')) data = [] table = soup.find('table') rows = table.find_all('tr') #array of rows in table for x,row in enumerate(rows[1:]):# skips first row cols = row.f

浏览 5提问于2015-03-07得票数 1

回答已采纳

1回答

用换行符修改BeautifulSoup .string

python、html、beautifulsoup

我试图用BeautifulSoup更改html文件的内容。该内容将来自基于python的文本，因此它将有\n新行. newContent = """This is my content \n with a line break.""" newContent = newContent.replace("\n", "<br>") htmlFile.find_all("div", "product").p.string = newContent 当我这样做时，html文件<

浏览 9提问于2015-02-07得票数 2

回答已采纳

2回答

删除python中的span标记

python、python-3.x

我是一个新手，在使用BeautifulSoup从页面抓取html后，移除跨度标签有困难。尝试使用"del links‘’span‘，但返回相同的结果。使用getText()的一些尝试也失败了。很明显我做错了应该很容易的事情。帮助？ from bs4 import BeautifulSoup import urllib.request import re url = urllib.request.urlopen("http://www.python.org") content = url.read() soup = BeautifulSoup(content) for l

浏览 2提问于2013-06-12得票数 1

3回答

bs4.FeatureNotFound:无法找到具有所需特性的树构建器:html-解析器。您需要安装解析器库吗？

python、python-3.x、beautifulsoup、html-parser、html-treebuilder

我试图通过以下代码在Web上刮取： from bs4 import BeautifulSoup import requests import pandas as pd page = requests.get('https://www.google.com/search?q=phagwara+weather') soup = BeautifulSoup(page.content, 'html-parser') day = soup.find(id='wob_wc') print(day.find_all('span')) 但经常

浏览 6提问于2020-03-29得票数 0

回答已采纳

1回答

bs4第二个注释<！->丢失了

python、web-scraping、beautifulsoup

我正在使用BeautifulSoup进行python挑战级别-9。url = "“。Bs4.版本 == '4.3.2‘。它的页面源中有两个注释。汤的产量应如下。但是，在应用BeautifulSoup时，缺少第二个注释。听起来有点奇怪。有什么暗示吗？谢谢! import requests from bs4 import BeautifulSoup url = "http://www.pythonchallenge.com/pc/return/good.html" page = requests.get(url, auth = ("huge",

浏览 1提问于2014-10-25得票数 0

回答已采纳

1回答

BeautifulSoup4不识别css-选择器。

python、beautifulsoup

我使用的是Python 3.6.9，我已经在我的am上安装了以下软件包： beautifulsoup4 (4.10.0) certifi (2021.10.8) charset-normalizer (2.0.9) idna (3.3) pip (9.0.1) pkg-resources (0.0.0) requests (2.26.0) setuptools (39.0.1) soupsieve (2.3.1) urllib3 (1.26.7) 我正在尝试编写一个简单的web应用程序，从一个网站获取数据： import requests import datetime from bs4 im

浏览 1提问于2022-01-02得票数 0

回答已采纳

1回答

如何获取div标记中的所有li标记

python、web-scraping、beautifulsoup

我正在刮一个网站，以了解公司和产品的细节。它有div标记，其中有li标记，我希望在div标记中得到所有的li标记。我使用python3.5.1和BeautifulSoup 我的代码： from bs4 import BeautifulSoup import urllib.request import re r = urllib.request.urlopen('http://i.cantonfair.org.cn/en/ExpExhibitorList.aspx?k=glassware') soup = BeautifulSoup(r, "html.parser"

浏览 4提问于2016-02-26得票数 0

回答已采纳

1回答

即使内容存在，BeautifulSoup也不打印任何内容

python-3.x、web、web-scraping、beautifulsoup

我正在尝试构建一个hackernews抓取器，但是当我运行我的代码时 import requests from bs4 import BeautifulSoup res = requests.get("https://news.ycombinator.com/") soup = BeautifulSoup(res.text,'html.parser') print(soup.find(id="score_23174015")) 我不明白为什么美丽的汤一直不给我任何回报，我还在学习，所以我也是python3的新手

浏览 25提问于2020-05-14得票数 0

回答已采纳

2回答

不同类的Soup.Find

python、beautifulsoup

我有一个关于python的问题，我只想抓取一个带有不同属性类的页面，并在它们上循环，所以这是我需要的html代码： ‘：“类: a” ‘'div'：“类: b” 'h1‘：“类: c” 页面中只有一个，所以我尝试使用"else if“和" try”语句，但我仍然不明白。此代码仅适用于一个类： #!/usr/bin/env python import csv import requests from bs4 import BeautifulSoup urls = csv.reader(open('link.csv')) for

浏览 2提问于2017-01-29得票数 1

回答已采纳