使用BeautifulSoup提取div标记本身中的文本_在Python中使用BeautifulSoup 4从div标记中提取文本_BeautifulSoup根据其中的文本查找div标记 - 腾讯云开发者社区

python、python-3.x、web-scraping、beautifulsoup

我想从标签中提取数据，以便简单地检索文本。不幸的是，我不能只提取文本，我总是在这个链接。是否可以从我的文本中删除所有的<img>和<a href>标记？ <div class="xxx" data-handler="xxx">its a good day <a class="link" href="https://" title="text">https:// link</a></div> 我只想恢复这个：its a good day，忽略&

浏览 16提问于2022-11-27得票数 0

2回答

无法从python中的html页面提取文本

python、beautifulsoup、html-parsing

我对网络抓取非常陌生。我读到了关于BeautifulSoup的文章，并试图使用它。但我无法提取具有给定类名“company-desc-and-排序容器”的文本。我甚至不能从html页面中提取标题。这是我尝试过的代码： from BeautifulSoup import BeautifulSoup import requests url= 'http://fortune.com/best-companies/' r = requests.get(url) soup = BeautifulSoup(r.text) #print soup.prettify()[0:10

浏览 5提问于2016-12-20得票数 1

回答已采纳

1回答

使用h1和Python从多个标记(如具有类的BeautifulSoup和p标记)中提取文本

python、beautifulsoup

我已经知道了如何从itemprop中提取文本，但是我不能从我粘贴的<div clas="someclass">Extract This Text Here!</div>中提取文本，我只是粘贴了我的代码中不起作用的部分，但是如果需要的话，它会粘贴整个内容。我已经用BeautifulSoup和Python设置了一个变量来获取页面，但是它不会只抓取文本。编辑:一些文本被包装在一个h1标记中，而有些文本被包装在一个具有多个跨度的p标记中。编辑2:所以有些数据在<div class=“someclass”><h1>There’s th

浏览 1提问于2018-08-22得票数 0

回答已采纳

1回答

web抓取找不到正确的标签

pandas、beautifulsoup

我正在尝试提取此页面的文本:使用bs4和pandas的我从以下几点开始： src=requests.get(url).content soup = BeautifulSoup(src,'xml') 并且看到我感兴趣的文本被包装在p标记中，但是当我运行soup.find_all('p')时，我得到的唯一返回值是结束段落。如何提取内的段落文本？我遗漏了什么？以下是我正在尝试提取的段落：我还尝试使用selenium： chrome_options = webdriver.ChromeOptions() chrome_options.add

浏览 4提问于2021-02-17得票数 2

1回答

解析HTML文件BeautifulSoup

python、html、parsing、beautifulsoup

我试图用html解析本地BeautifulSoup文件，但在导航树时遇到了问题。该文件的格式如下： <div class="contents"> <h1> USERNAME </h1> <div> <div class="thread"> N1, N2 <div class="message"> <div class="message_header"> <span cl

浏览 2提问于2015-07-31得票数 0

1回答

BeautifulSoup如何提取<br>标记后的文本

python、html、beautifulsoup

我不知道如何使用BeautifulSoup到达下面的段落，以及如何提取我想要的特定文本。因为我是Python和BS4的新手。我的HTML如下： <div class="inner-content"> <div class="bred"></div> <div class="clrbth"></div> <h1></h1> <h4></h4> ... ... ... <p></p>

浏览 5提问于2015-08-12得票数 4

回答已采纳

3回答

我正在尝试在span_id中提取文本，但是使用python漂亮汤获得空白输出。

python-3.x、beautifulsoup

我正在尝试提取文本内的跨度-id标签，但得到空白输出屏幕。我也尝试过使用父元素div文本，但未能提取，请任何人帮助我。下面是我的密码。 import requests from bs4 import BeautifulSoup r = requests.get('https://www.paperplatemakingmachines.com/') soup = BeautifulSoup(r.text,'lxml') mob = soup.find('span',{"id":"tollfree"})

浏览 1提问于2019-04-18得票数 0

回答已采纳

2回答

利用美汤提取标题和强标签

python、html、web-scraping、beautifulsoup

我希望从div内的标题提取文本字符串，并使用BeautifulSoup从<strong>标记中提取文本。我可以使用soup.h1获得标题，但我希望获得专门位于h1 <div class="site-content">中的div HTML： <div class="site-content"><h1>Here is the title<strong>( And a bit more! )</strong></h1></div> 所以我想要Here is the tit

浏览 2提问于2014-01-29得票数 1

1回答

如何用BeautifulSoup从Html中提取除法

python、python-3.x、string、web-scraping、beautifulsoup

我正在尝试使用漂亮汤从html文件中提取字典条目的“意义”部分，但这给我带来了一些麻烦。以下是我迄今所做尝试的总结：我右键单击下面的字典条目页，并将该网页保存到我的'aufmachen.html'目录中，作为在这个网页的源代码中，我试图提取的部分从第1042行开始，我在下面编写了表达式，但是标签和Bedeutungen都不包含任何搜索结果。 import requests import pandas as pd import urllib.request from bs4 import BeautifulSoup with open("aufmachen.htm

浏览 2提问于2022-06-24得票数 0

回答已采纳

4回答

使用BeautifulSoup从img标签中提取源属性

python、regex、beautifulsoup

<div class="someClass"> <a href="href"> <img alt="some" src="some"/> </a> </div> 我想使用BeautifulSoup从图像(即img)标签中提取源(即src)属性。我使用bs4，不能使用a.attrs['src']获取src，但我可以获取href。我该怎么办？

浏览 1提问于2017-05-15得票数 50

3回答

如何提取带有标签的标签内的文本？

python、beautifulsoup

我想使用beautifulsoup解析html页面。我希望在不删除html标签的情况下提取标签中的文本。例如，示例输入： <a class="fl" href="https://stackoverflow.com/questio..."> Angular2 <b>Router link not working</b> </a> 样本输出： 'Angular2 <b>Router link not working</b>' 我试过这样做： from bs4 import

浏览 10提问于2019-10-11得票数 2

回答已采纳

2回答

在CSS类中提取文本

python、pandas、beautifulsoup

试图从网页提取数据到表。例如： Block Number XXX Building Name YYY Street Name zzz Pin Code 123456789 我试着用这个代码以表格的形式得到公司的所有细节. html_doc='https://s3.amazonaws.com/todel162/test.html' from urllib.request import urlopen from bs4 import BeautifulSoup soup = BeautifulSoup(urlopen(html_doc), 'html.parser

浏览 2提问于2018-04-01得票数 1

回答已采纳

2回答

Python3.5 BeautifulSoup4从div中的p中获取文本

html、python-3.x、beautifulsoup、python-requests

我试图从div类“caselawcontent可搜索内容”中提取所有文本。此代码只打印HTML，而不使用网页中的文本。我错过了什么才能收到短信？以下链接位于“finteredcasesdoc.text”文件中： import requests from bs4 import BeautifulSoup with open('filteredcasesdoc.txt', 'r') as openfile1: for line in openfile1: rulingpage = requests.get(line).

浏览 7提问于2017-05-16得票数 6

回答已采纳

1回答

如何用BeautifulSoup排除元素(Python)

python、beautifulsoup

我试图从本文()中提取文章文本，并将底部的合法容器排除在外。文本部分似乎很简单，但似乎无法摆脱容器。为了便于使用，我将其与法律变量分开。到目前为止，我的代码如下： import requests from bs4 import BeautifulSoup base_url = 'https://www.vanityfair.com/style/society/2014/06/monica-lewinsky-humiliation-culture' r = requests.get(base_url) r_html = r.text soup = BeautifulSoup(

浏览 9提问于2017-10-12得票数 3

1回答

如何使用美丽汤从<a>标签中提取单个文本？

python、web-scraping、beautifulsoup

因此，在一个标记中有3个文本，但我只需要提取下面的单个标记，这是我编写的代码 import requests from bs4 import BeautifulSoup source= requests.get('eg.com') soup =BeautifulSoup(source,'lxml') article= soup.find('div',class_='content') b = article.li.a.text 它返回标签内的所有文本，输出： Apple 2 itea

浏览 0提问于2021-04-15得票数 0

回答已采纳

4回答

Python BeautifulSoup：“list_iterator”对象不可订阅

python、beautifulsoup

我试图从下面的html结构中提取内部的文本： <div class="account-age"> <label></label> <div> <div> <span>Text to extract</span> </div> </div> </div> 我有下面的Beautiful Soup代码来做这件事： from bs4 import BeautifulSoup as bs

浏览 25提问于2018-06-05得票数 2

回答已采纳

1回答

Python3 web-刮取器不能从站点中的每个<a>标记中提取文本

html、python-3.x、web-scraping、beautifulsoup

我正在尝试编写一个Python3网络刮刀，它从一个站点中提取中的文本，一个标记。我使用的是bs4库和以下代码： from bs4 import BeautifulSoup import requests req = requests.get(mainUrl).text soup = BeautifulSoup(req, 'html.parser') for div in soup.find_all('div', 'turbolink_scroller'): for a in div.find_all('a', href

浏览 3提问于2020-12-08得票数 0

回答已采纳

4回答

Python -使用空分隔符打印项

python、beautifulsoup

考虑到这个html： <div id="catwrap" class="categories"> <a href="http://blahblahblahscience.com/category/electronic/" style="background-color:#006666">Electronic</a> <a href="http://blahblahblahscience.com/category/track-reviews/" style=&

浏览 10提问于2017-02-25得票数 0

回答已采纳

1回答

使用BeautifulSoup将文本从一个超文本标记语言文档传输到另一个文档

python、html、beautifulsoup

我正在尝试从上的页面中提取类别名称和问题/答案文本，并使用Python将它们插入到我自己的HTML文档中。我已经能够使用soup.find_all("td", class_="clue_text)提取线索文本，理论上我知道如何提取其他数据，但我不知道如何将这些数据插入到我自己的HTML文档中，特别是考虑到BeautifulSoup输出一个列表，并且我的文本格式与源文件不同。例如，我希望线索文本替换以下HTML中的"Category 2 Question 5“： <table id="4_1" cellpadding="0"

浏览 1提问于2018-03-21得票数 0

2回答

从包含br标记的td标记中提取文本

python、python-3.x、beautifulsoup

我想从td标签中提取包含br标签的文本。 from bs4 import BeautifulSoup html = "<td class=\"text\">This is <br/>a breakline<br/><br/></td>" soup = BeautifulSoup(html, 'html.parser') print(soup.td.string) 实际产出：None 预期产出：This is a breakline

浏览 0提问于2018-03-14得票数 1

回答已采纳

3回答

使用BeautifulSoup4查找包含文本的所有终端节点。

python、python-3.x、beautifulsoup

我是Python和BeautifulSoup4的新手我试图(仅)提取所有标记的文本内容，这些标记要么是'div‘、'p’、'li‘，要么是直接节点，而不是子节点--因此出现了两个选项text=True, recursive=False 以下是我的尝试： content = soup.find_all("b", "div", "p", text=True, recursive=False) 和 tags = ["div", "p", "li"] content = soup.

浏览 1提问于2019-01-19得票数 5

回答已采纳

1回答

用BeautifulSoup提取文本

python、beautifulsoup

我正在尝试从一个旧的网页中提取文本，并且遇到了麻烦。检查网页()的来源时，文本开始： > </div></div><span class="displaytext"><b>PARTICIPANTS:</b><br>Former Secretary of State > Hillary Clinton (D) and<br>Businessman Donald Trump > (R)<p><b>MODERATOR:</b><br>C

浏览 2提问于2017-11-25得票数 0

回答已采纳

2回答

如何从以下HTML代码中提取文本？

python、html、web-scraping、beautifulsoup

我正在为一个DS项目做web抓取，我使用BeautifulSoup来实现这个目的。但我无法从"table“类中的"tbody”标记中提取持续时间。以下是HTML代码： <div class="table-responsive"> <table class="table"> <thead> <tr> <th>Start Date</th> <th>Dura

浏览 7提问于2020-05-26得票数 0

回答已采纳

1回答

<>已更改为&lt；&gt；，find_all("a")无法提取python中带有漂亮汤的链接

python、beautifulsoup

我正在尝试使用BeautifulSoup来提取一些链接。下面是我使用的python代码。 resp = urlopen("http://target-page.com").read().decode("utf-8") soup = BeautifulSoup(resp, "html.parser") all_links = soup.find_all("a") for link in all_links: print(link["href"]) 下面是我用print("soup")得到的一

浏览 12提问于2017-03-05得票数 1

1回答

如何在BeautifulSoup中捕获内部文本和内部标记

python、html、beautifulsoup、screen-scraping

我正在解析一个文档，它是一个包含div标记的列表，但它有时也只有文本内联。我需要知道如何从它们中提取内容。说我有以下几点： <div> <div>1</div> <div>2</div> 3 <div>4</div> </div> 我需要提取上面所有的文本，这样它就可以读到1234了。我有下面的代码，它获取所有的div标记，但不会单独获得文本。 from ghost import Ghost from BeautifulSoup import BeautifulSoup def tagfilt

浏览 0提问于2014-02-28得票数 1

回答已采纳

4回答

使用Python从HTML中提取可读文本？

python、html、text-extraction

我知道像html2text，BeautifulSoup等工具，但问题是他们也提取javascript并将其添加到文本中，这使得分离它们变得很困难。 htmlDom = BeautifulSoup(webPage) htmlDom.findAll(text=True) 或者， from stripogram import html2text extract = html2text(webPage) 这两个都提取了页面上的所有javascript，这不是我们想要的。我只想把你可以从浏览器复制的可读性文本提取出来。

浏览 2提问于2010-07-04得票数 4

回答已采纳

3回答

美丽的汤只在标签内直接得到字符串

python、web-scraping、beautifulsoup

我有一个这种格式的BeautifulSoup <div class='text'> <h3> text </h3> <p> some more text </p> "text here <b> is </b> important" </div> 如何只提取字符串“这里的文本很重要”，省略了h3和p元素，但是粗体标记文本仍然保留在输出中。谢谢你一吨

浏览 4提问于2021-03-14得票数 2

回答已采纳

1回答

使用Python抓取Javascript创建的动态内容

python、arrays、python-3.x、web-scraping、beautifulsoup

我想用python脚本废弃javascript函数创建的DIV内容。我尝试过使用BS4，但我无法获得动态数据。相反，它只显示源代码。示例代码： import requests from bs4 import BeautifulSoup URL = "https://rawgit.com/skysoft999/tableauJS/master/example.html" r = requests.get(URL) soup = BeautifulSoup(r.content, 'html5lib') for row in soup.findAll(

浏览 0提问于2018-04-20得票数 3

回答已采纳

1回答

使用python汤在动态HTML标记之间提取文本

python-3.x、beautifulsoup

我有一个需要在HTML标记之间提取文本的要求。我使用BeautifulSoup提取数据并将文本存储到一个变量中以供进一步处理。后来我发现，我需要提取的文本有两个不同的标签。但是，请注意，我需要提取文本并存储到相同的变量中。提供了我以前的代码和示例HTML文本信息。请帮助我得到我的最终结果，也就是预期的产出。示例HTML文本： <DIV CLASS="c0"><P CLASS="c1"><SPAN CLASS="c2">1 of 80 DOCUMENTS</SPAN></P> <D

浏览 0提问于2016-12-26得票数 0

回答已采纳

3回答

用BeautifulSoup和Python从网站中提取信息

python、web-scraping、beautifulsoup

我试图从中提取信息。无论我多么努力，都无法在图像中标记的三个字段(绿色、蓝色和红色矩形)中获得文本。使用以下函数，我认为我可以成功地在页面上获得所有的文本，但是它没有工作： from bs4 import BeautifulSoup import requests def get_text_from_maagarim_page(url: str): html_text = requests.get(url).text soup = BeautifulSoup(html_text, "html.parser") res = soup.find_

浏览 6提问于2022-04-29得票数 0

回答已采纳

3回答

如何使用BeautifulSoup查找第一个锚标签的文本

python、beautifulsoup

我有一个这样的HTML结构 <p class="title"> <a href="abc.com"> Story </a> <span class="domain"> <a href="xyz.com">comments</a> </span> </p> 我想提取第一个锚标签的文本，即Story 下面是我如何使用Beautifulsoup从锚标记中提取文本 soup = BeautifulSoup(htm

浏览 1提问于2016-04-28得票数 1

3回答

BeautifulSoup如何使用循环和提取特定数据？

python、beautifulsoup

下面的HTML代码来自一个关于电影评论的网站。我想从下面的代码中提取星号，它们是John C. Reilly，Sarah Silverman和Gal Gadot。我怎么能这样做呢？代码： html_doc = """ <html> <head> </head> <body> <div class="credit_summary_item"> <h4 class="inline">Stars:</

浏览 21提问于2019-01-11得票数 2

回答已采纳

2回答

BeautifulSoup查找表中的文本

python、html、beautifulsoup

下面是我的HTML的样子： <head> ... </head> <body> <div> <h2>Something really cool here<h2> <div class="section mylist"> <table id="list_1" class="table"> <thead> ... not important <

浏览 3提问于2017-08-23得票数 1

回答已采纳

1回答

提取HTML表并将它们存储在单独的文件中

python、html、web-scraping、beautifulsoup

我编写了一个代码来提取表的子部分，但是我希望从输入中提取每个标记，然后将它们存储在一个单独的html文件中。 from bs4 import BeautifulSoup soup = BeautifulSoup(myInput) table = soup.find('table', {'class': '*'}) 我希望代码显示输入文本上包含的所有表，但是它输出错误代码，因为*没有定义编辑：*意味着文件中的每个表，比如*.txt

浏览 0提问于2019-07-21得票数 0

回答已采纳

1回答

如何防止使用BeautifulSoup (python)在错误的HTML中关闭标记？

python、parsing、html-parsing、beautifulsoup

我自动地将HTML页面的内容翻译成不同的语言，所以我必须从不同的HTML页面中提取所有的文本节点，这些HTML页面有时写得很糟糕(我无法编辑这些HTML)。通过使用BeautifulSoup，我可以很容易地提取这些文本并将其替换为翻译，但是当我在这些操作之后显示HTML时: html = BeautifulSoup(source_html) --有时会因为BeautifulSoup自动关闭标记而中断(例如，表标记在错误的位置关闭)。有办法阻止BeautifulSoup关闭这些标记吗？例如，这是我的输入： html = "<table><tr><td&

浏览 8提问于2011-09-19得票数 5

2回答

BeautifulSoup:获取具有特定属性的元素，该属性独立于其值

python、parsing、xpath、html-parsing、beautifulsoup

假设我有以下html： <div id='0'> stuff here </div> <div id='1'> stuff here </div> <div id='2'> stuff here </div> <div id='3'> stuff here </div> 是否有一种简单的方法可以提取所有具有属性div的id，而不依赖于使用BeautifulSoup的值?我意识到使用Beautiful

浏览 5提问于2014-05-07得票数 3

回答已采纳

2回答

如何使用<div>从<div>中抓取特定的标记

python、html、beautifulsoup

我想要提取的数据是从这个网站。我只想提取发布日期:2011年12月6日最后更新:2012年1月10日漏洞标识符: APSA11-04 CVE编号: CVE-2011-2462 守则： from bs4 import BeautifulSoup div = soup.find("div", attrs={"id": "L0C1-body"}) for p in div.findAll("p"): if p.find('strong'): print(p.text) 产出： Relea

浏览 2提问于2021-03-30得票数 2

回答已采纳

2回答

在div标记本身中提取数据

python、html、beautifulsoup、html-parsing

我试图使用Pythons从beautifulSoup文件中提取数据。下面的HTML行是我感兴趣的。 <div class="myself" title="Name@email.com [11:07:27 AM]"> <nobr>Name</nobr></div> 我想提取标题(与电子邮件和时间戳)。我可以用. find('div', attrs={'class':'myself'})) 我能够从其中打印div的全部内容，或者在div中的标记中打印信息，但是我

浏览 0提问于2015-06-23得票数 4

回答已采纳

3回答

使用BeuatifulSoup提取除其他标签外的div标签的内容

python、beautifulsoup

我有下面的HTML内容，其中div标签看起来像下面 <div class="block">aaa <p> bbb</p> <p> ccc</p> </div> 从上面我想提取文本只作为"aaa“，而不是其他标签的内容。当我这么做的时候 soup.find('div', {"class": "block"}) 它为我提供了文本形式的所有内容，我希望避免使用p标记的内容。在BeautifulSoup中有没有方法可以做到这一点？

浏览 35提问于2020-11-17得票数 0

回答已采纳

1回答

当汤产生时文字就消失了

python、beautifulsoup、yahoo-finance

我试图从站点提取日期，但是当文本生成时，时间将不显示日期 import requests import re from bs4 import BeautifulSoup technicals = {} ticker = "INFY.NS" try: url = "https://finance.yahoo.com/quote/" + ticker + "/key-statistics?p=" + ticker r = requests.get(url) # soup = BeautifulSoup(open("

浏览 0提问于2020-08-30得票数 0

1回答

美丽汤:不能在一个循环中提取所有元素

python、beautifulsoup

代码： from bs4 import BeautifulSoup soup = BeautifulSoup('<div><p>p_string</p><div>div_string</div></div>') for m in soup.div: print "extract(first loop): ", m.extract() print "current soup.div(frist loop): ", soup.div #it contains anothe

浏览 5提问于2014-10-30得票数 1

回答已采纳

1回答

如何使用select()和特定的CSS选择器从网站中提取文本内容

python、beautifulsoup

我正在学习如何使用Python和BeautifulSoup从网站中提取内容。这是HTML结构： <div id="preview-prediction" class="two-cols rc-b rc-r"> <span style="position: absolute; top: 0.5em; left: 1em; color: #808080;">Prediction: </span> <div class="

浏览 3提问于2015-04-04得票数 0

回答已采纳

1回答

使用BeautifulSoup 4 (lxml解析器)，如何从标记中提取内部decode_contents (decode_contents不起作用)？

python、python-3.x、beautifulsoup、innerhtml

我使用的是BeautifulSoup 4和Python3.7。我想从找到的文章中提取内部HTML。我有这个 soup = BeautifulSoup(html, features="lxml") ... article_elt = top_article_elt.select('div[class*="outer"]')[0] article = article_elt.decode_contents() ... print("article: " + str(article) + " score:" + str(

浏览 6提问于2019-12-08得票数 0

回答已采纳

2回答

从包含在具有相同类的div中的多个链接中提取的文本的单行打印问题

python、beautifulsoup

我正在尝试从一个具有相同类的几个div的页面中提取文本。每个div包含不同数量的文本链接。从每个div中提取的文本需要用一行打印出来。例如，如果一个div包含三个链接，另一个div包含两个链接，那么我希望从第一个div中的三个链接中提取文本，然后用一行打印结果，然后从第二个div中的两个链接中提取文本，然后用新的行打印它。我还希望将提取的数据存储为数组中的单个项。下面的代码正确地打印合并的数据，但是除了提取的文本之外，它还打印<a>标记和URL。我试图添加文本属性(content.text)，但是我得到了以下错误： AttributeError: ResultSet对象没有属

浏览 0提问于2019-08-30得票数 0

回答已采纳

1回答

在BeautifulSoup中提取除一个标记之外的文本

python、beautifulsoup、screen-scraping

我正在尝试使用BeautifulSoup提取文本。下面是html： <div> "BLABLA" <span> "RRRRR" </span> <span> "ZZZZZ" </span> </div> 我只想得到'BLABLA'和'RRRR'，并得到'ZZZZ' 当然，soup.text给了我3条短信。一种解决方案是迭代，直到找到第二个跨度(就像这个问题中的：) 但是在这种情况下有没有更好的解决方案

浏览 4提问于2013-10-22得票数 2

1回答

如何从BeautifulSoup对象中分解和平滑标记？

python-3.x、beautifulsoup

如何从BeautifulSoup对象中分解和平滑标记？而不是string. 从一个汤，到一个汤，没有一根绳子。建议使用smooth()方法消除不需要的空格。你能给我看看吗？ from bs4 import BeautifulSoup dml = '''<html> <head> <title>TITLE</title> </head> <body>LOOSE TEXT <div></div> <p></p> <

浏览 2提问于2020-06-18得票数 1

回答已采纳

1回答

在python中为任何网页URL文档搜索特定标题的文本

python、web-scraping、beautifulsoup、scrapy

我已经搜索并介绍了python中的一些网络爬行库，比如scrapy，漂亮汤等。使用这些库，我想抓取文档中特定标题下的所有文本。如果你们中的任何人能帮助我，我将不胜感激。我看过一些教程，教你如何使用漂亮的soap获取特定类名下的链接(通过查看源页面选项)，但如何获取简单的文本，而不是特定类标题下的链接。对不起，我的英语不好 import requests from bs4 import BeautifulSoup r=requests.get('https://patents.google.com/patent/US6886010B2/en') print(r.content)

浏览 0提问于2017-10-25得票数 0

3回答

如何用BeautifulSoup提取HTML表中的数据

python、html、beautifulsoup

如何提取特定数据(本例中为39.74% )，然后在下面的F1示例中使用BeautifulSoup提取“Proj.EPS增长(Proj.EPS Growth (BeautifulSoup))”？我对Python完全陌生。谢谢! <div class="high_low_table" id="high_low_table"> </table> <tbody> <tr> <th class="alpha" scope="row">Proj. EPS Growth (Q1) &l

浏览 2提问于2021-08-20得票数 0

回答已采纳

1回答

如何提取特定元素后面的所有类"a“？

python-3.x、beautifulsoup

我正在尝试提取一个类中的所有元素a，这个类的文本是Full browser with import requests from bs4 import BeautifulSoup url = 'https://www.freethesaurus.com/great+adductor+muscle' soup = BeautifulSoup(requests.get(url).content, 'html.parser') main = soup.select('*:has(strong:contains("Full browser")

浏览 15提问于2020-08-25得票数 0

回答已采纳

3回答

如何使用href>和BeautifulSoup从<div>中的<a BeautifulSoup标记后面出现的类标记中提取文本？

html、python-3.x、beautifulsoup

我正在尝试从出现在(和后面)这样的标记中的类中提取文本： from bs4 import BeautifulSoup html = """<div class="wisbb_teamA"> <a href="http://www.example.com/eg1" class="wisbb_name">Phillies</a> </div>""" soup = BeautifulSoup(html,"lxml") for

浏览 0提问于2020-06-24得票数 0

回答已采纳