blocks|key|212870|text|因为这不是静态页面，所以您需要向彭博API发出请求。要了解方法，请转到页面，检查元素并选择"Network"，然后通过"XHR“筛选并查找JSON类型。重新加载页面。我这么做了，相信这就是你想要的：链接|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|212871|entityMap|0|LINK|mutability|MUTABLE|url|https://www.bloomberg.com/markets2/api/datastrip/IBVC%253AIND%252CINDU%253AIND%252CSPX%253AIND?locale=en&customTickerList=true^0|2R|2|0|0^^$0|@$1|2|3|4|5|6|7|L|8|@]|9|@$A|M|B|N|1|O]]|C|$]]|$1|D|3|-4|5|6|7|P|8|@]|9|@]|C|$]]]|E|$F|$5|G|H|I|C|$J|K]]]]

Since that's not a static page, you need to make a request to the Bloomberg API. To find out how, go to the page, inspect element and select "Network", then filter by "XHR" and look for JSON types. Reload the page. I did that and believe this is what you want: <a href="https://www.bloomberg.com/markets2/api/datastrip/IBVC%3AIND%2CINDU%3AIND%2CSPX%3AIND?locale=en&amp;customTickerList=true" rel="nofollow noreferrer">link</a>

blocks|key|2231875|text|因为所需的值是动态加载的。在这种情况下，您可以尝试使用selenium和BeautifulSoup。下面是供您参考的示例代码：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2231876|import+time
import+os
from+selenium+import+webdriver
from+bs4+import+BeautifulSoup

#+put+the+driver+in+the+folder+of+this+code
driver+=+webdriver.Chrome(os.getcwd()+%2B+'/chromedriver')++

driver.get("https://www.bloomberg.com/quote/IBVC:IND")
time.sleep(3)
real_soup+=+BeautifulSoup(driver.page_source,+'html.parser')
open_+=+real_soup.find("span",+{"class":+"priceText__1853e8a5"}).text
print(f"Price:+{open_}")
time.sleep(3)
driver.quit()|code-block|syntax|javascript|2231877|输出：|2231878|Price:+50,083.00|2231879|您可以搜索色度驱动器，并下载一个基于您的铬版本。|2231880|entityMap^0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|Q|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|R|8|@]|9|@]|A|$E|F]]|$1|K|3|L|5|6|7|S|8|@]|9|@]|A|$]]|$1|M|3|-4|5|6|7|T|8|@]|9|@]|A|$]]]|N|$]]

As the required values are dynamically loaded. In this case, you may try with selenium and BeautifulSoup. Here is a sample code for your reference:

<pre><code>import time
import os
from selenium import webdriver
from bs4 import BeautifulSoup

# put the driver in the folder of this code
driver = webdriver.Chrome(os.getcwd() + '/chromedriver') 

driver.get("https://www.bloomberg.com/quote/IBVC:IND")
time.sleep(3)
real_soup = BeautifulSoup(driver.page_source, 'html.parser')
open_ = real_soup.find("span", {"class": "priceText__1853e8a5"}).text
print(f"Price: {open_}")
time.sleep(3)
driver.quit()
</code></pre>

Output:

<pre><code>Price: 50,083.00
</code></pre>

You can search for chromedriver and download one based on your chrome version.

blocks|key|1597469|text|正如其他答案所示，内容是通过JavaScript生成的，因此不存在于普通html中。对于给定的问题，提出了两种不同的攻角。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1597470|Selenium，又名“大枪”：这将使您在浏览器中自动执行几乎所有的任务。虽然在速度上要付出一定的代价。|unordered-list-item|offset|length|style|CODE|1597471|API+Request+aka深思:这并不总是可行的。然而，如果是这样的话，效率就会高得多。|1597472|我要详细说明第二个问题。@ViniciusDAvila已经为这种解决方案绘制了典型的蓝图:导航到站点，检查网络并确定哪个请求负责获取数据。|1597473|一旦完成，剩下的就是执行的问题了：|1597474|刮板|BOLD|1597475|import+requests
import+json
from+urllib.parse+import+quote


#+Constants
HEADERS+=+{
++++'Host':+'www.bloomberg.com',
++++'User-Agent':+'Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64;+rv:70.0)+Gecko/20100101+Firefox/70.0',
++++'Accept':+'*/*',
++++'Accept-Language':+'de,en-US;q=0.7,en;q=0.3',
++++'Accept-Encoding':+'gzip,+deflate,+br',
++++'Referer':+'https://www.bloomberg.com/quote/',
++++'DNT':+'1',
++++'Connection':+'keep-alive',
++++'TE':+'Trailers'
}
URL_ROOT+=+'https://www.bloomberg.com/markets2/api/datastrip'
URL_PARAMS+=+'locale=en&customTickerList=true'
VALID_TYPE+=+{'currency',+'index'}


#+Scraper
def+scraper(object_id:+str+=+None,+object_type:+str+=+None,+timeout:+int+=+5)+->+list:
++++"""
++++Get+the+Bloomberg+data+for+the+given+object.
++++:param+object_id:+The+Bloomberg+identifier+of+the+object.
++++:param+object_type:+The+type+of+the+object.+(Currency+or+Index)
++++:param+timeout:+Maximal+number+of+seconds+to+wait+for+a+response.
++++:return:+The+data+formatted+as+dictionary.
++++"""
++++object_type+=+object_type.lower()
++++if+object_type+not+in+VALID_TYPE:
++++++++return+list()
++++#+Build+headers+and+url
++++object_append+=+'%25s:%25s'+%25+(object_id,+'IND'+if+object_type+==+'index'+else+'CUR')
++++headers+=+HEADERS
++++headers['Referer']+%2B=+object_append
++++url+=+'%25s/%25s?%25s'+%25+(URL_ROOT,+quote(object_append),+URL_PARAMS)
++++#+Make+the+request+and+check+response+status+code
++++response+=+requests.get(url=url,+headers=headers)
++++if+response.status_code+in+range(200,+230):
++++++++return+response.json()
++++return+list()|code-block|syntax|javascript|1597476|测试|1597477|#+Index
object_id,+object_type+=+'IBVC',+'index'
data+=+scraper(object_id=object_id,+object_type=object_type)
print('The+open+price+for+%25s+%25s+is:+%25d'+%25+(object_type,+object_id,+data[0]['openPrice']))
#+The+open+price+for+index+IBVC+is:+50094

#+Exchange+rate
object_id,+object_type+=+'EUR',+'currency'
data+=+scraper(object_id=object_id,+object_type=object_type)
print('The+open+exchange+rate+for+USD+per+{}+is:+{}'.format(object_id,+data[0]['openPrice']))
#+The+open+exchange+rate+for+USD+per+EUR+is:+1.0993|1597478|entityMap^0|0|0|8|0|0|B|0|0|0|0|2|0|0|0|2|0|0^^$0|@$1|2|3|4|5|6|7|12|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|13|8|@$E|14|F|15|G|H]]|9|@]|A|$]]|$1|I|3|J|5|D|7|16|8|@$E|17|F|18|G|H]]|9|@]|A|$]]|$1|K|3|L|5|6|7|19|8|@]|9|@]|A|$]]|$1|M|3|N|5|6|7|1A|8|@]|9|@]|A|$]]|$1|O|3|P|5|6|7|1B|8|@$E|1C|F|1D|G|Q]]|9|@]|A|$]]|$1|R|3|S|5|T|7|1E|8|@]|9|@]|A|$U|V]]|$1|W|3|X|5|6|7|1F|8|@$E|1G|F|1H|G|Q]]|9|@]|A|$]]|$1|Y|3|Z|5|T|7|1I|8|@]|9|@]|A|$U|V]]|$1|10|3|-4|5|6|7|1J|8|@]|9|@]|A|$]]]|11|$]]

As indicated in other answers, the content is generated via JavaScript, hence not inside the plain html. For the given problem, two different angles of attack have been proposed

<ul>
<li><code>Selenium</code> aka The Big Guns: This will let you automate virtually any task in a browser. Comes at a certain cost though in terms of speed.</li>
<li><code>API Request</code> aka Thought Through: This is not always feasible. When it is however the case then it is much more efficient.</li>
</ul>

I elaborate on the second one. @ViniciusDAvila already laid out the typical blueprint for such a solution: navigate to the site, inspect the Network and figure out which request is responsible for fetching the data. 

Once this is done, the rest is a matter of execution:

Scraper

<pre><code>import requests
import json
from urllib.parse import quote


# Constants
HEADERS = {
 'Host': 'www.bloomberg.com',
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0',
 'Accept': '*/*',
 'Accept-Language': 'de,en-US;q=0.7,en;q=0.3',
 'Accept-Encoding': 'gzip, deflate, br',
 'Referer': 'https://www.bloomberg.com/quote/',
 'DNT': '1',
 'Connection': 'keep-alive',
 'TE': 'Trailers'
}
URL_ROOT = 'https://www.bloomberg.com/markets2/api/datastrip'
URL_PARAMS = 'locale=en&amp;customTickerList=true'
VALID_TYPE = {'currency', 'index'}


# Scraper
def scraper(object_id: str = None, object_type: str = None, timeout: int = 5) -&gt; list:
 """
 Get the Bloomberg data for the given object.
 :param object_id: The Bloomberg identifier of the object.
 :param object_type: The type of the object. (Currency or Index)
 :param timeout: Maximal number of seconds to wait for a response.
 :return: The data formatted as dictionary.
 """
 object_type = object_type.lower()
 if object_type not in VALID_TYPE:
 return list()
 # Build headers and url
 object_append = '%s:%s' % (object_id, 'IND' if object_type == 'index' else 'CUR')
 headers = HEADERS
 headers['Referer'] += object_append
 url = '%s/%s?%s' % (URL_ROOT, quote(object_append), URL_PARAMS)
 # Make the request and check response status code
 response = requests.get(url=url, headers=headers)
 if response.status_code in range(200, 230):
 return response.json()
 return list()
</code></pre>

Test

<pre><code># Index
object_id, object_type = 'IBVC', 'index'
data = scraper(object_id=object_id, object_type=object_type)
print('The open price for %s %s is: %d' % (object_type, object_id, data[0]['openPrice']))
# The open price for index IBVC is: 50094

# Exchange rate
object_id, object_type = 'EUR', 'currency'
data = scraper(object_id=object_id, object_type=object_type)
print('The open exchange rate for USD per {} is: {}'.format(object_id, data[0]['openPrice']))
# The open exchange rate for USD per EUR is: 1.0993
</code></pre>

I want to scrape data from the Bloomberg website. The data under &quot;IBVC:IND
Caracas Stock Exchange Stock Market Index&quot; needs to be scraped.
Here is my code so far:
<pre><code>import requests
from bs4 import BeautifulSoup as bs

headers = {
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
 'Chrome/58.0.3029.110 Safari/537.36 '
}
res = requests.get(&quot;https://www.bloomberg.com/quote/IBVC:IND&quot;, headers=headers)

soup = bs(res.content, 'html.parser')
# print(soup)
itmes = soup.find(&quot;div&quot;, {&quot;class&quot;: &quot;snapshot__0569338b snapshot&quot;})

open_ = itmes.find(&quot;span&quot;, {&quot;class&quot;: &quot;priceText__1853e8a5&quot;}).text
print(open_)
prev_close = itmes.find(&quot;span&quot;, {&quot;class&quot;: &quot;priceText__1853e8a5&quot;}).text
</code></pre>
I can't find the required values in the HTML. Which library should I use to handle that? I'm currently using BeautifulSoup and Requests.

Scrape data from bloomberg

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

我想从彭博网站上搜集数据。"IBVC:IND股票市场指数“下的数据需要被剔除。到目前为止，我的代码如下：import requestsfrom bs4 import BeautifulSoup as bsheaders = {    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0...

问从彭博社收集数据
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从彭博社收集数据EN