如何使用symfony爬虫提取无标记元素。例如,在下面的示例html中,我想提取Hello World。
<strong>title</strong>Hello World<strong>Sub-Title</strong><div>This is just stuff</div>发布于 2015-07-02 12:46:25
使用PHP DOM可以很容易地做到这一点;)
$dom = new DOMDocument();
$dom->loadHTML('<strong>title</strong>Hello World<strong>Sub-Title</strong><div>This is just stuff</div>');
$xpath = new DOMXPath($dom);
// use the fact that PHP DOM wraps everything into the body and get the text()
$entries = $xpath->query('//body/text()');
foreach ($entries as $entry) {
echo $entry->nodeValue;
}发布于 2019-09-26 07:43:46
我有更好的方法给你
$ExtractText = $crawler->filter('strong')->eq(1)->text();这可以得到索引1的标记,因为标题是索引0。
https://stackoverflow.com/questions/31183435
复制相似问题