我正在尝试用ABOT在c#.i中制作网络爬虫,我搜索了很多例子,并添加了ABOT网络爬虫。从那我只能得到日志输出而不是超文本标记语言页面输出。我想得到超文本标记语言页面输出only.because,超文本标记语言输出是超文本标记语言敏捷工具的输入。帮助我从ABOT网络爬虫在C#中获得超文本标记语言输出。谢谢。
发布于 2013-09-24 13:22:07
//Create an instance of the crawler and subscribe to the PageCrawlCompleted event
PoliteWebCrawler crawler = new PoliteWebCrawler();
crawler.PageCrawlCompleted += crawler_ProcessPageCrawlCompleted;
//The event handler method
void crawler_ProcessPageCrawlCompleted(object sender, PageCrawlCompletedArgs e)
{
CrawledPage crawledPage = e.CrawledPage;
if (crawledPage.WebException != null || crawledPage.HttpWebResponse.StatusCode != HttpStatusCode.OK)
Console.WriteLine("Crawl of page failed {0}", crawledPage.Uri.AbsoluteUri);
else
Console.WriteLine("Crawl of page succeeded {0}", crawledPage.Uri.AbsoluteUri);
//crawledPage.Content.Text //raw html
//crawledPage.HtmlDocument //lazy loaded html agility pack object (HtmlAgilityPack.HtmlDocument)
//crawledPage.CSDocument //lazy loaded cs query object (CsQuery.Cq)
}
发布于 2015-11-17 23:59:39
void crawler_ProcessPageCrawlCompleted(object sender, PageCrawlCompletedArgs e)
{
CrawledPage crawledPage = e.CrawledPage;
crawledPage.Content.Text // HTML
}
发布于 2016-12-12 04:35:31
只需使用以下命令即可获取htmlpage:
crawledPage.Content
在函数内部
`static void crawler_ProcessPageCrawlCompleted(object sender, PageCrawlCompletedArgs e)`
例如:
static void crawler_ProcessPageCrawlCompleted(object sender, PageCrawlCompletedArgs e)
{
CrawledPage crawledPage = e.CrawledPage;
if (crawledPage.WebException != null || crawledPage.HttpWebResponse.StatusCode != HttpStatusCode.OK)
Console.WriteLine("Crawl of page failed {0}", crawledPage.Uri.AbsoluteUri);
else
Console.WriteLine("Crawl of page succeeded {0}", crawledPage.Uri.AbsoluteUri);
if (string.IsNullOrEmpty(crawledPage.Content.Text))
Console.WriteLine("Page had no content {0}", crawledPage.Uri.AbsoluteUri);
var htmlAgilityPackDocument = crawledPage.HtmlDocument; //Html Agility Pack parser
var angleSharpHtmlDocument = crawledPage.AngleSharpHtmlDocument;
//get content
Console.WriteLine(crawledPage.Content);
}
https://stackoverflow.com/questions/18767988
复制相似问题