我显然是Javascript的新手,我不愿意承认这一点。我正在尝试使用Node.js拉取一个网页,并将其内容保存为一个变量,这样我就可以随心所欲地解析它。
在Python中,我会这样做:
from bs4 import BeautifulSoup # for parsing
import urllib
text = urllib.urlopen("http://www.myawesomepage.com/").read()
parse_my_awesome_html(text)我该如何在Node中做到这一点?我已经做到了:
var request = require("request");
request("http://www.myawesomepage.com/", function (error, response, body) {
    /*
     Something here that lets me access the text
     outside of the closure
     This doesn't work:
     this.text = body;
    */ 
})发布于 2012-07-07 08:46:16
var request = require("request");
var parseMyAwesomeHtml = function(html) {
    //Have at it
};
request("http://www.myawesomepage.com/", function (error, response, body) {
    if (!error) {
        parseMyAwesomeHtml(body);
    } else {
        console.log(error);
    }
});编辑:正如Kishore所说,有很好的解析选项可用。如果你在windows上遇到了jsdom的python/gyp问题,也可以看看cheerio。Cheerio on github
https://stackoverflow.com/questions/11371310
复制相似问题