问使用R的网络爬虫
EN

Stack Overflow用户

提问于 2018-06-08 06:28:14

回答 1查看 517关注 0票数 -2

我想建立一个网络爬虫使用R程序为网站"https://www.latlong.net/convert-address-to-lat-long.html"，它可以访问与地址参数的网站，然后从网站获取生成的经度和纬度。这将对我拥有的数据集的长度重复。

由于我是网络爬虫领域的新手，我会寻求指导。

提前谢谢。

web-scraping

rcrawler

回答 1

Stack Overflow用户

发布于 2018-06-08 08:17:41

在过去，我使用过一种称为IP堆栈(ipstack.com)的应用程序接口。

示例:数据帧'd‘，其中包含名为'ipAddress’的IP地址列

for(i in 1:nrow(d)){
  #get data from API and save the text to variable 'str'
  lookupPath <- paste("http://api.ipstack.com/", d$ipAddress[i], "?access_key=INSERT YOUR API KEY HERE&format=1", sep = "")
  str <- readLines(lookupPath)

  #save all the data to a file
  f <- file(paste(i, ".txt", sep = ""))
  writeLines(str,f)
  close(f)

  #save data to main data frame 'd' as well:
  d$ipCountry[i]<-str[7]
  print(paste("Successfully saved ip #:", i))
}

在本例中，我特别关注每个IP的国家位置，它出现在API返回的数据的第7行(因此是str7)。

该API允许您每月免费查找10,000个地址，这对我的目的来说已经足够了。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/50750968

复制

相似问题

问使用R的网络爬虫
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用R的网络爬虫EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用R的网络爬虫
EN