我试图从PubMed、-using、RefManageR和PubMed ID (pmids)-中检索引用信息。
我选择RefManageR是因为以data.frame格式粘贴输出非常容易。对我来说,独自理解和使用PubMed API仍然很困难。
我能够编写使用“PMIds字符串”作为输入来获取数据的代码:
require(RCurl)
urli <- getURL("https://gist.githubusercontent.com/aurora-mareviv/3840512f6777d5293218/raw/dfd6b76ceb22c52aa073fc05211dcea986406914/pmids.csv", ssl.verifypeer = FALSE)
pmids <- read.csv(textConnection(urli))
head(pmids)
index10 <- pmids$pmId[1:10]
indice10 <- paste(pmids$pmId[1:10], collapse=" ")
# install.packages("RefManageR")
library(RefManageR)
auth.pm10 <- ReadPubMed(indice10, database = "PubMed", mindate = 1950)
auth.pm10d <- data.frame(auth.pm10)
View(auth.pm10)
但是,如果我想从500个pmids获得引用,我认为我应该避免在PubMed服务器中进行长时间的查询。我的想法是创建一个循环遍历向量index10
中所有元素的函数,类似于这样:
extract.pub <-
function(id=indice, dbase=d.base, mindat=1950){
require(RefManageR)
indice <- id # Author
d.base <- dbase # like PubMed, etc
min.dat <- mindat # Date from...
auth.pm <- NULL
for(i in indice){
auth.pm <- ReadPubMed(indice, database = d.base, mindate = min.dat)
}
auth.pm <- data.frame(auth.pm)
auth.pm
}
cites <- extract.pub(index10, dbase="PubMed")
View(cites)
它给出了以下错误:Error : Internal server error
。
但是,如果我插入indice10
(string)而不是index10
(向量),它可以工作:
cites <- extract.pub(indice10, dbase="PubMed")
View(cites)
我怎样才能让这个循环工作呢?或者这个方法对我来说不是最好的?
发布于 2014-10-16 05:40:52
ReadPubMEd
只接受每个函数调用一个pmid或查询。尝试:
lapply(pmids[1:3], ReadPubMed, database = "PubMed", mindate = 1950)
给出
[[1]]
[1] P. M. Zeltzer, B. Bodey, A. Marlin, et al. “Immunophenotype profile of childhood
medulloblastomas and supratentorial primitive neuroectodermal tumors using 16 monoclonal
antibodies”. Eng. In: _Cancer_ 66.2 (1990), pp. 273-83. PMID: 2196109.
[[2]]
[1] L. C. Rome, R. P. Funke and R. M. Alexander. “The influence of temperature on muscle
velocity and sustained performance in swimming carp”. Eng. In: _The Journal of
experimental biology_ 154 (1990), pp. 163-78. PMID: 2277258.
[[3]]
[1] P. Henry. “[Headache, facial neuralgia. Diagnostic orientation and management]”. Fre.
In: _La Revue du praticien_ 40.7 (1990), pp. 677-81. PMID: 2326596.
您可以将BibEntry
类的元素放入data.frame中,并很好地格式化创作。
lapply(pmids[1:3], function(x){
tmp <- unlist(ReadPubMed(x, database = "PubMed", mindate = 1950))
tmp <- lapply(tmp, function(z) if(is(z, "person")) paste0(z, collapse = ",") else z)
data.frame(tmp, stringsAsFactors = FALSE)
})
给出
title
1 Immunophenotype profile of childhood medulloblastomas and supratentorial primitive neuroectodermal tumors using 16 monoclonal antibodies
2 The influence of temperature on muscle velocity and sustained performance in swimming carp
3 [Headache, facial neuralgia. Diagnostic orientation and management]
author year journal volume number pages eprint language eprinttype bibtype
1 P M Zeltzer,B Bodey,A Marlin,J Kemshead 1990 Cancer 66 2 273-83 2196109 eng pubmed Article
2 L C Rome,R P Funke,R M Alexander 1990 The Journal of experimental biology 154 <NA> 163-78 2277258 eng pubmed Article
3 P Henry 1990 La Revue du praticien 40 7 677-81 2326596 fre pubmed Article
dateobj key
1 1990-01-01 zeltzer1990immunophenotype
2 1990-01-01 rome1990influence
3 1990-01-01 henry1990headache
https://stackoverflow.com/questions/26401119
复制