之前在一个 Analytics Vidhya 竞赛中,我试图集成多个模型。我发现 R 中没有一个用于集成的易用开源包。
当时我就决定要借此机会创建一个简单的包,使人们用几行代码就能进行集成(堆叠)。因此,我创建了一个名为 ensembleR 的包,你可以在 CRAN 上找到它。这个包使人们能够在 R 中创建多个模型的集成。欲了解更多关于在 R 中集成的信息,请阅读此内容:https://www.analyticsvidhya.com/blog/2017/02/introduction-to-ensembling-along-with-implementation-in-r/。
现在来编写真正预测给定股票代号的股价波动的函数。必须在 export 字段下的 hello.R 文件中编写这个‘stock_predict’函数。这是在完成编写此函数后的 hello.R 文件:
#' @title Predicts Stock PriceMovement for Given Stock Symbol
#'
#' @description This package predictswhether the stock price at tommorow's market close would be higher or lowercompared to today's closing place.
#'
#' @param symbol
#'
#' @return NULL
#'
#' @examples stock_predict('AAPL')
#'
#' @export stock_predict
stock_predict<-function(symbol)
{
#To ignore the warnings during usage
options(warn=-1)
options("getSymbols.warning4.0"=FALSE)
#Importing price data for the givensymbol
data<-data.frame(xts::as.xts(get(quantmod::getSymbols(symbol))))
#Assighning the column names
colnames(data) <-c("data.Open","data.High","data.Low","data.Close","data.Volume","data.Adjusted")
#Creating lag and lead features ofprice column.
data <-xts::xts(data,order.by=as.Date(rownames(data)))
data <- as.data.frame(merge(data,lm1=stats::lag(data[,'data.Adjusted'],c(-1,1,3,5,10))))
#Extracting features from Date
data$Date<-as.Date(rownames(data))
data$Day_of_month<-as.integer(format(as.Date(data$Date),"%d"))
data$Month_of_year<-as.integer(format(as.Date(data$Date),"%m"))
data$Year<-as.integer(format(as.Date(data$Date),"%y"))
data$Day_of_week<-as.factor(weekdays(data$Date))
#Naming variables for reference
today <- 'data.Adjusted'
tommorow <- 'data.Adjusted.5'
#Creating outcome
data$up_down <-as.factor(ifelse(data[,tommorow] > data[,today], 1, 0))
#Creating train and test sets
train<-data[stats::complete.cases(data),]
test<-data[nrow(data),]
#Training model
model<-stats::glm(up_down~data.Open+data.High+data.Low+data.Close+
data.Volume+data.Adjusted+data.Adjusted.1+
data.Adjusted.2+data.Adjusted.3+data.Adjusted.4+
Day_of_month+Month_of_year+Year+Day_of_week,
family=binomial(link='logit'),data=train)
#Making Predictions
pred<-as.numeric(stats::predict(model,test[,c('data.Open','data.High','data.Low','data.Close','data.Volume','data.Adjusted','data.Adjusted.1','data.Adjusted.2','data.Adjusted.3','data.Adjusted.4','Day_of_month','Month_of_year','Year','Day_of_week')],type= 'response'))
#Printing results
print("Probability of Stockprice going up tommorow:")
print(pred)
}