我的目标是创建一个for循环,将数据集中的某些特定列转换为因子或整数。
条件将基于列的名称。
# Here is a small reproducible dataset
df <- data.frame(x = c(10,20,30), y = c("yes", "no", "no"), z = c("Big", "Small", "Average"))
# here is a vector that we are going to use inside our if statement
column_factor_names <- c("y", "z")
# for each column in df
for (i in names(df)) {
print(i)
# if it's a factor, convert into factor, else convert it into integer
if (i %in% column_factor_names) {
print("it's a factor")
df$i <- as.factor(df$i)
} else {
print("it's an integer")
df$i <- as.integer(df$i)
}
}
当我运行这个命令时,我得到:Error in `$<-.data.frame`(`*tmp*`, "i", value = integer(0)) : replacement has 0 rows, data has 3
问题出在if-else语句中的df$i <- as.factor(df$i)
和df$i <- as.integer(df$i)
行。
但我不明白的是,当我手动运行它时。例如:
df$"x" <- as.integer(df$"x")
df$"y" <- as.factor(df$"y")
df$"z" <- as.factor(df$"z")
str(df)
它正在工作:
'data.frame': 3 obs. of 3 variables:
$ x: int 10 20 30
$ y: Factor w/ 2 levels "no","yes": 2 1 1
$ z: Factor w/ 3 levels "Average","Big",..: 2 3 1
我的问题是:为什么它不能在for-loop和if语句中工作?
发布于 2019-09-12 18:45:59
在您的代码中,子集函数$
查找名为i
的列,而不是计算i
。您可以选择使用[, i]
或[[i]]
以不同方式设置data.frame的子集
x <- data.frame(x = c(10,20,30), y = c("yes", "no", "no"), z = c("Big", "Small", "Average"))
# here is a vector that we are going to use inside our if statement
column_factor_names <- c("y", "z")
# for each column in df
for (i in names(df)) {
print(i)
# if it's a factor, convert into factor, else convert it into integer
if (i %in% column_factor_names) {
print("it's a factor")
x[[i]] <- as.factor(x[[i]])
} else {
print("it's an integer")
x[[i]] <- as.integer(x[[i]])
}
}
更多信息请参见help("$")
。
如果你不介意丢失状态消息,你也可以在不需要循环的情况下这样做:
x[, i] <- as.factor(x[, i])
发布于 2019-09-12 19:00:59
为您更正的循环部分代码为:
# Here is a small reproducible dataset
df <- data.frame(x = c(10,20,30), y = c("yes", "no", "no"), z = c("Big", "Small", "Average"))
# here is a vector that we are going to use inside our if statement
column_factor_names <- c("y", "z")
for (i in names(df)) {
print(i)
if (i %in% column_factor_names) {
print("it's a factor")
df[,i] <- as.factor(df[,i])
} else {
print("it's an integer")
df[,i] <- as.numeric(df[,i])
}
}
https://stackoverflow.com/questions/57905007
复制相似问题