对于编码和R还不熟悉,但是有一个STATA数据集,我想使用ggplot来访问我的数据,但是,我会得到多个错误,例如
不适用于"c('haven_labelled', 'vctrs_vctr', 'double')"类对象的“渐升”方法
我不知道如何将它们转换成图形,这样我就可以将它们绘制成视觉化图,
代码行如下:
Data <- read_dta("longitudinal_td.dta")
Data <- Data %>%
select(pidp,wave,age_dv,sex_dv,ethn_dv,sf1_dv,bmi_dv,sf12pcs_dv,fihhmnnet1_dv,sf12mcs_dv) %>%
filter(wave == "1", age_dv<=50)%>%
mutate(pipd = row_number(),age=age_dv, sex=sex_dv, ethnicity = ethn_dv, general_health=sf1_dv,
bmi=bmi_dv, physical_component_score=sf12pcs_dv, mental_component_score=sf12mcs_dv, household_income=fihhmnnet1_dv)%>%
select(-pipd,-age_dv,-sex_dv,-ethn_dv,-sf1_dv,-bmi_dv,-sf12pcs_dv,-sf12mcs_dv,-fihhmnnet1_dv)我希望这是正确的,这是dput:
本质上,我只是想探索一下BMI,但我不知道我是否能把这些画出来,或者把数字分配给一个标签,就像它已经在天堂标签里那样了。
dput(head(Data))
structure(list(pidp = structure(c(68001367, 68006127, 68008167,
68009527, 68010207, 68010887), label = "cross-wave person identifier (public release)", format.stata = "%12.0g"),
wave = structure(c(1, 1, 1, 1, 1, 1), label = "interview wave", format.stata = "%8.0g"),
age = structure(c(39, 39, 38, 31, 24, 45), label = "Age, derived from dob_dv and intdat_dv", format.stata = "%8.0g"),
sex = structure(c(1, 2, 2, 1, 2, 2), label = "Sex, derived", format.stata = "%8.0g", labels = c(Male = 1,
Female = 2), class = c("haven_labelled", "vctrs_vctr", "double"
)), ethnicity = structure(c(1, 1, 1, 1, 1, 1), label = "Ethnic group (derived from multiple sources)", format.stata = "%8.0g", labels = c(`white uk` = 1,
irish = 2, `gypsy or irish traveller` = 3, `any other white background` = 4,
`white and black caribbean` = 5, `white and black african` = 6,
`white and asian` = 7, `any other mixed background` = 8,
indian = 9, pakistani = 10, bangladeshi = 11, chinese = 12,
`any other asian background` = 13, caribbean = 14, african = 15,
`any other black background` = 16, arab = 17, `any other ethnic group` = 97
), class = c("haven_labelled", "vctrs_vctr", "double")),
general_health = structure(c(2, 4, 5, 3, 1, 1), label = "General health", format.stata = "%8.0g", labels = c(excellent = 1,
`very good` = 2, good = 3, fair = 4, `or Poor?` = 5), class = c("haven_labelled",
"vctrs_vctr", "double")), bmi = structure(c(29.6, 38.8, 21.5,
24.2, 25, 25.5), label = "Body Mass Index", format.stata = "%12.0g")发布于 2022-08-16 17:13:48
感谢您在dput()中发布了一个数据示例。您发布的数据格式表明,它以某种方式变成了列表,而不是数据框架。您需要将其转换为数据框架--在使用haven时,我将坚持使用tidyverse,并使用as_tibble()进行转换。
类似地,您需要的是标签,而不是底层整数。您只需将as_factor应用于整个数据帧即可。
然后您的数据就可以通过管道传输到ggplot2了。例如:
library(dplyr)
library(ggplot2)
library(haven)
Data |>
as_tibble() |>
as_factor() |>
ggplot() +
geom_boxplot(aes(x=sex, y=bmi))

https://stackoverflow.com/questions/73377261
复制相似问题