首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >R( solve.default(xtx +diag(Pen)中的误差):系统是计算奇异的:倒数条件数=)

R( solve.default(xtx +diag(Pen)中的误差):系统是计算奇异的:倒数条件数=)
EN

Stack Overflow用户
提问于 2022-08-21 20:47:08
回答 1查看 438关注 0票数 0

我想分析新冠肺炎的数据。我完成了部分数据清理,最后得到了数据集(160260行和34列)。我把大陆、位置、tests_units等变量转化为因子。我想检查丢失的值,所以我计算了丢失值的百分比,结果如下:

代码语言:javascript
运行
复制
> (colMeans(is.na(dataset1)))*100
          continent                location                    date             total_cases 
          0.0000000               0.0000000               0.0000000               1.9699239 
          new_cases            total_deaths              new_deaths       reproduction_rate 
          2.0366904               8.0094846               8.1130663              14.0078622 
       icu_patients           hosp_patients   weekly_icu_admissions  weekly_hosp_admissions 
         84.7747410              83.7021091              96.2386123              92.5851741 
        total_tests               new_tests           positive_rate          tests_per_case 
         54.4465244              56.6966180              43.9292400              44.7154624 
        tests_units people_fully_vaccinated        new_vaccinations        stringency_index 
         38.0974666              73.6390865              76.2298765              15.7138400 
         population      population_density              median_age           aged_70_older 
          0.0000000               4.3073755              10.5291401              11.0077374 
     gdp_per_capita         extreme_poverty   cardiovasc_death_rate     diabetes_prevalence 
         11.9381006              42.0897292              11.0077374               6.7003619 
     female_smokers            male_smokers  handwashing_facilities         life_expectancy 
         32.9963809              33.9535754              55.9690503               0.4785973 
        human_development_index        excess_mortality
         13.3738924                    96.1225509 

我不想分析缺少值的数据集,因此我进行了大量搜索,以找到一种填充这些can的方法。我发现,我可以使用鼠标函数来填充这些NAs.My目标:

  1. 以不使用可变日期作为预测的方式使用鼠标函数。
  2. 不要在变量中计算值:大陆、位置、日期、人口,因为它们没有NAs。
  3. 在变量中计算值: total_cases、new_cases、total_deaths、new_deaths、reproduction_rate、icu_patients、hosp_patients、weekly_icu_admissions、weekly_hosp_admissions、total_tests、new_tests、positive_rate、tests_per_case、people_fully_vaccinated、new_vaccinations、stringency_index、population_density、new_vaccinations、en19#、,human_development_index,excess_mortality与方法pmm (预测平均匹配),因为这些变量是数值的。
  4. 用polyreg (Polytomous logistic回归)方法计算变量tests_units中的值,因为该变量是4个水平的因子。

我跟踪了链接的每一步,并运行了以下代码:

代码语言:javascript
运行
复制
library(mice)

init = mice(dataset1,maxit = 0)
meth = init$method
predM = init$predictorMatrix

predM[, c("date")] = 0 #goal number 1

meth[c("continent","location","date","population")] = "" #goal number 2

meth[c("total_cases","new_cases","total_deaths","new_deaths","reproduction_rate",
   "icu_patients","hosp_patients","weekly_icu_admissions",
   "weekly_hosp_admissions","total_tests","new_tests","positive_rate",
   "tests_per_case","people_fully_vaccinated",
   "new_vaccinations","stringency_index","population_density","median_age",
   "aged_70_older","gdp_per_capita","extreme_poverty",
   "cardiovasc_death_rate","diabetes_prevalence","female_smokers",
   "male_smokers","handwashing_facilities","life_expectancy",
   "human_development_index","excess_mortality")]="pmm" #goal number 3

meth[c("tests_units")] = "polyreg" #goal number 4

set.seed(103)

imputed = mice(dataset1, method=meth, predictorMatrix=predM, m=5)

我得到的结果是

代码语言:javascript
运行
复制
> library(mice)
> init = mice(dataset1,maxit = 0)
Warning message:
Number of logged events: 1 
> meth = init$method
> predM = init$predictorMatrix
> predM[, c("date")] = 0
> meth[c("continent","location","date","population")] = ""
> meth[c("total_cases","new_cases","total_deaths","new_deaths","reproduction_rate",
+        "icu_patients","hosp_patients","weekly_icu_admissions",
+        "weekly_hosp_admissions","total_tests","new_tests","positive_rate",
+        "tests_per_case","people_fully_vaccinated",
+        "new_vaccinations","stringency_index","population_density","median_age",
+        "aged_70_older","gdp_per_capita","extreme_poverty",
+        "cardiovasc_death_rate","diabetes_prevalence","female_smokers",
+        "male_smokers","handwashing_facilities","life_expectancy",
+        "human_development_index","excess_mortality")]="pmm"
> meth[c("tests_units")] = "polyreg"
> 
> set.seed(103)
> imputed = mice(dataset1, method=meth, predictorMatrix=predM, m=5)

 iter imp variable
  1   1  total_casesError in solve.default(xtx + diag(pen)) : 
  system is computationally singular: reciprocal condition number = 2.80783e-24

这不是很愉快。我应该更改什么,或者应该运行哪些代码?

提前感谢!

EN

回答 1

Stack Overflow用户

发布于 2022-09-24 11:55:21

你检查过你记录的事件了吗?

代码语言:javascript
运行
复制
view(init$loggedEvents)

也许是因为你所用的估算方法("polyreg")。您是否尝试过使用像pmm这样更健壮的方法?

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73438111

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档