首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >statsmodels.api返回MissingDataError: exog在尝试拟合多元回归时包含inf或nans

statsmodels.api返回MissingDataError: exog在尝试拟合多元回归时包含inf或nans
EN

Stack Overflow用户
提问于 2021-03-21 18:27:49
回答 1查看 4.6K关注 0票数 1

我正在尝试用statsmodels.api拟合一个多元线性回归模型。我得到一个错误的MissingDataError: exog contains inf or nans。我已经检查了nans和inf,但没有找到。这怎麽可能?为什么我会得到这个错误?

代码

代码语言:javascript
运行
复制
import statsmodels.api as sm
from sklearn.linear_model import LinearRegression
import pandas as pd
import numpy as np

df = pd.read_csv('clean_df.csv')
x_multi = df.drop('price', axis=1) #feature variables.
x_multi_cons = sm.add_constant(x_multi) #add row of constants.

我检查了所有exog变量的na值,没有发现任何值。

代码语言:javascript
运行
复制
x_multi_cons.isna().sum()

const                       0
crime_rate                  0
resid_area                  0
air_qual                    0
room_num                    0
age                         0
teachers                    0
poor_prop                   0
n_hos_beds                  8
n_hot_rooms                 0
rainfall                    0
parks                       0
avg_dist                    0
airport_YES                 0
waterbody_Lake              0
waterbody_Lake and River    0
waterbody_River             0
dtype: int64

我还检查了exog变量的inf值,发现没有。

代码语言:javascript
运行
复制
np.isinf(x_multi_cons).sum()
const                       0
crime_rate                  0
resid_area                  0
air_qual                    0
room_num                    0
age                         0
teachers                    0
poor_prop                   0
n_hos_beds                  0
n_hot_rooms                 0
rainfall                    0
parks                       0
avg_dist                    0
airport_YES                 0
waterbody_Lake              0
waterbody_Lake and River    0
waterbody_River             0
dtype: int64

我在这里拟合模型。

代码语言:javascript
运行
复制
 y_multi = df['price'] # Dependent variable.
 lm_multi = sm.OLS(y_multi, x_multi_cons).fit() 

但我仍然得到错误:"MissingDataError: exog包含inf或nans“。这怎麽可能?

代码语言:javascript
运行
复制
ERROR: 
MissingDataError                          Traceback (most recent call last)
<ipython-input-67-ca6d2e9ba2c0> in <module>
----> 1 lm_multi = sm.OLS(y_multi, x_multi_cons).fit()

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, missing, hasconst, **kwargs)
    871                  **kwargs):
    872         super(OLS, self).__init__(endog, exog, missing=missing,
--> 873                                   hasconst=hasconst, **kwargs)
    874         if "weights" in self._init_keys:
    875             self._init_keys.remove("weights")

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, weights, missing, hasconst, **kwargs)
    702             weights = weights.squeeze()
    703         super(WLS, self).__init__(endog, exog, missing=missing,
--> 704                                   weights=weights, hasconst=hasconst, **kwargs)
    705         nobs = self.exog.shape[0]
    706         weights = self.weights

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/regression/linear_model.py in __init__(self, endog, exog, **kwargs)
    188     """
    189     def __init__(self, endog, exog, **kwargs):
--> 190         super(RegressionModel, self).__init__(endog, exog, **kwargs)
    191         self._data_attr.extend(['pinv_wexog', 'weights'])
    192 

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
    235 
    236     def __init__(self, endog, exog=None, **kwargs):
--> 237         super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
    238         self.initialize()
    239 

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/base/model.py in __init__(self, endog, exog, **kwargs)
     76         hasconst = kwargs.pop('hasconst', None)
     77         self.data = self._handle_data(endog, exog, missing, hasconst,
---> 78                                       **kwargs)
     79         self.k_constant = self.data.k_constant
     80         self.exog = self.data.exog

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/base/model.py in _handle_data(self, endog, exog, missing, hasconst, **kwargs)
     99 
    100     def _handle_data(self, endog, exog, missing, hasconst, **kwargs):
--> 101         data = handle_data(endog, exog, missing, hasconst, **kwargs)
    102         # kwargs arrays could have changed, easier to just attach here
    103         for key in kwargs:

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/base/data.py in handle_data(endog, exog, missing, hasconst, **kwargs)
    671     klass = handle_data_class_factory(endog, exog)
    672     return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
--> 673                  **kwargs)

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/base/data.py in __init__(self, endog, exog, missing, hasconst, **kwargs)
     85         self.const_idx = None
     86         self.k_constant = 0
---> 87         self._handle_constant(hasconst)
     88         self._check_integrity()
     89         self._cache = {}

~/anaconda3/envs/python3/lib/python3.6/site-packages/statsmodels/base/data.py in _handle_constant(self, hasconst)
    131             exog_max = np.max(self.exog, axis=0)
    132             if not np.isfinite(exog_max).all():
--> 133                 raise MissingDataError('exog contains inf or nans')
    134             exog_min = np.min(self.exog, axis=0)
    135             const_idx = np.where(exog_max == exog_min)[0].squeeze()

MissingDataError: exog contains inf or nans
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-03-22 17:30:20

如果你看看你的表,我不太确定你是如何得出没有nA值的结论的:

代码语言:javascript
运行
复制
x_multi_cons.isna().sum()

[...]
n_hos_beds                  8
[...]

这意味着n_hos_beds缺少8个值。如果它不会伤害你的模型,只需删除开头的nans:

代码语言:javascript
运行
复制
df = df.dropna()
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66731213

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档