文章/答案/技术大牛

发布

社区首页 >问答首页 >因子logistic回归中SAS和R的不同输出

问因子logistic回归中SAS和R的不同输出
EN

Stack Overflow用户

提问于 2018-10-22 22:55:25

回答 1查看 806关注 0票数 1

我正在尝试在SAS和R中进行这些阶乘逻辑回归，但我在dry=rt*chi_ur中得到了不同的结果！为什么？

我的数据：

id  dry rt  chi_ur
1   1   0   1
2   0   0   0
3   0   0   0
4   0   0   0
5   0   0   1
6   0   0   0
7   0   0   0
8   0   0   1
9   0   0   0
10  0   0   0
11  0   0   0
12  0   0   0
13  1   0   0
14  0   0   0
15  0   0   1
16  0   0   1
17  0   0   0
18  1   0   0
19  0   0   0
20  0   0   0
21  0   0   1
22  1   1   0
23  0   1   1
24  0   0   1
25  0   0   1
26  1   0   0
27  1   0   0
28  0   0   0
29  1   0   0
30  1   0   0
31  1   0   1
32  1   0   0
33  0   0   0
34  1   0   0
35  0   0   0
36  0   0   1
37  1   0   0
38  1   0   0
39  0   0   1
40  0   1   0
41  0   1   0
42  1   1   0
43  0   1   0
44  0   0   0
45  0   0   0
46  0   0   1
47  0   0   0
48  0   0   1
49  1   0   0
50  0   0   1
51  0   0   0
52  1   0   0
53  1   0   0
54  1   0   0
55  1   0   0
56  0   0   0
57  1   0   0
58  0   0   0
59  1   0   0
60  1   0   0
61  0   0   0
62  0   1   0
63  0   0   0
64  0   0   0
65  1   1   0
66  0   0   0
67  1   0   0
68  1   0   0
69  1   0   0
70  1   0   0
71  1   0   0
72  1   0   0
73  1   0   0
74  1   0   0
75  1   0   0
76  1   0   0
77  0   1   0
78  1   0   0
79  0   1   0
80  0   1   0
81  1   0   0
82  1   0   0
83  1   0   0
84  1   0   0
85  1   0   0
86  0   0   1
87  1   0   0
88  1   0   0
89  1   0   0
90  1   0   1
91  1   0   
92  1   0   
93  0   0   
94  0   1   
95  0   1   
96  0   1   
97  1   0   
98  1   0

R代码：

summary(glm(dry ~ chi_ur, data = en, family = binomial))
summary(glm(dry ~ rt, data = en, family = binomial))
summary(glm(dry ~ rt*chi_ur, data = en, family = binomial))

SAS代码：

proc logistic data = en.en1 desc;
class chi_ur ;
model dry = chi_ur / expb;
run;

proc logistic data = en.en1 desc;
class rt ;
model dry = rt / expb;
run;

proc logistic data = en.en1 desc;
class rt chi_ur ;
model dry = rt chi_ur rt*chi_ur/ expb;
run;

我的R结果：

> summary(glm(dry ~ chi_ur, data = en, family = binomial))

Call:
glm(formula = dry ~ chi_ur, family = binomial, data = en)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.2601  -1.2601  -0.6231   1.0969   1.8626  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)   0.1924     0.2352   0.818   0.4133  
chi_ur       -1.7328     0.6782  -2.555   0.0106 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 124.59  on 89  degrees of freedom
Residual deviance: 116.37  on 88  degrees of freedom
  (8 observations deleted due to missingness)
AIC: 120.37

Number of Fisher Scoring iterations: 3

> summary(glm(dry ~ rt, data = en, family = binomial))

Call:
glm(formula = dry ~ rt, family = binomial, data = en)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.2181  -1.2181  -0.6945   1.1372   1.7552  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)  0.09531    0.21847   0.436   0.6626  
rt          -1.39459    0.68700  -2.030   0.0424 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 135.69  on 97  degrees of freedom
Residual deviance: 130.81  on 96  degrees of freedom
AIC: 134.81

Number of Fisher Scoring iterations: 4

> summary(glm(dry ~ rt*chi_ur, data = en, family = binomial))

Call:
glm(formula = dry ~ rt * chi_ur, family = binomial, data = en)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.3304  -1.3304  -0.6444   1.0317   1.8297  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)   
(Intercept)    0.3528     0.2559   1.379  0.16798   
rt            -1.2001     0.7360  -1.631  0.10297   
chi_ur        -1.8192     0.6897  -2.637  0.00835 **
rt:chi_ur    -12.8996  1455.3979  -0.009  0.99293   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 124.59  on 89  degrees of freedom
Residual deviance: 113.07  on 86  degrees of freedom
  (8 observations deleted due to missingness)
AIC: 121.07

Number of Fisher Scoring iterations: 14

我的SAS结果：

The SAS System     

The LOGISTIC Procedure

Model Information 
Data Set EN.EN1 
Response Variable dry 
Number of Response Levels 2 
Model binary logit 
Optimization Technique Fisher's scoring     

Number of Observations Read 98 
Number of Observations Used 90    

Response Profile 
Ordered
Value dry Total
Frequency 
1 1 43 
2 0 47 

Probability modeled is dry='1'.   

Note: 8 observations were deleted due to missing values for the response or explanatory variables. 

Class Level Information 
Class Value Design
Variables 
chi_ur 0 1 
  1 -1 


Model Convergence Status 
Convergence criterion (GCONV=1E-8) satisfied.        

Model Fit Statistics 
Criterion Intercept Only Intercept and
Covariates 
AIC 126.589 120.371 
SC 129.088 125.371 
-2 Log L 124.589 116.371         

Testing Global Null Hypothesis: BETA=0 
Test Chi-Square DF Pr > ChiSq 
Likelihood Ratio 8.2175 1 0.0041 
Score 7.6262 1 0.0058 
Wald 6.5262 1 0.0106    

Type 3 Analysis of Effects 
Effect DF Wald
Chi-Square Pr > ChiSq 
chi_ur 1 6.5262 0.0106     

Analysis of Maximum Likelihood Estimates 
Parameter   DF Estimate Standard
Error Wald
Chi-Square Pr > ChiSq Exp(Est) 
Intercept   1 -0.6740 0.3391 3.9498 0.0469 0.510 
chi_ur 0 1 0.8664 0.3391 6.5262 0.0106 2.378 

Odds Ratio Estimates 
Effect Point Estimate 95% Wald
Confidence Limits 
chi_ur 0 vs 1 5.656 1.497 21.372 

Association of Predicted Probabilities and
Observed Responses 
Percent Concordant 27.7 Somers' D 0.228 
Percent Discordant 4.9 Gamma 0.700 
Percent Tied 67.4 Tau-a 0.115 
Pairs 2021 c 0.614     
  --------------------------------------------------------------------------------
The SAS System 


The LOGISTIC Procedure

Model Information 
Data Set EN.EN1 
Response Variable dry 
Number of Response Levels 2 
Model binary logit 
Optimization Technique Fisher's scoring 

Number of Observations Read 98 
Number of Observations Used 98      

Response Profile 
Ordered
Value dry Total
Frequency 
1 1 47 
2 0 51     


Probability modeled is dry='1'.    

Class Level
Information 
Class Value Design
Variables 
rt 0 1 
  1 -1 

Model Convergence Status 
Convergence criterion (GCONV=1E-8) satisfied. 

Model Fit Statistics 
Criterion Intercept Only Intercept and
Covariates 
AIC 137.694 134.806 
SC 140.279 139.976 
-2 Log L 135.694 130.806 

Testing Global Null Hypothesis: BETA=0 
Test Chi-Square DF Pr > ChiSq 
Likelihood Ratio 4.8871 1 0.0271 
Score 4.6063 1 0.0319 
Wald 4.1208 1 0.0424 

Type 3 Analysis of Effects 
Effect DF Wald
Chi-Square Pr > ChiSq 
rt 1 4.1208 0.0424 

Analysis of Maximum Likelihood Estimates 
Parameter   DF Estimate Standard
Error Wald
Chi-Square Pr > ChiSq Exp(Est) 
Intercept   1 -0.6020 0.3435 3.0712 0.0797 0.548 
rt 0 1 0.6973 0.3435 4.1208 0.0424 2.008 

Odds Ratio Estimates 
Effect Point Estimate 95% Wald
Confidence Limits 
rt 0 vs 1 4.033 1.049 15.504 


Association of Predicted Probabilities and
Observed Responses 
Percent Concordant 20.2 Somers' D 0.152 
Percent Discordant 5.0 Gamma 0.603 
Percent Tied 74.8 Tau-a 0.077 
Pairs 2397 c 0.576 

--------------------------------------------------------------------------------
The SAS System 

The LOGISTIC Procedure

Model Information 
Data Set EN.EN1 
Response Variable dry 
Number of Response Levels 2 
Model binary logit 
Optimization Technique Fisher's scoring 

Number of Observations Read 98 
Number of Observations Used 90 

Response Profile 
Ordered
Value dry Total
Frequency 
1 1 43 
2 0 47 

Probability modeled is dry='1'. 

Note: 8 observations were deleted due to missing values for the response or explanatory variables. 

Class Level Information 
Class Value Design
Variables 
rt 0 1 
  1 -1 
chi_ur 0 1 
  1 -1 

Model Convergence Status 
Quasi-complete separation of data points detected. 

Warning: The maximum likelihood estimate may not exist. 


Warning: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable. 


Model Fit Statistics 
Criterion Intercept Only Intercept and
Covariates 
AIC 126.589 121.066 
SC 129.088 131.065 
-2 Log L 124.589 113.066 

Testing Global Null Hypothesis: BETA=0 
Test Chi-Square DF Pr > ChiSq 
Likelihood Ratio 11.5228 3 0.0092 
Score 10.6138 3 0.0140 
Wald 8.6501 3 0.0343       

Joint Tests 
Effect DF Wald
Chi-Square Pr > ChiSq 
rt 1 0.0007 0.9787 
chi_ur 1 0.0009 0.9765 
rt*chi_ur 1 0.0005 0.9830 

Note: Under full-rank parameterizations, Type 3 effect tests are replaced by joint tests. The joint test for an effect is a test that all the parameters associated with that effect are zero. Such joint tests might not be equivalent to Type 3 effect tests under GLM parameterization. 

Analysis of Maximum Likelihood Estimates 
Parameter     DF Estimate Standard
Error Wald
Chi-Square Pr > ChiSq Exp(Est) 
Intercept     1 -3.5417 111.8 0.0010 0.9747 0.029 
rt 0   1 2.9849 111.8 0.0007 0.9787 19.785 
chi_ur 0   1 3.2945 111.8 0.0009 0.9765 26.963 
rt*chi_ur 0 0 1 -2.3849 111.8 0.0005 0.9830 0.092       

Association of Predicted Probabilities and
Observed Responses 
Percent Concordant 40.7 Somers' D 0.319 
Percent Discordant 8.8 Gamma 0.646 
Percent Tied 50.6 Tau-a 0.161 
Pairs 2021 c 0.660

我认为它有点怀疑SAS分析中的最大似然估计的标准误差保持不变……

有什么想法吗？我怎么才能修复它呢？谢谢!

sas

regression

Stack Overflow用户

发布于 2018-10-22 23:06:21

我怀疑这是因为您没有在PROC LOGISTIC中的CLASS语句上指定一个PARAMETERIZATION和REF选项，所以参数化方法会不同。R也没有指定'event‘是什么，假设它使用1，那么结果应该是相似的。

class rt (param=ref);

票数 0

查看全部 1 条回答

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/52932259

复制

相似问题

问因子logistic回归中SAS和R的不同输出
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问因子logistic回归中SAS和R的不同输出EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问因子logistic回归中SAS和R的不同输出
EN