首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >我在这份简历代码里犯了什么错误

我在这份简历代码里犯了什么错误
EN

Stack Overflow用户
提问于 2022-08-17 16:38:00
回答 1查看 71关注 0票数 0

我试图为我的培训和测试数据集做简历。我正在使用LinearRegressor。但是,当我运行代码时,我会得到下面的错误。但是,当我在决策树上运行代码时,我不会发现任何错误,而且代码可以工作。怎么解决这个问题?我的简历部分代码正确吗?谢谢你的help.......................................................

简历代码参考:scikit-learn cross_validation over-fitting or under-fitting

代码语言:javascript
运行
复制
data_set = pd.read_excel("NEW Collected Data for Preliminary Results Independant variables ONLY_NO AREA_NO_INFILL_DENSITY_no_printing_temperature.xlsx")
pd.set_option('max_columns', 35)
pd.set_option('max_rows', 300)
data_set.head(300)


X, y = data_set[[ "Part's Z-Height (mm)","Part's Solid Volume (cm^3)","Layer Height (mm)","Printing/Scanning Speed (mm/s)","Part's Orientation (Support's volume) (cm^3)"]], data_set [["Climate change (kg CO2 eq.)","Climate change, incl biogenic carbon (kg CO2 eq.)","Fine Particulate Matter Formation (kg PM2.5 eq.)","Fossil depletion (kg oil eq.)","Freshwater Consumption (m^3)","Freshwater ecotoxicity (kg 1,4-DB eq.)","Freshwater Eutrophication (kg P eq.)","Human toxicity, cancer (kg 1,4-DB eq.)","Human toxicity, non-cancer (kg 1,4-DB eq.)","Ionizing Radiation (Bq. C-60 eq. to air)","Land use (Annual crop eq. yr)","Marine ecotoxicity (kg 1,4-DB eq.)","Marine Eutrophication (kg N eq.)","Metal depletion (kg Cu eq.)","Photochemical Ozone Formation, Ecosystem (kg NOx eq.)","Photochemical Ozone Formation, Human Health (kg NOx eq.)","Stratospheric Ozone Depletion (kg CFC-11 eq.)","Terrestrial Acidification (kg SO2 eq.)","Terrestrial ecotoxicity (kg 1,4-DB eq.)"]]
代码语言:javascript
运行
复制
   scaler = preprocessing.MinMaxScaler()
    names = data_set.columns
    d = scaler.fit_transform(data_set)
    scaled_df = pd.DataFrame(d, columns=names)
    X_normalized, y_for_normalized = scaled_df[[ "Part's Z-Height (mm)","Part's Solid Volume (cm^3)","Layer Height (mm)","Printing/Scanning Speed (mm/s)","Part's Orientation (Support's volume) (cm^3)"]], scaled_df [["Climate change (kg CO2 eq.)","Climate change, incl biogenic carbon (kg CO2 eq.)","Fine Particulate Matter Formation (kg PM2.5 eq.)","Fossil depletion (kg oil eq.)","Freshwater Consumption (m^3)","Freshwater ecotoxicity (kg 1,4-DB eq.)","Freshwater Eutrophication (kg P eq.)","Human toxicity, cancer (kg 1,4-DB eq.)","Human toxicity, non-cancer (kg 1,4-DB eq.)","Ionizing Radiation (Bq. C-60 eq. to air)","Land use (Annual crop eq. yr)","Marine ecotoxicity (kg 1,4-DB eq.)","Marine Eutrophication (kg N eq.)","Metal depletion (kg Cu eq.)","Photochemical Ozone Formation, Ecosystem (kg NOx eq.)","Photochemical Ozone Formation, Human Health (kg NOx eq.)","Stratospheric Ozone Depletion (kg CFC-11 eq.)","Terrestrial Acidification (kg SO2 eq.)","Terrestrial ecotoxicity (kg 1,4-DB eq.)"]]
    scaled_df.head(200) 
代码语言:javascript
运行
复制
Part's Z-Height (mm)    Part's Solid Volume (cm^3)  Layer Height (mm)   Printing/Scanning Speed (mm/s)  Part's Orientation (Support's volume) (cm^3)    Climate change (kg CO2 eq.) Climate change, incl biogenic carbon (kg CO2 eq.)   Fine Particulate Matter Formation (kg PM2.5 eq.)    Fossil depletion (kg oil eq.)   Freshwater Consumption (m^3)    Freshwater ecotoxicity (kg 1,4-DB eq.)  Freshwater Eutrophication (kg P eq.)    Human toxicity, cancer (kg 1,4-DB eq.)  Human toxicity, non-cancer (kg 1,4-DB eq.)  Ionizing Radiation (Bq. C-60 eq. to air)    Land use (Annual crop eq. yr)   Marine ecotoxicity (kg 1,4-DB eq.)  Marine Eutrophication (kg N eq.)    Metal depletion (kg Cu eq.) Photochemical Ozone Formation, Ecosystem (kg NOx eq.)   Photochemical Ozone Formation, Human Health (kg NOx eq.)    Stratospheric Ozone Depletion (kg CFC-11 eq.)   Terrestrial Acidification (kg SO2 eq.)  Terrestrial ecotoxicity (kg 1,4-DB eq.)
0   0.258287    0.005030    0.0 0.666667    0.040088    0.069825    0.056976    0.083205    0.010373    0.113808    0.104798    0.086400    0.110358    0.012836    0.091120    0.108676    0.090401    0.087426    0.125608    0.079028    0.080495    0.078380    0.082404    0.045040
1   0.258287    0.005030    0.2 0.666667    0.036597    0.041682    0.022880    0.074884    0.004841    0.045640    0.102285    0.082884    0.044202    0.005414    0.086700    0.105749    0.087161    0.084130    0.060373    0.072878    0.073529    0.074829    0.075438    0.018122
2   0.258287    0.009557    0.4 0.666667    0.031013    0.033310    0.012113    0.073035    0.003458    0.023401    0.102914    0.082494    0.022690    0.003231    0.086279    0.105749    0.086937    0.084130    0.039708    0.071341    0.071981    0.074698    0.073447    0.009856
3   0.258287    0.009054    0.6 0.666667    0.031013    0.029213    0.006954    0.072111    0.002766    0.012936    0.102914    0.082103    0.012524    0.001921    0.086069    0.105423    0.086602    0.084130    0.029579    0.070572    0.071207    0.074435    0.072452    0.005723
4   0.258287    0.010060    1.0 0.666667    0.031711    0.025650    0.001795    0.071803    0.003458    0.002180    0.103542    0.082884    0.002063    0.001048    0.086490    0.106074    0.087049    0.084542    0.019449    0.070572    0.071207    0.074961    0.072452    0.001908
5   0.258287    0.005030    0.0 0.000000    0.040088    0.074279    0.062360    0.084129    0.011065    0.125000    0.104798    0.086790    0.121114    0.014146    0.091330    0.108676    0.091519    0.087426    0.136143    0.080566    0.081269    0.078511    0.083400    0.049385
6   0.258287    0.038226    0.0 0.666667    0.040088    0.097791    0.074249    0.109091    0.038036    0.135174    0.129299    0.111788    0.132164    0.024625    0.116582    0.133725    0.116102    0.112970    0.154781    0.105166    0.106037    0.104419    0.108280    0.064222
7   0.137212    0.004527    0.0 0.666667    0.030314    0.058247    0.046433    0.076117    0.003458    0.095349    0.099144    0.080150    0.092382    0.008907    0.084806    0.102821    0.084702    0.081246    0.106159    0.072878    0.073529    0.072199    0.075438    0.035608
8   0.137212    0.004527    0.2 0.666667    0.029616    0.035269    0.017721    0.069954    0.000000    0.037355    0.098516    0.078197    0.036246    0.002794    0.082281    0.101520    0.082803    0.080010    0.051053    0.068266    0.068885    0.070489    0.070462    0.013247
9   0.137212    0.010060    0.4 0.666667    0.028918    0.031706    0.010543    0.072111    0.002766    0.020494    0.102285    0.081712    0.019891    0.002358    0.085438    0.104773    0.086043    0.083306    0.036467    0.070572    0.071207    0.073908    0.072452    0.008372
10  0.137212    0.010060    0.6 0.666667    0.028220    0.027431    0.005384    0.070878    0.001383    0.010320    0.101657    0.080931    0.010019    0.001484    0.084806    0.104448    0.085373    0.082894    0.026742    0.069803    0.070433    0.073251    0.071457    0.004345
11  0.137212    0.009557    1.0 0.666667    0.027522    0.022800    0.000000    0.069029    0.000000    0.000000    0.101029    0.080150    0.000000    0.000000    0.083754    0.103472    0.084367    0.081658    0.016613    0.068266    0.068885    0.072330    0.070462    0.000000
12  0.137212    0.004527    0.0 0.000000    0.030314    0.062879    0.052266    0.077042    0.004149    0.107122    0.099144    0.080541    0.103875    0.010217    0.085227    0.102821    0.085037    0.081658    0.117099    0.073647    0.074303    0.072462    0.076433    0.040165
13  0.137212    0.037723    0.0 0.666667    0.030314    0.085857    0.063257    0.102003    0.031120    0.116134    0.123645    0.105929    0.112568    0.020695    0.110269    0.127544    0.110515    0.106790    0.134522    0.098247    0.099071    0.097843    0.101314    0.053624
14  0.077118    0.004527    0.0 0.666667    0.054050    0.080335    0.064827    0.091217    0.018672    0.126453    0.111709    0.093821    0.122145    0.016766    0.098485    0.115833    0.098223    0.094842    0.139789    0.087485    0.088235    0.085876    0.090366    0.052777
15  0.077118    0.004527    0.0 0.000000    0.054050    0.085144    0.070884    0.092450    0.019364    0.138081    0.111709    0.094211    0.133638    0.018075    0.099116    0.116158    0.098223    0.094842    0.151135    0.088253    0.089009    0.086139    0.091361    0.057864
16  0.077118    0.004527    0.0 0.333333    0.054050    0.082472    0.067519    0.091834    0.019364    0.132267    0.111709    0.094211    0.127744    0.017639    0.098695    0.116158    0.098223    0.094842    0.144652    0.087485    0.088235    0.086007    0.091361    0.054684
代码语言:javascript
运行
复制
     lin_regressor = LinearRegression()
    
    # pass the order of your polynomial here  
    poly = PolynomialFeatures(1)
    
    # convert to be used further to linear regression
    X_transform = poly.fit_transform(x_train)
    
    # fit this to Linear Regressor
    linear_regg=lin_regressor.fit(X_transform,y_train).                                               
代码语言:javascript
运行
复制
    import numpy as np
    from sklearn.metrics import SCORERS
    from sklearn.model_selection import KFold
    
    scorer = SCORERS['r2']
    
    cv = KFold(n_splits=5, random_state=0,shuffle=True)
    train_scores, test_scores = [], []
    
    for train, test in cv.split(X_normalized):
        X_transform2 = poly.fit_transform(X_normalized)
        OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train])
        tr_21 = OL.score(X_train, y_train)
        ts_21 = OL.score(X_test, y_test)
        print ("Train score:", tr_21) # from documentation .score returns r^2
        print ("Test score:", ts_21)   # from documentation .score returns r^2
        
        train_scores.append(tr_21)
        test_scores.append(ts_21)


    
    print ("The Mean for Train scores is:",(np.mean(train_scores)))
        
    print ("The Mean for Test scores is:",(np.mean(test_scores)))

错误消息:

代码语言:javascript
运行
复制
        --------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    /var/folders/mm/r4gnnwl948zclfyx12w803040000gn/T/ipykernel_73165/2276765730.py in <module>
         10 for train, test in cv.split(X_normalized):
         11     X_transform2 = poly.fit_transform(X_normalized)
    ---> 12     OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train])
         13     tr_21 = OL.score(X_train, y_train)
         14     ts_21 = OL.score(X_test, y_test)
    
    AttributeError: 'numpy.ndarray' object has no attribute 'iloc'

决策树

代码语言:javascript
运行
复制
    new_model = DecisionTreeRegressor(max_depth=9,
                                      min_samples_split=10,random_state=0)


    import numpy as np
    from sklearn.metrics import SCORERS
    from sklearn.model_selection import KFold
     
    scorer = SCORERS['r2']
     
    cv = KFold(n_splits=5, random_state=0,shuffle=True)
    train_scores, test_scores = [], []
     
    for train, test in cv.split(X_normalized):
     
        OO=new_model.fit(X_normalized.iloc[train], y_for_normalized.iloc[train])
        tr_2 = OO.score(X_train, y_train)
        ts_2 = OO.score(X_test, y_test)
        print ("Train score:", tr_2) # from documentation .score returns r^2
        print ("Test score:", ts_2)   # from documentation .score returns r^2
         
        train_scores.append(tr_2)
        test_scores.append(ts_2)
     
         
         
    print ("The Mean for Train scores is:",(np.mean(train_scores)))
         
    print ("The Mean for Test scores is:",(np.mean(test_scores)))

输出

代码语言:javascript
运行
复制
    Train score: 0.8960560474997927
    Test score: -0.15521696464773224
    Train score: 0.8852795454592853
    Test score: 0.17650772852710495
    Train score: 0.5825347735306872
    Test score: 0.34789159049344665
    Train score: 0.8549575808716975
    Test score: 0.7615265842042157
    Train score: 0.8340261480334055
    Test score: 0.14011826401728472
    The Mean for Train scores is: 0.8105708190789735
    The Mean for Test scores is: 0.2541654405188639

#试用1

代码语言:javascript
运行
复制
import numpy as np
from sklearn.metrics import SCORERS
from sklearn.model_selection import KFold

scorer = SCORERS['r2']

cv = KFold(n_splits=5, random_state=0,shuffle=True)
train_scores, test_scores = [], []

for train, test in cv.split(X_normalized):
    X_transform2 = poly.fit_transform(X_normalized)
    OL=lin_regressor.fit(X_transform2[train], y_for_normalized[train])
    tr_21 = OL.score(X_train, y_train)
    ts_21 = OL.score(X_test, y_test)
    print ("Train score:", tr_21) # from documentation .score returns r^2
    print ("Test score:", ts_21)   # from documentation .score returns r^2
    
    train_scores.append(tr_21)
    test_scores.append(ts_21)


    
print ("The Mean for Train scores is:",(np.mean(train_scores)))
    
print ("The Mean for Test scores is:",(np.mean(test_scores)))

错误消息:

代码语言:javascript
运行
复制
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/var/folders/mm/r4gnnwl948zclfyx12w803040000gn/T/ipykernel_90924/12176184.py in <module>
     10 for train, test in cv.split(X_normalized):
     11     X_transform2 = poly.fit_transform(X_normalized)
---> 12     OL=lin_regressor.fit(X_transform2[train], y_for_normalized[train])
     13     tr_21 = OL.score(X_train, y_train)
     14     ts_21 = OL.score(X_test, y_test)

~/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py in __getitem__(self, key)
   3462             if is_iterator(key):
   3463                 key = list(key)
-> 3464             indexer = self.loc._get_listlike_indexer(key, axis=1)[1]
   3465 
   3466         # take() does not accept boolean indexers

~/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis)
   1312             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
   1313 
-> 1314         self._validate_read_indexer(keyarr, indexer, axis)
   1315 
   1316         if needs_i8_conversion(ax.dtype) or isinstance(

~/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis)
   1372                 if use_interval_msg:
   1373                     key = list(key)
-> 1374                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   1375 
   1376             not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())

KeyError: "None of [Int64Index([ 0,  1,  3,  4,  5,  6,  9, 10, 11, 12, 14, 15, 17, 18, 19, 20, 21,\n            23, 25, 27, 28, 29, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,\n            44, 45, 46, 47, 48, 49, 50, 51, 52, 56, 57, 58, 59, 60, 61, 62, 63,\n            64, 65, 66, 67, 68, 69, 70, 71, 72, 74, 76, 77, 79, 80, 81, 82, 83,\n            84, 85, 87, 88, 89, 90, 91, 94, 96, 97, 98, 99],\n           dtype='int64')] are in the [columns]"
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-08-17 21:10:29

理解

  • poly.fit_transform将返回numpy.ndarray,因此在这里,您的X_normalized将从pandas.core.frame.DataFrame转换为respectively.

,您的y_for_normalizednumpy.ndarray中仍然是pandas.core.frame.DataFrame.

  • So,您将以numpy.ndarray[indexes]的形式传递索引,对于pandas.core.frame.DataFrame,您将在.iloc[indexes] respectively.

中传递索引

解决方案

  • For X_transform2使用[]获取数据,因为它是X_transform2y_for_normalized使用.iloc[]就像它是

代码

代码语言:javascript
运行
复制
train_scores, test_scores = [], []

for train, test in cv.split(X_normalized):
    X_transform2 = poly.fit_transform(X_normalized)
    # [] for X_transform2, .iloc[] for y_for_normalized
    OL = lin_regressor.fit(X_transform2[train], y_for_normalized.iloc[train])
    tr_21 = OL.score(X_transform2[train], y_for_normalized.iloc[train])
    ts_21 = OL.score(X_transform2[test], y_for_normalized.iloc[test])
    print("Train score:", tr_21)  # from documentation .score returns r^2
    print("Test score:", ts_21)  # from documentation .score returns r^2

    train_scores.append(tr_21)
    test_scores.append(ts_21)


print("The Mean for Train scores is:", (np.mean(train_scores)))

print("The Mean for Test scores is:", (np.mean(test_scores)))

PS:

  • 不知道你为什么在OL.score中使用X_trainy_trainX_testy_test。它应该是由traintest生成的具有cv索引的数据集。同样的情况反映在上面的代码片段中。如果您有use.

y_trainX_test,因为特定的原因而定义了y_test,那么您对X_train很好。

当您希望所有功能都达到1度时,为什么要使用sklearn.,所以使用PolynomialFeatures() 1度没有什么区别。如果使用新版本的PolynomialFeatures()

  • 也会检查SCORER的弃用警告。
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73392074

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档