文章/答案/技术大牛

发布

社区首页 >问答首页 >我在这份简历代码里犯了什么错误

问我在这份简历代码里犯了什么错误
EN

Stack Overflow用户

提问于 2022-08-17 16:38:00

回答 1查看 71关注 0票数 0

我试图为我的培训和测试数据集做简历。我正在使用LinearRegressor。但是，当我运行代码时，我会得到下面的错误。但是，当我在决策树上运行代码时，我不会发现任何错误，而且代码可以工作。怎么解决这个问题？我的简历部分代码正确吗？谢谢你的help.......................................................

简历代码参考：scikit-learn cross_validation over-fitting or under-fitting

data_set = pd.read_excel("NEW Collected Data for Preliminary Results Independant variables ONLY_NO AREA_NO_INFILL_DENSITY_no_printing_temperature.xlsx")
pd.set_option('max_columns', 35)
pd.set_option('max_rows', 300)
data_set.head(300)


X, y = data_set[[ "Part's Z-Height (mm)","Part's Solid Volume (cm^3)","Layer Height (mm)","Printing/Scanning Speed (mm/s)","Part's Orientation (Support's volume) (cm^3)"]], data_set [["Climate change (kg CO2 eq.)","Climate change, incl biogenic carbon (kg CO2 eq.)","Fine Particulate Matter Formation (kg PM2.5 eq.)","Fossil depletion (kg oil eq.)","Freshwater Consumption (m^3)","Freshwater ecotoxicity (kg 1,4-DB eq.)","Freshwater Eutrophication (kg P eq.)","Human toxicity, cancer (kg 1,4-DB eq.)","Human toxicity, non-cancer (kg 1,4-DB eq.)","Ionizing Radiation (Bq. C-60 eq. to air)","Land use (Annual crop eq. yr)","Marine ecotoxicity (kg 1,4-DB eq.)","Marine Eutrophication (kg N eq.)","Metal depletion (kg Cu eq.)","Photochemical Ozone Formation, Ecosystem (kg NOx eq.)","Photochemical Ozone Formation, Human Health (kg NOx eq.)","Stratospheric Ozone Depletion (kg CFC-11 eq.)","Terrestrial Acidification (kg SO2 eq.)","Terrestrial ecotoxicity (kg 1,4-DB eq.)"]]

   scaler = preprocessing.MinMaxScaler()
    names = data_set.columns
    d = scaler.fit_transform(data_set)
    scaled_df = pd.DataFrame(d, columns=names)
    X_normalized, y_for_normalized = scaled_df[[ "Part's Z-Height (mm)","Part's Solid Volume (cm^3)","Layer Height (mm)","Printing/Scanning Speed (mm/s)","Part's Orientation (Support's volume) (cm^3)"]], scaled_df [["Climate change (kg CO2 eq.)","Climate change, incl biogenic carbon (kg CO2 eq.)","Fine Particulate Matter Formation (kg PM2.5 eq.)","Fossil depletion (kg oil eq.)","Freshwater Consumption (m^3)","Freshwater ecotoxicity (kg 1,4-DB eq.)","Freshwater Eutrophication (kg P eq.)","Human toxicity, cancer (kg 1,4-DB eq.)","Human toxicity, non-cancer (kg 1,4-DB eq.)","Ionizing Radiation (Bq. C-60 eq. to air)","Land use (Annual crop eq. yr)","Marine ecotoxicity (kg 1,4-DB eq.)","Marine Eutrophication (kg N eq.)","Metal depletion (kg Cu eq.)","Photochemical Ozone Formation, Ecosystem (kg NOx eq.)","Photochemical Ozone Formation, Human Health (kg NOx eq.)","Stratospheric Ozone Depletion (kg CFC-11 eq.)","Terrestrial Acidification (kg SO2 eq.)","Terrestrial ecotoxicity (kg 1,4-DB eq.)"]]
    scaled_df.head(200)

Part's Z-Height (mm)    Part's Solid Volume (cm^3)  Layer Height (mm)   Printing/Scanning Speed (mm/s)  Part's Orientation (Support's volume) (cm^3)    Climate change (kg CO2 eq.) Climate change, incl biogenic carbon (kg CO2 eq.)   Fine Particulate Matter Formation (kg PM2.5 eq.)    Fossil depletion (kg oil eq.)   Freshwater Consumption (m^3)    Freshwater ecotoxicity (kg 1,4-DB eq.)  Freshwater Eutrophication (kg P eq.)    Human toxicity, cancer (kg 1,4-DB eq.)  Human toxicity, non-cancer (kg 1,4-DB eq.)  Ionizing Radiation (Bq. C-60 eq. to air)    Land use (Annual crop eq. yr)   Marine ecotoxicity (kg 1,4-DB eq.)  Marine Eutrophication (kg N eq.)    Metal depletion (kg Cu eq.) Photochemical Ozone Formation, Ecosystem (kg NOx eq.)   Photochemical Ozone Formation, Human Health (kg NOx eq.)    Stratospheric Ozone Depletion (kg CFC-11 eq.)   Terrestrial Acidification (kg SO2 eq.)  Terrestrial ecotoxicity (kg 1,4-DB eq.)
0   0.258287    0.005030    0.0 0.666667    0.040088    0.069825    0.056976    0.083205    0.010373    0.113808    0.104798    0.086400    0.110358    0.012836    0.091120    0.108676    0.090401    0.087426    0.125608    0.079028    0.080495    0.078380    0.082404    0.045040
1   0.258287    0.005030    0.2 0.666667    0.036597    0.041682    0.022880    0.074884    0.004841    0.045640    0.102285    0.082884    0.044202    0.005414    0.086700    0.105749    0.087161    0.084130    0.060373    0.072878    0.073529    0.074829    0.075438    0.018122
2   0.258287    0.009557    0.4 0.666667    0.031013    0.033310    0.012113    0.073035    0.003458    0.023401    0.102914    0.082494    0.022690    0.003231    0.086279    0.105749    0.086937    0.084130    0.039708    0.071341    0.071981    0.074698    0.073447    0.009856
3   0.258287    0.009054    0.6 0.666667    0.031013    0.029213    0.006954    0.072111    0.002766    0.012936    0.102914    0.082103    0.012524    0.001921    0.086069    0.105423    0.086602    0.084130    0.029579    0.070572    0.071207    0.074435    0.072452    0.005723
4   0.258287    0.010060    1.0 0.666667    0.031711    0.025650    0.001795    0.071803    0.003458    0.002180    0.103542    0.082884    0.002063    0.001048    0.086490    0.106074    0.087049    0.084542    0.019449    0.070572    0.071207    0.074961    0.072452    0.001908
5   0.258287    0.005030    0.0 0.000000    0.040088    0.074279    0.062360    0.084129    0.011065    0.125000    0.104798    0.086790    0.121114    0.014146    0.091330    0.108676    0.091519    0.087426    0.136143    0.080566    0.081269    0.078511    0.083400    0.049385
6   0.258287    0.038226    0.0 0.666667    0.040088    0.097791    0.074249    0.109091    0.038036    0.135174    0.129299    0.111788    0.132164    0.024625    0.116582    0.133725    0.116102    0.112970    0.154781    0.105166    0.106037    0.104419    0.108280    0.064222
7   0.137212    0.004527    0.0 0.666667    0.030314    0.058247    0.046433    0.076117    0.003458    0.095349    0.099144    0.080150    0.092382    0.008907    0.084806    0.102821    0.084702    0.081246    0.106159    0.072878    0.073529    0.072199    0.075438    0.035608
8   0.137212    0.004527    0.2 0.666667    0.029616    0.035269    0.017721    0.069954    0.000000    0.037355    0.098516    0.078197    0.036246    0.002794    0.082281    0.101520    0.082803    0.080010    0.051053    0.068266    0.068885    0.070489    0.070462    0.013247
9   0.137212    0.010060    0.4 0.666667    0.028918    0.031706    0.010543    0.072111    0.002766    0.020494    0.102285    0.081712    0.019891    0.002358    0.085438    0.104773    0.086043    0.083306    0.036467    0.070572    0.071207    0.073908    0.072452    0.008372
10  0.137212    0.010060    0.6 0.666667    0.028220    0.027431    0.005384    0.070878    0.001383    0.010320    0.101657    0.080931    0.010019    0.001484    0.084806    0.104448    0.085373    0.082894    0.026742    0.069803    0.070433    0.073251    0.071457    0.004345
11  0.137212    0.009557    1.0 0.666667    0.027522    0.022800    0.000000    0.069029    0.000000    0.000000    0.101029    0.080150    0.000000    0.000000    0.083754    0.103472    0.084367    0.081658    0.016613    0.068266    0.068885    0.072330    0.070462    0.000000
12  0.137212    0.004527    0.0 0.000000    0.030314    0.062879    0.052266    0.077042    0.004149    0.107122    0.099144    0.080541    0.103875    0.010217    0.085227    0.102821    0.085037    0.081658    0.117099    0.073647    0.074303    0.072462    0.076433    0.040165
13  0.137212    0.037723    0.0 0.666667    0.030314    0.085857    0.063257    0.102003    0.031120    0.116134    0.123645    0.105929    0.112568    0.020695    0.110269    0.127544    0.110515    0.106790    0.134522    0.098247    0.099071    0.097843    0.101314    0.053624
14  0.077118    0.004527    0.0 0.666667    0.054050    0.080335    0.064827    0.091217    0.018672    0.126453    0.111709    0.093821    0.122145    0.016766    0.098485    0.115833    0.098223    0.094842    0.139789    0.087485    0.088235    0.085876    0.090366    0.052777
15  0.077118    0.004527    0.0 0.000000    0.054050    0.085144    0.070884    0.092450    0.019364    0.138081    0.111709    0.094211    0.133638    0.018075    0.099116    0.116158    0.098223    0.094842    0.151135    0.088253    0.089009    0.086139    0.091361    0.057864
16  0.077118    0.004527    0.0 0.333333    0.054050    0.082472    0.067519    0.091834    0.019364    0.132267    0.111709    0.094211    0.127744    0.017639    0.098695    0.116158    0.098223    0.094842    0.144652    0.087485    0.088235    0.086007    0.091361    0.054684

     lin_regressor = LinearRegression()
    
    # pass the order of your polynomial here  
    poly = PolynomialFeatures(1)
    
    # convert to be used further to linear regression
    X_transform = poly.fit_transform(x_train)
    
    # fit this to Linear Regressor
    linear_regg=lin_regressor.fit(X_transform,y_train).

    import numpy as np
    from sklearn.metrics import SCORERS
    from sklearn.model_selection import KFold
    
    scorer = SCORERS['r2']
    
    cv = KFold(n_splits=5, random_state=0,shuffle=True)
    train_scores, test_scores = [], []
    
    for train, test in cv.split(X_normalized):
        X_transform2 = poly.fit_transform(X_normalized)
        OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train])
        tr_21 = OL.score(X_train, y_train)
        ts_21 = OL.score(X_test, y_test)
        print ("Train score:", tr_21) # from documentation .score returns r^2
        print ("Test score:", ts_21)   # from documentation .score returns r^2
        
        train_scores.append(tr_21)
        test_scores.append(ts_21)


    
    print ("The Mean for Train scores is:",(np.mean(train_scores)))
        
    print ("The Mean for Test scores is:",(np.mean(test_scores)))

错误消息：

        --------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    /var/folders/mm/r4gnnwl948zclfyx12w803040000gn/T/ipykernel_73165/2276765730.py in <module>
         10 for train, test in cv.split(X_normalized):
         11     X_transform2 = poly.fit_transform(X_normalized)
    ---> 12     OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train])
         13     tr_21 = OL.score(X_train, y_train)
         14     ts_21 = OL.score(X_test, y_test)
    
    AttributeError: 'numpy.ndarray' object has no attribute 'iloc'

决策树

    new_model = DecisionTreeRegressor(max_depth=9,
                                      min_samples_split=10,random_state=0)


    import numpy as np
    from sklearn.metrics import SCORERS
    from sklearn.model_selection import KFold
     
    scorer = SCORERS['r2']
     
    cv = KFold(n_splits=5, random_state=0,shuffle=True)
    train_scores, test_scores = [], []
     
    for train, test in cv.split(X_normalized):
     
        OO=new_model.fit(X_normalized.iloc[train], y_for_normalized.iloc[train])
        tr_2 = OO.score(X_train, y_train)
        ts_2 = OO.score(X_test, y_test)
        print ("Train score:", tr_2) # from documentation .score returns r^2
        print ("Test score:", ts_2)   # from documentation .score returns r^2
         
        train_scores.append(tr_2)
        test_scores.append(ts_2)
     
         
         
    print ("The Mean for Train scores is:",(np.mean(train_scores)))
         
    print ("The Mean for Test scores is:",(np.mean(test_scores)))

输出

    Train score: 0.8960560474997927
    Test score: -0.15521696464773224
    Train score: 0.8852795454592853
    Test score: 0.17650772852710495
    Train score: 0.5825347735306872
    Test score: 0.34789159049344665
    Train score: 0.8549575808716975
    Test score: 0.7615265842042157
    Train score: 0.8340261480334055
    Test score: 0.14011826401728472
    The Mean for Train scores is: 0.8105708190789735
    The Mean for Test scores is: 0.2541654405188639

#试用1

import numpy as np
from sklearn.metrics import SCORERS
from sklearn.model_selection import KFold

scorer = SCORERS['r2']

cv = KFold(n_splits=5, random_state=0,shuffle=True)
train_scores, test_scores = [], []

for train, test in cv.split(X_normalized):
    X_transform2 = poly.fit_transform(X_normalized)
    OL=lin_regressor.fit(X_transform2[train], y_for_normalized[train])
    tr_21 = OL.score(X_train, y_train)
    ts_21 = OL.score(X_test, y_test)
    print ("Train score:", tr_21) # from documentation .score returns r^2
    print ("Test score:", ts_21)   # from documentation .score returns r^2
    
    train_scores.append(tr_21)
    test_scores.append(ts_21)


    
print ("The Mean for Train scores is:",(np.mean(train_scores)))
    
print ("The Mean for Test scores is:",(np.mean(test_scores)))

错误消息：

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/var/folders/mm/r4gnnwl948zclfyx12w803040000gn/T/ipykernel_90924/12176184.py in <module>
     10 for train, test in cv.split(X_normalized):
     11     X_transform2 = poly.fit_transform(X_normalized)
---> 12     OL=lin_regressor.fit(X_transform2[train], y_for_normalized[train])
     13     tr_21 = OL.score(X_train, y_train)
     14     ts_21 = OL.score(X_test, y_test)

~/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py in __getitem__(self, key)
   3462             if is_iterator(key):
   3463                 key = list(key)
-> 3464             indexer = self.loc._get_listlike_indexer(key, axis=1)[1]
   3465 
   3466         # take() does not accept boolean indexers

~/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis)
   1312             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
   1313 
-> 1314         self._validate_read_indexer(keyarr, indexer, axis)
   1315 
   1316         if needs_i8_conversion(ax.dtype) or isinstance(

~/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis)
   1372                 if use_interval_msg:
   1373                     key = list(key)
-> 1374                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   1375 
   1376             not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())

KeyError: "None of [Int64Index([ 0,  1,  3,  4,  5,  6,  9, 10, 11, 12, 14, 15, 17, 18, 19, 20, 21,\n            23, 25, 27, 28, 29, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,\n            44, 45, 46, 47, 48, 49, 50, 51, 52, 56, 57, 58, 59, 60, 61, 62, 63,\n            64, 65, 66, 67, 68, 69, 70, 71, 72, 74, 76, 77, 79, 80, 81, 82, 83,\n            84, 85, 87, 88, 89, 90, 91, 94, 96, 97, 98, 99],\n           dtype='int64')] are in the [columns]"

cross-validation

python

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-08-17 21:10:29

理解

poly.fit_transform将返回numpy.ndarray，因此在这里，您的X_normalized将从pandas.core.frame.DataFrame转换为respectively.

，您的y_for_normalized在numpy.ndarray中仍然是pandas.core.frame.DataFrame.

So，您将以numpy.ndarray[indexes]的形式传递索引，对于pandas.core.frame.DataFrame，您将在.iloc[indexes] respectively.

中传递索引

解决方案

For X_transform2使用[]获取数据，因为它是X_transform2，y_for_normalized使用.iloc[]就像它是

代码

train_scores, test_scores = [], []

for train, test in cv.split(X_normalized):
    X_transform2 = poly.fit_transform(X_normalized)
    # [] for X_transform2, .iloc[] for y_for_normalized
    OL = lin_regressor.fit(X_transform2[train], y_for_normalized.iloc[train])
    tr_21 = OL.score(X_transform2[train], y_for_normalized.iloc[train])
    ts_21 = OL.score(X_transform2[test], y_for_normalized.iloc[test])
    print("Train score:", tr_21)  # from documentation .score returns r^2
    print("Test score:", ts_21)  # from documentation .score returns r^2

    train_scores.append(tr_21)
    test_scores.append(ts_21)


print("The Mean for Train scores is:", (np.mean(train_scores)))

print("The Mean for Test scores is:", (np.mean(test_scores)))

PS：

不知道你为什么在OL.score中使用X_train，y_train和X_test，y_test。它应该是由train和test生成的具有cv索引的数据集。同样的情况反映在上面的代码片段中。如果您有use.

，y_train和X_test，因为特定的原因而定义了y_test，那么您对X_train很好。

当您希望所有功能都达到1度时，为什么要使用sklearn.，所以使用PolynomialFeatures() 1度没有什么区别。如果使用新版本的PolynomialFeatures()，

也会检查SCORER的弃用警告。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73392074

复制

相似问题

问我在这份简历代码里犯了什么错误
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我在这份简历代码里犯了什么错误EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我在这份简历代码里犯了什么错误
EN