我运行以下代码:
from datetime import datetime
df_students_ages = df_students.dropna()
df_students_ages.loc[:, ['birth_year']] = df_students_ages.birthday.apply(lambda x : x.split('-')[0])
#conditional drop
df_students_ages.drop(df_students_ages[df_students_ages.birth_year > '2015'].index, inplace=True)
df_students_ages.drop(df_students_ages[df_students_ages.birth_year < '1920'].index, inplace=True)
df_students_ages.drop(columns='birth_year', inplace=True)
df_students_ages.loc[:, ['birthday']] = df_students_ages.birthday.apply(pd.to_datetime)
def from_dob_to_age(born):
today = pd.to_datetime(datetime.now().date())
return today.year - born.year - ((today.month, today.day) < (born.month, born.day))
df_students_ages.loc[:, ['age']] = df_students_ages.birthday.apply(lambda x: from_dob_to_age(x))
df_students_ages.sort_values('age')
我得到这样的警告:
~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexing.py:659: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self.obj[k] = np.nan
~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexing.py:1745: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
isetter(ilocs[0], value)
~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py:4163: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
return super().drop(
如何避免被感染?我还应该在'.loc[]‘表格中填什么?我不知道如何将它与条件删除相结合。
输出是整齐的。
发布于 2021-01-07 02:58:08
在……里面
#conditional drop
df_students_ages.drop(df_students_ages[df_students_ages.birth_year > '2015'].index, inplace=True)
df_students_ages.drop(df_students_ages[df_students_ages.birth_year < '1920'].index, inplace=True)
df_students_ages.drop(columns='birth_year', inplace=True)
df_students_ages.loc[:, ['birthday']] = df_students_ages.birthday.apply(pd.to_datetime)
您可能需要将其修改为如下所示:
#conditional drop
df_students_ages = df_students_ages.drop(df_students_ages[df_students_ages.birth_year > '2015'].index, inplace=True)
df_students_ages = df_students_ages.drop(df_students_ages[df_students_ages.birth_year < '1920'].index, inplace=True)
df_students_ages = df_students_ages.drop(columns='birth_year', inplace=True)
df_students_ages = df_students_ages.loc[:, ['birthday']] = df_students_ages.birthday.apply(pd.to_datetime)
发布于 2021-01-08 02:38:19
我自己找到了答案。我实际需要做的唯一一件事就是在开始删除无效行之前复制整个源数据帧。这样,我甚至在可能的情况下避免使用'.loc[]‘函数。最终的代码如下所示:
from datetime import datetime
# this is what has actually changed
df_students_ages = df_students
df_students_ages.dropna(inplace=True)
# new column creation without .loc
df_students_ages['birth_year'] = df_students_ages.birthday.apply(lambda x : x.split('-')[0])
# dropping inplace
df_students_ages.drop(df_students_ages[df_students_ages.birth_year > '2015'].index, inplace=True)
df_students_ages.drop(df_students_ages[df_students_ages.birth_year < '1920'].index, inplace=True)
df_students_ages = df_students_ages.drop(columns='birth_year')
df_students_ages.birthday = df_students_ages.birthday.apply(pd.to_datetime)
def from_dob_to_age(born):
today = pd.to_datetime(datetime.now().date())
return today.year - born.year - ((today.month, today.day) < (born.month, born.day))
# new column creation without .loc
df_students_ages['age'] = df_students_ages.birthday.apply(lambda x: from_dob_to_age(x))
df_students_ages.sort_values('age')
https://stackoverflow.com/questions/65601220
复制相似问题