我需要比较一个dataframe中连续的两行,例如:
df:
time door name
00:01:10 in alex
00:01:10 in alex
02:01:10 out alex
03:01:10 in alex
04:01:10 out alex
04:01:10 out alex如果door是连续两行中的in (或out),则需要删除重复项。
这是我代码的一部分:
import pandas as pd
file_name='test.xlsx'
df = pd.read_excel(file_name, header=0, index= False)
mydf = df.sort_values(by='time')
for i in range (len(mydf)):
if (mydf[['door']] != mydf[['door']].shift(-1)).any(axis=1):
print('ok')
else:
print ('nok')我发现了一个错误:
if ((mydf[['Door Name']] != mydf[['Door Name']].shift(-1).any(axis=1))):
File "C:\Users\khou\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py", line 1478, in __nonzero__
.format(self.__class__.__name__))
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().我不知道怎么修理,任何帮助都会很感激的。
发布于 2019-11-19 15:40:01
您可以使用以下方法获得索引,然后根据索引删除:
码
import pandas as pd
mydf=pd.DataFrame({'time':['00:01:10','00:01:10','02:01:10','03:01:10','04:01:10','04:01:10'],
'door':['in','in','out','in','out','out'],
'name':['alex','alex','alex','alex','alex','alex']})
idx=[]
for i in range (0,len(mydf)):
if i == 0:
print ('index '+str(i)+' ok')
elif mydf['door'][i] != mydf['door'][i-1]:
print('index '+str(i)+' ok')
else:
print ('index '+str(i)+' nok')
idx.append(i)
mydf.drop(mydf.index[[idx]],inplace=True)
print('\n',mydf)输出
index 0 ok
index 1 nok
index 2 ok
index 3 ok
index 4 ok
index 5 nok
time door name
0 00:01:10 in alex
2 02:01:10 out alex
3 03:01:10 in alex
4 04:01:10 out alex发布于 2019-11-19 15:16:20
您可以使用np.where,然后使用dropna
df['door'] = np.where((df['door'] == df['door'].shift(-1)), np.nan, df['door'])
df.dropna(how='any', axis=0, inplace= True)
print(df)
time door name
00:01:10 in alex
02:01:10 out alex
03:01:10 in alex
04:01:10 out alex或
如果door.values总是有重复的时间,那么您可以简单地使用参数keep = 'first' and subset = ['time', 'door']的df.drop_duplicates
df.drop_duplicates(subset=['time', 'door'], keep='first', inplace= True)
print(df)
time door name
00:01:10 in alex
02:01:10 out alex
03:01:10 in alex
04:01:10 out alexhttps://stackoverflow.com/questions/58936689
复制相似问题