考虑一下dataframe df
df = pd.DataFrame(dict(A=[1, 2], B=['X', 'Y']))
df
A B
0 1 X
1 2 Y
如果我沿着axis=0移动(默认值)
df.shift()
A B
0 NaN NaN
1 1.0 X
它按预期将所有行向下推送一行。
但是当我沿着axis=1移动时
df.shift(axis=1)
A B
0 NaN NaN
1 NaN NaN
当我期待的时候一切都是空的
A B
0 NaN 1
1 NaN
我有一张每日(时间序列)城市雨量表。如何使用熊猫填充物NaN,由第二天阴雨的同城?谢谢。
import pandas as pd
import numpy as np
rain_before = pd.DataFrame({'date':Date*2,'city':list('aaaaabbbbb'),'rain':[6,np.nan,1,np.nan,np.nan,4,np.nan,np.nan,8,np.nan]})
# after fillna, the table should look like this.
rain
问题是如何在熊猫数据栏中用最频繁的级别填充NaNs?
在R randomForest包中有选项:A completed data matrix or data frame. For numeric variables, NAs are replaced with column medians. For factor variables, NAs are replaced with the most frequent levels (breaking ties at random). If object contains no NAs, it is returned unaltered.
在Pa
我知道fillna()方法可以用来填充整个数据中的NaN。
df.fillna(df.mean()) # fill with mean of column.
如何将平均计算限制在NaN所在的组(和列)。
实例:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'a': pd.Series([1,1,1,2,2,2]),
'b': pd.Series([1,2,np.NaN,1,np.NaN,4])
})
print df
输入
a b
0 1 1
1