我有如下数据集:
ts
Out[227]:
Sales
Month
Jan 1808
Feb 1251
Mar 3023
Apr 4857
May 2506
Jun 2453
Jul 1180
Aug 4239
Sep 1759
Oct 2539
Nov 3923
Dec 2999取window=2的移动平均值后,输出如下:
shifted = ts.shift(0)
window = shifted.rolling(window=2)
means = window.mean()
print(means)
Sales
Month
Jan NaN
Feb 1529.5
Mar 2137.0
Apr 3940.0
May 3681.5
Jun 2479.5
Jul 1816.5
Aug 2709.5
Sep 2999.0
Oct 2149.0
Nov 3231.0
Dec 3460.5我希望NaN被它的原始值所取代。能办到吗?
发布于 2018-03-21 12:33:42
试试这个:
In [92]: ts.rolling(window=2, min_periods=1).mean()
Out[92]:
Sales
Jan 1808.0
Feb 1529.5
Mar 2137.0
Apr 3940.0
May 3681.5
Jun 2479.5
Jul 1816.5
Aug 2709.5
Sep 2999.0
Oct 2149.0
Nov 3231.0
Dec 3461.0发布于 2018-03-21 12:34:34
使用:
df = df['Sales'].rolling(window=2).mean().fillna(df['Sales'])
print (df)
Jan 1808.0
Feb 1529.5
Mar 2137.0
Apr 3940.0
May 3681.5
Jun 2479.5
Jul 1816.5
Aug 2709.5
Sep 2999.0
Oct 2149.0
Nov 3231.0
Dec 3461.0
Name: Sales, dtype: float64如果通过n>2滚动,两种解决方案都有差异。
df['Sales1'] = df['Sales'] * 2
df1 = df.rolling(window=3).mean().combine_first(df)
print (df1)
Sales Sales1
Jan 1808.000000 3616.000000
Feb 1251.000000 2502.000000 <-diff
Mar 2027.333333 4054.666667
Apr 3043.666667 6087.333333
May 3462.000000 6924.000000
Jun 3272.000000 6544.000000
Jul 2046.333333 4092.666667
Aug 2624.000000 5248.000000
Sep 2392.666667 4785.333333
Oct 2845.666667 5691.333333
Nov 2740.333333 5480.666667
Dec 3153.666667 6307.333333
df2 = df.rolling(window=3, min_periods=1).mean()
print (df2)
Sales Sales1
Jan 1808.000000 3616.000000
Feb 1529.500000 3059.000000 <-diff
Mar 2027.333333 4054.666667
Apr 3043.666667 6087.333333
May 3462.000000 6924.000000
Jun 3272.000000 6544.000000
Jul 2046.333333 4092.666667
Aug 2624.000000 5248.000000
Sep 2392.666667 4785.333333
Oct 2845.666667 5691.333333
Nov 2740.333333 5480.666667
Dec 3153.666667 6307.333333https://stackoverflow.com/questions/49406432
复制相似问题