我有以下的熊猫数据,,df,:
L_Time U_Time Eval_Time L_Flux U_Flux
2018-05-01 04:30:00 2018-05-01 05:30:00 2018-05-01 05:23:45 100 200
2018-05-01 07:30:00 2018-05-01 08:30:00 2018-05-01 07:44:11 100 200
L_Flux和U_Flux分别包含大熊猫、时间戳、L_Time和U_Time的辐射通量值。我想在Eval_Time上插值通量的值,以秒为单位。我怎么才能和巨蟒或熊猫一起做好呢?我试着把它和熊猫线性地插在一起,但这总是给我中间值(150)。我想让通量在第二个时间戳(Eval_Time),根据它与两个小时时间戳的距离内插。
发布于 2019-05-11 07:04:09
我需要在秒内重采样L_Time和U_Time之间的数据(上采样),然后插值上采样的通量值(以前是NaN,因为它们丢失了),并在Eval_Time提取插值的通量值。
INTERPOL_FLUX = []
for i in df.itertuples():
df = pd.DataFrame( [(i[1],i[4]), (i[2],i[5])], columns = ['Times', 'Flux'] ) #Create a new dataframe with two Timestamps in a single row
df = df.set_index('Times') #Set Timestamps as index of new dataframe
df = pd.Series(df['Flux'], index = df.index) #Squeeze dataframe to series
interpolated = df.resample('S').interpolate(method='linear') #Upsample data and interpolate (i needed linear ones)
interpol_flux = interpolated.loc[ i[3] ] #Extract interpolated flux at Eval_Time
INTERPOL_FLUX.append(interpol_flux) #Add this interpolated flux to an empty list
df['Eval_Flux'] = INTERPOL_FLUX #Set this list as the Eval_Flux column
简明扼要,
INTERPOL_FLUX = []
for i in df.itertuples():
df = pd.DataFrame( [(i[1],i[4]), (i[2],i[5])], columns = ['Times', 'Flux'] ).set_index('Times')
df = pd.Series(df['Flux'], index = df.index)
INTERPOL_FLUX.append(df.resample('S').interpolate(method='linear').loc[i[3]])
df['Eval_Flux'] = INTERPOL_FLUX
我以为会很慢,但速度很快。
发布于 2019-05-09 23:23:27
你可以做你自己的插值,因为它只是在两列之间。但是,您的数据似乎不正确,因为您要求在第二行进行推断。不管怎么说,以下都会给你一个答案
df = pd.DataFrame(data={'L_Time':['2018-05-01 04:30:00','2018-05-03 07:30:00'],
'U_Time':['2018-05-01 05:30:00','2018-05-01 08:30:00'],
'Eval_Time':['2018-05-01 05:23:45','2018-05-01 07:44:11'],
'L_Flux':[ 100 ,100],
'U_Flux':[200,200]})
df['L_Time'] = pd.to_datetime(df['L_Time'])
df['U_Time'] = pd.to_datetime(df['U_Time'])
df['Eval_Time'] = pd.to_datetime(df['Eval_Time'])
# The actual maths part - using times between U, L and Eval
df['Eval_Flux'] = df.L_Flux + (df.U_Flux - df.L_Flux)*(df.Eval_Time - df.L_Time)/(df.U_Time - df.L_Time)
L_Time U_Time Eval_Time L_Flux U_Flux Eval_Flux
0 2018-05-01 04:30:00 2018-05-01 05:30:00 2018-05-01 05:23:45 100 200 189.583333
1 2018-05-03 07:30:00 2018-05-01 08:30:00 2018-05-01 07:44:11 100 200 201.624704
https://stackoverflow.com/questions/56068564
复制相似问题