我有一个用Dask打开的很大的CSV文件。
import numpy as np
import pandas as pd
import hvplot.pandas
import hvplot.dask
import intake
data = '../file.csv'
ddf = intake.open_csv(data).to_dask()
ddf.head()
Datetime latitude longitude Temp_2m(C)
1 1980-01-02 03:00:00 30.605 50.217 5.31
2 1980-01-02 04:00:00 30.605 50.217 5.36
3 1980-01-02 05:00:00 30.605 50.217 7.04
4 1980-01-02 06:00:00 30.605 50.217 10.24我想用hvplot每月绘制一次Temp_2m(C)。用Datetime的每小时数据绘制是正确的,但是当我想将Datetime分组如下时,它返回一个错误。
# Convert 'Datetime' column to 'datetime64'
ddf["Datetime"] = ddf["Datetime"].astype("M8[us]")
# set index column
ddf = ddf.set_index('Datetime')
g = pd.Grouper(freq='M', key='Datetime')
month_ddf = dff.groupby(g).mean()
# plot
month_ddf.hvplot('Temp_2m(C)')错误:ValueError: all keys need to be the same shape我的错误是什么?
对于回复@frkr6591:
month_ddf.describe()
Dask DataFrame Structure:
latitude longitude Temp_2m(C)
npartitions=1
float64 float64 float64
... ... ...
Dask Name: describe-numeric, 89 tasks发布于 2020-11-30 02:22:44
我使用了to_datetime(),并使用.plot()获得了正确的绘图...安装hvplot时遇到问题。
import numpy as np
import pandas as pd
# FIXME : the following does not work
#import hvplot.pandas
%matplotlib inline
d = dict(datetime = ['1980-01-02 02:00:00',
'1980-01-02 03:00:00',
'1980-01-02 04:00:00',
'1980-01-02 05:00:00',
'1980-07-02 06:00:00'],
latitude = [30.605 for n in range(5)],
longitude = [50.217 for n in range(5)],
Temp_2m = [np.random.random()*10 for n in range(5)])
df = pd.DataFrame(d)
df['datetime'] = pd.to_datetime(df['datetime'])
df['mon'] = df['datetime'].dt.to_period('M')
print(df)
ddf = df.groupby('mon').mean()
print(ddf)
# This works on my py3.7
ddf.plot('Temp_2m')
# This fails because hvplot could not be imported.
ddf.hvplot('Temp_2m')
datetime latitude longitude Temp_2m mon
0 1980-01-02 02:00:00 30.605 50.217 2.512897 1980-01
1 1980-01-02 03:00:00 30.605 50.217 0.247358 1980-01
2 1980-01-02 04:00:00 30.605 50.217 7.678030 1980-01
3 1980-01-02 05:00:00 30.605 50.217 0.637331 1980-01
4 1980-07-02 06:00:00 30.605 50.217 2.156502 1980-07
latitude longitude Temp_2m
mon
1980-01 30.605 50.217 5.080373
1980-07 30.605 50.217 1.324140

https://stackoverflow.com/questions/65063107
复制相似问题