有关如何用xarray处理NetCDF数据前面已经介绍过四期了。把一些处理NetCDF的基本方法都介绍了一下。
但xarray远不止如此,还可以用它处理GRIB和TIFF数据,这两种也是非常常见的数据格式。
看到这里有没有一种买一送三的感觉,学会xarray的基本方法就可以掌握多种数据格式的处理方式了,大大地效降低了学习的成本,剩下来的时间可以更加专注于其他工作。
标记化图片文件格式(TIFF)是地理空间最常用的栅格格式。TIFF文件可以包含多波段,整型高程数据,基本元数据,内部压缩以及其他常用的存储辅助信息的文件格式。TIFF文件可以通过添加标记数据进行扩展,GeoTIFF就是扩展定义的地理空间数据的存储,常用的后缀.tif,.tiff和.gtif。
用open_rasterio
函数可以读取tif数据。
>>>import xarray as xr
>>>url = 'https://github.com/mapbox/rasterio/raw/master/tests/data/RGB.byte.tif'
>>>da = xr.open_rasterio(url)
>>>da
>>>
<xarray.DataArray (band: 3, y: 718, x: 791)>
[1703814 values with dtype=uint8]
Coordinates:
* band (band) int64 1 2 3
* y (y) float64 2.827e+06 2.826e+06 2.826e+06 ... 2.612e+06 2.612e+06
* x (x) float64 1.021e+05 1.024e+05 1.027e+05 ... 3.389e+05 3.392e+05
Attributes:
transform: (300.0379266750948, 0.0, 101985.0, 0.0, -300.041782729805, 2...
crs: +init=epsg:32618
res: (300.0379266750948, 300.041782729805)
is_tiled: 0
用matplotlib简单出图
>>>import cartopy.crs as ccrs
>>>import matplotlib.pyplot as plt
>>>crs = ccrs.UTM('18N')
>>>ax = plt.subplot(projection=crs)
>>>da.plot.imshow(ax=ax, rgb='band', transform=crs)
>>>ax.coastlines('10m')
>>>plt.show()
GRIB格式是一种应用于气象领域的高效存储格式,由世界气象组织进行标准化。当前有3个版本的GRIB格式,目前GRIB1和GRIB2在广泛使用。
如果想用xarray读取GRIB文件,首先要安装一下ECMWF的cfgrib库。它是xarray的用来解析GRIB数据的引擎。
安装就用conda一键安装就好了。
conda install -c conda-forge cfgrib eccodes
用法非常简单,只需要像下面一样指定cfgrib为数据加载引擎就可以了。
ds = xr.open_dataset('example.grib', engine='cfgrib')
官方也给出了测试样例,感兴趣的可以自己动手试一试,http://download.ecmwf.int/test-data/cfgrib/era5-levels-members.grib。
>>> import xarray as xr
>>> ds = xr.open_dataset('era5-levels-members.grib', engine='cfgrib')
>>> ds
<xarray.Dataset>
Dimensions: (isobaricInhPa: 2, latitude: 61, longitude: 120, number: 10, time: 4)
Coordinates:
* number (number) int64 0 1 2 3 4 5 6 7 8 9
* time (time) datetime64[ns] 2017-01-01 ... 2017-01-02T12:00:00
step timedelta64[ns] ...
* isobaricInhPa (isobaricInhPa) int64 850 500
* latitude (latitude) float64 90.0 87.0 84.0 81.0 ... -84.0 -87.0 -90.0
* longitude (longitude) float64 0.0 3.0 6.0 9.0 ... 351.0 354.0 357.0
valid_time (time) datetime64[ns] ...
Data variables:
z (number, time, isobaricInhPa, latitude, longitude) float32 ...
t (number, time, isobaricInhPa, latitude, longitude) float32 ...
Attributes:
GRIB_edition: 1
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: ...
上面的数据比较简单,直接加载不会出现报错,但是如果要读取GFS之类比较复杂的数据就需要提前过滤一下,因为像GFS数据存在多种不同的高度层形式,如果不提前指明就会报ValueError
的错。
以GFS为例,数据可自行下载:ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/gfs/prod。
数据中高度层存在多种形式,isobaricInhPa
,heightAboveGround
,pressureFromGroundLayer
等等,需要提前过滤出来。
筛选的关键词可以根据https://www.nco.ncep.noaa.gov/pmb/products/gfs/gfs.t00z.pgrb2.1p00.anl.shtml来确定。下面筛选的是500hPa的位势高度。
>>>ds = xr.open_dataset('gfs.t00z.pgrb2.1p00.anl', engine='cfgrib', backend_kwargs={'filter_by_keys': {'typeOfLevel': 'isobaricInhPa', 'level': 500}})
>>>ds
<xarray.Dataset>
Dimensions: (latitude: 181, longitude: 360)
Coordinates:
time datetime64[ns] ...
step timedelta64[ns] ...
isobaricInhPa int64 ...
* latitude (latitude) float64 90.0 89.0 88.0 87.0 ... -88.0 -89.0 -90.0
* longitude (longitude) float64 0.0 1.0 2.0 3.0 ... 357.0 358.0 359.0
valid_time datetime64[ns] ...
Data variables:
gh (latitude, longitude) float32 ...
t (latitude, longitude) float32 ...
r (latitude, longitude) float32 ...
w (latitude, longitude) float32 ...
wz (latitude, longitude) float32 ...
u (latitude, longitude) float32 ...
v (latitude, longitude) float32 ...
absv (latitude, longitude) float32 ...
clwmr (latitude, longitude) float32 ...
icmr (latitude, longitude) float32 ...
rwmr (latitude, longitude) float32 ...
snmr (latitude, longitude) float32 ...
grle (latitude, longitude) float32 ...
5wavh (latitude, longitude) float32 ...
Attributes:
GRIB_edition: 2
GRIB_centre: kwbc
GRIB_centreDescription: US National Weather Service - NCEP
GRIB_subCentre: 0
Conventions: CF-1.7
institution: US National Weather Service - NCEP
history: 2019-07-10T13:59:22 GRIB to CDM+CF via cfgrib-0....
>>>ds['gh']
<xarray.DataArray 'gh' (latitude: 181, longitude: 360)>
array([[5564.623 , 5564.623 , 5564.623 , ..., 5564.623 , 5564.623 , 5564.623 ],
[5556.423 , 5556.483 , 5556.543 , ..., 5556.1626, 5556.2427, 5556.3228],
[5549.503 , 5549.6626, 5549.8228, ..., 5549.023 , 5549.1826, 5549.343 ],
...,
[4849.483 , 4849.3027, 4849.123 , ..., 4849.943 , 4849.8027, 4849.6426],
[4856.1826, 4856.083 , 4856.023 , ..., 4856.443 , 4856.343 , 4856.2627],
[4865.2427, 4865.2427, 4865.2427, ..., 4865.2427, 4865.2427, 4865.2427]],
dtype=float32)
Coordinates:
time datetime64[ns] ...
step timedelta64[ns] ...
isobaricInhPa int64 ...
* latitude (latitude) float64 90.0 89.0 88.0 87.0 ... -88.0 -89.0 -90.0
* longitude (longitude) float64 0.0 1.0 2.0 3.0 ... 357.0 358.0 359.0
valid_time datetime64[ns] ...
Attributes:
GRIB_paramId: 156
GRIB_shortName: gh
GRIB_units: gpm
GRIB_name: Geopotential Height
GRIB_cfName: geopotential_height
GRIB_cfVarName: gh
GRIB_dataType: an
GRIB_missingValue: 9999
GRIB_numberOfPoints: 65160
GRIB_typeOfLevel: isobaricInhPa
GRIB_NV: 0
GRIB_stepUnits: 1
GRIB_stepType: instant
GRIB_gridType: regular_ll
GRIB_gridDefinitionDescription: Latitude/longitude. Also called...
GRIB_Nx: 360
GRIB_iDirectionIncrementInDegrees: 1.0
GRIB_iScansNegatively: 0
GRIB_longitudeOfFirstGridPointInDegrees: 0.0
GRIB_longitudeOfLastGridPointInDegrees: 359.0
GRIB_Ny: 181
GRIB_jDirectionIncrementInDegrees: 1.0
GRIB_jPointsAreConsecutive: 0
GRIB_jScansPositively: 0
GRIB_latitudeOfFirstGridPointInDegrees: 90.0
GRIB_latitudeOfLastGridPointInDegrees: -90.0
long_name: Geopotential Height
units: gpm
standard_name: geopotential_height