ESGF: Subset CMIP6 Datasets
Contents
ESGF: Subset CMIP6 Datasets#
Search CMIP6 Dataset with ESGF pyclient#
using: https://esgf-pyclient.readthedocs.io/en/latest/index.html
from pyesgf.search import SearchConnection
conn = SearchConnection('https://esgf-data.dkrz.de/esg-search', distrib=True)
ctx = conn.new_context(
project='CMIP6',
source_id='MPI-ESM1-2-LR',
experiment_id='historical',
variable='tas',
frequency='mon',
variant_label='r1i1p1f1',
data_node='esgf3.dkrz.de')
ctx.hit_count
-------------------------------------------------------------------------------
Warning - defaulting to search with facets=*
This behavior is kept for backward-compatibility, but ESGF indexes might not
successfully perform a distributed search when this option is used, so some
results may be missing. For full results, it is recommended to pass a list of
facets of interest when instantiating a context object. For example,
ctx = conn.new_context(facets='project,experiment_id')
Only the facets that you specify will be present in the facets_counts dictionary.
This warning is displayed when a distributed search is performed while using the
facets=* default, a maximum of once per context object. To suppress this warning,
set the environment variable ESGF_PYCLIENT_NO_FACETS_STAR_WARNING to any value
or explicitly use conn.new_context(facets='*')
-------------------------------------------------------------------------------
1
result = ctx.search()[0]
result.dataset_id
'CMIP6.CMIP.MPI-M.MPI-ESM1-2-LR.historical.r1i1p1f1.Amon.tas.gn.v20190710|esgf3.dkrz.de'
files = result.file_context().search()
for file in files:
print(file.opendap_url)
-------------------------------------------------------------------------------
Warning - defaulting to search with facets=*
This behavior is kept for backward-compatibility, but ESGF indexes might not
successfully perform a distributed search when this option is used, so some
results may be missing. For full results, it is recommended to pass a list of
facets of interest when instantiating a context object. For example,
ctx = conn.new_context(facets='project,experiment_id')
Only the facets that you specify will be present in the facets_counts dictionary.
This warning is displayed when a distributed search is performed while using the
facets=* default, a maximum of once per context object. To suppress this warning,
set the environment variable ESGF_PYCLIENT_NO_FACETS_STAR_WARNING to any value
or explicitly use conn.new_context(facets='*')
-------------------------------------------------------------------------------
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_185001-186912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_187001-188912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_189001-190912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_191001-192912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_193001-194912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_195001-196912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_197001-198912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_199001-200912.nc
http://esgf3.dkrz.de/thredds/dodsC/cmip6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_201001-201412.nc
Subset single dataset with xarray#
Using OpenDAP: http://xarray.pydata.org/en/stable/io.html?highlight=opendap#opendap
import xarray as xr
ds = xr.open_dataset(files[0].opendap_url, chunks={'time': 120})
ds
<xarray.Dataset> Dimensions: (time: 240, bnds: 2, lat: 96, lon: 192) Coordinates: * time (time) datetime64[ns] 1850-01-16T12:00:00 ... 1869-12-16T12:00:00 * lat (lat) float64 -88.57 -86.72 -84.86 -83.0 ... 84.86 86.72 88.57 * lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1 height float64 ... Dimensions without coordinates: bnds Data variables: time_bnds (time, bnds) datetime64[ns] dask.array<chunksize=(120, 2), meta=np.ndarray> lat_bnds (lat, bnds) float64 dask.array<chunksize=(96, 2), meta=np.ndarray> lon_bnds (lon, bnds) float64 dask.array<chunksize=(192, 2), meta=np.ndarray> tas (time, lat, lon) float32 dask.array<chunksize=(120, 96, 192), meta=np.ndarray> Attributes: (12/48) Conventions: CF-1.7 CMIP-6.2 activity_id: CMIP branch_method: standard branch_time_in_child: 0.0 branch_time_in_parent: 0.0 contact: cmip6-mpi-esm@dkrz.de ... ... variable_id: tas variant_label: r1i1p1f1 license: CMIP6 model data produced by MPI-M is li... cmor_version: 3.5.0 tracking_id: hdl:21.14100/6b679cba-17b8-45eb-90dc-23d... DODS_EXTRA.Unlimited_Dimension: time
da = ds['tas']
da = da.isel(time=slice(0, 1))
da = da.sel(lat=slice(-50, 50), lon=slice(0, 50))
#%matplotlib inline
da.plot()
<matplotlib.collections.QuadMesh at 0x7fff543b2620>
Subset over multiple datasets#
ds_agg = xr.open_mfdataset([files[0].opendap_url, files[1].opendap_url], chunks={'time': 120}, combine='nested', concat_dim='time')
ds_agg
<xarray.Dataset> Dimensions: (time: 480, bnds: 2, lat: 96, lon: 192) Coordinates: * time (time) datetime64[ns] 1850-01-16T12:00:00 ... 1889-12-16T12:00:00 * lat (lat) float64 -88.57 -86.72 -84.86 -83.0 ... 84.86 86.72 88.57 * lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1 height float64 2.0 Dimensions without coordinates: bnds Data variables: time_bnds (time, bnds) datetime64[ns] dask.array<chunksize=(120, 2), meta=np.ndarray> lat_bnds (time, lat, bnds) float64 dask.array<chunksize=(240, 96, 2), meta=np.ndarray> lon_bnds (time, lon, bnds) float64 dask.array<chunksize=(240, 192, 2), meta=np.ndarray> tas (time, lat, lon) float32 dask.array<chunksize=(120, 96, 192), meta=np.ndarray> Attributes: (12/48) Conventions: CF-1.7 CMIP-6.2 activity_id: CMIP branch_method: standard branch_time_in_child: 0.0 branch_time_in_parent: 0.0 contact: cmip6-mpi-esm@dkrz.de ... ... variable_id: tas variant_label: r1i1p1f1 license: CMIP6 model data produced by MPI-M is li... cmor_version: 3.5.0 tracking_id: hdl:21.14100/6b679cba-17b8-45eb-90dc-23d... DODS_EXTRA.Unlimited_Dimension: time
da = ds_agg['tas']
da = da.isel(time=slice(0, 1))
da = da.sel(lat=slice(-50, 50), lon=slice(0, 50))
da.plot()
<matplotlib.collections.QuadMesh at 0x7fff5413ace0>
Download dataset#
da.to_netcdf('tas_africa_19500116.nc')