pycontrails.core.met¶
Meteorology data models.
Functions
|
Downselect |
|
Possibly downselect |
Check if data appears to have originated from an ECMWF source. |
|
|
Shift longitude values from any input domain to [bound, 360 + bound) domain. |
|
Rename all variables in dataset from short name to standard name. |
Classes
|
Abstract class for building Meteorology Data handling classes. |
|
Meteorological DataArray of single variable. |
|
Meteorological dataset with multiple variables. |
- class pycontrails.core.met.MetBase¶
Bases:
ABC
,Generic
[XArrayType
]Abstract class for building Meteorology Data handling classes.
All support here should be generic to work on xr.DataArray and xr.Dataset.
- property attrs¶
Pass through to
self.data.attrs
.
- abstract broadcast_coords(name)¶
Broadcast coordinates along other dimensions.
- Parameters:
name (
str
) – Coordinate/dimension name to broadcast. Can be a dimension or non-dimension coordinates.- Returns:
xarray.DataArray
– DataArray of the coordinate broadcasted along all other dimensions. The DataArray will have the same shape as the gridded data.
- cachestore¶
Cache datastore to use for
save()
orload()
- property coords¶
Get coordinates of underlying
data
coordinates.Only return non-dimension coordinates.
See: http://xarray.pydata.org/en/stable/user-guide/data-structures.html#coordinates
- Returns:
dict[str
,np.ndarray]
– Dictionary of coordinates
- copy()¶
Create a shallow copy of the current class.
See
xarray.Dataset.copy()
for reference.- Returns:
Self
– Copy of the current class
- data¶
DataArray or Dataset
- dim_order = ('longitude', 'latitude', 'level', 'time')¶
Default dimension order for DataArray or Dataset (x, y, z, t)
- downselect(bbox)¶
Downselect met data within spatial bounding box.
- Parameters:
bbox (
list[float]
) – List of coordinates defining a spatial bounding box in WGS84 coordinates. For 2D queries, list is [west, south, east, north]. For 3D queries, list is [west, south, min-level, east, north, max-level] with level defined in [\(hPa\)].- Returns:
Self
– Return downselected data
- downselect_met(met, *, longitude_buffer=(0.0, 0.0), latitude_buffer=(0.0, 0.0), level_buffer=(0.0, 0.0), time_buffer=(np.timedelta64(0, 'h'), np.timedelta64(0, 'h')))¶
Downselect
met
to encompass a spatiotemporal region of the data.Warning
This method is analogous to
GeoVectorDataset.downselect_met()
. It does not change the instance data, but instead operates on themet
input. This method is different fromdownselect()
which operates on the instance data.Changed in version 0.54.5: Data is no longer copied when downselecting.
- Parameters:
met (
MetDataset | MetDataArray
) – MetDataset or MetDataArray to downselect.longitude_buffer (
tuple[float
,float]
, optional) – Extend longitude domain past bylongitude_buffer[0]
on the low side andlongitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.latitude_buffer (
tuple[float
,float]
, optional) – Extend latitude domain past bylatitude_buffer[0]
on the low side andlatitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.level_buffer (
tuple[float
,float]
, optional) – Extend level domain past bylevel_buffer[0]
on the low side andlevel_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
[\(hPa\)].time_buffer (
tuple[np.timedelta64
,np.timedelta64]
, optional) – Extend time domain past bytime_buffer[0]
on the low side andtime_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(np.timedelta64(0, "h"), np.timedelta64(0, "h"))
.
- Returns:
MetDataset | MetDataArray
– Copy of downselected MetDataset or MetDataArray.
- property hash¶
Generate a unique hash for this met instance.
Note this is not as robust as it could be since repr cuts off.
- Returns:
str
– Unique hash for met instance (sha1)
- property indexes¶
Low level access to underlying
data
indexes.This method is typically is faster for accessing coordinate indexes.
Added in version 0.25.2.
- Returns:
dict[Hashable
,pd.Index]
– Dictionary of indexes.
Examples
>>> from pycontrails.datalib.ecmwf import ERA5 >>> times = (datetime(2022, 3, 1, 12), datetime(2022, 3, 1, 13)) >>> variables = "air_temperature", "specific_humidity" >>> levels = [200, 300] >>> era5 = ERA5(times, variables, levels) >>> mds = era5.open_metdataset() >>> mds.indexes["level"].to_numpy() array([200., 300.])
>>> mda = mds["air_temperature"] >>> mda.indexes["level"].to_numpy() array([200., 300.])
- property is_single_level¶
Check if instance contains “single level” or “surface level” data.
This method checks if
level
dimension contains a single value equal to -1, the pycontrails convention for surface only data.- Returns:
bool
– If instance contains single level data.
- property is_wrapped¶
Check if the longitude dimension covers the closed interval
[-180, 180]
.Assumes the longitude dimension is sorted (this is established by the
MetDataset
orMetDataArray
constructor).- Returns:
bool
– True if longitude coordinates cover[-180, 180]
- property is_zarr¶
Check if underlying
data
is sourced from a Zarr group.Implementation is very brittle, and may break as external libraries change.
Some
dask
intermediate artifact is cached when this is called. Typically, subsequent calls to this method are much faster than the initial call.Added in version 0.26.0.
- Returns:
bool
– Ifdata
is based on a Zarr group.
- abstract property shape¶
Return the shape of the dimensions.
- abstract property size¶
Return the size of (each) array in underlying
data
.- Returns:
int
– Total number of grid points in underlying data
- wrap_longitude()¶
Wrap longitude coordinates.
- Returns:
Self
– Copy of instance with wrapped longitude values. Returns copy of data when longitude values are already wrapped
- class pycontrails.core.met.MetDataArray(data, cachestore=None, wrap_longitude=False, copy=True, name=None)¶
Bases:
MetBase
Meteorological DataArray of single variable.
Wrapper around
xarray.DataArray
to enforce certain variables and dimensions for internal usage.Changed in version 0.54.4: Remove
validate
parameter. Validation is now always performed.- Parameters:
data (
ArrayLike
) – xr.DataArray or other array-like data source. When array-like input is provided, input**kwargs
passed directly to xr.DataArray constructor.cachestore (
CacheStore
, optional) – Cache datastore for staging intermediates withsave()
. Defaults to DiskCacheStore.wrap_longitude (
bool
, optional) – Wrap data along the longitude dimension. If True, duplicate and shift longitude values (ie, -180 -> 180) to ensure that the longitude dimension covers the entire interval[-180, 180]
. Defaults to False.copy (
bool
, optional) – Copy data parameter on construction, by default True. If data is lazy-loaded via dask, this parameter has no effect. If data is already loaded into memory, a copy of the data (rather than a view) may be created if True.name (
Hashable
, optional) – Name of the data variable. If not specified, the name will be set to “met”.
Examples
>>> import numpy as np >>> import xarray as xr >>> rng = np.random.default_rng(seed=456)
>>> # Cook up random xarray object >>> coords = { ... "longitude": np.arange(-20, 20), ... "latitude": np.arange(-30, 30), ... "level": [220, 240, 260, 280], ... "time": [np.datetime64("2021-08-01T12", "ns"), np.datetime64("2021-08-01T16", "ns")] ... } >>> da = xr.DataArray(rng.random((40, 60, 4, 2)), dims=coords.keys(), coords=coords)
>>> # Cast to `MetDataArray` in order to interpolate >>> from pycontrails import MetDataArray >>> mda = MetDataArray(da) >>> mda.interpolate(-11.4, 5.7, 234, np.datetime64("2021-08-01T13")) array([0.52358215])
>>> mda.interpolate(-11.4, 5.7, 234, np.datetime64("2021-08-01T13"), method='nearest') array([0.4188465])
>>> da.sel(longitude=-11, latitude=6, level=240, time=np.datetime64("2021-08-01T12")).item() 0.41884649899766946
- property attrs¶
Pass through to
self.data.attrs
.
- property binary¶
Determine if all data is a binary value (0, 1).
- Returns:
bool
– True if all data values are binary value (0, 1)
- broadcast_coords(name)¶
Broadcast coordinates along other dimensions.
- Parameters:
name (
str
) – Coordinate/dimension name to broadcast. Can be a dimension or non-dimension coordinates.- Returns:
xarray.DataArray
– DataArray of the coordinate broadcasted along all other dimensions. The DataArray will have the same shape as the gridded data.
- property coords¶
Get coordinates of underlying
data
coordinates.Only return non-dimension coordinates.
See: http://xarray.pydata.org/en/stable/user-guide/data-structures.html#coordinates
- Returns:
dict[str
,np.ndarray]
– Dictionary of coordinates
- copy()¶
Create a shallow copy of the current class.
See
xarray.Dataset.copy()
for reference.- Returns:
Self
– Copy of the current class
- data¶
DataArray or Dataset
- dim_order = ('longitude', 'latitude', 'level', 'time')¶
Default dimension order for DataArray or Dataset (x, y, z, t)
- downselect(bbox)¶
Downselect met data within spatial bounding box.
- Parameters:
bbox (
list[float]
) – List of coordinates defining a spatial bounding box in WGS84 coordinates. For 2D queries, list is [west, south, east, north]. For 3D queries, list is [west, south, min-level, east, north, max-level] with level defined in [\(hPa\)].- Returns:
Self
– Return downselected data
- downselect_met(met, *, longitude_buffer=(0.0, 0.0), latitude_buffer=(0.0, 0.0), level_buffer=(0.0, 0.0), time_buffer=(np.timedelta64(0, 'h'), np.timedelta64(0, 'h')))¶
Downselect
met
to encompass a spatiotemporal region of the data.Warning
This method is analogous to
GeoVectorDataset.downselect_met()
. It does not change the instance data, but instead operates on themet
input. This method is different fromdownselect()
which operates on the instance data.Changed in version 0.54.5: Data is no longer copied when downselecting.
- Parameters:
met (
MetDataset | MetDataArray
) – MetDataset or MetDataArray to downselect.longitude_buffer (
tuple[float
,float]
, optional) – Extend longitude domain past bylongitude_buffer[0]
on the low side andlongitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.latitude_buffer (
tuple[float
,float]
, optional) – Extend latitude domain past bylatitude_buffer[0]
on the low side andlatitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.level_buffer (
tuple[float
,float]
, optional) – Extend level domain past bylevel_buffer[0]
on the low side andlevel_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
[\(hPa\)].time_buffer (
tuple[np.timedelta64
,np.timedelta64]
, optional) – Extend time domain past bytime_buffer[0]
on the low side andtime_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(np.timedelta64(0, "h"), np.timedelta64(0, "h"))
.
- Returns:
MetDataset | MetDataArray
– Copy of downselected MetDataset or MetDataArray.
- find_edges()¶
Find edges of regions.
- Returns:
Self
– MetDataArray with a binary field, 1 on the edge of the regions, 0 outside and inside the regions.- Raises:
NotImplementedError – If the instance is not binary.
- property hash¶
Generate a unique hash for this met instance.
Note this is not as robust as it could be since repr cuts off.
- Returns:
str
– Unique hash for met instance (sha1)
- property in_memory¶
Check if underlying
data
is loaded into memory.This method uses protected attributes of underlying xarray objects, and may be subject to deprecation.
Changed in version 0.26.0: Rename from
is_loaded
toin_memory
.- Returns:
bool
– If underlying data exists as an np.ndarray in memory.
- property indexes¶
Low level access to underlying
data
indexes.This method is typically is faster for accessing coordinate indexes.
Added in version 0.25.2.
- Returns:
dict[Hashable
,pd.Index]
– Dictionary of indexes.
Examples
>>> from pycontrails.datalib.ecmwf import ERA5 >>> times = (datetime(2022, 3, 1, 12), datetime(2022, 3, 1, 13)) >>> variables = "air_temperature", "specific_humidity" >>> levels = [200, 300] >>> era5 = ERA5(times, variables, levels) >>> mds = era5.open_metdataset() >>> mds.indexes["level"].to_numpy() array([200., 300.])
>>> mda = mds["air_temperature"] >>> mda.indexes["level"].to_numpy() array([200., 300.])
- interpolate(longitude, latitude, level, time, *, method='linear', bounds_error=False, fill_value=nan, localize=False, lowmem=False, indices=None, return_indices=False)¶
Interpolate values over underlying DataArray.
Zero dimensional coordinates are reshaped to 1D arrays.
If
lowmem == False
, method automatically loads underlyingdata
into memory. Otherwise, method iterates through smaller subsets ofdata
and releases subsets from memory once interpolation against each subset is finished.If
method == "nearest"
, the out array will have the samedtype
as the underlyingdata
.If
method == "linear"
, the out array will be promoted to the most precisedtype
of:underlying
data
data.longitude
data.latitude
data.level
longitude
latitude
Added in version 0.24: This method can now handle singleton dimensions with
method == "linear"
. Previously these degenerate dimensions caused nan values to be returned.- Parameters:
longitude (
float | npt.NDArray[np.floating]
) – Longitude values to interpolate. Assumed to be 0 or 1 dimensional.latitude (
float | npt.NDArray[np.floating]
) – Latitude values to interpolate. Assumed to be 0 or 1 dimensional.level (
float | npt.NDArray[np.floating]
) – Level values to interpolate. Assumed to be 0 or 1 dimensional.time (
np.datetime64 | npt.NDArray[np.datetime64]
) – Time values to interpolate. Assumed to be 0 or 1 dimensional.method (
str
, optional) – Additional keyword arguments to pass toscipy.interpolate.RegularGridInterpolator
. Defaults to “linear”.bounds_error (
bool
, optional) – Additional keyword arguments to pass toscipy.interpolate.RegularGridInterpolator
. Defaults toFalse
.fill_value (
float | np.float64
, optional) – Additional keyword arguments to pass toscipy.interpolate.RegularGridInterpolator
. Set to None to extrapolate outside the boundary whenmethod
isnearest
. Defaults tonp.nan
.localize (
bool
, optional) – Experimental. If True, downselect gridded data to smallest bounding box containing all points. By default False.lowmem (
bool
, optional) – Experimental. If True, iterate through points binned by the time coordinate of the grided data, and downselect gridded data to the smallest bounding box containing each binned set of point before loading into memory. This can significantly reduce memory consumption with large numbers of points at the cost of increased runtime. By default False.indices (
tuple | None
, optional) – Experimental. Seeinterpolation.interp()
. None by default.return_indices (
bool
, optional) – Experimental. Seeinterpolation.interp()
. False by default. Note that values returned differ whenlowmem=True
andlowmem=False
, so output should only be re-used in calls with the samelowmem
value.
- Returns:
numpy.ndarray
– Interpolated values
See also
GeoVectorDataset.intersect_met()
Examples
>>> from datetime import datetime >>> import numpy as np >>> import pandas as pd >>> from pycontrails.datalib.ecmwf import ERA5
>>> times = (datetime(2022, 3, 1, 12), datetime(2022, 3, 1, 15)) >>> variables = "air_temperature" >>> levels = [200, 250, 300] >>> era5 = ERA5(times, variables, levels) >>> met = era5.open_metdataset() >>> mda = met["air_temperature"]
>>> # Interpolation at a grid point agrees with value >>> mda.interpolate(1, 2, 300, np.datetime64('2022-03-01T14:00')) array([241.91972984])
>>> da = mda.data >>> da.sel(longitude=1, latitude=2, level=300, time=np.datetime64('2022-03-01T14')).item() 241.9197298421629
>>> # Interpolation off grid >>> mda.interpolate(1.1, 2.1, 290, np.datetime64('2022-03-01 13:10')) array([239.83793798])
>>> # Interpolate along path >>> longitude = np.linspace(1, 2, 10) >>> latitude = np.linspace(2, 3, 10) >>> level = np.linspace(200, 300, 10) >>> time = pd.date_range("2022-03-01T14", periods=10, freq="5min") >>> mda.interpolate(longitude, latitude, level, time) array([220.44347694, 223.08900738, 225.74338924, 228.41642088, 231.10858599, 233.54857391, 235.71504913, 237.86478872, 239.99274623, 242.10792167])
>>> # Can easily switch to alternative low-memory implementation >>> mda.interpolate(longitude, latitude, level, time, lowmem=True) array([220.44347694, 223.08900738, 225.74338924, 228.41642088, 231.10858599, 233.54857391, 235.71504913, 237.86478872, 239.99274623, 242.10792167])
- property is_single_level¶
Check if instance contains “single level” or “surface level” data.
This method checks if
level
dimension contains a single value equal to -1, the pycontrails convention for surface only data.- Returns:
bool
– If instance contains single level data.
- property is_wrapped¶
Check if the longitude dimension covers the closed interval
[-180, 180]
.Assumes the longitude dimension is sorted (this is established by the
MetDataset
orMetDataArray
constructor).- Returns:
bool
– True if longitude coordinates cover[-180, 180]
- property is_zarr¶
Check if underlying
data
is sourced from a Zarr group.Implementation is very brittle, and may break as external libraries change.
Some
dask
intermediate artifact is cached when this is called. Typically, subsequent calls to this method are much faster than the initial call.Added in version 0.26.0.
- Returns:
bool
– Ifdata
is based on a Zarr group.
- classmethod load(hash, cachestore=None, chunks=None)¶
Load saved intermediate from
cachestore
.- Parameters:
hash (
str
) – Saved hash to load.cachestore (
CacheStore
, optional) – Cache datastore to use for sourcing files. Defaults to DiskCacheStore.chunks (
dict[str
,int]
, optional) – Chunks kwarg passed toxarray.open_mfdataset()
when opening files.
- Returns:
MetDataArray
– New MetDataArray with loaded data.
- property name¶
Return the DataArray name.
- Returns:
Hashable
– DataArray name
- property proportion¶
Compute proportion of points with value 1.
- Returns:
float
– Proportion of points with value 1- Raises:
NotImplementedError – If instance does not contain binary data.
- save(**kwargs)¶
Save intermediate to
cachestore
as netcdf.Load and restore using
load()
.- Parameters:
**kwargs (
Any
) – Keyword arguments passed directly toxarray.save_mfdataset()
- Returns:
list[str]
– Returns filenames of saved files
- property shape¶
Return the shape of the dimensions.
- property size¶
Return the size of (each) array in underlying
data
.- Returns:
int
– Total number of grid points in underlying data
- to_polygon_feature(level=None, time=None, fill_value=nan, iso_value=None, min_area=0.0, epsilon=0.0, lower_bound=True, precision=None, interiors=True, convex_hull=False, include_altitude=False, properties=None)¶
Create GeoJSON Feature artifact from spatial array on a single level and time slice.
Computed polygons always contain an exterior linear ring as defined by the GeoJSON Polygon specification <https://www.rfc-editor.org/rfc/rfc7946.html#section-3.1.6>. Polygons may also contain interior linear rings (holes). This method does not support nesting beyond the GeoJSON specification. See the
pycontrails.core.polygon
for additional polygon support.Changed in version 0.25.12: Previous implementation include several additional parameters which have been removed:
The
approximate
parameterAn
path
parameter to save output as JSONPassing arbitrary kwargs to
skimage.measure.find_contours()
.
New implementation includes new parameters previously lacking:
fill_value
min_area
include_altitude
Changed in version 0.38.0: Change default value of
epsilon
from 0.15 to 0.Changed in version 0.41.0: Convert continuous fields to binary fields before computing polygons. The parameters
max_area
andepsilon
are now expressed in terms of longitude/latitude units instead of pixels.- Parameters:
level (
float
, optional) – Level slice to create polygons. If the “level” coordinate is length 1, then the single level slice will be selected automatically.time (
datetime
, optional) – Time slice to create polygons. If the “time” coordinate is length 1, then the single time slice will be selected automatically.fill_value (
float
, optional) – Value used for filling missing data and for padding the underlying data array. Set tonp.nan
by default, which ensures that regions with missing data are never included in polygons.iso_value (
float
, optional) – Value in field to create iso-surface. Defaults to the average of the min and max value of the array. (This is the same convention as used byskimage
.)min_area (
float
, optional) – Minimum area of each polygon. Polygons with area less thanmin_area
are not included in the output. The unit of this parameter is in longitude/latitude degrees squared. Set to 0 to omit any polygon filtering based on a minimal area conditional. By default, 0.0.epsilon (
float
, optional) – Control the extent to which the polygon is simplified. A value of 0 does not alter the geometry of the polygon. The unit of this parameter is in longitude/latitude degrees. By default, 0.0.lower_bound (
bool
, optional) – Whether to useiso_value
as a lower or upper bound on values in polygon interiors. By default, True.precision (
int
, optional) – Number of decimal places to round coordinates to. If None, no rounding is performed.interiors (
bool
, optional) – If True, include interior linear rings (holes) in the output. True by default.convex_hull (
bool
, optional) – EXPERIMENTAL. If True, compute the convex hull of each polygon. Only implemented for depth=1. False by default. A warning is issued if the underlying algorithm fails to make valid polygons after computing the convex hull.include_altitude (
bool
, optional) – If True, include the array altitude [\(m\)] as a z-coordinate in the GeoJSON output <https://www.rfc-editor.org/rfc/rfc7946#section-3.1.1>. False by default.properties (
dict
, optional) – Additional properties to include in the GeoJSON output. By default, None.
- Returns:
dict[str
,Any]
– Python representation of GeoJSON Feature with MultiPolygon geometry.
Notes
Cocip
andCocipGrid
set some quantities to 0 and other quantities tonp.nan
in regions where no contrails form. When computing polygons fromCocip
orCocipGrid
output, take care that the choice offill_value
correctly includes or excludes contrail-free regions. See theCocip
documentation for details aboutnp.nan
in model output.See also
to_polyhedra()
,polygons.find_multipolygons()
Examples
>>> from pprint import pprint >>> from pycontrails.datalib.ecmwf import ERA5 >>> era5 = ERA5("2022-03-01", variables="air_temperature", pressure_levels=250) >>> mda = era5.open_metdataset()["air_temperature"] >>> mda.shape (1440, 721, 1, 1)
>>> pprint(mda.to_polygon_feature(iso_value=239.5, precision=2, epsilon=0.1)) {'geometry': {'coordinates': [[[[167.88, -22.5], [167.75, -22.38], [167.62, -22.5], [167.75, -22.62], [167.88, -22.5]]], [[[43.38, -33.5], [43.5, -34.12], [43.62, -33.5], [43.5, -33.38], [43.38, -33.5]]]], 'type': 'MultiPolygon'}, 'properties': {}, 'type': 'Feature'}
- to_polygon_feature_collection(time=None, fill_value=nan, iso_value=None, min_area=0.0, epsilon=0.0, lower_bound=True, precision=None, interiors=True, convex_hull=False, include_altitude=False, properties=None)¶
Create GeoJSON FeatureCollection artifact from spatial array at time slice.
See the
to_polygon_feature()
method for a description of the parameters.- Returns:
dict[str
,Any]
– Python representation of GeoJSON FeatureCollection. This dictionary is comprised of individual GeoJON Features, one perself.data["level"]
.
- to_polyhedra(*, time=None, iso_value=0.0, simplify_fraction=1.0, lower_bound=True, return_type='geojson', path=None, altitude_scale=1.0, output_vertex_normals=False, closed=True)¶
Create a collection of polyhedra from spatial array corresponding to a single time slice.
- Parameters:
time (
datetime
, optional) – Time slice to create mesh. If the “time” coordinate is length 1, then the single time slice will be selected automatically.iso_value (
float
, optional) – Value in field to create iso-surface. Defaults to 0.simplify_fraction (
float
, optional) – Apply open3d simplify_quadric_decimation method to simplify the polyhedra geometry. This parameter must be in the half-open interval (0.0, 1.0]. Defaults to 1.0, corresponding to no reduction.lower_bound (
bool
, optional) – Whether to useiso_value
as a lower or upper bound on values in polyhedra interiors. By default, True.return_type (
str
, optional) – Must be one of “geojson” or “mesh”. Defaults to “geojson”. If “geojson”, this method returns a dictionary representation of a geojson MultiPolygon object whose polygons are polyhedra faces. If “mesh”, this method returns an open3d TriangleMesh instance.path (
str
, optional) – Output geojson or mesh to file. If return_type is “mesh”, see Open3D File I/O for Mesh for file type options.altitude_scale (
float
, optional) – Rescale the altitude dimension of the mesh, [\(m\)]output_vertex_normals (
bool
, optional) – Ifpath
is defined, write out vertex normals. Defaults to False.closed (
bool
, optional) – If True, pad spatial array along all axes to ensure polyhedra are “closed”. This flag often gives rise to cleaner visualizations. Defaults to True.
- Returns:
dict | open3d.geometry.TriangleMesh
– Python representation of geojson object or Open3D Triangle Mesh depending on the return_type parameter.- Raises:
ModuleNotFoundError – Method requires the vis optional dependencies
ValueError – If input parameters are invalid.
Notes
Uses the scikit-image Marching Cubes algorithm to reconstruct a surface from the point-cloud like arrays.
- property values¶
Return underlying numpy array.
This methods loads
data
if it is not already in memory.- Returns:
numpy.ndarray
– Underlying numpy array
See also
- wrap_longitude()¶
Wrap longitude coordinates.
- Returns:
Self
– Copy of instance with wrapped longitude values. Returns copy of data when longitude values are already wrapped
- class pycontrails.core.met.MetDataset(data, cachestore=None, wrap_longitude=False, copy=True, attrs=None, **attrs_kwargs)¶
Bases:
MetBase
Meteorological dataset with multiple variables.
Composition around xr.Dataset to enforce certain variables and dimensions for internal usage
- Parameters:
data (
xarray.Dataset
) –xarray.Dataset
containing meteorological variables and coordinatescachestore (
CacheStore
, optional) – Cache datastore for staging intermediates withsave()
. Defaults to None.wrap_longitude (
bool
, optional) – Wrap data along the longitude dimension. If True, duplicate and shift longitude values (ie, -180 -> 180) to ensure that the longitude dimension covers the entire interval[-180, 180]
. Defaults to False.copy (
bool
, optional) – Copy data on construction. Defaults to True.attrs (
dict[str
,Any]
, optional) – Attributes to add todata.attrs
. Defaults to None. Generally, pycontrailsModels
may use the following attributes:provider
: Name of the data provider (e.g. “ECMWF”).dataset
: Name of the dataset (e.g. “ERA5”).product
: Name of the product type (e.g. “reanalysis”).
**attrs_kwargs (
Any
) – Keyword arguments to add todata.attrs
. Defaults to None.
Examples
>>> import numpy as np >>> import pandas as pd >>> import xarray as xr >>> from pycontrails.datalib.ecmwf import ERA5
>>> time = ("2022-03-01T00", "2022-03-01T02") >>> variables = ["air_temperature", "specific_humidity"] >>> pressure_levels = [200, 250, 300] >>> era5 = ERA5(time, variables, pressure_levels)
>>> # Open directly as `MetDataset` >>> met = era5.open_metdataset() >>> # Use `data` attribute to access `xarray` object >>> assert isinstance(met.data, xr.Dataset)
>>> # Alternatively, open with `xarray` and cast to `MetDataset` >>> ds = xr.open_mfdataset(era5._cachepaths) >>> met = MetDataset(ds)
>>> # Access sub-`DataArrays` >>> mda = met["t"] # `MetDataArray` instance, needed for interpolation operations >>> da = mda.data # Underlying `xarray` object
>>> # Check out a few values >>> da[5:8, 5:8, 1, 1].values array([[224.08959005, 224.41374427, 224.75945349], [224.09456429, 224.42037658, 224.76525676], [224.10036756, 224.42617985, 224.77106004]])
>>> # Mean temperature over entire array >>> da.mean().load().item() 223.5083
- property attrs¶
Pass through to
self.data.attrs
.
- broadcast_coords(name)¶
Broadcast coordinates along other dimensions.
- Parameters:
name (
str
) – Coordinate/dimension name to broadcast. Can be a dimension or non-dimension coordinates.- Returns:
xarray.DataArray
– DataArray of the coordinate broadcasted along all other dimensions. The DataArray will have the same shape as the gridded data.
- property coords¶
Get coordinates of underlying
data
coordinates.Only return non-dimension coordinates.
See: http://xarray.pydata.org/en/stable/user-guide/data-structures.html#coordinates
- Returns:
dict[str
,np.ndarray]
– Dictionary of coordinates
- copy()¶
Create a shallow copy of the current class.
See
xarray.Dataset.copy()
for reference.- Returns:
Self
– Copy of the current class
- data¶
DataArray or Dataset
- property dataset_attr¶
Look up the ‘dataset’ attribute with a custom error message.
- Returns:
str
– Dataset of the data. If not one of ‘ERA5’, ‘HRES’, ‘IFS’, or ‘GFS’, a warning is issued.
- dim_order = ('longitude', 'latitude', 'level', 'time')¶
Default dimension order for DataArray or Dataset (x, y, z, t)
- downselect(bbox)¶
Downselect met data within spatial bounding box.
- Parameters:
bbox (
list[float]
) – List of coordinates defining a spatial bounding box in WGS84 coordinates. For 2D queries, list is [west, south, east, north]. For 3D queries, list is [west, south, min-level, east, north, max-level] with level defined in [\(hPa\)].- Returns:
Self
– Return downselected data
- downselect_met(met, *, longitude_buffer=(0.0, 0.0), latitude_buffer=(0.0, 0.0), level_buffer=(0.0, 0.0), time_buffer=(np.timedelta64(0, 'h'), np.timedelta64(0, 'h')))¶
Downselect
met
to encompass a spatiotemporal region of the data.Warning
This method is analogous to
GeoVectorDataset.downselect_met()
. It does not change the instance data, but instead operates on themet
input. This method is different fromdownselect()
which operates on the instance data.Changed in version 0.54.5: Data is no longer copied when downselecting.
- Parameters:
met (
MetDataset | MetDataArray
) – MetDataset or MetDataArray to downselect.longitude_buffer (
tuple[float
,float]
, optional) – Extend longitude domain past bylongitude_buffer[0]
on the low side andlongitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.latitude_buffer (
tuple[float
,float]
, optional) – Extend latitude domain past bylatitude_buffer[0]
on the low side andlatitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.level_buffer (
tuple[float
,float]
, optional) – Extend level domain past bylevel_buffer[0]
on the low side andlevel_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
[\(hPa\)].time_buffer (
tuple[np.timedelta64
,np.timedelta64]
, optional) – Extend time domain past bytime_buffer[0]
on the low side andtime_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(np.timedelta64(0, "h"), np.timedelta64(0, "h"))
.
- Returns:
MetDataset | MetDataArray
– Copy of downselected MetDataset or MetDataArray.
- ensure_vars(vars, raise_error=True)¶
Ensure variables exist in xr.Dataset.
- Parameters:
vars (
MetVariable | str | Sequence[MetVariable | str | list[MetVariable]]
) – List of MetVariable (or string key), or individual MetVariable (or string key). Ifvars
contains an element with a list[MetVariable], then only one variable in the list must be present in dataset.raise_error (
bool
, optional) – Raise KeyError if data does not contain variables. Defaults to True.
- Returns:
list[str]
– List of met keys verified in MetDataset. Returns an empty list if any MetVariable is missing.- Raises:
KeyError – Raises when dataset does not contain variable in
vars
- classmethod from_coords(longitude, latitude, level, time)¶
Create a
MetDataset
containing a coordinate skeleton from coordinate arrays.- Parameters:
longitude, latitude (
npt.ArrayLike | float
) – Horizontal coordinates, in [\(\deg\)]level (
npt.ArrayLike | float
) – Vertical coordinate, in [\(hPa\)]time (
npt.ArrayLike | np.datetime64,
) – Temporal coordinates, in [\(UTC\)]. Will be sorted.
- Returns:
Self
– MetDataset with no variables.
Examples
>>> # Create skeleton MetDataset >>> longitude = np.arange(0, 10, 0.5) >>> latitude = np.arange(0, 10, 0.5) >>> level = [250, 300] >>> time = np.datetime64("2019-01-01") >>> met = MetDataset.from_coords(longitude, latitude, level, time) >>> met MetDataset with data: <xarray.Dataset> Size: 360B Dimensions: (longitude: 20, latitude: 20, level: 2, time: 1) Coordinates: * longitude (longitude) float64 160B 0.0 0.5 1.0 1.5 ... 8.0 8.5 9.0 9.5 * latitude (latitude) float64 160B 0.0 0.5 1.0 1.5 ... 8.0 8.5 9.0 9.5 * level (level) float64 16B 250.0 300.0 * time (time) datetime64[ns] 8B 2019-01-01 air_pressure (level) float32 8B 2.5e+04 3e+04 altitude (level) float32 8B 1.036e+04 9.164e+03 Data variables: *empty*
>>> met.shape (20, 20, 2, 1)
>>> met.size 800
>>> # Fill it up with some constant data >>> met["temperature"] = xr.DataArray(np.full(met.shape, 234.5), coords=met.coords) >>> met["humidity"] = xr.DataArray(np.full(met.shape, 0.5), coords=met.coords) >>> met MetDataset with data: <xarray.Dataset> Size: 13kB Dimensions: (longitude: 20, latitude: 20, level: 2, time: 1) Coordinates: * longitude (longitude) float64 160B 0.0 0.5 1.0 1.5 ... 8.0 8.5 9.0 9.5 * latitude (latitude) float64 160B 0.0 0.5 1.0 1.5 ... 8.0 8.5 9.0 9.5 * level (level) float64 16B 250.0 300.0 * time (time) datetime64[ns] 8B 2019-01-01 air_pressure (level) float32 8B 2.5e+04 3e+04 altitude (level) float32 8B 1.036e+04 9.164e+03 Data variables: temperature (longitude, latitude, level, time) float64 6kB 234.5 ... 234.5 humidity (longitude, latitude, level, time) float64 6kB 0.5 0.5 ... 0.5
>>> # Convert to a GeoVectorDataset >>> vector = met.to_vector() >>> vector.dataframe.head() longitude latitude level time temperature humidity 0 0.0 0.0 250.0 2019-01-01 234.5 0.5 1 0.0 0.0 300.0 2019-01-01 234.5 0.5 2 0.0 0.5 250.0 2019-01-01 234.5 0.5 3 0.0 0.5 300.0 2019-01-01 234.5 0.5 4 0.0 1.0 250.0 2019-01-01 234.5 0.5
- classmethod from_zarr(store, **kwargs)¶
Create a
MetDataset
from a path to a Zarr store.- Parameters:
store (
Any
) – Path to Zarr store. Passed intoxarray.open_zarr()
.**kwargs (
Any
) – Other keyword only arguments passed intoxarray.open_zarr()
.
- Returns:
Self
– MetDataset with data from Zarr store.
- get(key, default_value=None)¶
Shortcut to
data.get(k, v)()
method.
- property hash¶
Generate a unique hash for this met instance.
Note this is not as robust as it could be since repr cuts off.
- Returns:
str
– Unique hash for met instance (sha1)
- property indexes¶
Low level access to underlying
data
indexes.This method is typically is faster for accessing coordinate indexes.
Added in version 0.25.2.
- Returns:
dict[Hashable
,pd.Index]
– Dictionary of indexes.
Examples
>>> from pycontrails.datalib.ecmwf import ERA5 >>> times = (datetime(2022, 3, 1, 12), datetime(2022, 3, 1, 13)) >>> variables = "air_temperature", "specific_humidity" >>> levels = [200, 300] >>> era5 = ERA5(times, variables, levels) >>> mds = era5.open_metdataset() >>> mds.indexes["level"].to_numpy() array([200., 300.])
>>> mda = mds["air_temperature"] >>> mda.indexes["level"].to_numpy() array([200., 300.])
- property is_single_level¶
Check if instance contains “single level” or “surface level” data.
This method checks if
level
dimension contains a single value equal to -1, the pycontrails convention for surface only data.- Returns:
bool
– If instance contains single level data.
- property is_wrapped¶
Check if the longitude dimension covers the closed interval
[-180, 180]
.Assumes the longitude dimension is sorted (this is established by the
MetDataset
orMetDataArray
constructor).- Returns:
bool
– True if longitude coordinates cover[-180, 180]
- property is_zarr¶
Check if underlying
data
is sourced from a Zarr group.Implementation is very brittle, and may break as external libraries change.
Some
dask
intermediate artifact is cached when this is called. Typically, subsequent calls to this method are much faster than the initial call.Added in version 0.26.0.
- Returns:
bool
– Ifdata
is based on a Zarr group.
- classmethod load(hash, cachestore=None, chunks=None)¶
Load saved intermediate from
cachestore
.- Parameters:
hash (
str
) – Saved hash to load.cachestore (
CacheStore
, optional) – Cache datastore to use for sourcing files. Defaults to DiskCacheStore.chunks (
dict[str
:int]
, optional) – Chunks kwarg passed toxarray.open_mfdataset()
when opening files.
- Returns:
Self
– New MetDataArray with loaded data.
- property product_attr¶
Look up the ‘product’ attribute with a custom error message.
- Returns:
str
– Product of the data. If not one of ‘forecast’, ‘ensemble’, or ‘reanalysis’, a warning is issued.
- property provider_attr¶
Look up the ‘provider’ attribute with a custom error message.
- Returns:
str
– Provider of the data. If not one of ‘ECMWF’ or ‘NCEP’, a warning is issued.
- save(**kwargs)¶
Save intermediate to
cachestore
as netcdf.Load and restore using
load()
.- Parameters:
**kwargs (
Any
) – Keyword arguments passed directly toxarray.Dataset.to_netcdf()
- Returns:
list[str]
– Returns filenames saved
- property shape¶
Return the shape of the dimensions.
- property size¶
Return the size of (each) array in underlying
data
.- Returns:
int
– Total number of grid points in underlying data
- standardize_variables(variables)¶
Standardize variables in-place.
- Parameters:
variables (
Iterable[MetVariable]
) – Data source variables
See also
- to_vector(transfer_attrs=True)¶
Convert a
MetDataset
to aGeoVectorDataset
by raveling data.If
data
is lazy, it will be loaded.- Parameters:
transfer_attrs (
bool
, optional) – Transfer attributes fromdata
to outputGeoVectorDataset
. By default, True, meaning that attributes are transferred.- Returns:
GeoVectorDataset
– ConvertedGeoVectorDataset
. The variables on the returned instance include all of those on the input instance, plus the four core spatial temporal variables.
Examples
>>> from pycontrails.datalib.ecmwf import ERA5 >>> times = "2022-03-01", "2022-03-01T01" >>> variables = ["air_temperature", "specific_humidity"] >>> levels = [250, 200] >>> era5 = ERA5(time=times, variables=variables, pressure_levels=levels) >>> met = era5.open_metdataset() >>> met.to_vector(transfer_attrs=False) GeoVectorDataset [6 keys x 4152960 length, 0 attributes] Keys: longitude, latitude, level, time, air_temperature, ..., specific_humidity Attributes: time [2022-03-01 00:00:00, 2022-03-01 01:00:00] longitude [-180.0, 179.75] latitude [-90.0, 90.0] altitude [10362.8, 11783.9]
- update(other=None, **kwargs)¶
Shortcut to
data.update()
.See
xarray.Dataset.update()
for reference.- Parameters:
other (
MutableMapping
) – Variables with which to update this dataset**kwargs (
Any
) – Variables defined by keyword arguments. If a variable exists both inother
and as a keyword argument, the keyword argument takes precedence.
See also
-
meth:xarray.Dataset.update
- wrap_longitude()¶
Wrap longitude coordinates.
- Returns:
Self
– Copy of instance with wrapped longitude values. Returns copy of data when longitude values are already wrapped
- pycontrails.core.met.downselect(data, bbox)¶
Downselect
xr.Dataset
orxr.DataArray
with spatial bounding box.- Parameters:
data (
XArrayType
) – xr.Dataset or xr.DataArray to downselectbbox (
tuple[float
,]
) – Tuple of coordinates defining a spatial bounding box in WGS84 coordinates.For 2D queries,
bbox
takes the form(west, south, east, north)
For 3D queries,
bbox
takes the form(west, south, min-level, east, north, max-level)
with level defined in [\(hPa\)].
- Returns:
XArrayType
– Downselected xr.Dataset or xr.DataArray- Raises:
ValueError – If parameter
bbox
has wrong length.
- pycontrails.core.met.maybe_downselect_mds(big_mds, little_mds, t0, t1)¶
Possibly downselect
big_mds
in the time domain to cover[t0, t1]
.If possible,
little_mds
is recycled to avoid re-loading data.This implementation assumes
t0 <= t1
, but this is not enforced.If
little_mds
already covers the time range, it is returned as-is.If
big_mds
doesn’t cover the time range, no error is raised.- Parameters:
big_mds (
MetDataset
) – Larger MetDatasetlittle_mds (
MetDataset | None
) – Smaller MetDataset. This is assumed to be a subset ofbig_mds
, though the implementation may work if this is not the case.t0, t1 (
numpy.datetime64
) – Time range to cover
- Returns:
MetDataset
– MetDataset covering the time range[t0, t1]
comprised of data fromlittle_mds
when possible, otherwise frombig_mds
.
- pycontrails.core.met.originates_from_ecmwf(met)¶
Check if data appears to have originated from an ECMWF source.
Added in version 0.27.0: Experimental. Implementation is brittle.
- Parameters:
met (
MetDataset | MetDataArray
) – Dataset or array to inspect.- Returns:
bool
– True if data appears to be derived from an ECMWF source.
See also
-
class:ERA5
-
class:HRES
- pycontrails.core.met.shift_longitude(data, bound=-180.0)¶
Shift longitude values from any input domain to [bound, 360 + bound) domain.
Sorts data by ascending longitude values.
- Parameters:
data (
XArrayType
) –xr.Dataset
orxr.DataArray
with longitude dimensionbound (
float
, optional) – Lower bound of the domain. Output domain will be [bound, 360 + bound). Defaults to -180, which results in longitude domain [-180, 180).
- Returns:
XArrayType
–xr.Dataset
orxr.DataArray
with longitude values on [a, 360 + a).
- pycontrails.core.met.standardize_variables(ds, variables)¶
Rename all variables in dataset from short name to standard name.
This function does not change any variables in
ds
that are not found invariables
.When there are multiple variables with the same short name, the last one is used.
- Parameters:
ds (
DatasetType
) – Anxr.Dataset
orMetDataset
. When aMetDataset
is passed, the underlyingxr.Dataset
is modified in place.variables (
Iterable[MetVariable]
) – Data source variables
- Returns:
DatasetType
– Dataset with variables renamed to standard names