pycontrails.core.vector¶
Lightweight data structures for vector paths.
Functions
|
Convert vectors to a longitude-latitude grid. |
Classes
Thin wrapper around dict to warn when setting a key that already exists. |
|
|
Base class to hold 1D geospatial arrays of consistent size. |
|
Thin wrapper around |
|
Base class to hold 1D arrays of consistent size. |
- class pycontrails.core.vector.AttrDict¶
-
Thin wrapper around dict to warn when setting a key that already exists.
- class pycontrails.core.vector.GeoVectorDataset(data=None, *, longitude=None, latitude=None, altitude=None, altitude_ft=None, level=None, time=None, attrs=None, copy=True, **attrs_kwargs)¶
Bases:
VectorDataset
Base class to hold 1D geospatial arrays of consistent size.
GeoVectorDataset is required to have geospatial coordinate keys defined in
required_keys
.Expect latitude-longitude CRS in WGS 84. Expect altitude in [\(m\)]. Expect level in [\(hPa\)].
Each spatial variable is expected to have “float32” or “float64”
dtype
. The time variable is expected to have “datetime64[ns]”dtype
.- Parameters:
data (
dict[str
,npt.ArrayLike] | pd.DataFrame | VectorDataDict | VectorDataset | None
, optional) – Data dictionary orpandas.DataFrame
. Must include keys/columnstime
,latitude
,longitude
,altitude
orlevel
. Keyword arguments fortime
,latitude
,longitude
,altitude
orlevel
overridedata
inputs. Expectsaltitude
in meters andtime
as a DatetimeLike (or array that can processed withpd.to_datetime()
). Additional waypoint-specific data can be included as additional keys/columns.longitude (
npt.ArrayLike
, optional) – Longitude data. Defaults to None.latitude (
npt.ArrayLike
, optional) – Latitude data. Defaults to None.altitude (
npt.ArrayLike
, optional) – Altitude data, [\(m\)]. Defaults to None.altitude_ft (
npt.ArrayLike
, optional) – Altitude data, [\(ft\)]. Defaults to None.level (
npt.ArrayLike
, optional) – Level data, [\(hPa\)]. Defaults to None.time (
npt.ArrayLike
, optional) – Time data. Expects an array of DatetimeLike values, or array that can processed withpd.to_datetime()
. Defaults to None.attrs (
dict[Hashable
,Any] | AttrDict
, optional) – Additional properties as a dictionary. Defaults to {}.copy (
bool
, optional) – Copy data on class creation. Defaults to True.**attrs_kwargs (
Any
) – Additional properties passed as keyword arguments.
- Raises:
KeyError – Raises if
data
input does not contain at leasttime
,latitude
,longitude
, (altitude
orlevel
).
- T_isa()¶
Calculate the ICAO standard atmosphere temperature at each point.
- Returns:
npt.NDArray[np.float64]
– ISA temperature, [\(K\)]
- property air_pressure¶
Get
air_pressure
values for points.- Returns:
npt.NDArray[np.float64]
– Point air pressure values, [\(Pa\)]
- property altitude¶
Get altitude.
Automatically calculates altitude using
units.pl_to_m()
usinglevel
key.Note that if
altitude
key exists indata
, the data at thealtitude
key will be returned. This allows an override of the default calculation of altitude from pressure level.- Returns:
npt.NDArray[np.float64]
– Altitude, [\(m\)]
- property altitude_ft¶
Get altitude in feet.
- Returns:
npt.NDArray[np.float64]
– Altitude, [\(ft\)]
- property constants¶
Return a dictionary of constant attributes and data values.
Includes
attrs
and values from columns indata
with a unique value.- Returns:
dict[str
,Any]
– Properties and their constant values
- property coords¶
Get geospatial coordinates for compatibility with MetDataArray.
- Returns:
pandas.DataFrame
–pd.DataFrame
with columns longitude, latitude, level, and time.
- coords_intersect_met(met)¶
Return boolean mask of data inside the bounding box defined by
met
.- Parameters:
met (
MetDataset | MetDataArray
) – MetDataset or MetDataArray to compare.- Returns:
npt.NDArray[np.bool_]
– True if point is inside the bounding box defined bymet
.
- classmethod create_empty(keys=None, attrs=None, **attrs_kwargs)¶
Create instance with variables defined by
keys
and size 0.If instance requires additional variables to be defined, these keys will automatically be attached to returned instance.
- Parameters:
keys (
Iterable[str]
) – Keys to include in empty VectorDataset instance.attrs (
dict[str
,Any] | None
, optional) – Attributes to attach instance.**attrs_kwargs (
Any
) – Define attributes as keyword arguments.
- Returns:
Self
– Empty VectorDataset instance.
- downselect_met(met, *, longitude_buffer=(0.0, 0.0), latitude_buffer=(0.0, 0.0), level_buffer=(0.0, 0.0), time_buffer=(np.timedelta64(0, 'h'), np.timedelta64(0, 'h')), copy=True)¶
Downselect
met
to encompass a spatiotemporal region of the data.- Parameters:
met (
MetDataset | MetDataArray
) – MetDataset or MetDataArray to downselect.longitude_buffer (
tuple[float
,float]
, optional) – Extend longitude domain past bylongitude_buffer[0]
on the low side andlongitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.latitude_buffer (
tuple[float
,float]
, optional) – Extend latitude domain past bylatitude_buffer[0]
on the low side andlatitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.level_buffer (
tuple[float
,float]
, optional) – Extend level domain past bylevel_buffer[0]
on the low side andlevel_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
[\(hPa\)].time_buffer (
tuple[np.timedelta64
,np.timedelta64]
, optional) – Extend time domain past bytime_buffer[0]
on the low side andtime_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(np.timedelta64(0, "h"), np.timedelta64(0, "h"))
.copy (
bool
) – If returned object is a copy or view of the original. True by default.
- Returns:
MetDataset | MetDataArray
– Copy of downselected MetDataset or MetDataArray.
- intersect_met(mda, *, longitude=None, latitude=None, level=None, time=None, use_indices=False, **interp_kwargs)¶
Intersect waypoints with MetDataArray.
- Parameters:
mda (
MetDataArray
) – MetDataArray containing a meteorological variable at spatio-temporal coordinates.longitude (
npt.NDArray[np.float64]
, optional) – Override existing coordinates for met interpolationlatitude (
npt.NDArray[np.float64]
, optional) – Override existing coordinates for met interpolationlevel (
npt.NDArray[np.float64]
, optional) – Override existing coordinates for met interpolationtime (
npt.NDArray[np.datetime64]
, optional) – Override existing coordinates for met interpolationuse_indices (
bool
, optional) – Experimental.**interp_kwargs (
Any
) – Additional keyword arguments to pass toMetDataArray.intersect_met()
. Examples includemethod
,bounds_error
, andfill_value
. If an error such asValueError: One of the requested xi is out of bounds in dimension 2
occurs, try calling this function with
bounds_error=False
. In addition, settingfill_value=0.0
will replace NaN values with 0.0.
- Returns:
npt.NDArray[np.float64]
– Interpolated values
Examples
>>> from datetime import datetime >>> import pandas as pd >>> import numpy as np >>> from pycontrails.datalib.ecmwf import ERA5 >>> from pycontrails import Flight
>>> # Get met data >>> times = (datetime(2022, 3, 1, 0), datetime(2022, 3, 1, 3)) >>> variables = ["air_temperature", "specific_humidity"] >>> levels = [300, 250, 200] >>> era5 = ERA5(time=times, variables=variables, pressure_levels=levels) >>> met = era5.open_metdataset()
>>> # Example flight >>> df = pd.DataFrame() >>> df['longitude'] = np.linspace(0, 50, 10) >>> df['latitude'] = np.linspace(0, 10, 10) >>> df['altitude'] = 11000 >>> df['time'] = pd.date_range("2022-03-01T00", "2022-03-01T02", periods=10) >>> fl = Flight(df)
>>> # Intersect >>> fl.intersect_met(met['air_temperature'], method='nearest') array([231.62969892, 230.72604651, 232.24318771, 231.88338483, 231.06429438, 231.59073409, 231.65125393, 231.93064004, 232.03344087, 231.65954432])
>>> fl.intersect_met(met['air_temperature'], method='linear') array([225.77794552, 225.13908414, 226.231218 , 226.31831528, 225.56102321, 225.81192149, 226.03192642, 226.22056121, 226.03770174, 225.63226188])
>>> # Interpolate and attach to `Flight` instance >>> for key in met: ... fl[key] = fl.intersect_met(met[key])
>>> # Show the final three columns of the dataframe >>> fl.dataframe.iloc[:, -3:].head() time air_temperature specific_humidity 0 2022-03-01 00:00:00 225.777946 0.000132 1 2022-03-01 00:13:20 225.139084 0.000132 2 2022-03-01 00:26:40 226.231218 0.000107 3 2022-03-01 00:40:00 226.318315 0.000171 4 2022-03-01 00:53:20 225.561022 0.000109
- property level¶
Get pressure
level
values for points.Automatically calculates pressure level using
units.m_to_pl()
usingaltitude
key.Note that if
level
key exists indata
, the data at thelevel
key will be returned. This allows an override of the default calculation of pressure level from altitude.- Returns:
npt.NDArray[np.float64]
– Point pressure level values, [\(hPa\)]
- required_keys = ('longitude', 'latitude', 'time')¶
Required keys for creating GeoVectorDataset
- to_geojson_points()¶
Return dataset as GeoJSON FeatureCollection of Points.
Each Feature has a properties attribute that includes
time
and other data besideslatitude
,longitude
, andaltitude
indata
.- Returns:
dict[str
,Any]
– Python representation of GeoJSON FeatureCollection
- to_lon_lat_grid(agg, *, spatial_bbox=(-180.0, -90.0, 180.0, 90.0), spatial_grid_res=0.5)¶
Convert vectors to a longitude-latitude grid.
See also
- transform_crs(crs)¶
Transform trajectory data from one coordinate reference system (CRS) to another.
- vertical_keys = ('altitude', 'level', 'altitude_ft')¶
At least one of these vertical-coordinate keys must also be included
- class pycontrails.core.vector.VectorDataDict(data=None)¶
-
Thin wrapper around
dict[str, np.ndarray]
to ensure consistency.- Parameters:
data (
dict[str
,np.ndarray]
, optional) – Dictionary input
- setdefault(k, default=None)¶
Thin wrapper around
dict.setdefault
.The main purpose of overriding is to run
_validate_array()
on set.- Parameters:
k (
str
) – Keydefault (
npt.ArrayLike
, optional) – Default value for keyk
- Returns:
Any
– Value atk
- update(other=None, **kwargs)¶
Update values without warning if overwriting.
This method casts values in
other
tonumpy.ndarray
and ensures that the array sizes are consistent with the instance.- Parameters:
other (
dict[str
,npt.ArrayLike] | None
, optional) – Fields to update as dict**kwargs (
npt.ArrayLike
) – Fields to update as kwargs
- class pycontrails.core.vector.VectorDataset(data=None, *, attrs=None, copy=True, **attrs_kwargs)¶
Bases:
object
Base class to hold 1D arrays of consistent size.
- Parameters:
data (
dict[str
,npt.ArrayLike] | pd.DataFrame | VectorDataDict | VectorDataset | None
, optional) – Initial data, by default Noneattrs (
dict[str
,Any] | AttrDict
, optional) – Dictionary of attributes, by default Nonecopy (
bool
, optional) – Copy data on class creation, by default True**attrs_kwargs (
Any
) – Additional attributes passed as keyword arguments
- Raises:
ValueError – If “time” variable cannot be converted to numpy array.
- attrs¶
Generic dataset attributes
- broadcast_attrs(keys, overwrite=False, raise_error=True)¶
Attach values from
keys
inattrs
ontodata
.If possible, use
dtype = np.float32
when broadcasting. If not possible, use whateverdtype
is inferred from the data bynumpy.full()
.
- broadcast_numeric_attrs(ignore_keys=None, overwrite=False)¶
Attach numeric values in
attrs
ontodata
.Iterate through values in
attrs
and attachfloat
andint
values todata
.This method modifies object in place.
- copy(**kwargs)¶
Return a copy of this instance.
- Parameters:
**kwargs (
Any
) – Additional keyword arguments passed into the constructor of the returned class.- Returns:
Self
– Copy of class
- classmethod create_empty(keys, attrs=None, **attrs_kwargs)¶
Create instance with variables defined by
keys
and size 0.If instance requires additional variables to be defined, these keys will automatically be attached to returned instance.
- Parameters:
keys (
Iterable[str]
) – Keys to include in empty VectorDataset instance.attrs (
dict[str
,Any] | None
, optional) – Attributes to attach instance.**attrs_kwargs (
Any
) – Define attributes as keyword arguments.
- Returns:
Self
– Empty VectorDataset instance.
- data¶
Vector data with labels as keys and
numpy.ndarray
as values
- property dataframe¶
Shorthand property to access
to_dataframe()
withcopy=False
.- Returns:
pandas.DataFrame
– Equivalent to the output fromto_dataframe()
- ensure_vars(vars, raise_error=True)¶
Ensure variables exist in column of
data
orattrs
.- Parameters:
vars (
str | Iterable[str]
) – A single string variable name or a sequence of string variable names.raise_error (
bool
, optional) – Raise KeyError if data does not contain variables. Defaults to True.
- Returns:
bool
– True if all variables exist. False otherwise.- Raises:
KeyError – Raises when dataset does not contain variable in
vars
- filter(mask, copy=True, **kwargs)¶
Filter
data
according to a boolean arraymask
.Entries corresponding to
mask == True
are kept.- Parameters:
mask (
npt.NDArray[np.bool_]
) – Boolean array with compatible shape.copy (
bool
, optional) – Copy data on filter. Defaults to True. See numpy best practices for insight into whether copy is appropriate.**kwargs (
Any
) – Additional keyword arguments passed into the constructor of the returned class.
- Returns:
Self
– Containing filtered data- Raises:
TypeError – If
mask
is not a boolean array.
- classmethod from_dict(obj, copy=True, **obj_kwargs)¶
Create instance from dict representation containing data and attrs.
- Parameters:
obj (
dict[str
,Any]
) – Dict representation of VectorDataset (e.g.to_dict()
)copy (
bool
, optional) – Passed toVectorDataset
constructor. Defaults to True.**obj_kwargs (
Any
) – Additional properties passed as keyword arguments.
- Returns:
Self
– VectorDataset instance.
See also
- generate_splits(n_splits, copy=True)¶
Split instance into
n_split
sub-vectors.- Parameters:
n_splits (
int
) – Number of splits.copy (
bool
, optional) – Passed intofilter()
. Defaults to True. Recommend to keep as True based on numpy best practices.
- Returns:
Generator[Self
,None
,None]
– Generator of split vectors.
See also
- get(key, default_value=None)¶
- get_data_or_attr(key, default=<object object>)¶
-
This method first checks if
key
is indata
and returns the value if so. Ifkey
is not indata
, then this method checks ifkey
is inattrs
and returns the value if so. Ifkey
is not indata
orattrs
, then thedefault
value is returned if provided. Otherwise aKeyError
is raised.- Parameters:
- Returns:
Any
– Value atdata[key]
orattrs[key]
- Raises:
KeyError – If
key
is not indata
orattrs
anddefault
is not provided.
Examples
>>> vector = VectorDataset({"a": [1, 2, 3]}, attrs={"b": 4}) >>> vector.get_data_or_attr("a") array([1, 2, 3])
>>> vector.get_data_or_attr("b") 4
>>> vector.get_data_or_attr("c") Traceback (most recent call last): ... KeyError: "Key 'c' not found in data or attrs."
>>> vector.get_data_or_attr("c", default=5) 5
- property hash¶
Generate a unique hash for this class instance.
- Returns:
str
– Unique hash for flight instance (sha1)
- select(keys, copy=True)¶
Return new class instance only containing specified keys.
- Parameters:
keys (
Iterable[str]
) – An iterable of keys to filter by.copy (
bool
, optional) – Copy data on selection. Defaults to True.
- Returns:
VectorDataset
– VectorDataset containing only data associated tokeys
. Note that this method always returns aVectorDataset
, even if the calling class is a proper subclass ofVectorDataset
.
- setdefault(key, default=None)¶
Shortcut to
VectorDataDict.setdefault()
.- Parameters:
- Returns:
numpy.ndarray
– Values atkey
- sort(by)¶
Sort data by key(s).
This method always creates a copy of the data by calling
pandas.DataFrame.sort_values()
.- Parameters:
by (
str | list[str]
) – Key or list of keys to sort by.- Returns:
Self
– Instance with sorted data.
- classmethod sum(vectors, infer_attrs=True, fill_value=None)¶
Sum a list of
VectorDataset
instances.- Parameters:
vectors (
Sequence[VectorDataset]
) – List ofVectorDataset
instances to concatenate.infer_attrs (
bool
, optional) – If True, infer attributes from the first element in the sequence.fill_value (
float
, optional) – Fill value to use when concatenating arrays. By default None, which raises an error if incompatible keys are found.
- Returns:
VectorDataset
– Sum of all instances invectors
.- Raises:
KeyError – If incompatible
data
keys are found amongvectors
.
Examples
>>> from pycontrails import VectorDataset >>> v1 = VectorDataset({"a": [1, 2, 3], "b": [4, 5, 6]}) >>> v2 = VectorDataset({"a": [7, 8, 9], "b": [10, 11, 12]}) >>> v3 = VectorDataset({"a": [13, 14, 15], "b": [16, 17, 18]}) >>> v = VectorDataset.sum([v1, v2, v3]) >>> v.dataframe a b 0 1 4 1 2 5 2 3 6 3 7 10 4 8 11 5 9 12 6 13 16 7 14 17 8 15 18
- to_dataframe(copy=True)¶
Create
pd.DataFrame
in which each key-value pair indata
is a column.DataFrame does not copy data by default. Use the
copy
parameter to copy data values on creation.- Parameters:
copy (
bool
, optional) – Copy data on DataFrame creation.- Returns:
pandas.DataFrame
– DataFrame holding key-values as columns.
- to_dict()¶
Create dictionary with
data
andattrs
.If geo-spatial coordinates (e.g.
"latitude"
,"longitude"
,"altitude"
) are present, round to a reasonable precision. If a"time"
variable is present, round to unix seconds. When the instance is aGeoVectorDataset
, disregard any"altitude"
or"level"
coordinate and only include"altitude_ft"
in the output.See also
Examples
>>> import pprint >>> from pycontrails import Flight >>> fl = Flight( ... longitude=[-100, -110], ... latitude=[40, 50], ... level=[200, 200], ... time=[np.datetime64("2020-01-01T09"), np.datetime64("2020-01-01T09:30")], ... aircraft_type="B737", ... ) >>> fl = fl.resample_and_fill("5min") >>> pprint.pprint(fl.to_dict()) {'aircraft_type': 'B737', 'altitude_ft': [38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0], 'latitude': [40.0, 41.724, 43.428, 45.111, 46.769, 48.399, 50.0], 'longitude': [-100.0, -101.441, -102.959, -104.563, -106.267, -108.076, -110.0], 'time': [1577869200, 1577869500, 1577869800, 1577870100, 1577870400, 1577870700, 1577871000]}
- pycontrails.core.vector.vector_to_lon_lat_grid(vector, agg, *, spatial_bbox=(-180.0, -90.0, 180.0, 90.0), spatial_grid_res=0.5)¶
Convert vectors to a longitude-latitude grid.
- Parameters:
vector (
GeoVectorDataset
) – Contains the longitude, latitude and variables for aggregation.agg (
dict[str
,str]
) – Variable name and the function selected for aggregation, i.e.{"segment_length": "sum"}
.spatial_bbox (
tuple[float
,float
,float
,float]
) – Spatial bounding box,(lon_min, lat_min, lon_max, lat_max)
, [\(\deg\)]. By default, the entire globe is used.spatial_grid_res (
float
) – Spatial grid resolution, [\(\deg\)]
- Returns:
xarray.Dataset
– Aggregated variables in a longitude-latitude grid.
Examples
>>> rng = np.random.default_rng(234) >>> vector = GeoVectorDataset( ... longitude=rng.uniform(-10, 10, 10000), ... latitude=rng.uniform(-10, 10, 10000), ... altitude=np.zeros(10000), ... time=np.zeros(10000).astype("datetime64[ns]"), ... ) >>> vector["foo"] = rng.uniform(0, 1, 10000) >>> ds = vector.to_lon_lat_grid({"foo": "sum"}, spatial_bbox=(-10, -10, 9.5, 9.5)) >>> da = ds["foo"] >>> da.coords Coordinates: * longitude (longitude) float64 320B -10.0 -9.5 -9.0 -8.5 ... 8.0 8.5 9.0 9.5 * latitude (latitude) float64 320B -10.0 -9.5 -9.0 -8.5 ... 8.0 8.5 9.0 9.5
>>> da.values.round(2) array([[2.23, 0.67, 1.29, ..., 4.66, 3.91, 1.93], [4.1 , 3.84, 1.34, ..., 3.24, 1.71, 4.55], [0.78, 3.25, 2.33, ..., 3.78, 2.93, 2.33], ..., [1.97, 3.02, 1.84, ..., 2.37, 3.87, 2.09], [3.74, 1.6 , 4.01, ..., 4.6 , 4.27, 3.4 ], [2.97, 0.12, 1.33, ..., 3.54, 0.74, 2.59]])
>>> da.sum().item() == vector["foo"].sum() np.True_