pycontrails.GeoVectorDataset¶
- class pycontrails.GeoVectorDataset(data=None, *, longitude=None, latitude=None, altitude=None, altitude_ft=None, level=None, time=None, attrs=None, copy=True, **attrs_kwargs)¶
Bases:
VectorDataset
Base class to hold 1D geospatial arrays of consistent size.
GeoVectorDataset is required to have geospatial coordinate keys defined in
required_keys
.Expect latitude-longitude CRS in WGS 84. Expect altitude in [\(m\)]. Expect level in [\(hPa\)].
Each spatial variable is expected to have “float32” or “float64”
dtype
. The time variable is expected to have “datetime64[ns]”dtype
.- Parameters:
data (
dict[str
,npt.ArrayLike] | pd.DataFrame | VectorDataset | None
, optional) – Data dictionary orpandas.DataFrame
. Must include keys/columnstime
,latitude
,longitude
,altitude
orlevel
. Keyword arguments fortime
,latitude
,longitude
,altitude
orlevel
overridedata
inputs. Expectsaltitude
in meters andtime
as a DatetimeLike (or array that can processed withpd.to_datetime()
). Additional waypoint-specific data can be included as additional keys/columns.longitude (
npt.ArrayLike
, optional) – Longitude data. Defaults to None.latitude (
npt.ArrayLike
, optional) – Latitude data. Defaults to None.altitude (
npt.ArrayLike
, optional) – Altitude data, [\(m\)]. Defaults to None.altitude_ft (
npt.ArrayLike
, optional) – Altitude data, [\(ft\)]. Defaults to None.level (
npt.ArrayLike
, optional) – Level data, [\(hPa\)]. Defaults to None.time (
npt.ArrayLike
, optional) – Time data. Expects an array of DatetimeLike values, or array that can processed withpd.to_datetime()
. Defaults to None.attrs (
dict[Hashable
,Any] | AttrDict
, optional) – Additional properties as a dictionary. Defaults to {}.copy (
bool
, optional) – Copy data on class creation. Defaults to True.**attrs_kwargs (
Any
) – Additional properties passed as keyword arguments.
- Raises:
KeyError – Raises if
data
input does not contain at leasttime
,latitude
,longitude
, (altitude
orlevel
).
- __init__(data=None, *, longitude=None, latitude=None, altitude=None, altitude_ft=None, level=None, time=None, attrs=None, copy=True, **attrs_kwargs)¶
Methods
T_isa
()Calculate the ICAO standard atmosphere temperature at each point.
__init__
([data, longitude, latitude, ...])broadcast_attrs
(keys[, overwrite, raise_error])broadcast_numeric_attrs
([ignore_keys, overwrite])coords_intersect_met
(met)Return boolean mask of data inside the bounding box defined by
met
.copy
(**kwargs)Return a copy of this instance.
create_empty
([keys, attrs])Create instance with variables defined by
keys
and size 0.downselect_met
(met, *[, longitude_buffer, ...])Downselect
met
to encompass a spatiotemporal region of the data.ensure_vars
(vars[, raise_error])filter
(mask[, copy])Filter
data
according to a boolean arraymask
.from_dict
(obj[, copy])Create instance from dict representation containing data and attrs.
generate_splits
(n_splits[, copy])Split instance into
n_split
sub-vectors.get
(key[, default_value])get_data_or_attr
(key[, default])intersect_met
(mda, *[, longitude, latitude, ...])Intersect waypoints with MetDataArray.
select
(keys[, copy])Return new class instance only containing specified keys.
setdefault
(key[, default])Shortcut to
VectorDataDict.setdefault()
.sort
(by)Sort data by key(s).
sum
(vectors[, infer_attrs, fill_value])Sum a list of
VectorDataset
instances.to_dataframe
([copy])Create
pd.DataFrame
in which each key-value pair indata
is a column.to_dict
()Return dataset as GeoJSON FeatureCollection of Points.
to_lon_lat_grid
(agg, *[, spatial_bbox, ...])Convert vectors to a longitude-latitude grid.
transform_crs
(crs)Transform trajectory data from one coordinate reference system (CRS) to another.
update
([other])Update values in
data
dict without warning if overwriting.Attributes
Get
air_pressure
values for points.Get altitude.
Get altitude in feet.
Generic dataset attributes
Return a dictionary of constant attributes and data values.
Get geospatial coordinates for compatibility with MetDataArray.
Vector data with labels as keys and
numpy.ndarray
as valuesShorthand property to access
to_dataframe()
withcopy=False
.Generate a unique hash for this class instance.
Get pressure
level
values for points.Required keys for creating GeoVectorDataset
Shape of each array in
data
.Length of each array in
data
.At least one of these vertical-coordinate keys must also be included
- T_isa()¶
Calculate the ICAO standard atmosphere temperature at each point.
- Returns:
npt.NDArray[np.floating]
– ISA temperature, [\(K\)]
- property air_pressure¶
Get
air_pressure
values for points.- Returns:
npt.NDArray[np.floating]
– Point air pressure values, [\(Pa\)]
- property altitude¶
Get altitude.
Automatically calculates altitude using
units.pl_to_m()
usinglevel
key.Note that if
altitude
key exists indata
, the data at thealtitude
key will be returned. This allows an override of the default calculation of altitude from pressure level.- Returns:
npt.NDArray[np.floating]
– Altitude, [\(m\)]
- property altitude_ft¶
Get altitude in feet.
- Returns:
npt.NDArray[np.floating]
– Altitude, [\(ft\)]
- attrs¶
Generic dataset attributes
- broadcast_attrs(keys, overwrite=False, raise_error=True)¶
Attach values from
keys
inattrs
ontodata
.If possible, use
dtype = np.float32
when broadcasting. If not possible, use whateverdtype
is inferred from the data bynumpy.full()
.
- broadcast_numeric_attrs(ignore_keys=None, overwrite=False)¶
Attach numeric values in
attrs
ontodata
.Iterate through values in
attrs
and attachfloat
andint
values todata
.This method modifies object in place.
- property constants¶
Return a dictionary of constant attributes and data values.
Includes
attrs
and values from columns indata
with a unique value.- Returns:
dict[str
,Any]
– Properties and their constant values
- property coords¶
Get geospatial coordinates for compatibility with MetDataArray.
- Returns:
pandas.DataFrame
–pd.DataFrame
with columns longitude, latitude, level, and time.
- coords_intersect_met(met)¶
Return boolean mask of data inside the bounding box defined by
met
.- Parameters:
met (
MetDataset | MetDataArray
) – MetDataset or MetDataArray to compare.- Returns:
npt.NDArray[np.bool_]
– True if point is inside the bounding box defined bymet
.
- copy(**kwargs)¶
Return a copy of this instance.
- Parameters:
**kwargs (
Any
) – Additional keyword arguments passed into the constructor of the returned class.- Returns:
Self
– Copy of class
- classmethod create_empty(keys=None, attrs=None, **attrs_kwargs)¶
Create instance with variables defined by
keys
and size 0.If instance requires additional variables to be defined, these keys will automatically be attached to returned instance.
- Parameters:
keys (
Iterable[str]
) – Keys to include in empty VectorDataset instance.attrs (
dict[str
,Any] | None
, optional) – Attributes to attach instance.**kwargs (
Any
) – Additional keyword arguments passed into the constructor of the returned class.
- Returns:
Self
– Empty VectorDataset instance.
- data¶
Vector data with labels as keys and
numpy.ndarray
as values
- property dataframe¶
Shorthand property to access
to_dataframe()
withcopy=False
.- Returns:
pandas.DataFrame
– Equivalent to the output fromto_dataframe()
- downselect_met(met, *, longitude_buffer=(0.0, 0.0), latitude_buffer=(0.0, 0.0), level_buffer=(0.0, 0.0), time_buffer=(np.timedelta64(0, 'h'), np.timedelta64(0, 'h')))¶
Downselect
met
to encompass a spatiotemporal region of the data.Changed in version 0.54.5: Returned object is no longer copied.
- Parameters:
met (
MetDataset | MetDataArray
) – MetDataset or MetDataArray to downselect.longitude_buffer (
tuple[float
,float]
, optional) – Extend longitude domain past bylongitude_buffer[0]
on the low side andlongitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.latitude_buffer (
tuple[float
,float]
, optional) – Extend latitude domain past bylatitude_buffer[0]
on the low side andlatitude_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
degrees.level_buffer (
tuple[float
,float]
, optional) – Extend level domain past bylevel_buffer[0]
on the low side andlevel_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(0, 0)
[\(hPa\)].time_buffer (
tuple[np.timedelta64
,np.timedelta64]
, optional) – Extend time domain past bytime_buffer[0]
on the low side andtime_buffer[1]
on the high side. Units must be the same as class coordinates. Defaults to(np.timedelta64(0, "h"), np.timedelta64(0, "h"))
.
- Returns:
MetDataset | MetDataArray
– Copy of downselected MetDataset or MetDataArray.
- ensure_vars(vars, raise_error=True)¶
Ensure variables exist in column of
data
orattrs
.- Parameters:
vars (
str | Iterable[str]
) – A single string variable name or a sequence of string variable names.raise_error (
bool
, optional) – Raise KeyError if data does not contain variables. Defaults to True.
- Returns:
bool
– True if all variables exist. False otherwise.- Raises:
KeyError – Raises when dataset does not contain variable in
vars
- filter(mask, copy=True, **kwargs)¶
Filter
data
according to a boolean arraymask
.Entries corresponding to
mask == True
are kept.- Parameters:
mask (
npt.NDArray[np.bool_]
) – Boolean array with compatible shape.copy (
bool
, optional) – Copy data on filter. Defaults to True. See numpy best practices for insight into whether copy is appropriate.**kwargs (
Any
) – Additional keyword arguments passed into the constructor of the returned class.
- Returns:
Self
– Containing filtered data- Raises:
TypeError – If
mask
is not a boolean array.
- classmethod from_dict(obj, copy=True, **obj_kwargs)¶
Create instance from dict representation containing data and attrs.
- Parameters:
obj (
dict[str
,Any]
) – Dict representation of VectorDataset (e.g.to_dict()
)copy (
bool
, optional) – Passed toVectorDataset
constructor. Defaults to True.**obj_kwargs (
Any
) – Additional properties passed as keyword arguments.
- Returns:
Self
– VectorDataset instance.
See also
- generate_splits(n_splits, copy=True)¶
Split instance into
n_split
sub-vectors.- Parameters:
n_splits (
int
) – Number of splits.copy (
bool
, optional) – Passed intofilter()
. Defaults to True. Recommend to keep as True based on numpy best practices.
- Returns:
Generator[Self
,None
,None]
– Generator of split vectors.
See also
- get(key, default_value=None)¶
- get_data_or_attr(key, default=<object object>)¶
-
This method first checks if
key
is indata
and returns the value if so. Ifkey
is not indata
, then this method checks ifkey
is inattrs
and returns the value if so. Ifkey
is not indata
orattrs
, then thedefault
value is returned if provided. Otherwise aKeyError
is raised.- Parameters:
- Returns:
Any
– Value atdata[key]
orattrs[key]
- Raises:
KeyError – If
key
is not indata
orattrs
anddefault
is not provided.
Examples
>>> vector = VectorDataset({"a": [1, 2, 3]}, attrs={"b": 4}) >>> vector.get_data_or_attr("a") array([1, 2, 3])
>>> vector.get_data_or_attr("b") 4
>>> vector.get_data_or_attr("c") Traceback (most recent call last): ... KeyError: "Key 'c' not found in data or attrs."
>>> vector.get_data_or_attr("c", default=5) 5
- property hash¶
Generate a unique hash for this class instance.
- Returns:
str
– Unique hash for flight instance (sha1)
- intersect_met(mda, *, longitude=None, latitude=None, level=None, time=None, use_indices=False, **interp_kwargs)¶
Intersect waypoints with MetDataArray.
- Parameters:
mda (
MetDataArray
) – MetDataArray containing a meteorological variable at spatio-temporal coordinates.longitude (
npt.NDArray[np.floating]
, optional) – Override existing coordinates for met interpolationlatitude (
npt.NDArray[np.floating]
, optional) – Override existing coordinates for met interpolationlevel (
npt.NDArray[np.floating]
, optional) – Override existing coordinates for met interpolationtime (
npt.NDArray[np.datetime64]
, optional) – Override existing coordinates for met interpolationuse_indices (
bool
, optional) – Experimental.**interp_kwargs (
Any
) – Additional keyword arguments to pass toMetDataArray.intersect_met()
. Examples includemethod
,bounds_error
, andfill_value
. If an error such asValueError: One of the requested xi is out of bounds in dimension 2
occurs, try calling this function with
bounds_error=False
. In addition, settingfill_value=0.0
will replace NaN values with 0.0.
- Returns:
npt.NDArray[np.floating]
– Interpolated values
Examples
>>> from datetime import datetime >>> import pandas as pd >>> import numpy as np >>> from pycontrails.datalib.ecmwf import ERA5 >>> from pycontrails import Flight
>>> # Get met data >>> times = (datetime(2022, 3, 1, 0), datetime(2022, 3, 1, 3)) >>> variables = ["air_temperature", "specific_humidity"] >>> levels = [300, 250, 200] >>> era5 = ERA5(time=times, variables=variables, pressure_levels=levels) >>> met = era5.open_metdataset()
>>> # Example flight >>> df = pd.DataFrame() >>> df['longitude'] = np.linspace(0, 50, 10) >>> df['latitude'] = np.linspace(0, 10, 10) >>> df['altitude'] = 11000 >>> df['time'] = pd.date_range("2022-03-01T00", "2022-03-01T02", periods=10) >>> fl = Flight(df)
>>> # Intersect >>> fl.intersect_met(met['air_temperature'], method='nearest') array([231.62969892, 230.72604651, 232.24318771, 231.88338483, 231.06429438, 231.59073409, 231.65125393, 231.93064004, 232.03344087, 231.65954432])
>>> fl.intersect_met(met['air_temperature'], method='linear') array([225.77794552, 225.13908414, 226.231218 , 226.31831528, 225.56102321, 225.81192149, 226.03192642, 226.22056121, 226.03770174, 225.63226188])
>>> # Interpolate and attach to `Flight` instance >>> for key in met: ... fl[key] = fl.intersect_met(met[key])
>>> # Show the final three columns of the dataframe >>> fl.dataframe.iloc[:, -3:].head() time air_temperature specific_humidity 0 2022-03-01 00:00:00 225.777946 0.000132 1 2022-03-01 00:13:20 225.139084 0.000132 2 2022-03-01 00:26:40 226.231218 0.000107 3 2022-03-01 00:40:00 226.318315 0.000171 4 2022-03-01 00:53:20 225.561022 0.000109
- property level¶
Get pressure
level
values for points.Automatically calculates pressure level using
units.m_to_pl()
usingaltitude
key.Note that if
level
key exists indata
, the data at thelevel
key will be returned. This allows an override of the default calculation of pressure level from altitude.- Returns:
npt.NDArray[np.floating]
– Point pressure level values, [\(hPa\)]
- required_keys = ('longitude', 'latitude', 'time')¶
Required keys for creating GeoVectorDataset
- select(keys, copy=True)¶
Return new class instance only containing specified keys.
- Parameters:
keys (
Iterable[str]
) – An iterable of keys to filter by.copy (
bool
, optional) – Copy data on selection. Defaults to True.
- Returns:
VectorDataset
– VectorDataset containing only data associated tokeys
. Note that this method always returns aVectorDataset
, even if the calling class is a proper subclass ofVectorDataset
.
- setdefault(key, default=None)¶
Shortcut to
VectorDataDict.setdefault()
.- Parameters:
- Returns:
numpy.ndarray
– Values atkey
- sort(by)¶
Sort data by key(s).
This method always creates a copy of the data by calling
pandas.DataFrame.sort_values()
.- Parameters:
by (
str | list[str]
) – Key or list of keys to sort by.- Returns:
Self
– Instance with sorted data.
- classmethod sum(vectors, infer_attrs=True, fill_value=None)¶
Sum a list of
VectorDataset
instances.- Parameters:
vectors (
Sequence[VectorDataset]
) – List ofVectorDataset
instances to concatenate.infer_attrs (
bool
, optional) – If True, infer attributes from the first element in the sequence.fill_value (
float
, optional) – Fill value to use when concatenating arrays. By default None, which raises an error if incompatible keys are found.
- Returns:
VectorDataset
– Sum of all instances invectors
.- Raises:
KeyError – If incompatible
data
keys are found amongvectors
.
Examples
>>> from pycontrails import VectorDataset >>> v1 = VectorDataset({"a": [1, 2, 3], "b": [4, 5, 6]}) >>> v2 = VectorDataset({"a": [7, 8, 9], "b": [10, 11, 12]}) >>> v3 = VectorDataset({"a": [13, 14, 15], "b": [16, 17, 18]}) >>> v = VectorDataset.sum([v1, v2, v3]) >>> v.dataframe a b 0 1 4 1 2 5 2 3 6 3 7 10 4 8 11 5 9 12 6 13 16 7 14 17 8 15 18
- to_dataframe(copy=True)¶
Create
pd.DataFrame
in which each key-value pair indata
is a column.DataFrame does not copy data by default. Use the
copy
parameter to copy data values on creation.- Parameters:
copy (
bool
, optional) – Copy data on DataFrame creation.- Returns:
pandas.DataFrame
– DataFrame holding key-values as columns.
- to_dict()¶
Create dictionary with
data
andattrs
.If geo-spatial coordinates (e.g.
"latitude"
,"longitude"
,"altitude"
) are present, round to a reasonable precision. If a"time"
variable is present, round to unix seconds. When the instance is aGeoVectorDataset
, disregard any"altitude"
or"level"
coordinate and only include"altitude_ft"
in the output.See also
Examples
>>> import pprint >>> from pycontrails import Flight >>> fl = Flight( ... longitude=[-100, -110], ... latitude=[40, 50], ... level=[200, 200], ... time=[np.datetime64("2020-01-01T09"), np.datetime64("2020-01-01T09:30")], ... aircraft_type="B737", ... ) >>> fl = fl.resample_and_fill("5min") >>> pprint.pprint(fl.to_dict()) {'aircraft_type': 'B737', 'altitude_ft': [38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0, 38661.0], 'latitude': [40.0, 41.724, 43.428, 45.111, 46.769, 48.399, 50.0], 'longitude': [-100.0, -101.441, -102.959, -104.563, -106.267, -108.076, -110.0], 'time': [1577869200, 1577869500, 1577869800, 1577870100, 1577870400, 1577870700, 1577871000]}
- to_geojson_points()¶
Return dataset as GeoJSON FeatureCollection of Points.
Each Feature has a properties attribute that includes
time
and other data besideslatitude
,longitude
, andaltitude
indata
.- Returns:
dict[str
,Any]
– Python representation of GeoJSON FeatureCollection
- to_lon_lat_grid(agg, *, spatial_bbox=(-180.0, -90.0, 180.0, 90.0), spatial_grid_res=0.5)¶
Convert vectors to a longitude-latitude grid.
See also
vector_to_lon_lat_grid
- transform_crs(crs)¶
Transform trajectory data from one coordinate reference system (CRS) to another.
- update(other=None, **kwargs)¶
Update values in
data
dict without warning if overwriting.- Parameters:
other (
dict[str
,npt.ArrayLike] | None
, optional) – Fields to update as dict**kwargs (
npt.ArrayLike
) – Fields to update as kwargs
- vertical_keys = ('altitude', 'level', 'altitude_ft')¶
At least one of these vertical-coordinate keys must also be included