openghg_inversions.utils#
Script containing common Python functions that can be called for running HBMCMC and other inversion models.
The main functions are related to applying basis functions to the flux and boundary conditions, and their sensitivities.
Many functions in this submodule originated in the ACRG code base (in acrg.name).
- openghg_inversions.utils.areagrid(lat: ndarray, lon: ndarray) ndarray#
Calculate grid of areas (m^2), given arrays of latitudes and longitudes.
- Parameters:
lat – 1D array of latitudes.
lon – 1D array of longitudes.
- Returns:
2D array of areas of size lat x lon.
- Return type:
np.ndarray
Examples
>>> import utils.areagrid >>> lat = np.arange(50., 60., 1.) >>> lon = np.arange(0., 10., 1.) >>> area = utils.areagrid(lat, lon)
- openghg_inversions.utils.combine_datasets(dataset_a: Dataset, dataset_b: Dataset, method: str | None = 'nearest', tolerance: float | None = None) Dataset#
Merges two datasets and re-indexes to the first dataset.
If “fp” variable is found within the combined dataset, the “time” values where the “lat”, “lon” dimensions didn’t match are removed.
NOTE: this is temporary solution while waiting for .load() to be added to openghg version of combine_datasets
- Parameters:
dataset_a – First dataset to merge
dataset_b – Second dataset to merge
method – One of None, nearest, ffill, bfill. See xarray.DataArray.reindex_like for list of options and meaning. Defaults to ffill (forward fill)
tolerance – Maximum allowed tolerance between matches.
- Returns:
Combined dataset indexed to dataset_a
- Return type:
- openghg_inversions.utils.get_country(domain: str, country_file: str | Path | None = None)#
Open country file for given domain and return as a SimpleNamespace.
NOTE: a SimpleNamespace is a like dict with class like attribute access
- Parameters:
domain – domain of inversion
country_file – optional string or Path to country file. If None, then the first file found in openghg_inversions/countries/ is used.
- Returns:
lon, lat, lonmax, lonmin, latmax, latmin, country, and name
- Return type:
SimpleNamespace with attributes
- openghg_inversions.utils.get_country_file_path(country_file: str | Path | None = None, domain: str | None = None)#
- openghg_inversions.utils.open_ds(path: str | Path, chunks: dict | None = None, combine: Literal['by_coords', 'nested'] = 'by_coords') Dataset#
Efficiently open xarray Datasets.
- Parameters:
path – Path to file to open.
chunks – Size of chunks for each dimension, e.g. {‘lat’: 50, ‘lon’: 50}. Opens dataset with dask, such that it is opened ‘lazily’ and all of the data is not loaded into memory. Defaults to None - dataset is opened without dask.
combine – Way in which the data should be combined (if using chunks), either: ‘by_coords’: order the datasets before concatenating (default) ‘nested’: concatenate datasets in the order supplied.
- Returns:
Opened xarray Dataset.
- Return type:
xr.Dataset
- openghg_inversions.utils.read_netcdfs(files: list[str] | list[Path], dim: str = 'time', chunks: dict | None = None, verbose: bool = True) Dataset#
Use xarray to open sequential netCDF files and concatenate them along the specified dimension.
Note: this function makes sure that file is closed after open_dataset call.
- Parameters:
files – List of netCDF filenames.
dim – Dimension of netCDF to use for concatenating the files. Default = “time”.
chunks – Size of chunks for each dimension, e.g. {‘lat’: 50, ‘lon’: 50}. Opens dataset with dask, such that it is opened ‘lazily’ and all of the data is not loaded into memory. Defaults to None - dataset is opened without dask.
verbose – If True, print progress information.
- Returns:
All files open as one concatenated xarray.Dataset object.
- Return type:
xr.Dataset
Note
This could be done more efficiently with xr.open_mfdataset (most likely).