openghg_inversions.inversion_data.get_data#

Functions for retrieving observations and datasets for creating forward simulations.

Current data processing options include: - “data_processing_surface_notracer”: Surface based measurements, without tracers

Future data processing options will include: - “data_processing_surface_tracer”: Surface based measurements, with tracers

This module also includes functions for saving and loading “merged data” created by the data processing functions.

openghg_inversions.inversion_data.get_data.add_obs_error(sites: list[str], fp_all: dict, add_averaging_error: bool = True) None#

Create mf_error variable.

The mf_error variables contains either mf_repeatablility, mf_variability or the square root of the sum of the squares of both, if add_averaging_error is True.

This function modifies fp_all in place, adding mf_error and making sure that both mf_repeatability and mf_variability are present.

Note: if averaging_period is specified in data_processing_surface_notracer, then OpenGHG will add an mf_variability variable with the standard deviation of the obs over the specified period. If mf_variability is already present (for instance, for Picarro data), then the existing variable is over-written. If the averaging_period matches the frequency of the data, this will make mf_variability zero (since the stdev of one value is 0).

Parameters:
  • sites – list of site names to process

  • fp_all – dictionary of ModelScenario objects, keyed by site names

  • add_averaging_error – if True, combine repeatability and variability to make mf_error variable. Otherwise, mf_error will equal mf_repeatability if it is present, otherwise it will equal mf_variability.

Returns:

None, modifies fp_all in place.

openghg_inversions.inversion_data.get_data.convert_to_list(x: list[str | None] | str | None, length: int, name: str | None = None) list[str | None]#

Convert variable that might be list, str, or None to a list of the expected size.

Parameters:
  • x – variable to convert to list

  • length – length of the output list

  • name – name to use for error message

Returns:

list of specified length; either the original list, or a list containing repeats of the input value

Raises:
  • ValueError – if input is a list and its length differs from the specified

  • length.

openghg_inversions.inversion_data.get_data.data_processing_surface_notracer(species: str, sites: list | str, domain: str, averaging_period: list[str | None] | str | None, start_date: str, end_date: str, obs_data_level: list[str | None] | str | None = None, platform: list[str | None] | str | None = None, inlet: list[str | None] | str | None = None, instrument: list[str | None] | str | None = None, max_level: int | None = None, calibration_scale: str | None = None, met_model: list[str | None] | str | None = None, fp_model: str | None = None, fp_height: list[str | None | Literal['auto']] | Literal['auto'] | str | None = None, fp_species: str | None = None, emissions_name: list | None = None, use_bc: bool = True, bc_input: str | None = None, bc_store: str | None = None, obs_store: str | list[str] | None = None, footprint_store: str | list[str] | None = None, emissions_store: str | None = None, averagingerror: bool = True, save_merged_data: bool = False, merged_data_name: str | None = None, merged_data_dir: str | None = None, output_name: str | None = None) tuple[dict, list, list, list, list, list]#

Retrieve and prepare fixed-surface datasets from specified OpenGHG object stores.

Use for forward simulations and model-data comparisons that do not use tracers.

Parameters:
  • species – Atmospheric trace gas species of interest e.g. “co2”

  • sites – List of strings containing measurement station/site abbreviations e.g. [“MHD”, “TAC”] NOTE: for satellite, pass as “satellitename-obs_region” eg “GOSAT-BRAZIL” and pass corresponding platform as “satellite”

  • domain – Model domain region of interest; e.g. “EUROPE”

  • averaging_period – List of averaging periods to apply to mole fraction data. NB. len(averaging_period)==len(sites) e.g. [“1H”, “1H”]

  • start_date – Date from which to gather data; e.g. “2020-01-01”

  • end_date – Date until which to gather data; e.g. “2020-02-01”

  • obs_data_level – ICOS observations data level. For non-ICOS sites use “None”

  • inlet – Specific inlet height for the site observations (length must match number of sites)

  • instrument – Specific instrument for the site (length must match number of sites)

  • max_level – Maximum atmospheric level to extract. Only needed if using satellite data.

  • calibration_scale – Convert measurements to defined calibration scale

  • met_model – Meteorological model used in the LPDM. List must be same length as number of sites.

  • fp_model – LPDM used for generating footprints.

  • fp_height – Inlet height used in footprints for corresponding sites.

  • fp_species – Species name associated with footprints in the object store

  • emissions_name – List of keywords args associated with emissions files in the object store. Corresponds to source in OpenGHG.

  • use_bc – Option to include boundary conditions in model

  • bc_input – Variable for calling BC data from ‘bc_store’ - equivalent of ‘emissions_name’ for fluxes.

  • bc_store – Name of object store to retrieve boundary conditions data from.

  • obs_store – Name of object store to retrieve observations data from.

  • footprint_store – Name of object store to retrieve footprints data from.

  • emissions_store – Name of object store to retrieve emissions data from.

  • averagingerror – Adds the variability in the averaging period to the measurement error if set to True.

  • save_merged_data – Save forward simulations data and observations.

  • merged_data_name – Filename for saved forward simulations data and observations.

  • merged_data_dir – Directory path for for saved forward simulations data and observations.

  • output_name – Optional name used to create merged data name.

Returns:

containing

  • fp_all: dictionary containing flux data (key “.flux”), bc data (key “.bc”), and observations data (site short name as key)

  • sites: Updated list of sites. All put in upper case and if data was not extracted correctly for any sites, drop these from the rest of the inversion.

  • inlet: List of inlet height for the updated list of sites

  • fp_height: List of footprint height for the updated list of sites

  • instrument: List of instrument for the updated list of sites

  • averaging_period: List of averaging_period for the updated list of sites

Return type:

tuple