openghg_inversions.models.coords#
Helpers for managing xarray and PyMC coordinate mismatches.
Xarray coordinates can contain rich objects, including MultiIndex coordinates
from stacked or ragged dimensions such as nmeasure representing stacked
(site, time) observations. PyMC does not reliably accept all such objects as
model coordinates, so model construction should use sanitized, PyMC-safe coords.
The current sanitization policy is intentionally simple: convert each known
dimension coordinate to a range index. The original scientific coordinates are
stored separately so they can later be restored onto ArviZ InferenceData.
- class openghg_inversions.models.coords.CoordRegistry(pymc_coords: dict[str, ~numpy.ndarray]=<factory>, original_coords: dict[str, ~typing.Any]=<factory>, auxiliary_coords: dict[str, ~xarray.core.dataarray.DataArray]=<factory>)#
Bases:
objectTrack scientific and PyMC-safe coordinates for a model.
- Variables:
pymc_coords (dict[str, numpy.ndarray]) – Sanitized coordinates actually registered with PyMC.
original_coords (dict[str, Any]) – Original scientific coordinates keyed by model dimension name.
auxiliary_coords (dict[str, xarray.core.dataarray.DataArray]) – Additional non-dimension coordinates attached to model dimensions, such as exploded
timeorsitecoordinates derived from a stackednmeasureMultiIndex.
- add(coords: dict[str, Any] | Coordinates, *, model_dims: tuple[str, ...] | list[str] | set[str] | None = None) None#
Register model and auxiliary coordinates with consistency checks.
- Parameters:
coords – Coordinate mapping or xarray coordinate container to register.
model_dims – Optional subset of model dimensions represented by the current data variable. Auxiliary coordinates attached to these dimensions are also preserved when possible.
- Raises:
ValueError – If the same coordinate name is registered more than once with conflicting lengths, shapes, or values.
- openghg_inversions.models.coords.add_coords(coords: dict[str, ndarray] | Coordinates, *, model_dims: tuple[str, ...] | list[str] | set[str] | None = None) None#
Register coordinates on the active model and capture scientific metadata.
- Parameters:
coords – Coordinate mapping or xarray coordinate container to register.
model_dims – Optional subset of model dimensions represented by the current data variable. When provided, auxiliary coordinates attached to those dimensions are also stored in the registry.
This helper must be called inside an active
pm.Modelcontext.
- openghg_inversions.models.coords.attach_coord_registry(model: Model, registry: CoordRegistry) None#
Attach a coordinate registry to a PyMC model.
- openghg_inversions.models.coords.get_coord_registry(model: Model) CoordRegistry | None#
Return the coordinate registry attached to a PyMC model, if any.
- openghg_inversions.models.coords.restore_inferencedata_coords(idata: InferenceData, coords_or_registry: CoordRegistry | dict[str, Any]) InferenceData#
Restore saved scientific coordinates onto matching
InferenceDatagroups.- Parameters:
idata – Inference data object returned by sampling.
coords_or_registry – Either a
CoordRegistryor a legacy mapping of original coordinates keyed by dimension name.
- Returns:
The same
InferenceDataobject with compatible original coordinates and auxiliary coordinates restored onto its xarray groups.
- openghg_inversions.models.coords.sanitize_coords_for_pymc(coords: dict[str, Any] | Coordinates | object, *, model_dims: tuple[str, ...] | list[str] | set[str] | None = None) dict[str, ndarray]#
Convert coordinate metadata into the range-based format to use with PyMC.
PyMC accepts fewer coordinate types than Xarray, so for simplicity, we convert all coordinates to range coordinates, and use the range coordinates with PyMC.
- Parameters:
coords – Coordinate mapping or xarray coordinate container.
model_dims – Optional subset of dimensions to sanitize. When omitted, all dimensions found in
coordsare considered.
- Returns:
A mapping from model dimension name to a simple
np.arangeindex of the corresponding length.