openghg_inversions.models.components#

Reusable PyMC model graph helpers.

These helpers operate on the active PyMC model context and are designed to be xarray-first. They return PyTensor/PyMC tensors and should not implement their own coordinate sanitization policy; coordinate handling lives in openghg_inversions.models.coords.

All component helpers operate inside an active PyMC model context.

Naming conventions: - data_name: name for registered pm.Data - var_name: name for the latent random variable - output_name: name for the aligned deterministic output - plain name is reserved for helpers that truly create only one semantic

object or where a base name is the clearest API

Frequency indicators may be supplied explicitly or derived from observation coordinates using shared helper logic based on openghg_inversions.inversion_inputs.make_freq_indicator.

class openghg_inversions.models.components.LinearComponentResult(data: TensorVariable, latent: TensorVariable, output: TensorVariable)#

Bases: object

Objects created by add_linear_component.

data: TensorVariable#
latent: TensorVariable#
output: TensorVariable#
openghg_inversions.models.components.add_inferpymc_likelihood_component(data: Dataset, /, mu: TensorVariable, mu_bc: TensorVariable | None, sigprior: dict, offset: TensorVariable | None = None, power: dict | float = 1.99, pollution_events_from_obs: bool = False, no_model_error: bool = False, sigma_per_site: bool = True, output_dim: str = 'nmeasure') TensorVariable#

Add the inferpymc observation model.

mu is the non-baseline forward-model contribution. mu_bc is the baseline contribution, usually H_bc @ bc, plus offset if applicable.

Parameters:
  • data – Canonical inferpymc input dataset.

  • mu – Non-baseline forward-model contribution.

  • mu_bc – Baseline contribution, if present.

  • sigprior – Prior specification for sigma.

  • offset – Optional aligned offset term.

  • power – Scalar or prior specification controlling pollution-event scaling.

  • pollution_events_from_obs – Whether to derive pollution events from the observations instead of mu.

  • no_model_error – Whether to bypass the model-error term.

  • sigma_per_site – Whether sigma varies by site.

  • output_dim – Observation/output dimension name.

Returns:

The epsilon deterministic variable used by the observation model.

openghg_inversions.models.components.add_linear_component(data: DataArray, /, data_name: str, prior_args: dict, var_name: str, output_name: str, output_dim: str = 'nmeasure', compute_deterministic: bool = True) LinearComponentResult#

Add a linear latent component and its aligned forward-model contribution.

Parameters:
  • data – Sensitivity matrix or other linear data term.

  • data_name – Name used when registering the data as pm.Data.

  • prior_args – Prior specification for the latent random variable.

  • var_name – Name for the latent random variable.

  • output_name – Name for the aligned deterministic output.

  • output_dim – Observation/output dimension name.

  • compute_deterministic – Whether to wrap the aligned output in pm.Deterministic.

Returns:

A LinearComponentResult containing the registered data tensor, the effective latent variable, and the aligned output tensor.

openghg_inversions.models.components.add_model_data(data: DataArray, name: str | None = None) TensorVariable#

Add labelled xarray data to the active PyMC model.

Parameters:
  • data – Xarray data to register as pm.Data.

  • name – Optional PyMC variable name. If omitted, data.name is used.

Returns:

The registered pm.Data tensor for data.

Raises:

ValueError – If no name can be determined for the data variable.

openghg_inversions.models.components.add_offset_component(site_indicator: DataArray, /, prior_args: dict, offset_freq_indicator: DataArray | ndarray | None = None, offset_freq: str | None = None, var_name: str = 'offset_latent', output_name: str = 'offset', output_dim: str = 'nmeasure', drop_first: bool = False) TensorVariable#

Add a site-only or site-by-period offset component.

Parameters:
  • site_indicator – Observation-aligned site indicator.

  • prior_args – Prior specification for the offset latent variable.

  • offset_freq_indicator – Optional explicit observation-aligned offset frequency indicator.

  • offset_freq – Optional frequency string used to derive an indicator when offset_freq_indicator is not provided.

  • var_name – Name for the latent offset variable.

  • output_name – Name for the aligned deterministic offset output.

  • output_dim – Observation/output dimension name.

  • drop_first – Whether to omit the first site indicator column.

Returns:

The aligned offset deterministic variable.

openghg_inversions.models.components.add_sigma_component(site_indicator: DataArray, /, prior_args: dict, sigma_freq_index: DataArray | None = None, sigma_freq: str | None = None, var_name: str = 'sigma', output_name: str | None = None, per_site: bool = True, output_dim: str = 'nmeasure', compute_deterministic: bool = False) TensorVariable#

Add inferpymc-compatible sigma terms and align them to observations.

Parameters:
  • site_indicator – Observation-aligned site indicator.

  • prior_args – Prior specification for the sigma random variable.

  • sigma_freq_index – Optional explicit observation-aligned frequency indicator.

  • sigma_freq – Optional frequency string used to derive an indicator when sigma_freq_index is not provided.

  • var_name – Name for the latent sigma random variable.

  • output_name – Optional name for an observation-aligned deterministic output.

  • per_site – Whether sigma varies by site.

  • output_dim – Observation/output dimension name.

  • compute_deterministic – Whether to register the aligned sigma term as a deterministic variable.

Returns:

The observation-aligned sigma tensor or deterministic variable.

Raises:

ValueError – If no frequency information is available.

openghg_inversions.models.components.get_model_latent(variable: TensorVariable, base_name: str) TensorVariable#

Return the effective latent variable for a named model component.

Parameters:
  • variable – User-facing variable returned by parse_prior.

  • base_name – Base model variable name used to look up a reparameterized latent variable.

Returns:

The reparameterized latent variable {base_name}_latent when it is present on the active model, otherwise variable.

openghg_inversions.models.components.resolve_model_variable(model: Model, base_name: str) TensorVariable | None#

Return a named model variable, preferring the reparameterised latent form.

Parameters:
  • model – PyMC model to inspect.

  • base_name – Base variable name to resolve.

Returns:

The reparameterised latent variable {base_name}_latent when it is present on model, otherwise the user-facing variable named base_name. Returns None if neither variable exists.