Catalog records¶
A catalog record represents one catalogued artifact. Every record contains a fixed set of reserved fields plus three metadata namespaces.
Reserved fields¶
Field |
Description |
|---|---|
|
Stable string identifier assigned at ingest time. |
|
Name of the catalog that owns the record. |
|
Kind of artifact, e.g. |
|
Describes where the artifact lives (see Locators and storage). |
|
How the artifact was stored, e.g. |
|
Source filename at ingest time. |
|
File suffix list derived from the source path. |
|
ISO 8601 timestamp when the record was created. |
Metadata namespaces¶
user_metadata
: Key–value pairs supplied by the caller at ingest time. Any
JSON-serialisable value is accepted. This is the primary place to store
domain metadata such as species, year, or instrument.
derived_metadata
: Metadata added automatically during ingest by extractors and hooks. For
netCDF files this includes dimension names and sizes when xarray is
installed. Do not rely on derived metadata being present for every file
type.
naming_metadata
: Internal metadata used to evaluate directory and filename templates. You
do not normally need to read or set this directly.
Searching across namespaces¶
When you search with an unqualified field name such as species, ogcat
looks in this order:
top-level record fields (
id,record_type, …)user_metadataderived_metadata
Use an explicit dotted path to target a specific namespace:
user_metadata.species, derived_metadata.netcdf.dims.time, or the
short aliases user.species and derived.netcdf.dims.time.
Python API¶
Records are returned as CatalogRecord instances.
record = catalog.add_file(path, metadata={"species": "CO2"})
print(record.id)
print(record.record_type) # "managed_file"
print(record.user_metadata) # {"species": "CO2", ...}
print(record.path()) # stored Path