Metadata and validation¶
User metadata¶
User metadata is a flat or nested JSON-compatible dictionary attached to each record. Any key–value pairs are accepted by default.
record = catalog.add_file(
path,
metadata={
"species": "CO2",
"year": 2024,
"tags": ["paris", "europe"],
},
)
Keys and values must be JSON-serialisable (strings, numbers, booleans, lists, or nested dictionaries).
List values can be searched with contains/list-membership filters:
matches = catalog.search(contains={"tags": "paris"})
The equivalent CLI forms are:
ogcat search --catalog ./my-catalog tags:paris
ogcat search --catalog ./my-catalog --contains tags=paris
When list metadata is used in a naming template, list items are joined with
hyphens before path-safe normalisation, so ["a", "b", "c"] renders as
a-b-c.
Record schemas¶
A record schema declares which metadata fields a catalog expects. Schemas
are stored in catalog.json and are purely advisory unless you also enable
strict validation.
from ogcat import CatalogSpec, RecordSchema, MetadataFieldDescription
spec = CatalogSpec(
catalog_name="fluxes",
default_schema=RecordSchema(
metadata_fields=[
MetadataFieldDescription(
name="species",
description="Chemical species code.",
example="CO2",
required=True,
),
MetadataFieldDescription(
name="year",
description="Calendar year.",
example=2024,
required=True,
),
],
),
)
Named schemas let one catalog hold different record types with different metadata expectations:
spec = CatalogSpec(
catalog_name="measurements",
record_schemas={
"surface": RecordSchema(
metadata_fields=[
MetadataFieldDescription(name="site", description="Site code.", required=True),
],
),
"satellite": RecordSchema(
metadata_fields=[
MetadataFieldDescription(name="platform", description="Satellite name.", required=True),
],
),
},
)
Validation¶
Validation checks that required fields declared by the effective schema are present in the record’s user metadata.
from ogcat import validate_metadata
report = validate_metadata(record.user_metadata, schema)
if report.issues:
for issue in report.issues:
print(issue.path, issue.message)
Validation is run automatically during add_file() and add_artifact().
Missing required fields are errors and block ingest. Use a
before_validate_metadata hook to fill defaults before validation, or call
validate_metadata() directly when you want to inspect a report without
writing a record.
The ogcat fields command¶
ogcat fields --catalog ./my-catalog
ogcat fields --catalog ./my-catalog --record-type surface
ogcat fields --catalog ./my-catalog --json
This prints the declared metadata fields from the catalog spec.
To discover fields actually present in stored records, use:
ogcat fields --catalog ./my-catalog --stored
ogcat fields --catalog ./my-catalog --values species