Catalog API¶
- class ogcat.Catalog(root, spec, repository, hook_manager=<factory>)[source]¶
Bases:
objectUser-facing API bound to one catalog root.
- Parameters:
root (
Path) – Root directory containingcatalog.json,db.json, and managed files.spec (
CatalogSpec) – Catalog specification loaded from or written tocatalog.json.repository (
CatalogRepository) – Record storage backend.hook_manager (
HookManager) – Dispatcher for lifecycle hooks.
- root: Path¶
- spec: CatalogSpec¶
- repository: CatalogRepository¶
- hook_manager: HookManager¶
- classmethod create(root, spec, *, plugins=None, hooks=None)[source]¶
Create a catalog directory and write its specification.
- Parameters:
root (
str|Path) – Directory to create or reuse for the catalog.spec (
CatalogSpec) – Catalog specification to persist.plugins (
PluginRegistry|None) – Optional plugin registry used to build a hook manager.hooks (
HookManager|None) – Optional hook manager. Pass eitherpluginsorhooks.
- Return type:
- Returns:
Open catalog instance bound to
root.- Raises:
ValueError – If the configured backend is unsupported, or both
pluginsandhooksare supplied.
- classmethod open(root, *, plugins=None, hooks=None)[source]¶
Open an existing catalog from disk.
- Parameters:
root (
str|Path) – Existing catalog root containingcatalog.json.plugins (
PluginRegistry|None) – Optional plugin registry used to build a hook manager.hooks (
HookManager|None) – Optional hook manager. Pass eitherpluginsorhooks.
- Return type:
- Returns:
Open catalog instance bound to
root.- Raises:
FileNotFoundError – If
catalog.jsonis missing.ValueError – If the configured backend is unsupported, or both
pluginsandhooksare supplied.
- add_file(path, metadata=None, operation=None, record_type=None)[source]¶
Add a local file using managed copy or move.
- Parameters:
path (
str|Path) – Source file to ingest.metadata (
MetadataDict|None) – JSON-compatible user metadata.operation (
str|None) –"copy"or"move". Defaults to the catalog spec.record_type (
str|None) – Optional named schema to validate against.
- Return type:
- Returns:
Persisted catalog record.
- Raises:
TypeError – If metadata is not a dictionary.
ValueError – If validation fails, the operation is unsupported, or
record_typenames an unknown schema.
- plan_artifact_storage(path=None, *, record_type=None, metadata=None, locator=None, target_kind='file', write_mode=None, ogcat_owned=True, storage_root=None)[source]¶
Plan artifact storage without writing data or a catalog record.
- Parameters:
path (
str|Path|None) – Optional local source path used for naming and copy/move plans.record_type (
str|None) – Optional named schema to validate and use for naming.metadata (
MetadataDict|None) – JSON-compatible user metadata.locator (
ArtifactLocator|None) – Optional pre-resolved target locator. When omitted, schema naming templates are rendered understorage_rootor this catalog’s managed files root.target_kind (
Literal['file','directory']) – Whether the target is a file-like or directory-like artifact.write_mode (
Optional[Literal['copy','move','write','reference']]) – Desired materialisation mode. Defaults to"write"for owned artifacts and"reference"otherwise.ogcat_owned (
bool) – Whether ogcat should treat the target as managed.storage_root (
str|Path|None) – Optional local root or fsspec URL root for rendered template targets.
- Return type:
- Returns:
Planned storage decision.
- add_artifact(*, record_type, locator=None, storage_plan=None, metadata=None, storage_mode=None, original_path=None, original_filename=None, suffixes=None, derived_metadata=None, naming_metadata=None, time_added=None, source=None, artifact_writer=None, transaction=None)[source]¶
Add an artifact record and optionally materialise planned storage.
This is the minimal general record API.
add_file()remains the managed ingest convenience wrapper that prepares a path-backed locator and delegates through the same lifecycle.- Parameters:
record_type (
str) – Logical type of record to create.locator (
ArtifactLocator|None) – Artifact locator to store with the record. Required unlessstorage_planis supplied.storage_plan (
StoragePlan|None) – Optional planned storage decision to use instead of a standalone locator.metadata (
MetadataDict|None) – JSON-compatible user metadata.storage_mode (
str|None) – Optional description such as"external".original_path (
str|Path|None) – Optional source path or URI.original_filename (
str|None) – Optional source filename.suffixes (
list[str] |None) – Optional suffix list for the source artifact.derived_metadata (
MetadataDict|None) – Optional derived metadata to persist.naming_metadata (
MetadataDict|None) – Optional naming metadata to persist.time_added (
str|None) – Optional timestamp override.source (
OperationSource|None) – Optional operation source for hooks and writers.artifact_writer (
ArtifactWriter|None) – Optional writer that materialises data before the record is written.transaction (
UnitOfWork|None) – Optional caller-owned unit of work.
- Return type:
- Returns:
Persisted or staged catalog record.
- Raises:
TypeError – If metadata or writer inputs are invalid.
ValueError – If validation fails or the transaction belongs to a different repository.
- transaction()[source]¶
Create a best-effort unit of work for composed catalog operations.
The current TinyDB backend uses staged writes and compensating rollback actions. This context manager does not provide true database transactions or ACID semantics.
- Return type:
Iterator[UnitOfWork]
- add_artifacts(artifacts)[source]¶
Add multiple artifact records.
Each item should provide the same keyword-style fields accepted by add_artifact(). Items are added one at a time so hooks and artifact writers run consistently for each record. Earlier items remain committed if a later item fails.
- Parameters:
artifacts (
list[dict[str,object]]) – List of dictionaries accepted byadd_artifact().- Return type:
list[CatalogRecord]- Returns:
Persisted records in input order.
- search(query=None, *, where=None, contains=None, regex=None, match=None, exists=None, missing=None, ignore_case=False, as_record_set=False)[source]¶
Search catalog records using backend-neutral query semantics.
- Parameters:
query (
SearchQuery|None) – Optional pre-built search query.where (
dict[str,object] |None) – Equality filters.contains (
dict[str,object] |None) – Substring or list-membership filters.regex (
dict[str,str] |None) – Regular-expression filters.match (
dict[str,str] |None) – Glob or substring filters.exists (
Sequence[str] |None) – Fields that must be present.missing (
Sequence[str] |None) – Fields that must be absent.ignore_case (
bool) – Whether string comparisons should be case-insensitive.as_record_set (
bool) – Return aCatalogRecordSetinstead of a list.
- Return type:
list[CatalogRecord] |CatalogRecordSet- Returns:
Matching records, either as a list or record-set view.
- record_set(records)[source]¶
Wrap records in a sequence-like container.
- Parameters:
records (
Sequence[CatalogRecord]) – Records to expose throughCatalogRecordSethelpers.- Return type:
- Returns:
Record set using this catalog’s field resolution order.
- describe()[source]¶
Return a serialisable summary of catalog configuration and contents.
- Return type:
dict[str,object]
- list_metadata_fields(record_type=None)[source]¶
Return serialisable metadata field descriptions for a schema.
- Return type:
list[MetadataDict]
- list_record_fields()[source]¶
Return discoverable field paths present in stored records.
- Return type:
list[str]
- unique_values(field)[source]¶
Return unique scalar values present for a field across stored records.
- Return type:
list[JsonValue]
- get_schema(record_type=None)[source]¶
Return a serialisable schema description.
- Return type:
dict[str,object]
- get(record_id)[source]¶
Get a record by id.
- Return type:
CatalogRecord|None
- path(record_id)[source]¶
Return the stored path for a path-backed record, if present.
- Return type:
Path|None
- add_record_schema(name, schema, *, overwrite=False)[source]¶
Add or replace a record schema in the catalog spec.
- Parameters:
name (
str) – Record schema name.schema (
RecordSchema|dict[str,object]) – Schema object or serialised schema dictionary.overwrite (
bool) – Whether an existing schema may be replaced.
- Raises:
ValueError – If the schema already exists and
overwriteis false, or if the resulting spec is invalid.TypeError – If
schemais not a valid schema object.
- Return type:
None