0025, Applying a ratified mapping — env-discovered dataset, ingest-time, DSN from env¶
- Status: Accepted (2026-06-07)
- Deciders: Ariadne maintainers
- Refines: ADR-0020 (the "Apply" step of the first Postgres slice)
Context¶
ADR-0020's first slice is a
propose → ratify → freeze → apply loop over a user's Postgres. Everything up to
freeze shipped: read-only information_schema introspection
(introspect/postgres.py), a SchemaMapper that proposes a draft
(mapping/propose.py), a deterministic structural validator + a mapping.toml
model (mapping/schema.py), a MappingDrivenAdapter that projects a user's rows
onto the canonical schema (mapping/adapter.py), a postgres_row_reader, an
ariadne map command that writes the draft, and a testcontainers test proving the
adapter loads canonical records off live Postgres.
The apply step was the gap: nothing wired a ratified mapping.toml so the
existing ariadne index loads it into the stores and ariadne workup / eval
resolve it. The map command literally ended with the dangling instruction "load
it with a MappingDrivenAdapter" — with no command that does so. This ADR records
how apply closes, since the wiring (where the DSN lives, how a dataset name resolves
across fresh processes) is a contestable design call, not a mechanical detail.
Decision drivers¶
- Run the existing pipeline unchanged. ADR-0020's stated done-state is "the existing indexer + workup + eval run unchanged on the user's data." Apply must add a source, not a parallel pipeline.
- Governance posture is preserved. Workups query Ariadne's own stores through the read-only MCP surface; they must not open ad-hoc live connections to a user's database mid-workup.
- Secrets stay off the command line. A connection string carries a password;
command-line arguments are world-readable via
ps//proc, so a--dsnflag is the wrong home for it (June-2026 secrets-handling practice). - Match the existing idiom. Datasets self-register into
DATASETSby import;ARIADNE_PROFILESalready establishes "an env var points to ratified TOML." Apply should reuse these, not invent a new resolution path.
Considered options¶
- Query-time federation — workups query the user's Postgres live, per-question. Rejected. Unbounded (every store/dialect becomes a live tool surface), and it breaks the read-only-MCP-over-our-own-stores governance posture. The 2026 pattern keeps sources behind a declarative layer; ingest-then-query is the bounded form.
- A
--mappingflag threaded through every command (index --mapping,workup --mapping,eval --mapping) with argparsechoicesrelaxed. Rejected. More code, and it diverges from theDATASETSregistry idiom that keeps the agent/connectors/eval ignorant of which dataset is loaded.workupdoes not even need the mapping (it queries the already-indexed stores). - Ingest-time apply via env-discovered registration. Chosen. A ratified
mapping.tomlself-registers as a normal dataset;indexloads it through the existing indexer;workup/evalresolve the name with zero changes.
Decision¶
Adopt option 3.
- Ingest-time, not query-time. A ratified mapping is applied at
index: theMappingDrivenAdapterprojects the user's rows onto canonical records, and the existingload_graph/load_documentswrite them into Ariadne's own Neo4j + Postgres.workup/eval/governancethen run unchanged against those stores. - Discovery-registration mirroring
ARIADNE_PROFILES. Ratified mappings live as*.tomlunder an opt-in, env-pointed directoryARIADNE_MAPPINGS(unset ⇒ no user datasets, zero import-time surprise).discover_and_registerreads the dir andregister()s oneMappingDrivenAdapterper file, so--dataset <name>resolves identically forindex,workup, andevalacross fresh processes. Themapping.tomlgains an optional[dataset]header (name,dsn_env,schema); the structural[[entities]]/[[relationships]]body is unchanged. - Source DSN from env, never argv. The
[dataset]header names the env var (dsn_env, defaultARIADNE_SOURCE_DSN) holding the connection string. The row reader connects lazily, insideload(), opening a short-lived read-only connection per table — so onlyindexever touches the source database;workup/evalregister the adapter but never open it. One_source_dsnresolver is shared withariadne map, whose--dsnflag becomes an optional override of the same env var (DRY; the env path is documented as primary).
Loop-closure done-state. The permanent gate is a testcontainers integration
test: a frozen mapping.toml → adapter → the real indexer → records queryable in
Neo4j (the foreign key resolves to a MATCH-able typed edge). The end-to-end
grounded note (a live LLM workup) is the existing synthetic e2e once the user's
data is in the stores; it is driven and recorded rather than added as a
cost-incurring, variance-prone assertion, consistent with how live-judge tests are
gated.
Consequences¶
- ADR-0020's first slice closes: a maintainer points Ariadne at a Postgres nobody
hand-wrote an adapter for, ratifies a
mapping.toml, andindex/workup/evalrun on it unchanged. - No new query-time attack surface or live cross-store federation; the read-only governance spine is untouched. The source DB is read once, at ingest, over short-lived read-only connections, with the credential off the command line.
- A second store dialect (CSV, Neo4j-as-source, …) is a new reader + header, not a new resolution path — discovery, naming, and the indexer are dialect-agnostic.
- The
[dataset]header is additive: a header-lessmapping.tomlstill parses for the existing validator/adapter tests; only registration needs the header.
Sources: avoid secrets on the command line (argv visible in process listings) — smallstep, How to Handle Secrets on the Command Line; connection strings out of source/argv, env or a vault — Securing Connection Strings; psycopg / libpq environment-variable connection parameters (PostgreSQL libpq env vars); sources behind a declarative layer queried via tools, not federated live (Truto, Mapping AI Agent Patterns to Integration Platforms 2026).