Skip to content

Architecture

Design notes for Ariadne's harness. The shape below is implemented end to end, from heterogeneous retrieval and analytic rigor through distribution and an adaptive, self-improving harness; the Decision log records each contestable choice, and the Roadmap tracks what is built versus planned.

Building blocks

The orchestration layer is the Claude Agent SDK. Its primitives are the vocabulary every design choice is expressed in:

  • Tools: callable retrieval/processing functions (in-process or via MCP).
  • Skills: packaged multi-step analytic procedures (e.g. entity-workup).
  • Hooks: lifecycle interceptors for provenance, authorization, and audit.
  • Subagents: context-isolated workers for parallel per-source retrieval.
  • MCP: the connector standard surfacing graph / SQL / vector stores as tools.

See the Claude Agent SDK Reference for the full doc-cited mechanics.

Decisions

Significant, contestable choices, which store, which connector, what was deferred and why, live in the Decision log as ADRs: the single place to point when asked "why this instead of that?" Notable examples: the Postgres-over-Redis comparison (ADR-0004) and the dataset-abstraction approach (ADR-0006).

Emerging shape (from the research)

The best-practice research points to an orchestrator-worker design: a lead agent runs the gather → act → verify → repeat loop and dispatches context-isolated subagents (one per source) that retrieve via MCP and hand back only their findings. GraphRAG traverses organizational hierarchies; multimodal evidence is fused agentically by converting imagery/video to structured text before reasoning over it.

Today the loop runs as a single lead agent querying the stores directly. Subagent fan-out is deferred as YAGNI (trigger: store count ≥4 or a measured latency bottleneck; the provenance blocker is dissolved now that the SDK hook fires inside subagents) per ADR-0005 and ADR-0015. The diagram below shows the target shape.

Target entitya person, unit, or org node
Lead agentgather → act → verify → repeat
dispatches parallel, context-isolated subagents
GraphNeo4jrelationships via MCP
RelationalPostgresrecords via MCP
UnstructuredHybrid searchtext + vector via MCP
findings return to the lead
Synthesisprovenance + citation gate
Cited analytic note

Datasets

A developer adds a corpus by writing one adapter that maps raw data to the canonical schema below; a dataset-agnostic indexer fans the records into the stores. The agent, entity-workup skill, connectors, and eval harness never change. Governance (access-control, PII gating) attaches at the canonical layer once, not per dataset.

Attribute
key
value
Document
text
source
Entity
name
type
aliases[]
Relationship
type
source → Entity
target → Entity
1 : N
mentions
source · target

Free-text retrieval is hybrid: Postgres full-text (tsvector GIN, websearch_to_tsquery) fused with pgvector semantic search via Reciprocal Rank Fusion. The agent reaches it through the in-process mcp__ariadne__hybrid_search tool (opt-in --semantic), per ADR-0007.

The same seam takes any source. Four adapters spanning four modalities map into one canonical schema with no change to the agent, connectors, or eval harness (ADR-0006; pipeline design spec).

syntheticgraph
enronemail · text
lahmanrelational
worldspeechaudio
adapterone seam
canonical schema
Neo4j + Postgres

Distribution

Ariadne is consumable as an MCP server (the workup tool runs the full harness and returns a cited analytic note) from any MCP client, with a Claude Code plugin wrapper for one-click install and slash-command UX, per ADR-0009. It is published to PyPI as ariadne-sensemaking (uvx ariadne-sensemaking).

Adaptive & self-improvement

Beyond the built-in datasets, Ariadne adapts to a user's own store and improves from experience (ADR-0020). Every change rides one cycle, turning around a core the loop can never touch:

Proposethe agent drafts a declarative artifact
Ratifya human approves or rejects
Freezeit becomes config the gates check
eval gate · governancethe loop can never edit this

The loop edits only ratified artifacts; the eval harness it is scored against stays off-limits, so an agent can never quietly grade its own work.

  • Adapt (Axis A): introspect a real Postgres, propose a mapping into the canonical schema (deterministic or LLM), ratify it, and the existing indexer / workup / eval run unchanged on the user's data (ariadne map); plus a declarative user ontology and a dynamic MCP surface that activates a ratified store at runtime.
  • Learn (Axis B): distil a high-scoring workup into a reusable analytic skill (ariadne distil), reflect on a low-scoring one and propose grounded, gold-free refinements (ariadne reflect), deepen an existing skill from a new run (distil --into), and measure whether a learned change actually helps before adopting it (ariadne compare: repairs net of regressions on the same eval instance). The eval harness is the external verifiable reward the loop can never edit.

Knowing it works

The brief's central challenge is "how do you know what works?" Ariadne answers it with a tiered eval pyramid: cheap, exhaustive checks on every output at the base; expensive, sampled judgment at the apex.

LLM-as-judge
ICD-203 analytic-standards rubric, on a sample of survivors
slow · costly · sampled
Entailment (NLI)
HHEM-2.1 checks each cited claim against its evidence
fast · local · per claim
Deterministic floor
citation gate + ICD-203 tradecraft lint on every output
microseconds · $0 · every claim

volume falls and cost rises toward the apex

Still open

The pieces not yet built or still settling:

  • Multimodal fusion (Phase 3): image/video/OCR extraction and cross-modal evidence fusion.
  • Subagent fan-out: parallel per-source workers, deferred as YAGNI until store count ≥4 or a measured latency bottleneck (design specified in ADR-0015).
  • Entity resolution across stores: currently a stable shared key per entity; a richer resolver remains an open research question.
  • Open-weight validation: which self-hostable model clears the eval bar through the air-gap seam (open-weight validation).