Aletheia v2 — Citation Verification Agent (Design Spec)¶
Status: design approved 2026-05-23 (AJ) — implementation plan to follow.
Author: Claude (Opus 4.7) in collaboration with AJ.
Path: docs/architecture/2026-05-23-aletheia-v2-citation-verification-design.md
1. Problem¶
LLMs hallucinate academic citations at empirically observed rates of 14-95% (CiteVerifier benchmark, May 2026). NeurIPS 2025 surfaced ~100 fabricated citations across 53 published papers; ICLR 2026 reported 20% of sampled papers contain at least one AI-generated hallucination. arXiv began a 1-year policy of removing submissions with LLM errors. The "ACM venue rejected 100 camera-ready papers" incident AJ flagged is one observable consequence.
A direct manifestation in this project: today's vFL PR #36 added an attribution of arXiv:2502.03801 to "Yifei Liu et al." The actual first author is Heyi Zhang. Five separate citation errors made it into the PR before a manual audit; four would have been caught mechanically by the design specified here.
The kourai agent ecosystem already has Aletheia (agents/aletheia/agent.py)
in the role of "research validator and citation enforcer". Today she runs
Brave Search for generic claim verification.
This spec specializes her to academic-source verification with a mechanical
artifact-first workflow that closes the categories of hallucination Brave can't
catch.
2. Goals + non-goals¶
Goals:
- Given a research claim, produce a citation backed by a verifiable on-disk artifact whose metadata fields are populated verbatim from official APIs.
- Mechanical (deterministic, code-verifiable) guarantees on the load-bearing parts: identity provenance, triangulation gate, verbatim-excerpt check, artifact-file existence.
- Probabilistic (LLM-judgment) pieces clearly labeled and surfaced for reviewer audit, not hidden.
- Default-deny: every error path returns
(None, ...)rather than emitting an unverified citation. - Per-project artifacts committed alongside code, reviewable in PR diffs.
- A scheduled audit pass to detect drift (retractions, metadata corrections).
Non-goals (out of scope):
- IEEE Xplore API integration (requires institutional API key application; multi-week calendar time). Defer until first concrete need.
- Google Scholar integration via paid SERP API ($50-300/mo recurring). The free S2 + arXiv + OpenAlex stack covers the same papers with ToS-compliant access; add Scholar as opt-in only if a developer specifically needs it.
- Replacing Aletheia's existing
validate_research()/find_unsupported_claims()/verify_claim_with_search()— those stay for non-academic claim validation (industry standards, RFCs, generic web claims). - Pre-commit hook with full LLM call. The May 2026 "10-second pre-commit rule" (devs bypass hooks >5s) rules this out; pre-commit gets only the mechanical file-existence check.
- A novel citation-graph or paper-recommendation system. Aletheia verifies citations the developer or another agent is about to make; she does not proactively suggest papers.
- "Find every citation in a paper and verify it" batch tool. The artifact audit pass covers re-verification of existing artifacts, which is the closest equivalent and what this design ships.
3. Research foundation (May 2026 state of the art)¶
| Topic | Source | Verified |
|---|---|---|
| Cite-hallucination rates 14-95% | CiteVerifier benchmark, CiteAudit paper | yes |
| NeurIPS 2025 ~100 fabricated cites | Fortune, Jan 2026, TechCrunch | yes |
| ICLR 2026 20% hallucination sample | same | yes |
| Multi-component verification pipeline | CiteAudit (arXiv:2602.23452) — Claim Extractor → Retriever → Evidence Matcher → Reasoner → Judge | yes |
| ReAct as native tool use | Agentic AI Design Patterns 2026 — "modern LLMs handle reasoning-action loop natively through built-in tool calling" | yes |
| Semantic Scholar API: 200M+ papers, no auth | api docs | yes |
| OpenAlex API: 250M+ works, key required from 2026-02-13 | docs — 30-second signup | yes |
| arXiv API: free, no auth, Atom XML | user manual — HTML5 versions at arxiv.org/html/{id} since late 2023 |
yes |
| MCP servers reliability: 71% median pass rate | 100-MCP stress test, Feb-Apr 2026 | yes; supports direct-API over MCP-based approach |
| PDF→Markdown: Docling 0.877 on opendataloader-bench, Apache 2.0 | Granite-Docling-258M, Jan 2026 | yes |
| PyMuPDF4LLM is AGPL-3.0 (transitive-dep risk) | PyPI | yes — rejected for that reason |
| Title fuzzy threshold 0.92 = high-precision; 0.85-0.90 = "everything-is-duplicate" | pdfmux benchmark, Vikranth3140/Citation-Hallucination-Detection (0.92 exact, 0.70 fuzzy hybrid) | yes |
| vcrpy + httpx async works | vcrpy 8.0 changelog — "BREAKING fix for httpx support" landed | yes |
| Hard gate > soft gate for high-stakes | vLLM HaluGate, Tool Receipts arXiv:2603.10060 | yes — triangulation is a hard gate |
| Pre-commit "10-second rule" | DeployHQ AI hooks guide | yes — LLM calls belong in CI |
| Anthropic deterministic-grader guidance | Demystifying Evals for AI Agents | yes |
| CFF (Citation File Format) YAML schema | citation-file-format.github.io | yes — field-name source for artifact frontmatter |
4. Architecture¶
4.1 Component decomposition (CiteAudit-aligned, five tools)¶
Aletheia exposes verify_and_cite(claim: str, hint: str | None = None) -> tuple[Citation | None, ArtifactPath | ConflictReport],
implemented as five composable tools that Claude's native tool-use loop calls
in sequence. No explicit ReAct prompting — Claude's built-in tool calling is
the loop.
| Tool | Role | Backed by |
|---|---|---|
extract_claim(text) |
Claim Extractor — distills the assertion that needs grounding | LLM call (Aletheia, low temp) |
search_papers(query, year_hint) |
Retriever — top-K candidates with metadata + abstract | Semantic Scholar /graph/v1/paper/search |
fetch_paper_text(paper_id) |
Retriever, deeper — full paper text for excerpt extraction | arXiv HTML5 (preferred) → Docling for PDF |
match_evidence(claim, paper_text) |
Evidence Matcher — verbatim excerpts supporting claim | LLM call returning quotes; substring-verified mechanically |
triangulate(paper_meta) |
Judge — cross-source metadata agreement check | OpenAlex DOI lookup (or arXiv if no DOI) |
4.2 Data flow¶
verify_and_cite(claim, hint=None)
│
├─ extract_claim → search_query (LLM)
│
├─ search_papers(query) → [candidates] (S2 API)
│
├─ rank candidates by abstract-match (LLM, max 3 ReAct iterations)
│ │
│ └─ if no good match: return (None, NoCandidates)
│
├─ fetch_paper_text(best_candidate) (arXiv HTML5 → Docling)
│ │
│ └─ if all fetches fail: abstract-only fallback (artifact written
│ with text_excerpts_unavailable=true, flagged) OR return (None, ...)
│
├─ match_evidence(claim, paper_text) → [verbatim_excerpts] (LLM + substring-check)
│ │
│ └─ if zero excerpts pass: return (None, NoSupportingExcerpts)
│
├─ triangulate(paper_meta) (OpenAlex)
│ │
│ ├─ if no DOI + no arxiv_id: skip, mark single_source_verified=true
│ ├─ if decisive_field mismatch: return (None, TriangulationConflict)
│ └─ if agree: continue
│
├─ write_citation_artifact(meta, claim, excerpts, project_root)
│ │
│ └─ writes docs/citations/{slug}.md with frontmatter populated
│ verbatim from API responses
│
└─ return (citation_string, artifact_path)
4.3 Module layout¶
| Module | Status | Purpose |
|---|---|---|
agents/aletheia/agent.py |
extend existing | Add verify_and_cite() + the five tool implementations. Keep existing validate_research, find_unsupported_claims, verify_claim_with_search for non-academic claim verification. |
shared/src/kourai_common/academic_search.py |
new | search_semantic_scholar(), fetch_arxiv_html(), fetch_paper_pdf(), lookup_openalex_by_doi(), lookup_arxiv_metadata() — all httpx.AsyncClient + tenacity retries + polite User-Agent header with kourai's email. |
shared/src/kourai_common/citation_artifacts.py |
new | write_citation_artifact(), read_citation_artifact(), slug_for_paper(), normalize_doi(), normalize_title(), normalize_surname(). |
shared/src/kourai_common/triangulate.py |
new | triangulate() + helpers for field-level comparison. |
scripts/check_citations.py |
new | Mechanical pre-commit + CI check: every # research: / [^cite] resolves to an existing artifact. No LLM, <1s runtime. |
tests/unit/test_*.py |
new — five files | Unit tests for normalizers, artifact round-trip, triangulation, slug generation. Hypothesis property tests. |
tests/integration/test_academic_search.py |
new | vcrpy-cassette tests for the three APIs. |
tests/integration/test_aletheia_verify_cite.py |
new | End-to-end agent tests with FakeLLM + replayed cassettes. |
tests/nightly/test_api_contracts.py |
new | One real query per source; detects upstream schema drift. |
4.4 New runtime dependencies¶
| Package | License | Purpose | Wheel size |
|---|---|---|---|
httpx |
BSD-3 | Async HTTP client | small |
tenacity |
Apache 2.0 | Retry decorator | small |
rapidfuzz |
MIT | Title fuzzy match | ~2 MB |
docling |
Apache 2.0 | PDF → Markdown | ~50 MB |
pyyaml |
MIT | YAML frontmatter | small |
pydantic |
MIT | Frontmatter schema validation (probably already present) | already in kourai |
Excluded (with rationale):
- pymupdf4llm — AGPL-3.0 transitive dep risk for a project that might want
permissive licensing later. Docling is a clean Apache 2.0 swap with
comparable academic-PDF quality and CPU-friendly performance.
- scholarly — ToS-grey scraping of Google Scholar, CAPTCHA-prone, not
recommended for production.
- Any paid SERP API (SerpAPI / ScraperAPI) — $50-300/mo recurring; free
alternatives cover the same papers.
- MCP servers (paper-search-mcp, scholar-mcp, arxiv-mcp-server) — May 2026
100-MCP study shows median 71% pass rate; depending on three external
MCPs has compound failure rate. Direct HTTP is more reliable.
5. The triangulation gate (load-bearing)¶
5.1 Cross-check source priority¶
- OpenAlex via DOI lookup — when S2 returns a DOI (~95% of published papers)
- arXiv API via arxiv_id — when no DOI but paper is on arXiv
- Skip with note — preprint with neither DOI nor arxiv_id. In this case
triangulate()returnsverified=truewithnotes=["no_secondary_source_available"], the artifact still gets written, andsingle_source_verified: trueis recorded in the frontmatter so reviewers know to be more skeptical of this artifact.
5.2 Decisive fields (mismatch → REJECT)¶
| Field | Normalization | Match rule |
|---|---|---|
| DOI | lowercase, strip https://doi.org/ prefix |
exact equality (DOI spec is case-insensitive) |
| arXiv ID | strip version suffix (v1, v2, ...) |
exact equality |
| Title | Unicode NFKD, lowercase, strip punctuation, collapse whitespace | RapidFuzz fuzz.ratio ≥ 0.92 |
| First-author surname | NFKD, lowercase, strip accents/hyphens, drop suffixes ("Jr.") | exact equality |
| Year | none | exact equality |
5.3 Non-decisive fields (note in artifact, do not reject)¶
- Venue name (NeurIPS / NIPS / "Advances in Neural Information Processing
Systems" are aliases; a small hand-maintained
VENUE_ALIASESmap covers ~15 common cases). Everything else gets recorded as a note:venue_aliases: "S2 says X" vs "OpenAlex says Y". - Author count beyond first (S2 sometimes truncates lists at 5-10).
- Author order beyond first author.
5.4 Rejection behavior¶
On any decisive-field mismatch, verify_and_cite() returns
(None, TriangulationConflict). The TriangulationConflict lists
field-level disagreements with both sources' values. The caller can:
- Pick a different candidate from the retrieval (Aletheia surfaces the top-K, not just top-1).
- Escalate to manual review — print the conflict, ask AJ.
- Override — invoke with
override=True. Recordshuman_overridden: true+ a requiredoverride_reason: "..."in the artifact so the audit trail shows the decision.
5.5 Why this catches today's specific bug¶
The "Yifei Liu" misattribution would have been caught at the first-author surname check. S2 returns first_author="Heyi Zhang"; OpenAlex returns first_author="Heyi Zhang"; arXiv returns first_author="Heyi Zhang". An LLM- generated "Liu" attribution does not match any of these, so the citation is rejected before emission.
6. Artifact format¶
6.1 Path + slug¶
Per-project: docs/citations/{slug}.md in the consuming repo (vFL, phalanx-fl,
kourai-khryseai itself). Per-project rather than central kourai cache — artifacts
ship with the code, surface in PR diffs, get reviewed alongside the code that
cites them.
Slug pattern: {paper_id}-{first_author_surname}-{title_keyword}.md
| Example | Identifier | Slug |
|---|---|---|
| arXiv preprint | arXiv:2502.03801 | 2502.03801-zhang-flpoison.md |
| Published DOI | 10.1109/TSP.2022.3153135 | 10.1109_tsp.2022.3153135-pillutla-rfa.md |
| Neither | first-author + title hash | noid-blanchard-krum-c3d8.md |
DOI slashes → underscores (filesystem safety). Title keyword: first significant word from title (skip articles, prepositions, etc.), lowercase, alphanumeric only.
6.2 Frontmatter schema (YAML)¶
---
# === Identity (verbatim from API; never LLM-generated) ===
title: "SoK: Benchmarking Poisoning Attacks and Defenses in Federated Learning"
authors:
- Heyi Zhang
- Yule Liu
- Xinlei He
- Jun Wu
- Tianshuo Cong
- Xinyi Huang
year: 2025
venue: "arXiv preprint"
venue_full: null # optional alternative form
# === Identifiers (at least one of arxiv_id, doi required) ===
arxiv_id: "2502.03801"
doi: null
# === Resolvable URLs ===
urls:
abs: https://arxiv.org/abs/2502.03801
pdf: https://arxiv.org/pdf/2502.03801
html: https://arxiv.org/html/2502.03801 # optional; preferred for parsing
# === Verification trail (Aletheia-populated) ===
sources_consulted: [semantic_scholar, openalex]
triangulation:
primary_source: semantic_scholar
secondary_source: openalex
decisive_fields_agreed: true
decisive_fields_checked: [title, first_author_surname, year, arxiv_id]
notes: []
single_source_verified: false # true iff triangulation was skipped
verified_by: aletheia
verified_at: "2026-05-23T17:42:00Z"
verification_version: "1.0"
# === Human override (optional) ===
human_overridden: false
override_reason: null
# === Claim linkage (LLM-generated; supported by verbatim excerpts below) ===
claim_supported: "FLPoison canonical headliner set includes ALIE, Fang, sign-flip"
---
Field names borrowed from CFF where applicable (authors, title, year,
doi). Aletheia-specific fields namespaced under their own keys so future CFF
tooling won't conflict.
6.3 Body sections (markdown)¶
## Abstract
{verbatim from API — no LLM rewrite}
## Excerpts supporting the claim
> "We evaluate 15 representative poisoning attacks and 17 defense
> strategies in a unified benchmark..."
> (Abstract, paragraph 2)
> "Our headliner attack set includes ALIE [Baruch et al. 2019],
> Fang [Fang et al. 2020], and sign-flipping..."
> (Section 3.1, p. 4)
## Citation snippet
Heyi Zhang, Yule Liu, Xinlei He, Jun Wu, Tianshuo Cong, Xinyi Huang.
*SoK: Benchmarking Poisoning Attacks and Defenses in Federated Learning*.
arXiv:2502.03801 (2025).
## BibTeX
```bibtex
@article{zhang2025sok,
title={SoK: Benchmarking Poisoning Attacks and Defenses in Federated Learning},
author={Zhang, Heyi and Liu, Yule and He, Xinlei and Wu, Jun and Cong, Tianshuo and Huang, Xinyi},
journal={arXiv preprint arXiv:2502.03801},
year={2025}
}
### 6.4 Verbatim-excerpt rule (with whitespace tolerance)
Every excerpt in the "Excerpts supporting the claim" section must be a
verbatim substring of the paper's text (Docling-extracted Markdown).
Substring check applies *after* whitespace normalization on both sides
(collapse runs of whitespace, normalize line breaks to `\n`) to handle PDF
column breaks and hyphenation that would otherwise make a real quote fail
the check.
```python
def _normalize_whitespace(s: str) -> str:
return re.sub(r"\s+", " ", s).strip()
def excerpt_verifies(claim_quote: str, paper_text: str) -> bool:
return _normalize_whitespace(claim_quote) in _normalize_whitespace(paper_text)
If zero excerpts pass the check → triangulation soft-fails →
(None, NoSupportingExcerpts) is returned.
6.5 Code-side citation linking¶
# research(2026-05): Baruch et al. ALIE perturbation
# see docs/citations/1902.06156-baruch-alie.md
def alie_attack(...):
...
For Markdown:
ALIE [Baruch et al. 2019][^alie] perturbs honestly-trained updates by...
[^alie]: See [docs/citations/1902.06156-baruch-alie.md] for the verified source.
The mechanical pre-commit + CI check (scripts/check_citations.py) verifies
each # research: or [^cite] link resolves to an existing file in
docs/citations/.
7. Anti-hallucination guarantees¶
7.1 Mechanical (provable from code structure)¶
-
Identity provenance —
title,authors,year,doi,arxiv_id,urlsare written bywrite_citation_artifact()directly from parsed API JSON. The LLM is not in the call path for these fields. A code diff is the proof. -
Triangulation gate as hard reject —
verify_and_cite()returns(None, ConflictReport)iftriangulate()returnsverified=False. The function signature enforces this: caller cannot get a citation string without a successful triangulation. (JPMorgan-style hard gate.) -
Verbatim-excerpt check — every excerpt is whitespace-normalized substring-matched against the parsed paper text. If LLM proposes a quote that isn't in the paper → excerpt rejected. Zero passing excerpts → no artifact.
-
Artifact-file existence as code-side gate —
scripts/check_citations.py(no LLM, <1s) walks the codebase and confirms every cite link resolves to a file indocs/citations/. Runs in pre-commit + CI. -
Re-verifiability via audit pass —
audit_existing_citations()re-runs triangulation on every artifact's identifiers against current API data. Reports drift: DOI now resolves to a different paper, paper retracted, first-author surname corrected upstream.
7.2 Probabilistic (LLM judgment, audited by the artifact)¶
-
Candidate selection — LLM ranks retrieval results. Wrong-but-plausible pick caught by triangulation (e.g., wrong-year of same paper → year mismatch).
-
Excerpt selection — LLM picks which quotes to extract; substring check verifies they're real.
-
claim_supportedfield — LLM's stated link between paper and claim; reviewer-auditable in artifact body.
7.3 Coverage of today's failure modes¶
| Failure (from vFL PR #36 audit) | Caught by |
|---|---|
| Wrong first author ("Yifei Liu" → "Heyi Zhang") | Triangulation gate (mechanical) |
Wrong attribution of std=100 to FLPoison reference |
Verbatim-excerpt check (mechanical — no excerpt matches) |
| Overstated "uses same formula" claim on IPM | Verbatim-excerpt check + reviewer audit of claim_supported |
| Unverifiable "eq. 3" precision pointer | Verbatim-excerpt check (no excerpt with that section ref) |
| Paper-of-origin overclaim (sign-flip predates Damaskinos) | LLM judgment piece — surfaced in claim_supported body for reviewer to audit |
Four of five caught mechanically. The fifth (semantic judgment) is surfaced explicitly in the artifact rather than hidden.
8. Trigger model¶
Four tiers, aligned with the May 2026 "LLM calls belong in CI, not pre-commit" consensus:
| Tier | When | What runs | Latency | LLM? |
|---|---|---|---|---|
| 1. On-demand (primary) | Developer or agent calls aletheia.verify_and_cite(claim) |
Full agentic loop | 5-15s | yes |
| 2. Pre-commit mechanical | git commit |
scripts/check_citations.py: every cite link resolves to a file |
<1s | no |
| 3. CI mechanical | Every PR | Tier 2 + frontmatter schema validation (well-formed YAML, required fields, verified_at not stale beyond 365 days — academic metadata is reasonably stable but does drift on retractions/corrections) |
<5s | no |
| 4. Scheduled audit | Weekly cron / pre-publication | audit_existing_citations(project_root) — re-runs triangulation on every artifact |
minutes | yes |
Phase 2 (deferred): proactive inline guard — when Techne / Kallos is generating output containing a research claim, they call Aletheia inline before emitting. Uses Anthropic's managed-agents pattern. Wire after Tier 1-3 prove stable; only if there's a real signal that proactive catching is needed.
9. Error handling (default-deny)¶
Five failure modes with recovery. Every error path returns (None, ...)
rather than emitting an unverified citation.
| Failure | Recovery | Returns |
|---|---|---|
| API timeout / 5xx | tenacity retry: 3 attempts, exponential backoff with jitter |
After retries: (None, RetriesExhausted) |
| No candidates from S2 | Surface query terms to caller | (None, NoCandidates(query=...)) |
| Triangulation reject | Surface field-level disagreement list | (None, TriangulationConflict(fields=[...])) |
| PDF parse failure | arXiv HTML5 → Docling → abstract-only fallback (artifact with text_excerpts_unavailable: true) |
(citation, artifact) with flag, or (None, ...) if even abstract unreadable |
| Zero verbatim excerpts support claim | Soft triangulation reject; surface candidate for manual decision | (None, NoSupportingExcerpts(candidate=...)) |
10. Testing strategy¶
Three test tiers, mirroring kourai's existing layout (tests/unit/,
tests/integration/, tests/nightly/):
10.1 Tier 1 — Unit tests (no network, no LLM)¶
test_doi_normalization.py— idempotency, case-insensitivity, URL-prefixtest_title_match.py— RapidFuzz pair fixtures (equivalent + different)test_author_normalization.py— Unicode NFKD, accents, hyphens, "Jr."test_slug_generation.py— determinism + filesystem safetytest_citation_artifacts.py— write/read round-trip, schema validationtest_triangulate.py— decisive vs non-decisive field handling
Hypothesis property tests:
- normalize_doi(normalize_doi(x)) == normalize_doi(x) (idempotency)
- For valid PaperMetadata m: read_artifact(write_artifact(m)) == m (round-trip)
- Same paper → same slug (determinism)
- Triangulation symmetric: swap primary↔secondary → same verdict
10.2 Tier 2 — API cassette tests (recorded HTTP, no LLM)¶
tests/integration/test_academic_search.py uses
pytest-recording / vcrpy.
First run hits real S2/OpenAlex/arXiv; subsequent runs replay from
tests/cassettes/*.yaml. Documented 14s → 1.4s
speedup.
Cassettes are YAML — grep-able, diff-able, AJ-auditable.
Fixture papers:
- arxiv:1902.06156 (Baruch ALIE)
- arxiv:2502.03801 (Zhang FLPoison)
- doi:10.1109/TSP.2022.3153135 (Pillutla RFA)
- one preprint-with-no-DOI case
- one S2-finds-but-OpenAlex-doesn't case
Refresh: uv run pytest --record-mode=rewrite tests/integration/test_academic_search.py.
10.3 Tier 3 — Agent end-to-end (mocked LLM + replayed HTTP)¶
tests/integration/test_aletheia_verify_cite.py mocks Aletheia's LLM via a
FakeLLM fixture (per Anthropic
guidance)
+ replays Tier 2 cassettes.
Coverage:
- Happy path: ALIE claim → Baruch citation + artifact
- Triangulation reject: hand-crafted wrong-first-author S2 response → (None, TriangulationConflict)
- No candidates: vague claim → (None, NoCandidates)
- API timeout: cassette 504 → retries → (None, RetriesExhausted)
- Abstract-only fallback: PDF parse fails → artifact with text_excerpts_unavailable: true
- Human override: triangulation conflict + override=True → artifact with human_overridden: true
10.4 Nightly — live-API contract check¶
tests/nightly/test_api_contracts.py runs one real query per source. Detects
upstream schema drift before it breaks cassettes. ~3 queries, no LLM, fast.
Failure means an API has changed shape and our cassettes need rebuild.
10.5 Shared-state hygiene¶
Per Anthropic eval guidance,
"unnecessary shared state between runs can cause correlated failures."
Aletheia tests use per-test tmp_path for artifact writes; no
docs/citations/ pollution. pytest-mock reset at function scope.
11. Open questions / Phase 2¶
- IEEE Xplore integration — deferred until first concrete need; institutional API key application is multi-week calendar time.
- Google Scholar via paid SERP API — opt-in, behind a
SCHOLAR_SERPAPI_KEYenv var. Defer until a developer specifically needs it. - Proactive inline guard (Tier 5) — Techne / Kallos call Aletheia inline when generating output with research claims. Cost: 5-15s per generation that contains a citation. Defer; ship Tier 1-3 first and measure how often citations actually need verification.
- CSL-JSON export from artifact — convert frontmatter to Citation Style Language JSON so Zotero / Pandoc can consume artifacts directly. Easy add later if needed.
- DOI registration agents (Crossref, DataCite) — if vFL or phalanx-fl ever issue their own DOIs for datasets / software, Aletheia could verify the registration trail too. Out of scope.
12. Implementation plan placeholder¶
Implementation plan will be drafted via the
superpowers:writing-plans skill after
AJ reviews this spec. Anticipated rough order:
shared/src/kourai_common/citation_artifacts.py+ unit testsshared/src/kourai_common/triangulate.py+ unit testsshared/src/kourai_common/academic_search.py+ cassette tests (with one-time real-API recording)- Extend
agents/aletheia/agent.pywithverify_and_cite()and the five tool wrappers - End-to-end agent test with FakeLLM
scripts/check_citations.py+ CI wiring- Documentation:
docs/agents/aletheia.mddescribing the new capability
Each step lands as its own PR with green CI before the next begins.
13. Decision log (for posterity)¶
| Decision | Date | Rationale |
|---|---|---|
| Extend existing Aletheia, don't build new agent | 2026-05-23 | Aletheia is already "research validator and citation enforcer" — natural specialization. |
| Direct HTTP, not MCP servers | 2026-05-23 | 100-MCP stress test: 71% median pass rate; coordinating 3 MCPs compounds tail risk. |
| Triangulation as a hard gate, not parallel-vote | 2026-05-23 | CiteGuard (single-source ReAct) is published SOTA; triangulation as a Judge-step gate adds AJ's instinct without the parallel-vote complexity. |
| Docling over PyMuPDF4LLM | 2026-05-23 | PyMuPDF4LLM is AGPL transitive-dep risk; Docling is Apache 2.0 with comparable quality and CPU-friendly. |
Per-project docs/citations/ not central kourai cache |
2026-05-23 | Artifacts ship with code, show up in PR diffs, get reviewed alongside the code that cites them. |
| Title fuzzy threshold 0.92 | 2026-05-23 | Matches Vikranth3140 hybrid pipeline's "exact" tier; May 2026 benchmark shows 0.85-0.90 is "everything is duplicate" — 0.92 for high precision. |
| Skip pre-commit LLM call | 2026-05-23 | May 2026 "10-second pre-commit rule" — devs bypass slow hooks. Mechanical existence check only in pre-commit. |
| Defer Google Scholar | 2026-05-23 | No official API; scraping ToS-grey; paid SERP $50-300/mo. Free S2 + OpenAlex + arXiv covers the same papers. |
verify_and_cite() returns tuple[CitationString \| None, ArtifactPathOrError] |
2026-05-23 | Default-deny: typed boundary forces caller to handle None; no silent emit-with-warning path. |