Infrastructure¶
🔍 Observability¶
The agent stack ships a three-pane observability triad — Jaeger for traces, Prometheus for metrics, Dozzle for live container logs. One command opens all three:
Each log line emitted inside a span context carries the active trace ID, so a slow span found in Jaeger is grep-findable in Dozzle without any code change between observation and search.
The full mental model, triage runbook, and current coverage gaps live on the dedicated Observability page.
🐳 Infrastructure¶
Docker¶
A single generic Dockerfile at docker/agent.Dockerfile builds any agent via the AGENT_NAME build arg:
docker build --build-arg AGENT_NAME=mneme -f docker/agent.Dockerfile -t kourai-mneme .
Multi-stage build: builder installs deps with uv, runtime copies only the venv. Each container has a health check against /.well-known/agent-card.json.
Docker Compose¶
docker-compose.yml defines all ten agents + infrastructure. docker compose up brings everything up — agents resolve each other via Docker service names (e.g., http://hephaestus:10000).
🔑 Key Design Decisions¶
Why a2a-sdk directly, not AgentStack?
AgentStack requires Kubernetes via Lima VM. Windows support needs WSL2. Frequent breaking changes. Decision: a2a-sdk + Starlette + uvicorn gives full A2A compliance without K8s overhead.
Why A2A 1.0.x (M7 cutover, shipped 2026-04-30)?
1.0 introduced breaking changes — Part type unification, enum case changes, method renames, well-known URL rename — that the codebase had been firewalled against via a dual-shape inspection layer under the prior >=0.3.25,<1.0 pin. The M7 cutover (six phases) retired the firewall, switched every client to the 1.0 JSON-RPC binding, and shipped AgentCard.supported_interfaces declaring v1.0 only. The shared pin is now a2a-sdk[http-server]>=1.0,<2.0.
Why LiteLLM?
Model-agnostic interface. Claude for production, Ollama for free local dev. Swap with one env var: KOURAI_PROVIDER=local.
Why sequential pipelines, not parallel?
Agents build on each other's output — Techne needs Metis's spec, Dokimasia needs Techne's code, Kallos needs the files written. Parallelism doesn't help when there's a data dependency chain. The Kallos-Techne loop is the one place where iteration (not parallelism) adds value.
📚 References¶
A2A Protocol¶
- A2A Protocol Spec (v0.4.0)
- A2A GitHub
- A2A Python Samples
- A2A SDK (PyPI)
- A2A Purchasing Concierge Codelab
Industry Context¶
- Google Blog: A2A — A New Era of Agent Interoperability
- IBM: What Is Agent2Agent Protocol
- AWS: Inter-Agent Communication on A2A
- Linux Foundation A2A Project