Skip to content

Infrastructure

🔍 Observability

The agent stack ships a three-pane observability triad — Jaeger for traces, Prometheus for metrics, Dozzle for live container logs. One command opens all three:

make observe

Each log line emitted inside a span context carries the active trace ID, so a slow span found in Jaeger is grep-findable in Dozzle without any code change between observation and search.

The full mental model, triage runbook, and current coverage gaps live on the dedicated Observability page.


🐳 Infrastructure

Docker

A single generic Dockerfile at docker/agent.Dockerfile builds any agent via the AGENT_NAME build arg:

Build a single agent
docker build --build-arg AGENT_NAME=mneme -f docker/agent.Dockerfile -t kourai-mneme .

Multi-stage build: builder installs deps with uv, runtime copies only the venv. Each container has a health check against /.well-known/agent-card.json.

Docker Compose

docker-compose.yml defines all ten agents + infrastructure. docker compose up brings everything up — agents resolve each other via Docker service names (e.g., http://hephaestus:10000).


🔑 Key Design Decisions

Why a2a-sdk directly, not AgentStack?

AgentStack requires Kubernetes via Lima VM. Windows support needs WSL2. Frequent breaking changes. Decision: a2a-sdk + Starlette + uvicorn gives full A2A compliance without K8s overhead.

Why A2A 1.0.x (M7 cutover, shipped 2026-04-30)?

1.0 introduced breaking changes — Part type unification, enum case changes, method renames, well-known URL rename — that the codebase had been firewalled against via a dual-shape inspection layer under the prior >=0.3.25,<1.0 pin. The M7 cutover (six phases) retired the firewall, switched every client to the 1.0 JSON-RPC binding, and shipped AgentCard.supported_interfaces declaring v1.0 only. The shared pin is now a2a-sdk[http-server]>=1.0,<2.0.

Why LiteLLM?

Model-agnostic interface. Claude for production, Ollama for free local dev. Swap with one env var: KOURAI_PROVIDER=local.

Why sequential pipelines, not parallel?

Agents build on each other's output — Techne needs Metis's spec, Dokimasia needs Techne's code, Kallos needs the files written. Parallelism doesn't help when there's a data dependency chain. The Kallos-Techne loop is the one place where iteration (not parallelism) adds value.


📚 References

A2A Protocol

Industry Context

Stack