Docs/System Architecture

System Architecture

The persistent memory kernel, multi-model routing, and the harness contract.

Persistent memory and knowledge graph

NOME maintains state across sessions, models, and devices. Historical context, user preferences, project architectures, and interaction nuances persist in vector stores and continuously updated knowledge graphs.

Memory retrieval is thread-aware and project-aware — it finds what matters when it matters, built for long-running projects instead of session amnesia.

Three-plane route separation

NOME separates routing into three independent planes that compose but never collapse:

Product routing — the universal router classifies intent, assigns roles, and determines effort budgets.

Execution topology — the topology controller decides single vs centralized vs swarm fanout, worker model selection, and parallelization.

Provider routing — the model resolver selects the concrete provider, model, and fallback chain.

No public framework cleanly separates these three planes the way NOME does.

HarnessRun contract and work-item continuity

Every objective opens a HarnessRun — the canonical unit of governed work. A run binds to a work_item_id, thread, session, workspace, and project. Receipts, artifacts, and approvals are typed, cross-device shared truth.

The harness contract is surface-neutral: Agents, CoWork, Code, Productivity, CLI, and any future surface are profiles over this contract, not parallel harness definitions. Models are replaceable; the contract is not.

Cross-device state model

NOME uses a three-tier state split for cross-device continuity:

Shared state — durable truth synced across all devices (threads, work items, receipts).

Soft state — last-writer-wins preferences like draft buffers and tab registries.

Per-device state — local-only data like viewport scroll position and transient UI state.

Object-based continuity is keyed to work_item_id, with fail-closed restore and schema validation.

Ensemble routing and model selection

NOME dynamically routes each request to the optimal model based on workload characteristics, cost constraints, and latency requirements. The ensemble supports best-of-n, consensus voting, and merge-and-refine aggregation strategies.

Effort budgets on each run frame prevent over-investment in simple queries and under-investment in complex ones. Cost tracking is a first-class run metric.

Offline and local execution

Offline runs use the same HarnessRun contract as cloud runs. Tool invocations that cannot execute offline are queued as deferred tasks and resolve when connectivity returns.

Local model selection follows the compute policy and local model registry. Blackout is the dedicated native-first offline surface with zero network latency and total data privacy on Apple Silicon.

Ready to try it?

Open NOME