05 — AI & Multi-Agent Systems
A working multi-agent infrastructure
The infrastructure that makes the rest of the work possible.
A custom-built, bidirectional, multi-agent operations system runs continuously on dedicated hardware. It was designed and built from scratch — not assembled from open-source frameworks.
This is the center of gravity of the current work. Everything else on this site is shipped through it or coordinated by it.
A multi-agent operations infrastructure runs continuously on a dedicated server. It is bidirectional and always-on: agents hold persistent memory and a persistent runtime, so they do not start cold each time I reach for them. The system was designed and built from scratch — not assembled out of open-source agent frameworks — because the requirements were specific and the coordination had to be mine to control end to end.
There are distinct agent classes, each with a defined role. A central orchestration brain interprets intent and routes work. Ambassadors translate that intent into each provider and system's own protocol. Auditors verify what other agents produce. Messengers move information between layers without loss. Researchers gather context and feed it back. Senior executors do full-capability work, up to sixteen at once; below them, narrower sub-agents fan out — more than three hundred in parallel when the workload demands it.
I command the fleet from several surfaces. From the workstation through VSCode and the Linux CLI; from mobile devices in real time; over socket-level bidirectional connections that stay open rather than polling; and through real-time voice calls with running instances. The point of the breadth is not novelty — it is that the same infrastructure is reachable from wherever the work happens to be.
senior executor instances, full-capability, running concurrently
narrower sub-agents in parallel when the workload calls for it
LLM providers orchestrated at once — Anthropic, OpenAI, Google, xAI, Moonshot
always-on, on dedicated hardware, with persistent memory and runtime
The numbers it actually runs at.
Infrastructure / steady state
- Senior executor instances
- up to 16 concurrent
- Sub-agents
- 300+ in parallel
- LLM providers
- 5, orchestrated simultaneously
- Transport
- socket-level, bidirectional, low-latency
- Command surfaces
- workstation · mobile · voice
- Runtime host
- dedicated Dell server, always-on
- Operational state
- persistent — agents never start cold
- Agent-to-agent
- protocol-mediated, audited
- Context interfaces
- function calling · MCP · RAG
- Origin
- built from scratch — not a framework assembly
One brain, several classes, many workers.
Intent enters at the center. It is decomposed, routed to the class fit for the task, executed, and verified — then the result returns to the center, which holds the only complete picture.
Schematic. Class roles are fixed; instance counts scale with workload.
What each layer is responsible for.
The central brain holds the whole problem.
One orchestration brain sits at the center. It interprets intent, decomposes a mission into tasks, routes those tasks to the agents best suited to them, and holds the global context that no single worker can see on its own.
It also resolves conflicts between agents and maintains operational state across the run. When two paths disagree, the decision is made here, with the full picture — not at the edge by a worker that only knows its own slice.
- Interprets intent and decomposes it into discrete tasks
- Routes work to the agent class fit for each task
- Holds global context and resolves conflicts
- Maintains operational state across the whole run
Translation and transport between layers.
Ambassadors translate intent into each provider, CLI, and system's own protocol. A task expressed once at the center reaches each endpoint in the form that endpoint actually expects, so the orchestration layer is not coupled to any single vendor's interface.
Messengers move information between layers without loss. Their job is fidelity: what one layer produced is what the next layer receives, intact, rather than degraded through repeated paraphrase.
- Ambassadors speak each provider, CLI, and system protocol natively
- Messengers preserve fidelity moving information across layers
- Together they keep the core vendor-neutral
Other agents' work gets checked before it counts.
Auditors inspect, validate, and verify the work other agents produce — for correctness, for security, and for coherence with the rest of the system. They are a separate class on purpose: the agent doing the work is not the agent confirming it.
This is the same principle as code review and separation of duties, applied inside the fleet. Output is not accepted because an agent claims it is done; it is accepted because a different agent verified it against the requirement.
- Inspect and validate other agents' output
- Check correctness, security, and coherence
- Separation of duties — maker is never the checker
Context gathered before, intelligence fed back during.
Researchers gather context, investigate open questions, and feed intelligence back into the system. When a mission needs information the fleet does not already hold, this class goes and finds it rather than guessing.
Their output flows back to the orchestration brain and into working memory, so a fact established once is available to every agent that needs it afterward.
- Gather context and investigate open questions
- Feed findings back into shared working memory
- Turn unknowns into inputs the rest of the fleet can use
Idle capacity becomes preparation capacity
When there is no task, the fleet studies.
When no task is assigned, agents do not sit idle. They enter a self-directed study and cross-training loop: working through material, comparing approaches, and teaching each other across classes.
That learning is not left as transcript. Implementer agents take what was studied, fine-tune the studied model on it, and inject the resulting knowledge back into the system for future operation. The fleet that handles tomorrow's work is measurably more prepared than the one that finished today's.
- Self-directed study and cross-training during idle windows
- Implementer agents fine-tune the studied model on what was learned
- Knowledge injected back for future operation, not just logged
Provider-neutral, and able to modify weights directly.
Provider routing
- Anthropic
- OpenAI
- xAI
- Moonshot
One task, the model that fits it. No vendor lock.
Fine-tuning · LoRA · a research direction
Not vendor-locked — and not limited to prompting.
The infrastructure orchestrates multiple LLM providers at once — Anthropic, OpenAI, Google, xAI, Moonshot — and routes each task to the model that suits it. It is not tied to a single vendor. Where it helps, I go below the prompt: fine-tuning, LoRA, and advanced weight and bias modification on open-weight models.
There is also active research I will describe only at a high level. It uses small open-weight models in the four-to-twenty-billion-parameter range, with a proprietary process that activates only a minimal part — in memory — of much larger models in the hundred-to-six-hundred-billion range, and integrates that into the small models. The method is withheld; the direction has large potential, and that is as much as I will say here.
- Five providers orchestrated simultaneously, routed per task
- Fine-tuning, LoRA, weight/bias modification on open-weight models
- Small-model research (~4–20B) integrating selective activation of larger ones (~100–600B) — method withheld
This is the infrastructure that makes the rest of the work possible.
What it can reach, and how it stays sound.
Function calling
Agents act through typed tool interfaces — read, write, query, execute — rather than emitting prose I have to interpret by hand.
MCP
Model Context Protocol connects agents to tools and data sources through a common contract, so a capability added once is available across the fleet.
RAG
Retrieval feeds agents the specific context a task needs from a working memory store, instead of relying on what a model happened to be trained on.
Persistent memory
State, prior decisions, and accumulated context survive between sessions. The infrastructure remembers; nothing has to be re-explained each morning.
Observability
What each agent did, why, and against which inputs is recorded — so behaviour can be inspected after the fact, not just trusted in the moment.
Security self-audit
Continuous, authorized self-audit with BlackArch Linux across local and cloud servers. To date: no open breaches and no information leakage.
How one objective becomes coordinated work.
A request does not go straight to a worker. It is read as a single objective, broken into ordered tasks, routed by class, executed, verified, and returned to the center. Each stage has a defined owner.
Intake to return
- 01 Intake Intent arrives from a command surface and is read as a single objective, not yet a plan.
- 02 Decompose The orchestration brain breaks the objective into discrete, ordered tasks with explicit dependencies.
- 03 Route Each task is assigned to the agent class fit for it — a senior executor, or a fan-out of sub-agents.
- 04 Execute Work runs in parallel where the dependency graph allows, against typed tool interfaces.
- 05 Audit A separate agent verifies each result for correctness, security, and coherence before it counts.
- 06 Return Verified output flows back to the center, which alone holds the complete picture of the run.
Connections stay open in both directions.
The link between a command surface and a running instance is socket-level and bidirectional. It stays open rather than polling, so I can push to an instance and it can push back — mid-task, not only on completion.
Bidirectional socket
One line, held open, used both ways.
Socket-level · bidirectional · low-latency
Not request-and-wait. A held line.
Most automation talks to a model by asking a question and waiting for an answer. This does not. The transport is a persistent socket in both directions: I send instruction or context into a live instance, and the instance sends progress, questions, and results back on the same open line.
Because the connection is held rather than re-established per message, latency stays low and an exchange can be continuous. An agent can ask me something mid-task and act on the reply without tearing down and rebuilding the channel.
- Persistent socket — no per-message reconnect overhead
- Both ends can initiate — push in, push back
- Continuous mid-task exchange, not request-and-wait
Reachable from wherever the work is.
The same infrastructure answers to three surfaces. The breadth is not for novelty — it means the fleet is reachable from the workstation, from a phone, and by voice, without a separate system for each.
All three surfaces ride the same bidirectional transport into one core.
Idle, study, fine-tune, inject — then repeat.
The fleet's preparedness is not fixed at deployment. Between active tasks it studies, consolidates what it studied into a model change, and injects the result back. The loop is what makes tomorrow's fleet more prepared than today's.
- Active task The fleet executes against a mission. Intent is decomposed, routed, executed in parallel, audited, and returned. Persistent state is updated as the run proceeds.
- Idle window Self-directed study and cross-training. With no task assigned, agents work through material, compare approaches, and teach each other across classes rather than sitting idle.
- Consolidation Studied material becomes a model change. Implementer agents fine-tune the studied open-weight model on what was learned — LoRA and direct weight and bias modification where it helps.
- Injection Knowledge is fed back for future operation. The result is injected into the system, not left as a transcript, so the next run starts more prepared than the last one finished.
What the fleet knows, and how it reaches it.
An agent does not rely on what a model happened to be trained on. Context is retrieved for the task at hand, state persists between sessions, and every action runs through a typed contract. Nothing has to be re-explained each morning.
Context interfaces
- Working memory store
- task-scoped context, retrieved on demand
- Retrieval
- RAG — the specific context a task needs, not a model's training guess
- Persistent state
- decisions and accumulated context survive between sessions
- Tool contract
- MCP — a capability added once is reachable fleet-wide
- Action interface
- function calling — typed read / write / query / execute
- Provenance
- what each agent did, why, and against which inputs is recorded
The infrastructure checks itself.
The same fleet that does the work is held to an inward-facing security audit. It runs continuously, with authorization, against my own servers — local and cloud — using the BlackArch toolset.
Authorized scope
The self-audit runs against my own local and cloud servers, continuously and with explicit authorization. It is an inward-facing exercise, not an outward one.
BlackArch toolset
BlackArch Linux supplies the tooling the audit runs with — the same instruments used to probe a system are turned on the infrastructure that holds the fleet.
Standing result
To date the audit reports no open breaches and no information leakage. It is a recurring check, so the result is a current state rather than a one-time certificate.
Open to the right work
If your problem needs an operations layer this deliberate, that is the conversation.
If you are holding a problem that doesn't fit inside one field, that is the conversation I want.