05 — AI & Multi-Agent Systems

A working multi-agent infrastructure

The infrastructure that makes the rest of the work possible.

A custom-built, bidirectional, multi-agent operations system runs continuously on dedicated hardware. It was designed and built from scratch — not assembled from open-source frameworks.

What it is

This is the center of gravity of the current work. Everything else on this site is shipped through it or coordinated by it.

A multi-agent operations infrastructure runs continuously on a dedicated server. It is bidirectional and always-on: agents hold persistent memory and a persistent runtime, so they do not start cold each time I reach for them. The system was designed and built from scratch — not assembled out of open-source agent frameworks — because the requirements were specific and the coordination had to be mine to control end to end.

There are distinct agent classes, each with a defined role. A central orchestration brain interprets intent and routes work. Ambassadors translate that intent into each provider and system's own protocol. Auditors verify what other agents produce. Messengers move information between layers without loss. Researchers gather context and feed it back. Senior executors do full-capability work, up to sixteen at once; below them, narrower sub-agents fan out — more than three hundred in parallel when the workload demands it.

I command the fleet from several surfaces. From the workstation through VSCode and the Linux CLI; from mobile devices in real time; over socket-level bidirectional connections that stay open rather than polling; and through real-time voice calls with running instances. The point of the breadth is not novelty — it is that the same infrastructure is reachable from wherever the work happens to be.

0

senior executor instances, full-capability, running concurrently

300+

narrower sub-agents in parallel when the workload calls for it

0

LLM providers orchestrated at once — Anthropic, OpenAI, Google, xAI, Moonshot

24/7

always-on, on dedicated hardware, with persistent memory and runtime

Operating parameters

The numbers it actually runs at.

Infrastructure / steady state

Senior executor instances
up to 16 concurrent
Sub-agents
300+ in parallel
LLM providers
5, orchestrated simultaneously
Transport
socket-level, bidirectional, low-latency
Command surfaces
workstation · mobile · voice
Runtime host
dedicated Dell server, always-on
Operational state
persistent — agents never start cold
Agent-to-agent
protocol-mediated, audited
Context interfaces
function calling · MCP · RAG
Origin
built from scratch — not a framework assembly
Topology

One brain, several classes, many workers.

Intent enters at the center. It is decomposed, routed to the class fit for the task, executed, and verified — then the result returns to the center, which holds the only complete picture.

Agent topology A central orchestration brain connects to four coordination classes — ambassadors, auditors, messengers and researchers — which connect to senior executors, which fan out to many sub-agent workers. Orchestration Coordination classes Senior executors · up to 16 Sub-agents · 300+ in parallel Orchestration brain Ambassadors Auditors Messengers Researchers

Schematic. Class roles are fixed; instance counts scale with workload.

The classes, in depth

What each layer is responsible for.

The central brain holds the whole problem.

One orchestration brain sits at the center. It interprets intent, decomposes a mission into tasks, routes those tasks to the agents best suited to them, and holds the global context that no single worker can see on its own.

It also resolves conflicts between agents and maintains operational state across the run. When two paths disagree, the decision is made here, with the full picture — not at the edge by a worker that only knows its own slice.

  • Interprets intent and decomposes it into discrete tasks
  • Routes work to the agent class fit for each task
  • Holds global context and resolves conflicts
  • Maintains operational state across the whole run
Idle time
Metallic icosahedron — abstract representation of the agent fleet.

Idle capacity becomes preparation capacity

When there is no task, the fleet studies.

When no task is assigned, agents do not sit idle. They enter a self-directed study and cross-training loop: working through material, comparing approaches, and teaching each other across classes.

That learning is not left as transcript. Implementer agents take what was studied, fine-tune the studied model on it, and inject the resulting knowledge back into the system for future operation. The fleet that handles tomorrow's work is measurably more prepared than the one that finished today's.

  • Self-directed study and cross-training during idle windows
  • Implementer agents fine-tune the studied model on what was learned
  • Knowledge injected back for future operation, not just logged
Models

Provider-neutral, and able to modify weights directly.

Provider routing

  • Anthropic
  • OpenAI
  • Google
  • xAI
  • Moonshot

One task, the model that fits it. No vendor lock.

Fine-tuning · LoRA · a research direction

Not vendor-locked — and not limited to prompting.

The infrastructure orchestrates multiple LLM providers at once — Anthropic, OpenAI, Google, xAI, Moonshot — and routes each task to the model that suits it. It is not tied to a single vendor. Where it helps, I go below the prompt: fine-tuning, LoRA, and advanced weight and bias modification on open-weight models.

There is also active research I will describe only at a high level. It uses small open-weight models in the four-to-twenty-billion-parameter range, with a proprietary process that activates only a minimal part — in memory — of much larger models in the hundred-to-six-hundred-billion range, and integrates that into the small models. The method is withheld; the direction has large potential, and that is as much as I will say here.

  • Five providers orchestrated simultaneously, routed per task
  • Fine-tuning, LoRA, weight/bias modification on open-weight models
  • Small-model research (~4–20B) integrating selective activation of larger ones (~100–600B) — method withheld

This is the infrastructure that makes the rest of the work possible.

Capabilities

What it can reach, and how it stays sound.

01

Function calling

Agents act through typed tool interfaces — read, write, query, execute — rather than emitting prose I have to interpret by hand.

02

MCP

Model Context Protocol connects agents to tools and data sources through a common contract, so a capability added once is available across the fleet.

03

RAG

Retrieval feeds agents the specific context a task needs from a working memory store, instead of relying on what a model happened to be trained on.

04

Persistent memory

State, prior decisions, and accumulated context survive between sessions. The infrastructure remembers; nothing has to be re-explained each morning.

05

Observability

What each agent did, why, and against which inputs is recorded — so behaviour can be inspected after the fact, not just trusted in the moment.

06

Security self-audit

Continuous, authorized self-audit with BlackArch Linux across local and cloud servers. To date: no open breaches and no information leakage.

Mission decomposition

How one objective becomes coordinated work.

A request does not go straight to a worker. It is read as a single objective, broken into ordered tasks, routed by class, executed, verified, and returned to the center. Each stage has a defined owner.

Intake to return

  1. 01 Intake Intent arrives from a command surface and is read as a single objective, not yet a plan.
  2. 02 Decompose The orchestration brain breaks the objective into discrete, ordered tasks with explicit dependencies.
  3. 03 Route Each task is assigned to the agent class fit for it — a senior executor, or a fan-out of sub-agents.
  4. 04 Execute Work runs in parallel where the dependency graph allows, against typed tool interfaces.
  5. 05 Audit A separate agent verifies each result for correctness, security, and coherence before it counts.
  6. 06 Return Verified output flows back to the center, which alone holds the complete picture of the run.
Mission decomposition flow A single objective enters the orchestration brain, is split into parallel tasks routed to executor classes, each result is audited, and verified output returns to the center. Objective one request Orchestration decompose · route task · executor task · executor task · sub-agents Audit · return verified result result returns to the only complete picture
Transport

Connections stay open in both directions.

The link between a command surface and a running instance is socket-level and bidirectional. It stays open rather than polling, so I can push to an instance and it can push back — mid-task, not only on completion.

Bidirectional socket

Bidirectional socket communication A command surface and a running agent instance hold one open socket; messages travel in both directions along it. Command surface me Agent instance running, warm open socket instruction · context progress · questions · results

One line, held open, used both ways.

Socket-level · bidirectional · low-latency

Not request-and-wait. A held line.

Most automation talks to a model by asking a question and waiting for an answer. This does not. The transport is a persistent socket in both directions: I send instruction or context into a live instance, and the instance sends progress, questions, and results back on the same open line.

Because the connection is held rather than re-established per message, latency stays low and an exchange can be continuous. An agent can ask me something mid-task and act on the reply without tearing down and rebuilding the channel.

  • Persistent socket — no per-message reconnect overhead
  • Both ends can initiate — push in, push back
  • Continuous mid-task exchange, not request-and-wait
Command surfaces

Reachable from wherever the work is.

The same infrastructure answers to three surfaces. The breadth is not for novelty — it means the fleet is reachable from the workstation, from a phone, and by voice, without a separate system for each.

Command surfaces Workstation, mobile, and voice surfaces all connect over bidirectional sockets to one orchestration core that drives the agent fleet. Orchestration core Workstation VSCode · Linux CLI Mobile real time, in the field Voice Workstation, mobile, voice one infrastructure behind all three socket socket socket

All three surfaces ride the same bidirectional transport into one core.

The learning loop

Idle, study, fine-tune, inject — then repeat.

The fleet's preparedness is not fixed at deployment. Between active tasks it studies, consolidates what it studied into a model change, and injects the result back. The loop is what makes tomorrow's fleet more prepared than today's.

Idle-study to fine-tune to inject loop Four stages — study during idle time, consolidate, fine-tune the open-weight model, and inject the result back into the fleet — arranged as a closed cycle. Study idle window Consolidate implementers Fine-tune LoRA · weights Inject back to fleet a more prepared fleet re-enters the loop
  1. Active task The fleet executes against a mission. Intent is decomposed, routed, executed in parallel, audited, and returned. Persistent state is updated as the run proceeds.
  2. Idle window Self-directed study and cross-training. With no task assigned, agents work through material, compare approaches, and teach each other across classes rather than sitting idle.
  3. Consolidation Studied material becomes a model change. Implementer agents fine-tune the studied open-weight model on what was learned — LoRA and direct weight and bias modification where it helps.
  4. Injection Knowledge is fed back for future operation. The result is injected into the system, not left as a transcript, so the next run starts more prepared than the last one finished.
Memory & context

What the fleet knows, and how it reaches it.

An agent does not rely on what a model happened to be trained on. Context is retrieved for the task at hand, state persists between sessions, and every action runs through a typed contract. Nothing has to be re-explained each morning.

Memory and RAG architecture An agent retrieves task-scoped context from a working-memory store via retrieval, acts through typed tool contracts, and writes verified results and decisions back into persistent state. Agent needs context Retrieval · RAG task-scoped lookup Working memory store Tool contract MCP · function calling Persistent state survives sessions query fetch act write

Context interfaces

Working memory store
task-scoped context, retrieved on demand
Retrieval
RAG — the specific context a task needs, not a model's training guess
Persistent state
decisions and accumulated context survive between sessions
Tool contract
MCP — a capability added once is reachable fleet-wide
Action interface
function calling — typed read / write / query / execute
Provenance
what each agent did, why, and against which inputs is recorded
Self-audit

The infrastructure checks itself.

The same fleet that does the work is held to an inward-facing security audit. It runs continuously, with authorization, against my own servers — local and cloud — using the BlackArch toolset.

01

Authorized scope

The self-audit runs against my own local and cloud servers, continuously and with explicit authorization. It is an inward-facing exercise, not an outward one.

02

BlackArch toolset

BlackArch Linux supplies the tooling the audit runs with — the same instruments used to probe a system are turned on the infrastructure that holds the fleet.

03

Standing result

To date the audit reports no open breaches and no information leakage. It is a recurring check, so the result is a current state rather than a one-time certificate.

Open to the right work

If your problem needs an operations layer this deliberate, that is the conversation.

If you are holding a problem that doesn't fit inside one field, that is the conversation I want.

NextIndustrial Automation