OpenClaw Subagents: How to Spawn and Orchestrate Multiple AI Workers

Every OpenClaw operator eventually hits a ceiling: the agent can only do one thing at a time. You ask it to research a competitor, and while it works, you wait. You need a weekly report generated, a batch of URLs crawled, and a Slack digest composed all at once. A single agent session is a single thread of execution, and the world does not wait for sequential processing.

Subagents change that equation. OpenClaw’s subagent system lets you spawn independent child agent sessions that run in parallel, each with its own state, its own tool access, and its own completion timeline. The parent agent can fire off a dozen workers, each specialized for a different task, and collect results as they arrive. True parallelism. Specialized workers. No more waiting.

Learning how to openclaw subagents orchestrate multiple agents 2026 is the skill that separates casual OpenClaw users from production operators running autonomous multi-agent workflows at scale. This guide covers the complete subagent architecture, concrete spawning examples with working JSON configurations, push-based completion, three proven orchestration patterns, the most common failure mode (the isolation trap), and the cost discipline that separates production-grade subagent usage from token-burning chaos.

How Subagents Work: The Architecture

Subagents in OpenClaw are powered by a single primitive: sessions_spawn. When a parent agent calls sessions_spawn, the OpenClaw runtime creates a brand new agent session. That child session is fully independent. It has its own context window, its own tool access, its own conversation history. The parent and child share nothing unless the parent explicitly includes information in the spawning prompt.

This separation is intentional and fundamental. Each subagent is a full, isolated agent runtime. It loads its own system prompt, initializes its own tool environment, and maintains its own conversation state. The parent does not block while the child runs. Instead, the parent continues executing its own work, or simply yields and waits for subagent completion messages to arrive asynchronously.

The gateway routes all communication between parent and child sessions. When you call sessions_spawn, the gateway registers the new session, assigns it a unique session ID, and begins executing the child’s task. The gateway also handles timeout enforcement, session lifecycle, and the delivery of completion announcements back to the parent.

Runtime Options

OpenClaw supports two subagent runtimes, each suited for different workloads:

  • runtime: "subagent" (default) – An OpenClaw-native subagent spawned within the same gateway instance. This is the standard choice for most workloads. The child runs the same agent model and tools as the parent unless overridden. Fast to start, minimal overhead, native integration with the gateway’s session management system.
  • runtime: "acp" – An ACP (Agent Communication Protocol) harness subagent. This routes the task to an external coding or specialized agent – think Codex, Claude Code, or other ACP-compatible agents. Use this when the work needs a different toolset or model entirely, such as spawning a dedicated code-generation agent for a complex refactoring task or routing analysis to a model with specialized capabilities not available in the default agent runtime.

Modes

Each subagent operates in one of two modes, controlling the child session’s lifecycle:

  • mode: "run" – One-shot task execution. The subagent receives its task prompt, executes, produces output, and terminates automatically. This is the common case for parallel workers and pipeline stages. The subagent lives just long enough to do its job and then releases all resources. Use this when you need a specific piece of work done and the result returned.
  • mode: "session" – A persistent subagent session. The child remains alive as an independent session, potentially accepting ongoing conversation or further instructions. Use this for monitoring agents that watch dashboards continuously, support bots that field incoming questions from users, or any subagent that needs to maintain long-running state across multiple interactions. Persistent sessions consume resources until explicitly terminated, so use this mode deliberately.

Key Parameters

The sessions_spawn call accepts these essential parameters that control every aspect of subagent behavior:

  • task (required) – The prompt or instruction the subagent executes. For mode: "run", this is the complete task and the subagent’s only instruction. For mode: "session", this is the initial instruction and the subagent can receive more via the steer action. This parameter is the single most important factor in subagent success or failure – more on that in the isolation trap section below.
  • label – An optional human-readable identifier for tracking. Appears in logs, management output, and completion announcements. Use descriptive labels like "competitor-research-1" or "weekly-report-generator" instead of generic names. When you have 15 subagents running simultaneously, good labels are the difference between knowing what is happening and guessing.
  • runTimeoutSeconds – Maximum wall-clock time the subagent is allowed to run. If exceeded, the runtime terminates the child and reports a timeout. Critical for cost control – always set this. A missing or overly generous timeout is one of the easiest ways to lose control of token spending.
  • model – Override the default model for this subagent. A cheap model for simple filtering, an expensive reasoning model for complex analysis. Mix and match per task to optimize cost while maintaining quality where it matters.
  • lightContext – A lightweight bootstrap configuration. When set to true, the subagent starts with a minimal system prompt and context, reducing startup latency and token consumption for simple tasks. Ideal for high-volume, low-complexity subagent work where the full bootstrap would be wasteful.

Spawning Your First Subagent: Complete Example

Here is a minimal, working subagent spawn. The parent spawns a child to research a market competitor and returns the result.

{
  "action": "sessions_spawn",
  "runtime": "subagent",
  "mode": "run",
  "label": "competitor-research-1",
  "task": "Research the company LangChain Inc. Find their latest funding round, primary product offerings, target market, and any recent press releases. Format the output as a structured brief with sections for Company Overview, Products, Funding, and Recent News.",
  "runTimeoutSeconds": 120
}

In practice, this call would be made from within an agent’s tool call – typically as a JSON block or via a subagent-spawning function in the agent’s toolset. The parent sends the JSON to the gateway, the gateway spawns the new session, and control returns to the parent immediately. The parent does not block. It receives the result later as a subagent_announce message.

A more advanced spawn with model override and light context:

{
  "action": "sessions_spawn",
  "runtime": "subagent",
  "mode": "run",
  "label": "summary-generator",
  "model": "claude-sonnet-4-20250514",
  "lightContext": true,
  "task": "Summarize the following text in exactly 3 bullet points. Keep each bullet under 20 words.\n\n[PASTE TEXT HERE]",
  "runTimeoutSeconds": 30
}

The model parameter lets you use a cheaper, faster model for simple summarization while reserving expensive reasoning models for complex analysis. The lightContext flag skips loading the full agent bootstrap, making startup near-instant for trivial tasks. Combined, these two parameters can reduce per-subagent overhead by 40-60 percent for simple workloads.

Push-Based Completion: How to Receive Results

OpenClaw uses a push-based completion model. When a subagent finishes its task – whether by completing successfully, timing out, or hitting an error – the runtime sends a subagent_announce message back to the parent session. The parent receives this message as a normal event and can process the results immediately.

The subagent_announce message includes these fields:

  • The subagent’s session ID (the unique identifier assigned at spawn time)
  • The label set at spawn time (your human-readable identifier)
  • The final output or error message from the subagent
  • The completion status (one of: success, timeout, error, killed)
  • Token usage (input tokens, output tokens, total) and wall-clock duration

Do not poll for results. This is the most common newcomer mistake with subagents. OpenClaw does not offer a polling endpoint for subagent status because the architecture is designed around push from day one. The parent agent should simply yield control and wait for the subagent_announce messages to arrive. Busy-polling wastes tokens, increases latency for other tasks, and creates race conditions in multi-subagent scenarios where subagents complete at unpredictable times.

When spawning multiple subagents, the parent can maintain a tracking structure – a simple dictionary mapping session IDs to expected work – and fill in results as each subagent_announce arrives. The parent knows it is done when all expected announcements have been received or a supervisor timeout has elapsed. This pattern scales cleanly from 2 subagents to 200.

Parallel Execution: Fan-Out Pattern

The fan-out pattern is the most powerful subagent use case. A parent spawns N subagents simultaneously, each working on an independent subtask. Total wall-clock time equals the slowest subagent, not the sum of all tasks. This is where subagents deliver their highest value: turning O(N) sequential time into O(1) parallel time.

Consider a content analysis system that needs to analyze 20 different sources. A sequential approach would take 20x the per-source analysis time. With subagent fan-out, it takes roughly 1x the slowest analysis time.

[
  {
    "action": "sessions_spawn",
    "runtime": "subagent",
    "mode": "run",
    "label": "source-1",
    "task": "Analyze the following source for key themes, sentiment, and named entities. Output JSON with keys: themes, sentiment, entities.\n\nSOURCE: [source URL or text 1]",
    "runTimeoutSeconds": 60
  },
  {
    "action": "sessions_spawn",
    "runtime": "subagent",
    "mode": "run",
    "label": "source-2",
    "task": "Analyze the following source for key themes, sentiment, and named entities. Output JSON with keys: themes, sentiment, entities.\n\nSOURCE: [source URL or text 2]",
    "runTimeoutSeconds": 60
  }
]

The parent can fire all 20 spawn commands in a single batch. Each subagent runs in its own isolated session. The parent tracks incoming subagent_announce messages and merges results as they arrive. The total time is roughly the duration of the slowest single analysis, not 20 times the average.

For large fan-outs, the parent should stagger spawns slightly or use rate limiting if the subagents share resources – for example, if all 20 subagents are hitting the same external API. OpenClaw does not impose an artificial cap on subagent count, but your gateway’s available memory, thread pool, and token budget are real constraints. Test your gateway’s subagent capacity under load before relying on large fan-outs in production.

The Isolation Trap: What Subagents Cannot See

This is the single most common subagent failure mode, and it can waste hours of debugging time if you do not understand it upfront.

Subagents have no access to the parent session’s context. None. Zero. The child session starts with a blank state. It does not inherit the parent’s conversation history, file handles, environment variables, API credentials, loaded tools, or current working directory. Everything the subagent needs must be explicitly included in the task parameter.

Common mistakes that new subagent users make:

  • Assuming the subagent can read files the parent has open – it cannot. Pass file contents directly in the task prompt, or include enough information that the subagent can locate and read files on its own.
  • Assuming the subagent inherits API keys or credentials from the parent’s environment – it does not. Include necessary authentication tokens in the task prompt or ensure the subagent’s tool configuration includes the required credentials.
  • Assuming the subagent shares the parent’s memory, conversation history, or session state – it does not. The child starts completely fresh, with no knowledge of anything that happened before its spawn moment.
  • Assuming variables defined in the parent’s session are available in the child – they are not. Any variable value needed by the subagent must be inlined into the task string.

The fix is simple but requires discipline: treat every subagent task as a self-contained instruction. If the subagent needs to write to a specific directory, include the absolute path in the task. If it needs an API key, embed it in the task string. If it needs context from earlier in the parent’s conversation, include the relevant snippet as part of the instruction.

A good test before spawning any subagent: read the task prompt aloud. Would it make complete sense to someone who walked into the room with zero prior context? If the answer is no, the subagent will likely fail or produce incorrect results, and you will waste time debugging an issue that is not a bug – it is an isolation architecture working exactly as designed.

Managing Subagents: List, Steer, and Kill

OpenClaw provides a subagents() tool for runtime management of active child sessions. This tool accepts an action parameter that controls the operation. These management operations give you visibility and control over the subagent lifecycle without needing to restart or modify configurations.

List Active Subagents

subagents(action="list")

Returns a list of all active (running) subagents spawned by the current session. Each entry includes the session ID, label, runtime mode, elapsed time, and current status. Use this for inventory – know what is running at any moment, especially when debugging unexpected behavior or tracking down orphaned subagents that should have completed.

Steer an Active Subagent

subagents(action="steer", sessionId="abc123", message="Stop researching LangChain and switch to Anthropic")

Sends an intervention message to a running subagent. The child receives the message in its context and can respond or adjust its behavior. Use steer to correct course mid-execution, provide additional context that was not available at spawn time, or redirect work without killing and respawning. This is especially useful for long-running mode: "session" subagents where the task may need ongoing guidance.

Kill a Runaway Subagent

subagents(action="kill", sessionId="abc123")

Terminates a subagent immediately. Any uncompleted work is lost. Use this for cost control: if a subagent is stuck in a loop, producing garbage output, or exceeding its expected runtime, kill it. Always set runTimeoutSeconds as an automatic safety net, but kill is the manual override when you spot a problem before the timeout fires. Kill early, kill often – a runaway subagent is the fastest way to burn through a token budget.

Three Orchestration Patterns

Most subagent workflows fall into three patterns. Each serves a different need, and understanding which pattern fits your use case is the key to building reliable multi-agent systems. These patterns correspond to the same categories used in the broader multi-agent orchestration literature, adapted for OpenClaw’s subagent architecture.

Fan-Out (Parallel Workers)

One parent spawns N independent workers. Workers never communicate with each other. The parent collects results as they arrive, typically merging them into a combined output.

Use when: Tasks are truly independent – analyzing different sources, processing different files, querying different APIs. The work partitions cleanly into disjoint units that do not need intermediate coordination.

Pitfall: If the aggregation logic is complex – merging, deduplicating, ordering, or resolving conflicts between worker outputs – the parent can become a bottleneck. Design the merge step carefully and test it with edge cases like empty results from some workers.

Pipeline (Sequential Stages)

Subagent A produces output consumed by Subagent B, which feeds Subagent C. The parent orchestrates the chain: spawn A, wait for its completion announcement, spawn B with A’s output included in B’s task prompt, then spawn C with B’s output.

Use when: Each stage genuinely depends on the previous stage’s output. Data extraction feeds entity recognition, which feeds summarization. A document processing pipeline where raw text goes to classification, classification goes to structured extraction, and extraction goes to database insertion is the classic pipeline use case.

Pitfall: Total latency equals the sum of all stages. Not suitable for real-time workloads where end-to-end response time matters. A failure in any stage cascades through the entire chain – recovery requires replaying from the last known-good stage, not just retrying the failed stage in isolation.

Supervisor (Monitor and Recover)

A parent subagent (the supervisor) spawns child workers and actively monitors their health. If a worker fails or times out, the supervisor respawns it. The supervisor can also aggregate partial results from surviving workers when some fail, preventing the loss of completed work.

Use when: Reliability is critical and the cost of failure is high. Batch processing, ETL pipelines, data migration jobs, and any workload where you cannot afford a single point of failure. The supervisor pattern is the most robust but also the most complex to implement correctly.

Pitfall: The supervisor itself can become a source of complexity. Design error handling before success handling. Decide ahead of time: when do you abort the entire batch versus continue with partial results? What is the retry policy – retry immediately, back off, or skip? Does the supervisor need to detect duplicate results from retries? These questions must be answered in the supervisor’s task prompt, not discovered in production.

Cost Management: Subagents Are Not Free

Every subagent is a separate model session with its own token budget. A fan-out of 10 subagents costs approximately 10 times the tokens of a single agent doing the same work sequentially. This is not a bug – it is the price of parallelism, and it must be accounted for in your cost model.

Consider the arithmetic: a sequential agent processing 10 items pays for one context window, one system prompt load, and one set of tool overheads. Ten parallel subagents each pay for their own context window, their own system prompt, and their own tool environment, plus the overhead of spawning and session management. The system prompt and bootstrap overhead – often 1,000 to 3,000 tokens per session – is multiplied by the subagent count. For 10 subagents, that is 10,000 to 30,000 tokens of pure overhead before any real work begins.

When NOT to Use Subagents

Subagents are not the right tool for every job. Here are the cases where you should keep work in a single agent session:

  • Simple sequential tasks. If the work is light and fast – a single web fetch, a brief analysis, a short generation – a single agent is cheaper and simpler. Subagents add spawning overhead, session management overhead, and result routing overhead for no benefit.
  • Tightly coupled tasks. If workers need to share intermediate state, coordinate decisions, or reference each other’s outputs mid-execution, subagents create complexity that outweighs parallelism. Use a single agent with tool loops instead – a for loop over items in a single session is often the right pattern.
  • Cost-sensitive batch processing. If you are running on a tight token budget and wall-clock time is acceptable, test whether sequential execution is fast enough. Parallel wins on latency but loses on total cost by a factor equal to the number of workers plus overhead.
  • Rapid fire tasks. Spawning a subagent for a task that takes 3 seconds and a few hundred tokens adds more overhead than it saves. Use subagents when individual task runtime is measured in tens of seconds or more, where the parallel speedup justifies the overhead.

Cost Optimization Strategies

When subagents are the right tool, these strategies keep costs under control:

  • Use cheaper models for simple subagents. Not every worker needs GPT-4 or Claude Opus. Route simple extraction or filtering to smaller, cheaper models via the model parameter. Reserve expensive reasoning models for the minority of tasks that genuinely need them.
  • Use lightContext for trivial tasks. Setting lightContext: true reduces bootstrap token overhead significantly – often by 50 percent or more for simple subagent tasks.
  • Set aggressive runTimeoutSeconds. A stuck subagent burning tokens indefinitely is the most expensive failure mode. Set timeouts aggressively. You can always respawn with a longer timeout if needed, but you cannot recover tokens burned in a runaway loop.
  • Batch small tasks into larger prompts. Sometimes it is cheaper to give one agent a list of 10 simple tasks than to spawn 10 subagents. Measure both approaches with your specific workload. The breakeven point depends on model pricing, task complexity, and parallelism requirements.
  • Monitor token usage per subagent. Track the total tokens consumed by child sessions. OpenClaw’s completion announcements include this data as part of the payload. Log it, track it per subagent label, and use the data to refine your cost model over time.

Sources

  • OpenClaw official documentation – sessions_spawn API reference, subagents() tool specification, runtime options, and mode semantics
  • Production subagent usage patterns validated across OpenClaw gateway deployments
  • Direct operational experience managing multi-subagent workflows on OpenClaw instances
  • Multi-agent orchestration patterns adapted from production-scale design principles

Related Reading

Similar Posts