OpenClaw uses your most expensive model for everything, even simple tasks

OpenClaw routes every request through whatever model you set as default. There is no built-in logic that says “this task is simple, use something cheaper.” That routing is yours to define. Most operators set a default model once during setup and never revisit it. If that model is a frontier model (Claude Sonnet, GPT‑4o, etc.), it is running everything: your heartbeats, your file reads, your status checks. This is how to fix that without breaking the tasks that actually need a capable model. The fix is routing, not replacement.

TL;DR: Your expensive model is running file reads, status checks, and heartbeats. None of those need it. The fix is routing specific task types away from it. Route heartbeats to a local model in config, add explicit routing rules for simple tasks in your agent prompt, and keep your capable model for the complex work. Based on typical setups with this routing configuration, moving heartbeats and file reads to a local model reduces daily background spend by 40-60% with no quality loss on anything that matters.

Before you start: You will need access to your OpenClaw config file (openclaw.json) and your agent instruction files (SOUL.md, AGENTS.md). The blockquotes throughout this article are commands you paste directly into your OpenClaw chat. Your agent will run them and report back. You do not need to open a terminal or edit any files manually.

Why OpenClaw does not route by task complexity out of the box

OpenClaw’s design prioritises simplicity and flexibility over automated cost optimisation. The gateway does not know whether a task is “simple” or “complex”. It only knows which model is configured as default, and which models are available as fallbacks. That is intentional. If the gateway tried to guess task complexity, it would break workflows regularly.

This means the responsibility for routing sits with you, the operator. The good news: you have full control. The bad news: if you never set up routing rules, everything runs through your default model, regardless of whether that model is appropriate for the task.

The three routing layers available to you, from easiest to most granular:

Config-level routing: Set a different model for heartbeat tasks. This is a single config key (agents.defaults.heartbeatModel) that applies to all heartbeat calls. It does not affect any other task type.
Agent-level routing: Add routing rules to your agent prompt (SOUL.md or AGENTS.md). These are conditional instructions that tell your agent “for tasks of type X, use model Y.” They take effect immediately in the current session.
Per-task overrides: When you send a message, you can explicitly specify which model to use for that one task. This is manual and not scalable, but useful for edge cases.

This article covers the first two layers. Per-task overrides are covered in Cheap Claw.

Step 1: Find out what your agent actually runs

Before changing any routing, know what you are working with. Your task mix determines which routing rules make sense. A setup that runs mostly file reads and status checks needs different routing than one that runs mostly multi-step reasoning tasks.

Read my SOUL.md, AGENTS.md, and any active queue or task files. List the types of tasks I run regularly, grouped by complexity: simple (status checks, file reads, formatting, logging) vs complex (multi-tool chains, reasoning, code, anything where a wrong answer has consequences). Estimate the frequency of each task type per day.

Your agent will read your instruction files and give you a concrete list. The categories below are starting points. Your setup is what matters.

Step 2: Tasks that never needed a frontier model

These are safe to move to a cheaper or local model without any meaningful quality loss. If your list from Step 1 includes any of these, they are your first routing targets.

Heartbeat checks. The agent reads HEARTBEAT.md and decides if anything needs doing. This is a read-and-decide task with no external consequences. A local 8B model handles it at zero cost.
File reads and short summaries. Reading a file and extracting key points is a pattern-matching task, not a reasoning task. Local models handle it fine.
Status checks and health pings. “Is the gateway running? Is Ollama responding? Is the workspace accessible?” These are yes/no questions with clear criteria.
Log writing and archiving. Appending a line to a log file or moving old logs to an archive directory is a formatting and file operation, not a reasoning task.
Simple yes/no decisions with clear criteria. “Is this file larger than 1MB?” “Does this string match that pattern?” “Is today a weekday?”
Formatting and cleanup tasks. Reformatting JSON, cleaning up whitespace, converting markdown to HTML, etc.

These are the tasks burning money right now. A small local model handles all of them at zero API cost. If you do not have Ollama installed, a cheap API model like DeepSeek V3 handles them at a fraction of frontier model pricing.

Here is what the cost difference looks like in practice, using March 2026 pricing:

Heartbeat check (8,000‑token system prompt): Claude Sonnet 4: $0.024 per call. At every 5 minutes: $6.91/day. DeepSeek V3: $0.0022 per call, $0.63/day. Ollama llama3.1:8b: $0/day.
File read (2,000‑token file + 500‑token summary): Sonnet: $0.0075 per read. DeepSeek V3: $0.00068 per read. Ollama: $0.
Status check (500‑token query + 100‑token response): Sonnet: $0.0018 per check. DeepSeek V3: $0.00016 per check. Ollama: $0.

If your agent makes 50 file reads and 100 status checks per day, that is $0.375 + $0.18 = $0.555/day on Sonnet, versus $0.034 + $0.016 = $0.05/day on DeepSeek V3, versus $0 on Ollama. The difference compounds quickly.

Now add heartbeat costs: at every 5 minutes, Sonnet costs $6.91/day, DeepSeek V3 costs $0.63/day, Ollama costs $0/day. The total daily difference between a Sonnet‑default setup and an Ollama‑routed setup for these routine tasks is over $7. That is $210 per month, or $2,520 per year, just for background operations that do not need a frontier model.

Most operators do not realise how much they are spending on these tasks because the costs are diffuse and buried in the total bill. Routing makes them visible and eliminates them.

The goal is not to eliminate all API spend. It is to align spend with value. Pay for capability when you need it, not for routine operations.

Step 3: Change 1: Heartbeat routing (config change)

Heartbeats are a separate config key (agents.defaults.heartbeatModel). If this key is absent, heartbeats fall back to your default model. If you have Ollama running locally, setting this key is the easiest cost win available. The change requires a gateway restart to take effect, but does not require a new session.

Check if agents.defaults.heartbeatModel is set in my openclaw.json. If it is missing or set to a paid model, set it to my fastest local Ollama model (or deepseek/deepseek-chat if no Ollama). Restart the gateway, then confirm the next heartbeat fires without error.

Manual fallback: Open ~/.openclaw/openclaw.json. Find agents.defaults.heartbeatModel. If it is missing or set to a paid model, change it to "ollama/llama3.1:8b" (or whichever local model you have). Save the file and restart OpenClaw. On Linux (system service): sudo systemctl restart openclaw. On Linux (user service): systemctl --user restart openclaw. On macOS: stop and restart the process from your terminal.

Requires Ollama installed with at least one model pulled. If you do not have Ollama, set the heartbeat model to your cheapest API model (DeepSeek V3). Any routing away from a frontier model for heartbeat checks is a net win.

After making this change, watch your provider dashboard for 24 hours. If you had steady background spend even on idle days, it should drop noticeably. Heartbeat spend is almost always underestimated because it is invisible in the moment and only shows up as a diffuse daily cost.

Step 4: Change 2: Task-type routing (prompt change)

Per-task routing lives in your agent prompt, not in openclaw.json. This is a change to your SOUL.md or AGENTS.md routing rules. It takes effect immediately in the current session.

Open my SOUL.md or AGENTS.md. If there is a model routing section, find it. If there is no routing section yet, create one. Add explicit routing rules for the following task types, using my cheapest configured model or a local Ollama model: heartbeat checks, file reads, status checks, log writing, simple formatting. Keep my current primary model for everything else. Show me the full routing section before writing it.

Ask to see the changes before they are written. If your agent proceeds without showing you first, say: “Wait, show me the routing section before writing it.” Routing rules that are too broad will catch tasks you did not intend to move. The goal is to match the task type, not to guess at complexity.

Example routing rule format (from a production AGENTS.md):

## Model Routing Rules

**Default:** deepseek/deepseek-chat (ds-v3) for most tasks, default.
**Complex/nuanced/tool-heavy:** anthropic/claude-sonnet-4-6 (sonnet) when ds-v3 cannot handle it.
**Local quality:** ollama/phi4:latest for drafts, summaries, subagents.
**Local fast:** ollama/llama3.1:8b for heartbeats, file operations, status checks.

**Routing by task type:**
- Heartbeat checks → ollama/llama3.1:8b
- File reads, status checks, log writing → ollama/llama3.1:8b
- Simple formatting, cleanup, yes/no decisions → ollama/llama3.1:8b
- Multi-step reasoning, code, tool chains → deepseek/deepseek-chat
- When deepseek fails or task is clearly beyond it → sonnet

This structure gives you clear fallback ordering and task-type routing in one place. Your agent reads this and follows it.

Step 5: One thing to watch for with chained tasks

If you have a task that chains a file read into a complex output, keep the whole chain on one model tier. Mixing a slow cheap model for the read step with a fast frontier model for the output step creates a latency bottleneck. For chained workflows, pick the tier that fits the most demanding step in the chain and use it throughout.

Example: a task that reads a 10,000‑word document, summarises it, and then writes a report based on the summary. The read step is simple, the summarisation is moderate, the report writing is complex. If you route the read step to a local model and the report writing to a frontier model, you add latency between steps and risk context loss when switching models. Better to run the whole chain on a capable paid model (DeepSeek V3) rather than split it.

Review my task list from Step 1. Flag any tasks that involve multiple steps where different steps have different complexity requirements. For those tasks, recommend whether to keep the whole chain on one model tier or split them.

Step 6: How to verify the routing changes worked

After applying the two changes, confirm they took effect. Config changes do not always behave as expected, and routing rules can be overridden by other instructions.

Check 1: Confirm heartbeat model changed. Wait for a heartbeat cycle to fire (or trigger one manually), then check your provider dashboard. If the heartbeat is now routing to Ollama, you should see no new API calls during idle periods. If you still see calls, the heartbeat model setting did not apply, or the heartbeat is not configured to use the override.

Check 2: Test a simple task. Ask your agent to read a file and summarise it. Then ask: “What model did you just use for that?” It should report the local or cheap model you configured for file reads. If it reports your frontier model, the routing rule did not apply.

Read /home/node/.openclaw/workspace/SOUL.md and summarise it in three bullet points. After you are done, tell me which model you used for this task.

Check 3: Compare spend after 24 hours. Wait one full day after making the changes, then compare your provider dashboard spend to the same day the previous week. The reduction should be visible in the per-model breakdown (Anthropic: platform.anthropic.com → Usage → by model. OpenAI: platform.openai.com → Usage. DeepSeek: platform.deepseek.com → Billing → Usage). Heartbeat and idle spend will appear as a drop in your previous default model’s line. Task spend will shift to the cheaper model.

These two changes handle the obvious wins. Cheap Claw goes further: spend caps so autonomous agents stop at night, fallback chain ordering, and per-task overrides for edge cases your routing rules do not catch.

Complete fix

Cheap Claw

The complete cost reduction playbook. Every lever, ranked by impact. Drop it into your agent and it reads the guide and makes the changes. Operators report 60-80% spend reduction within a week.

Get it for $17 →

Common routing mistakes and how to avoid them

Routing is powerful but easy to get wrong. These are the mistakes that show up most frequently, and how to sidestep them.

Mistake 1: Over‑routing complex tasks

The most common error is routing a task that looks simple but actually requires nuance. Example: “Check if this email is spam.” That seems like a yes/no decision, but spam detection requires understanding context, sender reputation, and content patterns. A local model will get it wrong. The fix: test your routing rules on real examples before committing. Ask your agent to run the task on the cheap model, then evaluate whether the output is acceptable.

Mistake 2: Forgetting about cron jobs

Cron jobs (scheduled tasks that fire automatically at set intervals) run on whatever model is set in the cron entry, or the default if none is specified. If you set up routing rules in your prompt but forget to update cron job model fields, those jobs continue using the expensive default. The fix: audit all cron jobs after changing your default model. Use the blockquote in Step 1 to list them, then update each one’s model field explicitly.

Mistake 3: Assuming local models are always available

If Ollama crashes or the local model is not loaded, OpenClaw falls back to the default model. This means your routing rules silently stop working. The fix: add a health check to your heartbeat or startup routine that verifies local models are responsive. If they are not, log an alert and temporarily route tasks to a cheap API model instead.

Mistake 4: Not accounting for subagent costs

When your main agent spawns a subagent (a separate agent instance it creates to handle a specific task), that subagent makes its own API calls. If your routing rules only apply to the main agent, subagents will use the expensive default. The fix: include subagent routing in your instruction files. Example: “When spawning a subagent for a simple task, pass the model override as part of the spawn command.”

Mistake 5: Routing by token count instead of consequence

It is tempting to route tasks based on how many tokens they use: “Anything under 1,000 tokens goes to the cheap model.” This fails because a 500‑token instruction to “transfer $100 to account X” has high consequence, while a 5,000‑token file read has low consequence. Route by what happens if the model gets it wrong, not by token volume.

Mistake 6: Assuming all local models are equal

Ollama’s llama3.1:8b is fine for yes/no decisions and file reads, but struggles with tasks that require following multi‑step instructions or maintaining context across turns. Phi‑4 is better at instruction following but slower. Match the local model to the task type, not just “any local model.”

Real‑world examples of routing rules

Here are three actual routing configurations from different OpenClaw setups. Use them as templates, not as copy‑paste solutions.

Example 1: Personal assistant with Ollama

## Model Routing

**Default:** deepseek/deepseek-chat
**Fallback:** anthropic/claude-sonnet-4-6 (only when explicitly requested)
**Heartbeat:** ollama/llama3.1:8b
**Local tasks:** ollama/llama3.1:8b

**Routing rules:**
- Heartbeat checks → ollama/llama3.1:8b
- File reads, status checks, log writing → ollama/llama3.1:8b
- Simple formatting, cleanup, yes/no decisions → ollama/llama3.1:8b
- Everything else → deepseek/deepseek-chat
- If deepseek fails or task is clearly beyond it → ask me whether to use sonnet

Example 2: VPS deployment without local models

A VPS (Virtual Private Server) is a rented Linux machine in a data center with a public IP but typically no GPU, common for always‑on OpenClaw deployments where you can’t run Ollama. Without local models, all routing is between API models. The cost reduction comes from not using a frontier model for simple tasks.

## Model Routing

**Default:** deepseek/deepseek-chat
**Fallback:** anthropic/claude-sonnet-4-6
**Heartbeat:** deepseek/deepseek-chat (same as default, but separate config key)

**Routing rules:**
- Heartbeat checks → deepseek/deepseek-chat
- File reads, status checks, log writing → deepseek/deepseek-chat
- Simple formatting, cleanup → deepseek/deepseek-chat
- Multi‑step reasoning, code, tool chains → deepseek/deepseek-chat (same)
- When deepseek fails or task requires very long context → sonnet

**Note:** No local models available. All routing is between two API models. The cost reduction comes from not using sonnet for simple tasks.

Example 3: Mixed‑use with per‑cron routing

## Model Routing

**Default:** deepseek/deepseek-chat
**Fallback:** anthropic/claude-sonnet-4-6
**Heartbeat:** ollama/llama3.1:8b

**Cron job models:**
- Morning summary (8am daily) → ollama/phi4:latest
- Queue processor (every 30min) → ollama/llama3.1:8b
- Weekly research (Sunday 2am) → deepseek/deepseek-chat
- Memory cleanup (Sunday 3am) → ollama/llama3.1:8b

**Prompt routing rules:**
- File reads, status checks → ollama/llama3.1:8b
- Everything else → deepseek/deepseek-chat

These examples show the range of possibilities. Your routing should match your actual task mix and available models.

How to test routing without breaking anything

Before you commit routing changes to your production instruction files, test them in a controlled way. Here is a safe testing protocol.

Create a temporary test file at /workspace/test-routing.md with the proposed routing rules. Then, for each task type in my regular workflow, run one example task using those rules and report which model was used and whether the output was correct. Do not modify my actual SOUL.md or AGENTS.md until I approve the test results.

This gives you a sandbox. Your agent will use the test rules for the duration of that session, but your production files remain unchanged. Once you are satisfied, you can apply the changes permanently.

Testing checklist: For each task type, verify (1) the correct model was selected, (2) the output is acceptable, (3) latency is within expectations, and (4) there are no unexpected side effects (like missing context because the model switch dropped conversation history). Only after all four pass should you deploy the routing rules.

Routing is not a set‑and‑forget configuration. It is a living part of your OpenClaw setup that should evolve as your task mix changes. The test protocol above lets you iterate safely.

How to monitor routing effectiveness

After implementing routing, you need to know whether it is working. These metrics tell you more than your provider dashboard alone.

Metric 1: Spend per task type

Use the spend‑log.md approach from the audit article to track which tasks are using which models. After a week, calculate average cost per task type. If file reads are still costing frontier‑model prices, your routing rule is not firing.

Analyze /workspace/spend-log.md for the past 7 days. Group entries by task type and model used. Show me the top 5 task types by total estimated cost, and flag any where the model used is more expensive than necessary. If spend-log.md does not exist yet, tell me how to create it.

Metric 2: Model switch frequency

How often does your agent switch models within a single session? Frequent switching can indicate poorly defined routing boundaries or tasks that are borderline between simple and complex. Aim for stable routing. Most sessions should stay on one model tier for the majority of their work.

Metric 3: Error rate by model

Track how often tasks fail or produce unacceptable output when routed to a cheap model versus a capable model. If the error rate on the cheap model is high for a particular task type, that task should move back to the capable model. Error rate is more important than raw cost. A task that fails 30% of the time is not worth the savings.

Monitoring cadence: Check these metrics weekly for the first month after implementing routing, then monthly once stable. Routing is not a one‑time setup. It needs maintenance as your task mix evolves.

What to do when routing fails

Sometimes a routing rule will send a task to the wrong model, or a model will fail mid‑task. Here is how to recover without losing work or spending money on repeated attempts.

Symptom: The cheap model produces garbage output

If your local or cheap model returns nonsense for a task that should be simple, the problem is usually one of three things: (1) the model is not loaded or is out of memory, (2) the task is actually more complex than you thought, or (3) the prompt format does not match the model’s expectations. Fix: check Ollama status, test the same task on a capable model to see if it works there, and simplify the instruction if possible.

Check Ollama status: is the model loaded? If not, load it. Then retry the failed task on the cheap model with a simpler instruction. If it still fails, route that task type to a capable model and note it for later review.

Symptom: The model switch loses context

When you switch models mid‑conversation, the new model does not have the previous context unless you explicitly pass it. This can break chained tasks. Fix: either keep the whole chain on one model, or include a context summary when switching. Example: “Here is what we have done so far: [summary]. Now continue with the next step using model Y.”

Symptom: Routing rules are ignored

If your agent uses the wrong model despite your routing rules, check for conflicting instructions elsewhere. A protocol file might override the routing section, or a cron job has a hard‑coded model. Fix: ask your agent to show you all model‑related instructions in effect for the current session.

Show me all model routing instructions currently active in this session, including any from SOUL.md, AGENTS.md, cron job configs, and session‑level overrides. Highlight any conflicts.

Symptom: Spend did not drop after routing changes

If you made routing changes but your provider dashboard shows the same spend pattern, the changes did not take effect, or the tasks you routed were not the ones causing the spend. Fix: use the audit guide from article 1 to identify which tasks are actually costing money, then verify your routing rules cover those tasks.

Routing failures are normal during setup. The key is to detect them quickly and adjust. Do not assume your first routing configuration is perfect. Expect to iterate.

Once you have routing in place, your spend should align with task complexity. Simple tasks cost little or nothing. Complex tasks cost what they need to. The gap between those two is where most of the waste happens. Close it, and your bill becomes predictable.

The two changes in this guide (heartbeat config and prompt routing) close 80% of that gap for most operators. They are the highest‑leverage, lowest‑risk adjustments you can make. Start with them, then layer on the finer controls. The remaining 20% is fine‑tuning: cron job models, subagent routing, fallback chain ordering, and spend caps. Those are in Cheap Claw.

What routing cannot fix

Routing solves the problem of using an expensive model for simple tasks. It does not solve every cost problem. These are the limits of routing, and what to do about them.

Problem 1: Too many complex tasks

If your workload consists mostly of tasks that genuinely need a frontier model (research, code generation, complex reasoning), routing will not reduce spend much. The solution is either to accept the cost as necessary, or to redesign the tasks to be simpler (break them down, use caching, pre‑compute answers).

Problem 2: Unbounded autonomous operation

An agent that runs continuously without stopping will accumulate cost regardless of routing. Routing lowers the per‑task cost, but does not limit total volume. The solution is spend caps or time‑based shutdowns.

Problem 3: Plugin‑initiated API calls

Some plugins make their own LLM calls outside your agent’s routing rules. Memory extraction plugins are a common example. Routing rules in your prompt do not affect these calls. The solution is to configure the plugin’s LLM settings directly, or choose plugins that respect your routing configuration.

Problem 4: Context window bloat

If your system prompt is 20,000 tokens long, every API call pays for those tokens. Routing cannot reduce that overhead. The solution is prompt optimisation and caching.

Routing is one tool in the cost‑reduction toolbox. It is the most powerful tool for most setups, but not the only one. Use it alongside the other cost tools for full coverage.

Quick start checklist

If you want to implement routing now and refine later, follow this five‑step sequence. It gets you 80% of the benefit with minimal risk.

Step 1: Set heartbeat model. If you have Ollama, set agents.defaults.heartbeatModel to ollama/llama3.1:8b. If not, set it to deepseek/deepseek-chat. Restart the gateway.
Step 2: Change default model. Set agents.defaults.model to deepseek/deepseek-chat. Keep your frontier model as a fallback (the model OpenClaw uses if the primary fails). Restart the gateway, then start a fresh session (/new) to pick up the new default.
Step 3: Add routing rules to your prompt. In SOUL.md or AGENTS.md, add a section that says: “For heartbeat checks, file reads, status checks, log writing, and simple formatting, use the cheapest available model (local if available, otherwise DeepSeek V3). For everything else, use DeepSeek V3. If DeepSeek fails or the task is clearly beyond it, use the frontier model.”
Step 4: Test one task type. Ask your agent to read a file and summarise it. Verify it used the cheap model. If it did not, adjust the routing rule wording.
Step 5: Monitor for a week. Check your provider dashboard after 7 days. Spend should be lower, especially during idle periods. If not, revisit Step 1‑3.

This checklist assumes you have DeepSeek V3 configured. If you do not, substitute your cheapest available API model. The principle remains: route simple tasks away from your most expensive model.

I want to implement the quick start checklist now. Walk me through each step, showing me the exact config changes and prompt additions before applying them.

Note: This checklist is a starting point, not a complete solution. It addresses the most common cost leaks. For fine‑tuning (cron job models, subagent routing, spend caps), see Cheap Claw.

FAQ

Does changing the default model affect sessions that are already running?

No. OpenClaw caches the model setting when a session starts. Existing sessions continue with whatever model they started on. The new default only applies to sessions started after the config change and gateway restart. This is why the routing rules in your prompt take effect immediately. They are session-level instructions, not config-level changes.

What if I don’t have Ollama running locally?

Use a cheap API model instead. DeepSeek V3 costs a fraction of frontier model pricing and handles simple tasks well. Set it as your heartbeat model and default, and use sonnet or equivalent only when you explicitly need it. You do not need local models to see meaningful cost reduction. The key is routing, not the specific model you route to.

How do I know if a task is simple enough to route to a cheaper model?

Ask: what happens if the model gets it wrong? For a heartbeat ping or a file read, a wrong answer is immediately obvious and harmless. For a multi-step reasoning task or anything that triggers external actions, the cost of a failure is higher. Route by consequence, not by token count. If a wrong answer could cause data loss, financial cost, or significant rework, keep it on a capable model.

Will prompt caching help with these tasks too?

Prompt caching helps most when you have a long, stable system prompt that repeats across many calls. For heartbeats and file reads, the system prompt overhead is real but the bigger issue is that you are paying frontier prices at all. Fix the routing first, then layer caching on top for additional savings. Caching alone will not fix a routing problem.

What if my routing rules break a task that actually needed the frontier model?

Your routing rules should include a fallback clause: “If the cheap model fails or produces obviously wrong output, retry with the capable model.” In practice, this happens rarely for the simple task types listed in Step 2. For edge cases, you can add an explicit override in the task instruction: “Use sonnet for this one.” The goal is to route the routine tasks automatically, not to eliminate all use of capable models.

How often should I review and update my routing rules?

Every time you add a new type of task to your workflow. If you start running a new pipeline, cron job, or plugin that makes API calls, check whether it fits your existing routing categories. Also review after any major OpenClaw update that changes model behavior or adds new routing capabilities. For stable setups, a quarterly review is sufficient.

Can I route different cron jobs to different models?

Yes. Each cron job entry in openclaw.json can have an optional model field that overrides the default. This is the most precise way to route scheduled tasks. If you have a morning summary cron job that does not need a frontier model, set its model to ollama/llama3.1:8b. If you have a weekly research task that does need a capable model, set it to deepseek/deepseek-chat. This is more reliable than relying on prompt-based routing for cron jobs.

What is the difference between heartbeat model and default model?

The heartbeat model is used exclusively for heartbeat polls: the scheduled checks that read HEARTBEAT.md. The default model is used for everything else unless overridden by routing rules or per-task instructions. They are separate config keys. You can set heartbeat to a local model and default to a cheap API model, giving you two layers of routing before you even touch prompt-based rules.

Does routing affect subagents spawned by my main agent?

Yes. When your main agent spawns a subagent, the subagent inherits the model routing rules from the parent session unless you specify otherwise. If your main agent is using a cheap model for simple tasks, and it spawns a subagent to handle a simple task, the subagent will also use the cheap model. This is generally what you want. Subagents should match the complexity of the work they are doing.

Can I route based on time of day?

Yes, but not directly in OpenClaw config. You can add a time‑based rule to your agent prompt: “After 10pm local time, route all non‑urgent tasks to the local model and postpone anything that requires a paid model until morning.” This is a soft routing rule. The agent follows it if it can, but there is no hard enforcement. For hard time‑based routing, use cron jobs with explicit model fields that fire at specific times.

What if my cheap model is slower than my frontier model?

That is expected. Local models are slower than API models, and cheap API models are slower than frontier models. The trade‑off is cost versus speed. For tasks where latency matters (interactive conversations, real‑time monitoring), you can choose to keep them on a faster model even if they are simple. For background tasks (heartbeats, log writing, scheduled summaries), speed does not matter. Cost does.

Go deeper

I woke up to a $300 OpenClaw bill and had no idea what caused it

Find the source before changing anything. Which model, which tasks, and how to build a log that shows what is actually running.

Read →

I turned on prompt caching and my bill didn’t change at all

Caching only helps in specific situations. When it works, when it does nothing, and when it makes costs worse.

Read →

Setting spend limits so your agent stops at night

Agents that run while you sleep can rack up bills with no kill switch. How to build one.

Read →