OpenClaw 2026-4-24: Voice Calls, DeepSeek V4, and Browser Automation — What’s New and What It Means
Opening On April 24, 2026, the OpenClaw project shipped its 2026-4-24 stable release, and it is the most consequential update in the platform’s history for two opposing reasons. On one side, this release adds three transformative capabilities: native voice calling, DeepSeek V4 model integration, and computer-use-style browser automation. On the other, those same capabilities expand OpenClaw’s attack surface at a moment when the project is already under siege. The same week saw a coordinated trojan horse campaign compromise approximately 28,000 OpenClaw nodes, and eight critical CVEs were patched in this exact release. OpenClaw now has over 3 million active installs. Every new feature is a new vector for an ecosystem that attackers have already proven they will exploit. This article covers every major feature in the 2026-4-24 release, the security implications of each, the cost arithmetic behind DeepSeek V4 adoption, and whether you should upgrade today.
What’s New in 2026-4-24
The 2026-4-24 release is dense. Here is what shipped, feature by feature.
Voice Calls
OpenClaw agents can now initiate and receive phone calls natively. The voice call plugin uses Twilio for telephony transport and supports two modes: a standard interactive voice response loop and a “realtime brain” mode that connects calls through Gemini Live for low-latency conversational responses. The architecture is plugin-based, meaning voice call capability can be enabled or disabled per-gateway. The release also includes a `voicecall smoke` command that dry-runs provider readiness before placing a live call, and the `openclaw_agent_consult` tool lets phone calls hand off to the full OpenClaw agent for deep, tool-backed answers.
Google Meet also ships as a bundled participant plugin with Chrome and Twilio realtime transports, paired-node Chrome support, and artifact/attendance export workflows. The Control UI gained browser WebRTC realtime voice sessions backed by OpenAI Realtime with Gateway-minted ephemeral client secrets.
DeepSeek V4 Integration
DeepSeek V4 Flash and V4 Pro are now in the bundled model catalog, and V4 Flash is the onboarding default for new OpenClaw installs. This is significant because it means every new user who runs `openclaw setup` will default to a self-hostable, open-weight model instead of an API-gated provider. The integration supports DeepSeek’s thinking and replay behavior across tool-call turns, which was broken in earlier beta builds. For existing operators, switching a workflow from Anthropic or OpenAI to a self-hosted DeepSeek V4 instance is a configuration change, not a code change.
Browser Automation (Computer-Use Style)
OpenClaw agents can now browse the web through a managed Chrome instance with viewport coordinate clicks, longer default action budgets, per-profile headless overrides, and tab reuse and recovery. The feature is comparable to Anthropic’s computer-use capability, but self-hosted. An operator can launch `openclaw browser start –headless` as a one-shot override, and agents can inspect page state through CDP-native role snapshots with iframe-aware refs and cursor-clickable detection. The browser doctor tool now has a `–deep` probe mode for slow hosts like Raspberry Pi.
Gateway Pairing Fixes
The gateway pairing system got a targeted improvement: `gateway.nodes.pairing.autoApproveCidrs`, disabled by default, allows first-time node pairing from explicit trusted CIDRs. All upgrade flows and operator/browser pairing remain manual. This directly addresses a pain point for operators managing fleets of nodes on known internal networks.
Other Improvements
Google Meet joins as a bundled participant plugin. The plugin registry migrated to a cold persisted model that cuts startup time. OpenTelemetry coverage expanded across model calls, token usage, tool loops, harness runs, and exec processes. The Codex harness is now integrated with OpenClaw’s context engine. The agent tool access panel in the Control UI was redesigned with compact live-tool chips and direct per-tool toggles. And the eight critical CVEs referenced above — spanning authentication bypass, remote code execution, privilege escalation, and sandbox escape — are all fixed.
The Voice Call Surface
Native voice support changes what an OpenClaw agent can reach. Phone calls are not chat messages. They operate in real time, over PSTN or VoIP, and they carry social engineering weight that text messages do not.
What voice enables for operators.
The obvious use cases are practical. A customer support agent can call a client when a system alert triggers. A DevOps agent can escalate an incident by phone when a Slack message goes unread. A personal assistant agent can place reminder calls, make restaurant reservations, or confirm appointments. The Google Meet plugin extends this into meeting participation: an agent can join a video call, listen, and respond with tool-backed answers drawn from the full agent stack.
The security considerations are not hypothetical.
Every voice call is a bidirectional channel. An agent that can place calls can also receive them. If an attacker compromises a gateway’s voice plugin — or exploits the WebRTC realtime voice session — they gain an audio channel into the agent’s environment. Voice phishing through a compromised agent is a credible attack: an attacker could call a target, spoof the agent’s voice persona, and extract information the agent has access to.
The spoofing surface is also larger than it may appear. OpenClaw’s TTS system now supports eleven providers including Azure Speech, ElevenLabs v3, and Xiaomi. ElevenLabs voice cloning is known to produce convincing replicas with short audio samples. If an attacker gains write access to an agent’s TTS configuration, they could impersonate a trusted contact over a phone call placed by the agent itself.
The realtime brain mode adds another dimension. Voice calls that use Gemini Live for low-latency responses go through Google’s infrastructure. The WebSocket endpoint is owner-auth gated, but the exposure surface for misconfiguration is the same as any WebSocket-facing capability: if authentication is improperly scoped, an external caller could connect to the agent’s voice session without authorization.
Operators deploying voice should consider these mitigations: restrict the voice call plugin to outbound-only mode if inbound calls are not needed. Audit TTS provider configurations regularly. Use the dry-run `voicecall smoke` command before enabling live calling. And never deploy voice on a gateway that also has mirror mode or debugging enabled, since CVE-2026-41355 (mirror mode code injection) could allow an attacker who compromises the gateway to inject malicious audio commands.
Browser Automation: Power and Risk
Browser automation is the highest-risk new feature in the 2026-4-24 release, and security researchers on r/OpenClaw flagged it the same day the release notes went live. The reason is straightforward: giving an agent the ability to browse the web, interact with pages, and execute actions in a real browser creates a substantially larger attack surface than text-only tool use.
What it does.
OpenClaw’s browser automation works through the Chrome DevTools Protocol (CDP). An agent can launch a managed Chrome instance, navigate to URLs, click elements (now at viewport coordinates), fill forms, extract page content, and manage tabs and sessions. The operator can override headless mode per profile, set action timeout budgets (default 60 seconds), and run the browser doctor tool for diagnostics.
This is architecturally comparable to Anthropic’s computer-use feature, which allows Claude to directly interact with desktop applications. The difference is that OpenClaw’s implementation is self-hosted and runs through CDP rather than screen coordinate analysis. The agent sees the browser’s accessible DOM and CDP events, not a screenshot or video stream.
Why it is the highest-risk new surface.
There are four reasons.
First, the browser is a full execution environment. When an agent navigates to a webpage, that page’s JavaScript runs in the browser context. A malicious webpage could exploit CDP to send commands back to the agent. CDP is designed for programmatic browser control, and OpenClaw’s CDP connection is controlled by the gateway, but a crafted page could attempt cross-origin attacks, trigger browser exploits, or serve content designed to manipulate the agent’s decision loop.
Second, the agent’s browser session inherits the agent’s credentials. If the agent is logged into a service (GitHub, Gmail, a corporate dashboard), the browser session carries those cookies and tokens. An attacker who redirects the agent to a malicious page that successfully reads cookies or triggers an authenticated action could exfiltrate those credentials.
Third, the 2026-4-24 release also patched CVE-2026-41349 (agentic consent bypass via config patch) and CVE-2026-41352 (node scope gate RCE). An attacker who chains browser automation with either of those now-patched vulnerabilities could escape the browser sandbox entirely and execute arbitrary commands on the host. The patches are fixed, but the browser automation feature’s existence means that any future sandbox vulnerability in the CDP layer or browser automation plugin has a direct path to host-level access.
Fourth, automated browsing changes the detection profile. An agent running automated browser sessions will leave logs, cookies, and session artifacts. It will trigger rate limits and CAPTCHAs. It will be detected by anti-bot systems. If an operator has not configured appropriate rate limiting, credential scoping, and session isolation, the agent could expose the operator’s infrastructure to website operators and threat intelligence platforms.
Comparison to Anthropic’s computer-use.
Anthropic’s computer-use feature analyzes screenshots and performs actions through mouse and keyboard events. OpenClaw’s approach is different: it uses CDP directly, which means it has structured access to the page’s DOM, network events, and browser state. This is both more powerful and more dangerous. CDP access means the agent can do things screen-based computer-use cannot, like reading network request details and directly dispatching CDP commands. It also means the CDP connection itself is a native attack surface that screen-based approaches do not expose.
Mitigations.
Operators deploying browser automation should: run browser sessions in a dedicated, sandboxed environment (a separate container or VM). Never run the browser with the same credentials as the gateway. Use per-profile headless mode to prevent any visual feedback from being observable by attackers. Set explicit navigation allowlists if the agent only needs to access specific domains. Monitor CDP logs for unexpected cross-origin navigation. And consider whether browser automation is necessary at all — many tasks that agents need to do on the web (API calls, webhook interactions, authenticated data retrieval) can be done better through direct tool calls without a browser.
DeepSeek V4 Cost Arbitrage
The DeepSeek V4 integration in 2026-4-24 is the most important financial decision OpenClaw operators will face this quarter. For the first time, switching from API-gated models to a self-hosted open-weight alternative is a configuration change, not a migration project. The question is whether it makes economic sense for your workload.
What you save.
DeepSeek V4 Flash and V4 Pro are open-weight models. You can host them yourself and pay no per-token API fees. GPT-5.5 costs $5.00 per million input tokens and $30.00 per million output tokens. Claude Opus 4.6 costs more. At scale, the API costs add up quickly.
Per the DeepSeek V4 analysis published on RedRook, a self-hosted V4 Flash instance requires roughly 4x consumer GPUs or a single enterprise GPU (H100 or equivalent) depending on quantization. Cloud GPU rental runs approximately $1,500 to $3,000 per month depending on provider and region. For V4 Pro, the hardware requirements jump significantly: roughly 8x enterprise GPUs or a cluster at $8,000 to $15,000 per month.
At GPT-5.5 API pricing, 10 million output tokens per month cost $300. At 100 million output tokens, the bill is $3,000. That matches the cost of a V4 Flash instance. At 500 million tokens, the API cost hits $15,000, which is where self-hosting V4 Pro becomes cost-competitive.
The crossover point is workload-dependent, but it exists. For any operator doing more than roughly 50 to 100 million tokens per month in agentic workloads, the math favors self-hosting.
Who should switch and who should not.
Switch if: your workload is token-heavy (over 50M tokens/month). You have the hardware budget for cloud GPU rental or own the GPUs already. Your data does not have strict residency requirements that prevent self-hosting. You have operational capacity to manage a model serving stack (vLLM, TensorRT-LLM, or equivalent). You want to eliminate per-token cost variability from your budget.
Do not switch if: you are under the token-volume crossover point and do not expect to cross it. Your data is subject to compliance requirements that mandate API-gated models with specific data processing agreements. You lack the infrastructure to self-host (no GPU hardware, no cloud budget, no DevOps support). You need guaranteed SLAs that a self-hosted deployment cannot match. Your workload requires reasoning capabilities that V4 Pro trails on, specifically MMLU-Pro (87.5% vs. Gemini 3.1 Pro at 91.0%) or GPQA Diamond (90.1% vs. Gemini at 94.3%).
For operators in the middle — comfortable with self-hosting but not at scale — V4 Flash is the pragmatic choice. It costs less to run, supports agentic tasks well (91.6% on LiveCodeBench, 79.0% on SWE Verified), and can be upgraded to V4 Pro later if needed without any code change in OpenClaw.
Data residency and compliance.
DeepSeek V4 is a Chinese company’s open-weight model. The weights are publicly available and auditable. The model runs entirely on your hardware with no telemetry or backchannel to DeepSeek. But the provenance of the training data, the regulatory environment governing the model’s development, and the potential for export control changes are all real considerations. Organizations in defense, critical infrastructure, or regulated industries should have their legal and compliance teams review the model’s licensing and data provenance before production deployment.
OpenClaw’s integration architecture means you can also run a hybrid approach: route sensitive or complex tasks through a self-hosted V4 instance and keep non-sensitive, high-volume tasks on cheaper hosted models. The model routing configuration in OpenClaw is granular enough to support workload-specific model selection.
Should You Update?
Yes, you should update. The 2026-4-24 release patches eight critical CVEs. If your gateway is exposed to the internet — and with 3 million installs, most are — you are vulnerable. CVE-2026-41342 alone allows an unauthenticated remote attacker to complete the node onboarding handshake without a valid bootstrap token. There is no scenario where staying on an earlier version is the safer choice.
But there are caveats.
If you run OpenClaw in a high-security environment (defense, critical infrastructure, or any deployment where an automated browser could cause real damage), consider isolating your gateway before enabling the new features. Specifically:
- Do not enable voice calls or browser automation on the same gateway that handles administrative functions. Use separate gateways for experimentation and production.
- If you deploy browser automation, run the browser in a sandboxed container or VM that has no network access to the gateway or your internal network. The CDP connection from the agent to the browser should be the only communication path.
- Audit your plugin configuration. The 2026-4-24 release adds multiple new bundled plugins (Google Meet, voice call, the cold registry infrastructure). Only enable the plugins you need. Every plugin is a potential vector.
- Review your pairing configuration. The new autoApproveCidrs setting is disabled by default and should remain disabled unless you have a specific fleet-management use case and a well-defined trusted network.
- Upgrade and patch all connected nodes. The eight CVEs cover vulnerabilities in the gateway, node scope, and privilege layers. Nodes connected to a patched gateway remain vulnerable if they are running older builds.
For everyone else — personal gateways, small teams, standard VPS deployments — the update is straightforward. The 2026-4-24 stable release is available from the OpenClaw GitHub releases page. Run your normal update procedure. The breaking changes are minor (the Pi-only embedded extension factory path was removed) and should not affect standard deployments.
Sources
- OpenClaw 2026.4.24 GitHub Release Notes. https://github.com/openclaw/openclaw/releases/tag/v2026.4.24
- OpenClaw 2026.4.25 Pre-release Notes (TTS upgrades, browser safety improvements). https://github.com/openclaw/openclaw/releases/tag/v2026.4.25-beta.4
- “OpenClaw CVE Batch April 2026: Eight Critical Vulnerabilities Every Operator Needs to Know.” Red Rook AI, April 2026. https://redrook.ai/openclaw-cve-batch-april-2026/
- “DeepSeek V4 Pro and Flash: What Open-Weight Agentic AI Means for Enterprise Deployments.” Red Rook AI, April 26, 2026. https://redrook.ai/deepseek-v4-enterprise-agentic-2026/
- “OpenClaw trojan horse agent campaign compromises 28,000 nodes.” TechRadar, March 2026.
- “Trojan horse OpenClaw agents found on npm, 28,000 systems affected.” BleepingComputer, March 2026.
- r/OpenClaw community discussion, April 24-25, 2026. Reddit.
Related Reading
- https://redrook.ai/openclaw-cve-batch-april-2026/
- https://redrook.ai/deepseek-v4-enterprise-agentic-2026/
