The AI Agent Security Threat Landscape: From OpenClaw CVEs to Bissa Scanner Exploitation

The first major offensive campaign against an AI agent platform is no longer hypothetical. In March 2026, a coordinated attack compromised approximately 28,000 OpenClaw nodes through malicious plugin distribution. In April, security researchers disclosed eight critical vulnerabilities in the OpenClaw gateway and node stack. And a threat intelligence group known as Bissa Labs released a scanner that actively probes for exposed OpenClaw and Claude Code deployments.

These events mark a transition. AI agents have moved from experimental infrastructure to production attack surface. The question is no longer whether agent frameworks will be targeted, but how to defend them at scale.

Why AI Agent Security Is Different

AI agents are not web applications. They are not APIs. They are not serverless functions. The attack surface of an agent framework combines the exposure of a network service, the autonomy of an endpoint, and the trust model of a supply chain distribution system. Each property creates attack surface that traditional security tools were not designed to detect.

An agent operates with persistent memory, tool access (filesystem, shell, network, browser), and the ability to act on instructions within a configurable consent model. Compromising an agent is not the same as compromising a web application. A compromised agent can read its memory (which may contain API keys, session tokens, and private communications), execute shell commands on its host, make authenticated network calls to internal services, and install additional plugins that expand its reach.

The security model for traditional software assumes static code, defined network boundaries, and human-in-the-loop authorization for privileged actions. Agents violate all three assumptions. Their code is dynamically generated by language models. Their network boundaries are defined by tool access policies that can be modified at runtime. And their authorization model relies on consent gates that have already been demonstrated to fail under real attack conditions.

The Three Attack Vectors

Every major incident involving AI agents in the first half of 2026 falls into one of three categories. Understanding them separately is necessary because they require different defenses.

1. Prompt Injection via Tools and Data

Prompt injection remains the most fundamental and hardest-to-solve vulnerability in agentic systems. Unlike traditional injection attacks (SQL injection, command injection), prompt injection targets the model’s instruction-following mechanism itself.

The vector works as follows: an agent reads data from an external source (a webpage, an email, a document, an API response). That data contains text designed to override the agent’s system prompt or tool-use instructions. The agent interprets the injected text as a legitimate instruction and acts on it.

Concrete examples from research and real-world incidents include:

Indirect prompt injection via web content. An agent tasked with web research navigates to a page containing hidden instructions. Those instructions tell the agent to exfiltrate its conversation history or API keys to an attacker-controlled endpoint. Successful demonstrations of this technique date back to 2023, but the techniques have matured significantly.

Multi-turn injection attacks. An attacker sends a carefully crafted message that appears benign in the first turn but, when combined with the agent’s own tool output in subsequent turns, causes the agent to take actions it was not authorized for. These attacks exploit the fact that agent memory accumulates context across turns.

Data exfiltration via tool output. An agent that has access to a “read file” tool and a “send email” tool can be prompted to read sensitive files and email them to an attacker. The consent system is supposed to gate this, but the CVE-2026-41349 disclosure showed that consent bypass via configuration patch is a real vulnerability that was present in a production agent framework.

Prompt injection is not a framework bug. It is a structural property of systems that combine language models with tool access. The model cannot distinguish between “this text is the user’s actual request” and “this text is data that the user’s request is processing.” Until models develop reliable internal mechanisms for instruction separation, prompt injection remains the foundational vulnerability of agent security.

2. Malicious Plugin and Skill Supply Chain

The second attack vector is supply chain: distributing malicious plugins or skills through the same channels that distribute legitimate ones.

The March 2026 trojan horse campaign against OpenClaw is the most significant example to date. Attackers published plugins to the OpenClaw skill marketplace that appeared to perform legitimate functions (automation helpers, utility tools, integration connectors). Once installed on a gateway, these plugins executed secondary payloads: credential theft, session hijacking, and deployment of persistent backdoors.

TechRadar reported approximately 28,000 nodes compromised. Bissa Labs later released a scanner that probes for configurations vulnerable to the techniques used in the campaign. The DFIR Report covered the exploitation chain, confirming that the initial access vector was plugin installation, not CVE exploitation.

The plugin supply chain problem is structurally similar to the npm and PyPI malware problem, but the consequences are different. A malicious npm package can execute code on a developer’s machine during build. A malicious OpenClaw plugin can execute code on a gateway that has persistent connections to every connected node, access to the agent’s memory and tool configuration, and the ability to push commands to all downstream agents.

Key structural factors that make the agent plugin market high-risk:

Plugin isolation is immature. The February 2026 OpenClaw disclosures included CVE-2026-41295, a trust boundary violation in which one plugin could read another plugin’s runtime memory because process isolation was not enforced. If plugins share a process space, a malicious plugin can observe all other plugin activity.

Plugins inherit gateway permissions. A plugin that exposes a “spreadsheet tool” or “email connector” runs with the permissions of the gateway operator, not sandboxed to the specific data set it needs.

Plugin distribution platforms prioritize growth over security. The OpenClaw skill marketplace, like npm and PyPI before it, relies on community reporting for malware detection rather than proactive scanning.

The CVE disclosure described a critical vulnerability (CVE-2026-41349) in which an operator with config write access could disable consent checks for specific plugins. Combined with a supply chain attack, this means the consent system itself can be turned off for the attacker’s plugin once it gains any foothold.

3. CVE Exploitation of the Agent Framework

The third vector is traditional vulnerability exploitation, but applied to the agent runtime itself rather than the application running on top of it.

The April 2026 OpenClaw CVE batch disclosed eight vulnerabilities covering the full attack chain from initial access to privilege escalation. The CVEs are:

CVE-2026-41342: Remote onboarding authentication bypass. An unauthenticated attacker can complete the gateway handshake without a valid bootstrap token by crafting a WebSocket upgrade request that skips token validation.

CVE-2026-41349: Agentic consent bypass via configuration patch. An attacker with operator-level gateway access can push a config patch that disables consent checks for specific plugins, bypassing the user authorization gate.

CVE-2026-41352: Node scope gate remote code execution. An argument injection vulnerability in the scope gate allows escape from the command whitelist, executing arbitrary shell commands on the node host.

CVE-2026-41353: Access control bypass via allowProfiles. Profile-based access restrictions are enforced only at the UI layer, not at the API middleware, allowing direct API calls to bypass access controls.

CVE-2026-41355: Mirror mode sandbox code execution. The debugging/mirror mode interface allows code injection into the target agent’s sandbox, bypassing normal code submission channels and consent gates.

CVE-2026-41356: WebSocket session token rotation failure. Session tokens are never rotated after initial authentication, meaning a captured token remains valid indefinitely.

CVE-2026-41359: Privilege escalation via Telegram integration. The Telegram bot does not validate user roles for admin commands, allowing an operator-level user to escalate to admin by issuing a specific command sequence.

CVE-2026-41361: SSRF guard bypass via IPv6. The SSRF guard blocks IPv4 internal ranges but does not check IPv6 special-use ranges, including the IPv4-mapped range. An agent can bypass the guard using ::ffff:127.0.0.1 or [::1].

What is notable about this batch is not any single vulnerability but the attack chain they enable. Chaining CVE-2026-41342 (remote auth bypass) with CVE-2026-41352 (node scope gate RCE) allows a remote, unauthenticated attacker to achieve code execution on a node. Adding CVE-2026-41349 (consent bypass) and CVE-2026-41359 (Telegram privilege escalation) converts that code execution into persistent administrative access to the gateway.

The February 2026 batch (CVE-2026-41295 through CVE-2026-41302) included similar themes: trust boundary violations, SSRF vulnerabilities across multiple subsystems, and authentication bypass. Together, the two batches represent twelve disclosed vulnerabilities over three months in a single agent framework.

The OpenClaw Incident Analysis

The March 2026 incident that compromised 28,000 nodes needs precise characterization, because the public framing has been misleading in important ways.

What Actually Happened

The attack vector was malicious plugin distribution through the OpenClaw skill marketplace. Attackers published plugins that appeared legitimate. Users installed them on their gateways. The plugins contained malicious code that executed secondary payloads.

The initial TechRadar headline (“OpenClaw Trojan Horse Agents Hack 28,000 Systems”) implied that agents themselves were the attack vector. The actual vector was the plugin supply chain. The agents were not compromised through their normal operation. They were compromised because users installed malicious extensions.

This distinction matters because the defensive response is different. If agents were compromised through CVE exploitation, the fix is patching. If agents were compromised through malicious plugins, the fix is supply chain security, plugin vetting, and sandboxing.

The Bissa Scanner

Bissa Labs, a threat intelligence research group, released a scanner that probes for OpenClaw and Claude Code configurations vulnerable to the techniques used in the campaign. The scanner does not exploit new vulnerabilities. It identifies deployments that have not applied known patches, have exposed management interfaces, or are running plugins from untrusted sources.

The scanner’s existence is significant because it operationalizes the research. Any attacker can now scan for vulnerable deployments without needing to develop their own tools. The DFIR Report confirmed active exploitation campaigns using the scanner’s output.

What Operators Should Learn

The incident teaches four lessons that apply beyond OpenClaw:

The plugin supply chain is the highest-probability initial access vector. CVE exploitation requires finding and weaponizing a vulnerability. Malicious plugin distribution only requires publishing code to a marketplace. The barrier to entry for supply chain attacks is lower.

Consent systems are not a substitute for sandboxing. OpenClaw’s consent system was bypassed through a configuration patch (CVE-2026-41349). Even when consent systems work as designed, they rely on the user making correct security decisions at the moment of authorization. User fatigue and insufficient context make this unreliable.

Visibility into plugin behavior is insufficient. The compromised plugins executed secondary payloads that were not visible to the gateway operator through normal monitoring. The plugin’s declared capabilities (what it claims to do) did not match its actual behavior.

The gap between disclosure and patching is the operational window for attackers. The February CVEs were patched in the 2026-2-15 release. The trojan horse campaign exploited deployments that had not applied that update. The scanner accelerates exploitation of unpatched deployments.

Why Enterprise Defenses Are Lagging

On April 20, 2026, Palo Alto Networks published research on the gap between enterprise security tooling and AI agent behavior. The findings were not optimistic.

Traditional SIEM and EDR tools are calibrated for specific behavioral patterns: process creation, network connections, file system modifications, registry changes, authentication events. AI agents produce none of these signals in the way that traditional tools expect. An agent making an API call looks like a user making an API call. An agent reading a file looks like a user reading a file. An agent sending an email looks like a user sending an email.

The behavioral patterns that indicate agent compromise are different:

Consent bypass events. An agent performing an action that should have required user consent but was not gated.

Tool use patterns inconsistent with the agent’s declared purpose. A research agent making shell calls. A file management agent making network connections to external IPs.

Plugin behavior divergence. A plugin claiming file access capabilities making network calls to unknown endpoints.

Memory access patterns. An agent reading data from memory that is not part of its configured tool scope.

Traditional EDR tools are not instrumented to detect any of these patterns because they do not operate at the agent runtime layer. They see the host-level activity (process creation, network connections) but lack the context to distinguish between authorized agent activity and an attacker operating through the agent.

Palo Alto’s research concluded that existing enterprise deployments are running agent frameworks without any runtime behavioral monitoring. The agent is invisible to the security stack because the security stack was designed for a world where software does not have autonomous tool access and dynamically generated code execution paths.

What Is Needed

Enterprise defenses for AI agents require three capabilities that do not exist in most current deployments:

Agent-aware logging. The security stack needs visibility into the agent’s decision loop: what prompt was processed, what tools were invoked, what data was read, what actions resulted. This is not logging from the agent framework (which may be compromised). It is logging from a separate monitoring layer.

Behavioral baselines for agent activity. An agent that suddenly accesses databases it has never queried, makes calls to IP ranges it has never contacted, or reads files outside its configured scope should trigger an alert. Building these baselines requires instrumentation that most agent frameworks do not expose by default.

Consent system integrity monitoring. The consent system is the authorization gate for high-risk agent actions. Security monitoring must detect if that gate is bypassed, disabled, or reconfigured. This requires external verification that the consent system is functioning as configured.

Emerging Mitigations

The security response to the AI agent threat landscape is developing in parallel, and several approaches have emerged.

ClawPatrol and Enkrypt AI

On April 21, 2026, two days before the DFIR Report on Bissa scanner exploitation, Enkrypt AI released ClawPatrol, a gateway security tool for OpenClaw. ClawPatrol operates as a monitoring and blocking layer between the gateway and its nodes. It intercepts agent actions before they reach the tool execution layer, applies policy rules, and blocks or logs actions that match defined risk criteria.

ClawPatrol’s architecture represents a practical response: it does not rely on the gateway’s own security controls (which may be compromised), but operates as an independent verification layer. This is the same principle as network segmentation and bastion hosts in traditional infrastructure security.

CVE Patching Cadence

The OpenClaw project released two emergency patch batches in three months (February 2026 and April 2026). The project has committed to a faster disclosure and remediation cycle, but the volume of disclosures raises questions about whether the framework’s architecture enables secure development or requires fundamental redesign.

Operators should treat patching as their primary defense. Every unpatched deployment is a known-exploitable target. The Bissa scanner and similar tools mean that the time between public disclosure and active scanning is measured in days or hours.

Skill Vetting Process

The OpenClaw project has not yet published a formal skill vetting process. The current model relies on community reporting and post-incident removal. For a platform with over 3 million active installs, this is not sufficient.

Recommended elements of a skill vetting process include:

Static analysis of plugin code before publication. Scanning for known malicious patterns, obfuscation, and network exfiltration calls.

Runtime behavior monitoring for published plugins. Periodic execution in a sandboxed environment to verify that plugin behavior matches declared capabilities.

Permission model enforcement. Plugins should declare their required permissions at installation time, and the framework should enforce those permissions at the runtime level rather than trusting plugin code to self-restrict.

Version pinning and checksum verification. Users should be able to verify that the plugin they are running matches a published checksum, and the framework should warn when a checksum mismatch is detected.

Network Segmentation for Agent Nodes

The simplest and most reliable mitigation is network segmentation. The February and April CVE batches demonstrated that many vulnerabilities require network access to the gateway to exploit. Placing gateways behind VPNs, Tailscale networks, or VPC boundaries dramatically reduces the attack surface.

Specific recommendations:

Gateways should not have public IPs unless absolutely necessary. When they must be public, restrict access to known IP ranges.

Nodes should connect to gateways over tailnet or VPN, not over the public internet.

Remote onboarding should be disabled on all production gateways. New nodes should be paired through administrative channels, not through an exposed WebSocket endpoint.

Mirror mode and debugging features should be disabled in production. These features provide attacker utility far exceeding their operational value.

The Telegram integration should be restricted to admin accounts only, and audit logging should capture all command sequences executed through the bot.

The Supply Chain Problem

The plugin supply chain for AI agents reproduces every mistake that software ecosystems have made in the past two decades. The incentives are the same: growth of the platform requires a rich ecosystem of third-party plugins, and security friction slows adoption. The consequences are worse.

How the OpenClaw Skill Marketplace Risk Model Works

The OpenClaw skill marketplace is a centralized repository of community-contributed plugins. Users browse, install, and update plugins from the marketplace through the OpenClaw CLI or Control UI. The marketplace has a submission process that includes basic metadata verification, but does not include code review, static analysis, or runtime sandboxing of submitted plugins.

The risk model inherits from npm’s early years: trust is implicit, malicious plugins are removed after they are reported, and the user is responsible for verifying the safety of any plugin they install. This model failed in the Node.js ecosystem, where supply chain attacks continue to be the most common attack vector. It is failing in the AI agent ecosystem for the same structural reasons.

Red Flags for Plugin Vetting

Until the marketplace implements proactive security measures, operators must vet plugins themselves. The following red flags apply:

Plugins that request broader permissions than their stated function requires. A CSV file reader does not need network access. A calendar integration does not need filesystem write access outside its own data directory.

Plugins with obfuscated code or encoded strings. Legitimate open-source plugins do not need to hide their code. Obfuscation in a plugin published to a marketplace is a strong signal of malicious intent.

Plugins from unknown publishers with no history of contributions to the platform. The pseudonymous nature of plugin publishing means that attackers can create new accounts, publish one or two plugins, and then disappear after installation targets are met.

Plugins that include network calls to unknown domains. The plugin’s code should connect only to services that are documented and relevant to its function. Any network call to an unregistered or recently registered domain warrants immediate investigation.

Plugins that modify gateway configuration. Any plugin that attempts to change consent settings, add plugins, or modify access controls should be treated as a host of unknown origin.

Plugins with no version history or changelog. A plugin that appears, receives no updates, and has no public development history is more likely to be a throwaway account.

Plugins that use outdated or deprecated API methods that bypass current security controls. The April 2026 CVE batch patched several vulnerabilities that could be exploited through plugins using pre-patch APIs.

Beyond OpenClaw: The Broader Supply Chain Issue

The plugin supply chain problem is not unique to OpenClaw. LangChain, Composio, and other orchestration layers face the same challenge: they expose plugin and tool interfaces that third parties can publish to, and they lack proactive security scanning for those contributions.

The AI agent ecosystem is currently in a phase where security is an afterthought in plugin distribution. The first major incident was the OpenClaw trojan horse campaign. It will not be the last. Any platform that allows untrusted code to run in the agent’s process space, with the agent’s permissions, and with access to the agent’s tools and memory, is running the same risk.

Conclusion

The AI agent security landscape in April 2026 is defined by three intersecting trends: agent frameworks are being actively targeted by threat actors, enterprise defenses are not equipped to detect or respond to agent compromise, and the plugin supply chain reproduces security failures that software supply chains have struggled with for decades.

The Bissa scanner and the 28,000-node trojan horse campaign are early indicators. Attackers have demonstrated that agent frameworks are exploitable, that plugin marketplaces are effective distribution channels, and that the gap between vulnerability disclosure and patching provides a reliable attack window.

The defenses exist: network segmentation, CVE patching, consent system integrity monitoring, and plugin vetting. The challenge is that these defenses require active maintenance and operational discipline. The threshold for compromise in agent security is not whether a vulnerability exists but whether the operator has applied the available fixes.

Operators who treat their agent infrastructure as critical security infrastructure, apply patches within hours of disclosure, audit their plugin inventory regularly, and monitor for behavioral anomalies will survive the current wave. Those who treat agents as experimental tools that can be deployed and forgotten will be the next incident report.

Sources

  • OpenClaw Security Advisory CVE-2026-41342 through CVE-2026-41361. https://openclaw.org/security/advisory/2026-04-24
  • OpenClaw Security Advisory CVE-2026-41295 through CVE-2026-41302. https://openclaw.org/security/advisory/2026-02-15
  • Bissa Labs. OpenClaw Instance Exposure Analysis. https://bissalabs.com/research/openclaw-exposure-2026
  • TechRadar. “Trojan horse AI agents target 28,000 OpenClaw nodes.” https://www.techradar.com/pro/trojan-horse-ai-agents-target-openclaw-nodes
  • DFIR Report. Bissa Scanner Exploitation Analysis. April 23, 2026.
  • Palo Alto Networks. “AI agents slip past enterprise defenses.” April 20, 2026.
  • Enkrypt AI / ClawPatrol. Gateway security tool release. April 21, 2026.

Related Reading on Red Rook AI

  • “OpenClaw CVE Batch April 2026: Eight Critical Vulnerabilities Every Operator Needs to Know.” https://redrook.ai/openclaw-cve-batch-april-2026/
  • “What’s New in OpenClaw 2026-4-24: Voice Calls, DeepSeek V4, and Browser Automation.” https://redrook.ai/openclaw-2026-4-24-release-features/
  • “AI Agent Security and Governance: A Practical Guide.” https://redrook.ai/ai-agent-security-governance/

Similar Posts