Agentic AI Risks Existing Security Controls Weren't Built For

2 de julio de 2026 |

0 minutos de lectura

Learn how Forcepoint helps organizations stop Shadow AI

Lionel Menchaca

AI Security

Agentic AI has moved from pilot projects to production infrastructure faster than most security teams expected. According to Gartner, 40 percent of enterprise applications will embed task-specific AI agents by the end of 2026, up from less than 5 percent in 2025. These systems are no longer just answering questions. They are reading files, sending messages, executing code, querying databases and calling APIs, often with no human in the loop at any step.

That operational shift changes the security conversation in a fundamental way. Traditional AI risk focused on what a model might say: a hallucination, a biased output, a privacy leak in a response. Agentic AI risk is about what a system will do. When an agent acts autonomously across enterprise infrastructure, its access permissions, its memory, its tool integrations and its connections to other agents all become potential attack surfaces.

The OWASP Top 10 for Agentic Applications 2026, developed with input from over 100 security researchers and peer-reviewed by NIST, Microsoft and NVIDIA, codified these risks for the first time in December 2025. The framework establishes something practitioners have been watching in real-world incidents for the better part of a year: agentic AI systems inherit all the vulnerabilities of traditional AI and introduce entirely new ones specific to autonomy, tool use and persistent state.

This post walks through the seven agentic AI security risks that security teams need to understand and act on, explains which mitigation strategies actually address each one and describes the data security controls that determine whether those mitigations hold at scale.

Why Agentic AI Creates a Different Risk Category

The distinction between generative AI and agentic AI matters more than most security policies currently reflect. A generative AI tool produces output for a human to review and act on. An agentic AI system acts directly, autonomously, across the systems and data it can reach.

Three properties of agentic AI create security risks that simply do not exist in traditional AI deployments.

Autonomy at machine speed. Agentic systems operate in observe-orient-decide-act loops without waiting for human confirmation at each step. A compromised or misdirected agent can exfiltrate data, modify records or trigger downstream workflows in seconds, before any alert surfaces.

Broad and inherited access. Every agent deployed in an enterprise creates non-human identities, one per tool, API and data source it connects to. Most of those identities inherit human-level permissions because traditional identity management systems were never designed to right-size access for autonomous actors operating dynamically across tool chains. According to Rubrik Zero Labs, non-human identities now outnumber human users 82-to-1 in enterprise environments.

Interconnected action chains. Agentic systems increasingly communicate with other agents through protocols like the Model Context Protocol (MCP) and Agent-to-Agent (A2A) frameworks. That interconnection means a compromise in one agent can propagate through an entire workflow before security teams have visibility into what happened.

These three properties are what make the risk category distinct. The seven risks below each trace back to one or more of them.

7 Agentic AI Security Risks Security Teams Need to Address

1. Prompt injection and goal hijacking

Prompt injection is the most documented and immediately exploitable agentic AI security risk. An attacker embeds malicious instructions in content the agent is designed to process: a document, a support ticket, a web page, an email. When the agent retrieves and processes that content, it follows the embedded instruction rather than the original task.

The direct variant, where the user themselves injects malicious instructions, is the easier one to catch. The dangerous variant is indirect: the attacker's instructions sit in external data the agent retrieves mid-task, and the agent has no reliable mechanism to distinguish legitimate instructions from content it should be processing.

In mid-2025, the EchoLeak vulnerability (CVE-2025-32711) against Microsoft 365 Copilot demonstrated exactly this. Attackers embedded engineered prompts in Word documents and emails. When Copilot summarized those files, it executed the hidden instructions and exfiltrated sensitive data without any user interaction. The same pattern appeared in a documented GitHub MCP server attack, where a malicious issue in a repository injected hidden instructions that hijacked an agent and triggered data exfiltration from private repositories.

OWASP classifies this as Agent Goal Hijack (ASI01). The data security implication is specific: prompt injection is not primarily a model problem. It is an access problem. An agent that can be misdirected through content can only do damage proportional to the data it can reach. Reducing that access surface is the most reliable structural defense.

2. Memory poisoning

Agentic AI systems maintain context across sessions through persistent memory stores. This persistence is part of what makes them useful: an agent that remembers previous interactions can operate more coherently over time. It also creates an attack surface that does not exist in stateless generative AI systems.

Memory poisoning occurs when an attacker corrupts or manipulates what an agent retains between sessions. If an agent's memory can be written with false context, fabricated user preferences or misdirected behavioral patterns, every subsequent action that draws on that memory becomes compromised. The attack is stealthy by design: the visible interaction looks normal while the underlying state has been altered.

In multi-agent architectures, poisoned memory compounds. One agent's corrupted output becomes another agent's trusted input. OWASP designates this as Memory and Context Poisoning (ASI06). Unlike most traditional threats, which are event-based, memory poisoning is cumulative. Detection requires behavioral monitoring that can identify drift in agent outputs over time, not just inspection of individual interactions.

3. Overprivileged access and identity abuse

Agentic AI inherits a structural problem that identity security teams have been fighting for years with service accounts: too much access, held too long, with too little audit visibility.

The problem compounds with agents because the provisioning process lacks the human review that even imperfect IAM processes apply to human accounts. An agent configured to automate a workflow gets the access required for that workflow, or more often the access the deploying user had, and it retains those permissions indefinitely. A 2026 enterprise security survey found that 37 percent of non-human identity security incidents trace back to overprivileged identities.

The security consequence follows directly. A compromised agent with broad permissions has a proportionally broad blast radius. An attacker who successfully manipulates an overprivileged agent through prompt injection, memory poisoning or supply chain compromise gains the ability to take actions the original designer never intended, across every system the agent can reach.

This is classified as Agent Identity and Privilege Abuse (ASI03) in the OWASP taxonomy. Identity-first approaches to securing agents treat agent identities with the same lifecycle management as human accounts: unique identity per agent, scoped to the minimum access required, with continuous behavioral monitoring and automated deprovisioning.

4. MCP and tool chain vulnerabilities

The Model Context Protocol has become the standard integration layer for agentic AI, connecting language models to external tools, data sources and APIs. Its rapid adoption has expanded the attack surface in ways that most enterprise security architectures were not designed to handle.

Researchers and the NSA's AI Security Center have both identified core security gaps in MCP deployments. Many MCP server implementations ship with no authentication controls. The protocol itself does not define how a session maps to a verifiable identity. Role-based access control is not part of the baseline specification. The result is that every MCP integration point is potentially an unauthenticated, unmonitored data pipeline.

Tool poisoning is the most direct exploitation path: a malicious or compromised MCP server exposes tools with names or descriptions that appear legitimate but perform unauthorized actions, including data exfiltration. The Supabase incident in mid-2025 illustrated the pattern clearly. A Cursor agent with privileged service-role access processed support tickets containing user-supplied SQL. The combination of excessive privilege, untrusted input and an external output channel resulted in integration tokens being exfiltrated into a public thread. No malware. No stolen credentials. Just an agent doing what it was designed to do in a context it wasn't designed to handle.

OWASP classifies MCP and integration vulnerabilities primarily under Tool Misuse and Exploitation (ASI02) and Agentic Supply Chain Compromise (ASI04).

5. Supply chain risks

Agentic AI systems are assembled from components: foundation models, third-party agent frameworks, external MCP packages, vendor-supplied tools and API integrations. Each component in that chain is a potential vector for compromise that occurs upstream, before the enterprise ever deploys the agent.

Malicious packages in agent repositories have already moved from theoretical to documented. In September 2025, a malicious package targeting agent ecosystems was confirmed in production. The OWASP Top 10 for Agentic Applications designates Agentic Supply Chain Compromise (ASI04) as a distinct risk category for this reason. Unlike software supply chain attacks, which target code dependencies, agentic supply chain attacks can target the behavioral instructions, tool definitions and memory schemas that shape how an agent operates. Compromising an agent at that layer can be far harder to detect than compromising a library.

The governance gap is specific: most organizations apply rigorous code review to software dependencies but have no equivalent review process for the agent components, external MCP packages or third-party tool integrations their AI systems use. Treating external MCP packages with the same scrutiny as third-party code libraries is the operational practice that closes this gap.

6. Cascading failures in multi-agent systems

Individual agent vulnerabilities are serious. Multi-agent systems introduce a second-order problem: failure states that propagate across the workflow before any human reviewer can intervene.

When agents operate in chains, one agent's output becomes the next agent's input. A prompt injection that misdirects the first agent in a chain affects every downstream agent that acts on its output. A memory poisoning attack in one node corrupts the context shared across the system. An overprivileged agent that is compromised can grant that access, functionally if not technically, to other agents it communicates with through A2A protocols.

The OWASP taxonomy designates this as Cascading Agent Failures (ASI08). The circuit breaker pattern addresses this directly: deploying monitoring controls that can detect anomalous behavior in an agent's outputs and suspend or isolate that agent before the failure propagates further. Runtime behavioral monitoring that covers agent-to-agent communication, not just user-to-agent interaction, is the control that makes this detectable rather than invisible until post-incident.

7. Shadow AI agents and governance blind spots

The unauthorized deployment of AI agents represents a specific evolution of the shadow IT problem, with a materially higher risk profile. Shadow IT created ungoverned access to external applications. Shadow AI agents create ungoverned autonomous actors with inherited enterprise credentials operating inside the network perimeter.

According to Gartner, 69 percent of organizations suspect employees are using prohibited AI tools. The agents employees deploy outside sanctioned channels typically lack security review, connect to enterprise data sources using inherited broad permissions and leave no audit trail that security teams can observe. The CASB controls that surface sanctioned and unsanctioned AI tool usage are the foundational visibility layer for addressing shadow AI, but shadow agents require additional controls that can detect autonomous activity, not just employee-initiated access.

The risk profile of shadow AI agents mirrors the insider threat pattern closely: a trusted entity with privileged access taking actions outside its intended scope, at machine speed, across systems that were never designed to audit autonomous behavior. The behavioral detection logic used to surface AI insider threats applies directly here, and organizations running insider risk programs are increasingly extending that framework to cover agent activity for exactly this reason.

The governance blind spot compounds over time. Shadow agents are rarely deprovisioned when an employee changes roles or leaves the organization. Long-lived credentials tied to shadow agent identities are specifically identified in the OWASP Non-Human Identity Top 10 as one of the highest-risk NHI security patterns.

Mitigation Strategies That Actually Work at Scale

Understanding the risks above is useful. Deciding how to respond to them requires a framework that maps controls to root causes rather than symptoms. The following strategies address the structural conditions that make each risk exploitable.

Start with data: classify before agents can reach it

Every risk in the taxonomy above is proportional to what the agent can access. An agent that can only reach the data its specific task requires has a fraction of the blast radius of one that inherited broad permissions. The practical starting point for reducing agentic AI risk is not agent-level controls. It is data classification.

Data access governance defines who, and what, can access sensitive information and under what conditions. Extending that framework to agent identities, before deployment rather than as remediation, is the structural intervention that limits what any exploitation can actually accomplish.

Apply least-privilege access to every agent identity

Each agent should operate with a distinct, verified identity scoped to the minimum access its specific use case requires. This means departing from the common pattern of agents inheriting the permissions of the user who deployed them. It means provisioning agent identities with the same lifecycle management applied to privileged human accounts: explicit scope, time-limited credentials where possible and immediate deprovisioning when the agent's function changes or ends.

The age of agentic AI is forcing a rethink of identity management frameworks that were built entirely around human actors. Zero Trust applied to agents means never inheriting standing permissions: access is granted per action, per task, for the specific scope required.

Validate inputs and treat external content as untrusted

Because no fully reliable defense against prompt injection exists at the model level, the durable mitigation is structural: an agent that receives misdirected instructions simply should not be able to take high-impact actions or reach external endpoints as a result of those instructions.

Input sanitization and validation at the point where agents consume external data reduces the attack surface for both direct and indirect injection. LLM firewalls and runtime guardrails add a second layer. But the governance principle matters more than any individual control: external content is untrusted by default. An agent processing a document, a web page or a support ticket should treat the instructions in that content the same way a secure system treats user input: as data to be validated, not commands to be executed.

Monitor agent behavior continuously, not just at access points

Traditional security monitoring is event-based. It looks for known-bad signatures at known access points. Agentic AI threats operate differently. Memory poisoning accumulates gradually. Cascading failures propagate through agent chains. Goal hijacking may redirect an agent's behavior in ways that generate no alert but produce meaningful data exposure over time.

Behavioral monitoring that establishes baselines for each agent's normal operating patterns and alerts on deviations, including changes in what data an agent is accessing, where outputs are being routed and which tools are being invoked, is the control that makes these threats detectable. Circuit breakers that can suspend an agent whose behavior deviates beyond defined thresholds contain failures before they propagate.

Govern MCP and third-party integrations as a security-critical layer

Every MCP integration is a trust decision. Treat external MCP packages with the same scrutiny applied to third-party code libraries. Require authentication for all remote MCP server connections. Apply least-privilege scoping to the tools each MCP integration exposes. Audit MCP server configurations on the same cadence as other critical infrastructure. The AI cybersecurity landscape has made MCP supply chain integrity one of the fastest-growing attack categories of 2025 and 2026.

Embed human oversight at high-impact decision points

Not every agent action needs human review. But irreversible, high-impact actions, sending external communications, executing financial transactions, modifying access permissions, transmitting data outside the enterprise perimeter, should require human confirmation before the agent proceeds. This is the NIST AI RMF principle applied operationally: human oversight is not about slowing AI down. It is about defining the boundary beyond which autonomous action creates unacceptable risk.

How Forcepoint Addresses Agentic AI Security Risks

Most agentic AI security risks ultimately collapse to a data problem. An attacker who successfully hijacks an agent's goals, poisons its memory, exploits a misconfigured MCP server or compromises an upstream supply chain component gains the ability to reach, move or exfiltrate data. The controls that determine the actual damage are the ones governing what data the agent can touch and how its activity is monitored.

Forcepoint approaches agentic AI security from the data layer out, rather than from the network perimeter in. This matters because agent-to-LLM traffic and agent-to-agent communications frequently never pass through a traditional network control point. Proxy-based approaches cannot inspect what they cannot see.

Forcepoint DSPM discovers and classifies sensitive data across cloud, SaaS and on-premises environments before any AI tool or agent can reach it. Powered by AI Mesh, Forcepoint's network of small language models fine-tuned for specific industries and datasets, DSPM delivers classification accuracy that generic models cannot match. By mapping where sensitive data lives, how it is accessed and what permissions govern it, DSPM establishes the upstream foundation that makes every downstream agent control more precise. Data access governance tied to DSPM classification determines, at the data level, what any agent can retrieve, which limits the blast radius of every risk in the taxonomy above. For agentic AI deployments specifically, DSPM identifies overshared files, misconfigured permissions and mislabeled sensitive content before agent integrations go live, not as post-incident remediation.

Forcepoint DLP extends to the AI interaction layer through both API-based enforcement for SaaS AI platforms and inline endpoint enforcement. DLP policies that already govern email, web and endpoint activity extend to agent prompts, outputs and workflows without reclassification, so the protection that already covers your most sensitive data channels follows that data into AI. Inline enforcement reaches agent-to-LLM traffic that never passes through a network control point. When a policy fires, Forcepoint enforces, blocks, coaches or redirects the interaction at the point it occurs, not after the fact. For a closer look at how DLP enforcement works across AI channels, including evaluation criteria and use cases for agentic environments, see our dedicated guide.

Together, these capabilities give security teams the two controls that matter most for agentic AI: visibility into what data agents can reach before deployments go live, and enforcement when agent activity crosses a policy boundary during operation. The audit trail connecting agent identity, triggering user, data accessed and action taken provides the attribution that incident response requires and that regulators are increasingly asking for.

For a detailed technical look at how these attack chains operate end-to-end, the Forcepoint X-Labs AI kill chain simulation walks through a real MCP server attack scenario from initial LLM exploit through to full compromise.

Build an Agentic AI Security Program That Keeps Pace

Agentic AI security is not a configuration problem you solve once at deployment. Agents evolve. Integrations expand. New tools are connected. New agents are provisioned. The attack techniques targeting these systems are evolving at the same pace.

The organizations managing this well are treating agentic AI security as a continuous program: continuous data discovery and classification before new deployments go live, behavioral monitoring that adapts as agent operating patterns change, governance frameworks tied to established standards including the OWASP Top 10 for Agentic Applications 2026 and the NIST AI Risk Management Framework and enforcement controls that can keep pace with how fast AI adoption is actually happening.

Forcepoint's data access governance capabilities and AI Security platform are built for that operational reality. If you want to see how the data-first framework for secure AI adoption works in practice, or you're ready to assess where your current governance architecture has gaps, start with a Forcepoint Data Security Cloud conversation.

Lionel Menchaca
Lionel Menchaca has covered data security at Forcepoint since 2020, writing about DLP, DSPM, insider risk and AI security for security and IT leaders. He works with Forcepoint X-Labs threat researchers to turn their findings on emerging threats, from AI-targeted supply chain attacks to prompt injection, into practical guidance, and he leads the company's editorial strategy across the blog and the X-Labs newsletter. Before Forcepoint, Lionel founded and ran Dell's corporate blog for seven years and spent two decades helping enterprise tech companies explain security, cloud and AI.
Leer más artículos de Lionel Menchaca

En este post

The Enterprise Guide to AI Data Security

The Enterprise Guide to AI Data SecurityLeer el Libro Electrónico

X-Labs

Reciba información, novedades y análisis directamente en su bandeja de entrada.

Al Grano

Ciberseguridad

Un podcast que cubre las últimas tendencias y temas en el mundo de la ciberseguridad

Escuchar Ahora