Hero image for Indirect Prompt Injection in 2026: What Google Actually Found

Indirect Prompt Injection in 2026: What Google Actually Found

PC Drama
80 views

Indirect Prompt Injection: Google’s Data Ends the Debate

According to Google’s AI security team, the biggest story this spring is not a zero-day threat. Google released a report from their four month monitoring campaign. which uncovered an increase in prompt injection. Google scanned billions of pages from Common Crawl and monitored the live public web and these results confirm what red teamers have been warning about since 2023: Indirect prompt injection isn’t theoretical anymore. It’s already here.

What follows is the verified account. Every statistic presented here is grounded in primary sources. The more sensational figures circulating on content farms, the “340% surge,” the “88% of enterprises breached,” and similar claims, lack credible backing, and we flag them accordingly.

TL;DR

  • Google measured a 32% relative increase in malicious indirect prompt injection payloads on the public web between November 2025 and February 2026.
  • Palo Alto Networks Unit 42 catalogued 22 distinct techniques used in real-world web-based attacks, with examples targeting Stripe payments, database deletion, and scam ad approval.
  • Perplexity's Comet browser was publicly demonstrated to leak email addresses and one-time passwords after reading a Reddit comment hidden behind a spoiler tag.
  • OWASP ranks Prompt Injection as the #1 LLM vulnerability (LLM01:2025), and Munich Re's 2026 cyber outlook names it as a top attack vector against AI models.
  • Sophistication is still low. Most attempts are the equivalent of "ignore previous instructions" written in white text. That changes the moment a payload pays off.

What is indirect prompt injection?

Indirect prompt injection is when malicious instructions are hidden within web content, and an AI agent acts on these prompts or commands. The attacker never talks to you. They publish a page, embed a command (often invisibly), and wait for an agent to fetch it. The agent does the rest. No malware. No exploit chain. No zero-day. Just a webpage with extra opinions.

Definition: Indirect Prompt Injection

A class of attack where an AI agent processes attacker-controlled content from a third-party source (a webpage, document, calendar invite, email, screenshot) and executes hidden instructions embedded in that content. The agent acts with the user's privileges, often without the user seeing the instruction at all.

Finding 1: Google's 32% increase is real, and the methodology matters

Google's security team analyzed billions of pages from Common Crawl snapshots between November 2025 and February 2026 and reported a 32% relative increase in the malicious category of indirect prompt injection payloads. The data appears on the Google security blog and has been corroborated by SecurityWeek, Help Net Security, Cybernews, and Decrypt.

The number is real. The framing is more interesting than the headline.

Google looked at static web content (blogs, forums, comment sections) using 2 to 3 billion pages per monthly Common Crawl pull. They did not measure attacks against users. They measured malicious payloads sitting in published pages. Sophistication remained low. Most attempts were variations of "ignore prior instructions and do X." A few were not. One real payload contained a fully specified PayPal transaction targeting agents with payment capabilities.

"Attackers are experimenting. Sophistication is still low. That is not reassuring. That is the warm-up."

Treat the 32% number as a tripwire, not a casualty count. It tells you the supply side is scaling. The demand side (agents that fetch the web with tool access) is also scaling. Those two lines are about to cross.

Finding 2: Unit 42’s Attacker Playbook (22 Techniques)

Palo Alto Networks Unit 42 published its real-world IDPI taxonomy in March 2026, drawn from telemetry across their detection products. They identified 22 distinct payload techniques in active use. The greatest hits include:

  • Zero-sized fonts and off-screen positioning
  • CSS suppression and visibility tricks
  • SVG encapsulation
  • Base64-encoded JavaScript assembled at runtime
  • Visible plaintext nested inside HTML attributes
  • Comments and collapsed sections that humans skim past
  • Text in images that survives OCR

One scam page they analysed layered 24 separate injection attempts in a single document. Unit 42's breakdown of attacker intent is the part defenders should screenshot: roughly 14.2% of observed attacks targeted data destruction, 9.5% attempted AI content moderation bypass, and the rest spread across payment rail attacks (Stripe, PayPal), data exfiltration, and persistence in agent memory.

Why this taxonomy matters more than the count

Twenty-two techniques sounds like a lot until you realize they all share one architectural premise: the AI cannot reliably tell instructions from data. Every defense that depends on the model spotting the trick is fighting that premise. The defenses that work do not ask the model to be smarter. They put a wall between fetched content and tool execution.

Finding 3: Perplexity's Comet browser became the proof of concept

If anything turned indirect prompt injection from a research topic into a board-meeting topic, it was Perplexity Comet. In July 2025, Brave researchers disclosed a vulnerability where a Reddit comment hidden behind a spoiler tag could hijack the Comet assistant's "Summarize the current webpage" function. The hidden instructions sent the agent off to extract the user's email address and the one-time password sitting in their inbox, then handed both to the attacker.

That demo was followed by:

  • CometJacking (October 2025, LayerX): a single crafted URL turning Comet into a Gmail and calendar siphon via Base64-obfuscated payloads.
  • Calendar invite hijacks demonstrating that any content surface the agent reads is an instruction surface.
  • Unseeable screenshot injections (Brave, follow-up disclosure) showing the attack vector extends to images rendered into the agent's context.

Perplexity shipped mitigations and has continued to harden the product. The point is not that Comet is uniquely flawed. The point is that every agentic browser inherits this attack surface the moment it lets a model decide what to do with fetched content.

Pro Tip: Treat fetched content as untrusted input, always

If your workflow involves an AI agent reading vendor pages, customer documents, or research links, assume every byte fetched from outside your trust boundary is an instruction until proven otherwise. The right question is not "what does this page say." The right question is "what could this page tell my agent to do."

Finding 4: OWASP, Munich Re, and CIS all named it. That's three different audiences agreeing.

Three independent bodies, three different rooms, one threat at the top of the page.

  • OWASP lists Prompt Injection as LLM01:2025, the #1 risk in their LLM Top 10. Their reasoning is structural: models cannot reliably distinguish trusted instructions from untrusted content, and there is no known foolproof mitigation.
  • Munich Re's 2026 cyber outlook identifies prompt injection and data poisoning as major attack vectors against AI models, noting that agentic AI will increasingly plan multi-stage operations, learn from detection responses, and exploit vulnerabilities with minimal human input. The reinsurer expects AI to raise attack frequency more than severity in the near term.
  • CIS issued a report flagging prompt injection as a growing risk to generative AI deployments.

When the standards bodies, the reinsurers, and the defensive security community all use the same vocabulary, the threat has moved out of the lab and into procurement decks.

Finding 5: The folk stats are mostly fake. Here's what to ignore.

This is where things get fun. While researching this article we kept tripping over claims like "prompt injection attacks surged 340% in 2026" and "88% of enterprises already breached." These numbers appear on content-farm blogs that all link back to each other and to nothing real. None of them trace to a primary source. Some of them are obvious hallucinations from AI-generated security content.

Specifically, one figure circulating in summary pieces claims "multi-hop agent attacks rose 70% year over year." We could not find a primary source for that statistic. The closest real number is a 70% success rate in controlled experiments, where researchers tested whether prompt-injected agents could be made to leak credentials. That is a useful number. It is also a completely different number, measuring a different thing.

Warning: Filter out the content-farm stats

If a cybersecurity article quotes a precise percentage but does not name the study, the methodology, or the publication date, treat the number as decorative. The prompt injection field is now actively polluted by AI-generated content that confabulates statistics. The verified primary sources are short: Google's security blog, Unit 42's research, Brave's disclosures, OWASP's LLM Top 10, and Munich Re's annual cyber outlook.

How the verified findings compare at a glance

Source What they measured Headline number What it actually means
Google Security (Apr 2026) Malicious IDPI payloads in Common Crawl snapshots 32% relative increase Nov 2025 to Feb 2026 Supply of attack content is scaling. Sophistication is still low.
Unit 42 (Mar 2026) Active payload techniques in production telemetry 22 distinct techniques catalogued Attackers are iterating on delivery, not just intent.
Brave / LayerX (2025) Perplexity Comet browser exploitability Multiple proof-of-concept exfiltrations Agentic browsers inherit the full attack surface of the web.
OWASP (2025) LLM application risk ranking LLM01: Prompt Injection Structural problem. No silver-bullet fix today.
Munich Re (2026) Cyber insurance threat outlook Named alongside data poisoning as major AI attack vector The insurance market is pricing this in.

So what should a small business actually do?

Most of the defense conversation lives at the model vendor level. That is real, but it is not the layer you control. The layer you control is workflow design and tool privilege. We covered the full protocol in our companion piece on how to safely let AI agents browse the web. Three habits matter most for owners and operators:

  1. Separate "read" from "do." A summary task should never run with the same tool privileges as an action task. If an agent can browse, it should not also be able to spend money or send mail in the same session without a human gate.
  2. Require explicit confirmation for sensitive actions. Payment, deletion, sharing, posting. Make the model ask. Not as a courtesy. As a circuit breaker.
  3. Audit fetched content separately from the agent's response. Keep the raw HTML the agent read. If something weird happens later, you want to be able to see what the page actually said versus what the agent reported.

Expert Tip: Audit the fetch, not just the answer

Logging the agent's final output tells you what it decided. Logging the fetched HTML before the agent processed it tells you what it was told. The second log is what feeds anomaly detection and catches indirect prompt injection. The first one will gaslight you into trusting the model's summary of the page that just compromised your workflow.

What's the realistic threat trajectory through the rest of 2026?

Three things are likely. Two are concerning. One is almost certain.

Almost certain: the supply of malicious payloads will keep climbing because the cost of publishing one is essentially zero. Google's 32% in four months is a soft floor, not a ceiling.

Concerning, near-term: sophistication will start to track payoff. Today's attacks are blunt because there is no consistent payoff. The moment an attacker proves a profitable payload (a successful PayPal transaction at scale, a working credential exfiltration), the techniques will tighten fast.

Concerning, structural: multi-agent and toolchain attacks are next. The research consensus is that defenses designed for single-agent prompt injection do not transfer to multi-agent systems, and goal hijacking in agent pipelines is genuinely hard to mitigate.

The bright spot is that the defensive primitives (input validation, sandboxing, least-privilege tool access, human-in-the-loop on sensitive actions) work. They are not perfect. They are not invisible. They do not require a research breakthrough. They require treating fetched web content the way you already treat untrusted user input.

Frequently asked questions

Is indirect prompt injection a real risk for small businesses, or just an enterprise problem?

It's a real risk for any organization whose staff use AI assistants to read external content. That includes summarizing vendor pages, processing customer documents, researching procurement options, or having an agent browse on your behalf. Small businesses are arguably more exposed because they have less segmentation between research workflows and production systems. If a salesperson asks an assistant to summarize a competitor's pricing page and the agent has access to your CRM, you have an indirect prompt injection problem whether you call it that or not.

Does the 32% increase mean my AI tools are 32% more dangerous?

No. The 32% is a measurement of malicious payloads sitting on the public web, not a measurement of successful attacks against users. Think of it as ammunition stockpiling. The risk to your specific tools depends on what those tools can do with content they fetch (browse, summarize, take action, retain memory) and what privileges they hold when they do it.

Why can't AI vendors just patch this?

Because it is not a bug. It is a property of how current language models work. Models treat instructions and data as the same kind of token stream, then use learned patterns to figure out which is which. Attackers exploit the ambiguity. Vendors are adding detection layers, sandboxing, and tool guardrails, and those help. There is no known fix that eliminates the underlying problem without also breaking the usefulness of agents that read external content.

Is using a browser like Perplexity Comet safe now?

Safer than it was at first disclosure, but not solved. Perplexity has shipped mitigations and continues to harden the product, and the same is true for other agentic browsers under active scrutiny. The honest framing is that agentic browsing is a new category, and every product in it inherits the same structural attack surface. Use them with the understanding that any page you summarize can also try to instruct the agent. Reserve sensitive sessions (banking, internal admin tools, password managers) for browsers that are not running an AI agent in the background.

How can I tell if a page contains a prompt injection attempt?

Most attempts are designed to be invisible to humans, which is the whole point. You can spot some by viewing the page source and looking for off-screen text, zero-sized fonts, content hidden inside HTML comments, or Base64 blobs that decode to instructions. For non-technical users, the practical move is to assume any page processed by an AI agent is a potential vector and to keep tool privileges minimal during those sessions. Detection tools at the gateway layer (model vendor or middleware) are improving fast but are not yet ubiquitous.

What makes this different from regular web malware?

Two things. First, there is no executable code on the victim's machine, so antivirus and endpoint detection are blind to it. Second, the attacker does not need an exploit chain or a vulnerability. They just need a page the agent will read. This puts the defensive burden in a place most security stacks are not built to cover, which is the boundary between fetched content and AI tool execution.

Key Takeaways

  • 32% real, four months, billions of pages: Google's number is the cleanest signal we have that the supply of attack content is scaling fast.
  • 22 techniques, not 1: attackers iterate on delivery (CSS, SVG, Base64, OCR). Defenders cannot rely on the model spotting the trick.
  • Comet was the wake-up call: agentic browsers inherit the full attack surface of the web, and the first round of public demos proved it.
  • OWASP, Munich Re, CIS all agree: prompt injection is the named top-tier risk for LLM applications in 2025/2026.
  • Filter the content-farm stats: if a number does not name its source and methodology, treat it as decoration.
  • Defense is workflow, not magic: separate read from do, gate sensitive actions, audit fetched content.

Where to go from here

If you operate AI workflows on behalf of a small business, the next two steps are simple. First, read our companion piece on how to safely let AI agents browse the web and walk through the protocol with your team. Second, take a hard look at which of your AI-assisted workflows let an agent both read external content and take action in the same session. Those are your priority candidates for separation. Browse our other cybersecurity coverage for adjacent topics, or look at secure web solutions if you want a starting point on the defensive stack as a whole.

Indirect prompt injection is not a future problem. It is a present problem with a real number attached to it. The good news is that the defense is not a research breakthrough. The defense is treating the web like the web, which is to say, like something that might be trying to talk to your assistant when you are not looking.

Sources

Related Articles