Runtime Governance Series
Copilot SearchLeak: Why Logging Can't Stop One-Click Data Exfiltration
SearchLeak shows why observability cannot stop one-click AI exfiltration; only inline enforcement can block the operation before data leaves.
Published on


A single click on a real microsoft.com link could drain sensitive data from a mailbox, MFA codes included. The fix that matters is not better logging. It is enforcement placed in the execution path.
A user clicks a link that points to a genuine microsoft.com address. Nothing downloads. There is no second prompt. Within seconds, the subject lines of their email, including one-time passcodes and password reset links, are sitting in a stranger's web server logs.
That is SearchLeak, and it is a sharp recent example of one-click data exfiltration through an AI assistant. For a CISO, the uncomfortable part is not the bug itself. It is that most standard observability tooling watching that agent would have recorded the theft and prevented none of it.
What SearchLeak actually is
SearchLeak is a three-stage attack chain that Varonis Threat Labs disclosed in June 2026, turning Microsoft 365 Copilot Enterprise Search into a silent exfiltration channel. Microsoft tracks it as CVE-2026-42824 and has patched it on its own servers. Microsoft rates it Critical under its own severity scale; the CVSS 3.1 base score is 6.5, which the CVSS scale classifies as Medium.
The chain works because three weak points line up:
Parameter-to-prompt injection. The q parameter in a Copilot Search URL is read as an executable instruction, not just a search term.
An HTML rendering race condition. An image tag in Copilot's streamed response fires its request before the output sanitizer can wrap it as plain text.
A server-side request forgery through Bing. Bing's image endpoint is allowlisted in the page's content security policy, so it fetches the attacker's URL from Microsoft's own infrastructure.
The result is a single motion. The victim clicks, Copilot searches their mailbox, the stolen subject line is embedded in an image URL, and Bing delivers it to the attacker. No malware, no extra permissions, no second click. Because the link uses a trusted Microsoft domain, anti-phishing and URL filters wave it through.
Why observability tools log the theft but cannot stop it
Observability tells you what an agent did. It does not decide what an agent may do. That distinction is the whole problem.
Tools like Langfuse are built to trace LLM applications: every call, retrieval, tool use, and token, captured with full context for debugging and evaluation. They are very good at it. But a trace is a recording made alongside execution, not a gate placed inside it. The data is written so you can inspect it afterward.
SearchLeak finishes during the render stream. The exfiltration request leaves the browser before Copilot has even finished its answer. An after-the-fact record of that event is forensics, not prevention. You learn, in precise detail, how you were robbed.
This is the gap that matters for the agents you build and run. Observability and enforcement are complementary; you need both. But watching is not the same as authorizing, and a tool designed to record cannot substitute for a tool designed to refuse. To stop a one-click chain, something has to sit in the path and refuse the operation before it completes.
Where enforcement has to live
Enforcement has to live in the execution path, on the operation, before it runs. This is the job of the Authorize phase in the OpenBox (docs.openbox.ai) Trust Lifecycle, which runs Assess, Authorize, Monitor, Verify, and Adapt.
When an agent operation is evaluated, OpenBox returns one of four governance decisions: ALLOW, REQUIRE_APPROVAL, BLOCK, or HALT. ALLOW lets the operation proceed. REQUIRE_APPROVAL pauses it for a human reviewer. BLOCK denies the operation but leaves the session running. HALT terminates the session outright. Precedence runs HALT > BLOCK > REQUIRE_APPROVAL > ALLOW, so the strictest applicable decision wins.
The point is timing. These decisions are returned on the operation, in line, while it can still be refused. That is the difference between a control and a camera.
How OPA/Rego policies stop the operation before data leaves
A policy can deny the outbound step on its own, without knowing anything about the steps before it. In OpenBox, Policies are stateless permission checks written in OPA Rego. Each one evaluates a single operation and returns a decision. They do not track session history. They answer one question: is this specific operation allowed right now?
That single-operation view is enough to break an exfiltration step. A policy reads the operation's classified spans, each tagged with a semantic type such as llm_completion or file_read. A rule can require approval, or deny, when an operation tries to reach a destination outside an allowlist, or when a model completion attempts to embed retrieved data in an external URL. The chain needs that outbound request to succeed. A policy that refuses it ends the attack at the last step.
Policies also tighten by risk. A higher-risk agent can be held to stricter rules than a low-risk one, so the same operation that passes for a trusted agent routes to human approval for a sensitive one.
How behavioral rules catch the multi-step chain
Some attacks only look dangerous as a sequence. SearchLeak is one: read the mailbox, reshape the result into an image URL, fire an outbound fetch. Each step, on its own, can look ordinary.
This is what Behavioral Rules are for. Where policies judge one operation, Behavioral Rules are stateful. They detect multi-step patterns across a session and escalate to BLOCK, REQUIRE_APPROVAL, or HALT. A rule that recognizes an internal data read, followed by an external URL built from it, followed by an outbound request, can halt the session the moment that pattern forms, even when no single operation crossed a line.
Used together, the two layers cover both shapes of the threat. Policies hold the line on the individual operation. Behavioral rules catch the trajectory.
The SearchLeak chain mapped to enforcement
Here is the chain, what an observability tool records at each stage, and the OpenBox control that would gate the same pattern in an agent you govern.
SearchLeak stage | What an observability tool records | OpenBox control that gates it |
|---|---|---|
Prompt injection via the q parameter | The prompt and the model's response, stored as a trace | Policy on the input operation: require approval or deny when untrusted input carries embedded instructions |
Internal search reshaped into an external image URL | The tool call and its output, logged after the fact | Behavioral rule: flag the sequence of internal read, then external URL constructed from it |
Outbound fetch that carries the data out | The outbound request, recorded once it has already fired | Policy on the egress operation: deny destinations outside the allowlist, before the request leaves |
One honest caveat: SearchLeak itself lives inside Microsoft's Copilot and was fixed by Microsoft. You cannot bolt OpenBox onto a closed Microsoft product. The mapping above applies to the agents you build and wrap, where OpenBox sits in the execution path.
Proving what the agent did, for the incident review
When an operation is refused, you need evidence that holds up later. OpenBox records every governance decision in an immutable audit trail. Each session's events are hashed into a Merkle tree and digitally signed, creating a verifiable proof certificate that no governance data was altered after the fact.
For incident response, that turns “we think the control fired” into something an auditor can verify. The decision, the reason, the operation, and the timing are all preserved and tamper-evident.
The takeaway
SearchLeak will not be the last one-click data exfiltration chain, because the ingredient that makes it work, an AI that treats input as instructions, is now standard. The agents you deploy will face the same shape of attack, without Microsoft's server-side fix to save them. The defense is not more logging. It is a decision returned on the operation, in the path, before the data leaves.
For how runtime enforcement fits into a full program, see OpenBox's complete AI agent governance guide for enterprise teams.
Frequently asked questions
What is the Copilot SearchLeak vulnerability?
SearchLeak is a one-click attack chain disclosed by Varonis Threat Labs in June 2026 that turned Microsoft 365 Copilot Enterprise Search into a data exfiltration channel. It combined prompt injection, an HTML render race condition, and a Bing server-side request forgery. Microsoft tracks it as CVE-2026-42824 and has patched it. Microsoft rates it Critical under its own severity scale; the CVSS 3.1 base score is 6.5 (Medium by the CVSS scale).
Can observability tools like Langfuse stop prompt injection attacks?
No. Observability tools trace and record what an agent does, for debugging and evaluation. They sit alongside execution, not inside it, so they capture an exfiltration after it happens. Stopping a one-click chain needs enforcement that refuses the operation in the execution path before it completes.
What are the four OpenBox governance decisions?
When an agent operation is evaluated, OpenBox returns ALLOW, REQUIRE_APPROVAL, BLOCK, or HALT. ALLOW proceeds, REQUIRE_APPROVAL pauses for a human reviewer, BLOCK denies the operation while the session continues, and HALT terminates the session. Precedence runs HALT, BLOCK, REQUIRE_APPROVAL, ALLOW, so the strictest decision wins.
What is the difference between OpenBox policies and behavioral rules?
Policies are stateless OPA Rego checks that evaluate a single operation and return a decision, with no memory of prior steps. Behavioral rules are stateful and detect multi-step patterns across a session, then escalate. Policies gate one operation; behavioral rules catch a sequence.
Does OpenBox protect Microsoft 365 Copilot?
No. SearchLeak lives inside Microsoft's closed Copilot Enterprise and was fixed by Microsoft. OpenBox governs AI agents you build and wrap, such as those on LangChain, LangGraph, Temporal, or Mastra. SearchLeak matters here as a template for the same attack pattern in agents you control.
How does OpenBox prove an agent's actions for an audit?
OpenBox records every governance decision in an immutable audit trail. Each session's events are hashed into a Merkle tree and digitally signed, producing a verifiable proof certificate. That gives incident responders tamper-evident evidence of which operation was refused, why, and when.
Sources
Varonis Threat Labs, “SearchLeak: How We Turned M365 Copilot Into a One-Click Data Exfiltration Weapon.” https://www.varonis.com/blog/searchleak Accessed June 18, 2026.
Microsoft Security Response Center, CVE-2026-42824. https://msrc.microsoft.com/update-guide/vulnerability/CVE-2026-42824 Accessed June 18, 2026.
OpenBox (docs.openbox.ai), “Governance Decisions.” https://docs.openbox.ai/core-concepts/governance-decisions Accessed June 18, 2026.
OpenBox (docs.openbox.ai), “Policies.” https://docs.openbox.ai/trust-lifecycle/authorize/policies Accessed June 18, 2026.
OpenBox (docs.openbox.ai), “Compliance & Audit.” https://docs.openbox.ai/administration/compliance-and-audit Accessed June 18, 2026.
OpenBox (docs.openbox.ai), “Attestation & Cryptographic Proof.” https://docs.openbox.ai/administration/attestation-and-cryptographic-proof Accessed June 18, 2026.
Langfuse, “LLM Observability & Application Tracing.” https://langfuse.com/docs/observability/overview Accessed June 18, 2026.

