The Shared Responsibility Problem in GPAI Compliance

Model providers own baseline transparency. Deployers own what the model does in production. Most enterprise teams assume upstream compliance covers them. It does not.

Published on

Subscribe to our newsletter

By submitting your email, you agree to our Privacy Policy and consent to receiving updates from us


An enterprise insurance company integrates a GPAI model into its commercial underwriting workflow, using an AI agent to assess risk factors, summarize policy conditions, and generate coverage recommendations for broker review. The model provider has published a technical summary, maintained a model card, and, where applicable, fulfilled the notification requirements that apply to systemic-risk GPAI models. From the compliance team's perspective, that upstream documentation is substantial. In practice, they treat it as coverage for their own deployment. The EU AI Act's GPAI obligation structure draws a precise line between what a model provider is obligated to demonstrate and what a deployer is obligated to govern. That line runs through the production environment. Everything on the deployer's side of it is ungoverned unless the deployer has built governance themselves, and most enterprise teams deploying GPAI models in 2025 have not.

The provider's compliance obligations primarily address the model as supplied. They do not, by themselves, establish governance over how the model is configured, integrated, and used within a deployer's specific production environment. The deployer is responsible for that. This is not an interpretation of the regulation. It is the regulation's explicit structure, and it creates a compliance gap that is both provable under audit and widening as enterprise GPAI adoption accelerates faster than deployer governance programs.

The Deployer Assumption Gap

Call this structural failure the Deployer Assumption Gap: the condition in which a deployer treats upstream provider compliance as a substitute for deployer-layer governance, leaving the production execution environment uncontrolled and unattested.

The gap is not a mistake in the deployer's compliance program. It is a structural consequence of how enterprise AI procurement works. A compliance team evaluating a GPAI model integration examines the provider's documentation, confirms the model has passed available conformity assessments, and files the technical summary alongside the system design. That work is correct and necessary. It is also entirely focused on the model as a static artifact at the point of supply. The EU AI Act's deployer obligations are concerned with something different: the model as a dynamic participant in a production decision-making environment. Those two things require different governance programs, and only one of them has typically been built.

The gap widens with every deployment configuration that diverges from the provider's documented reference conditions. An underwriting agent that retrieves policy documents from a proprietary corpus, operates against customer data under data processing agreements, and generates outputs that feed directly into coverage decisions is not operating under the conditions the provider assessed. It is operating under conditions the deployer controls. The controls the provider demonstrated do not, by themselves, establish governance over how the agent operates in that deployment environment. The deployer's governance program, if it exists, is what matters. In most enterprise deployments, it does not exist at the required level of specificity.

What the GPAI Obligation Structure Actually Requires

The EU AI Act's GPAI provisions establish a two-layer obligation structure. Under Article 53 of the EU AI Act, providers of general-purpose AI models are required to maintain technical documentation, provide information necessary for downstream compliance, establish a policy to comply with Union copyright law, and make publicly available a sufficiently detailed summary of the content used for training the model. These are baseline transparency and safety obligations. They apply to the model as a product at the point of supply. The GPAI Code of Practice (final version, European AI Office, 10 July 2025) operationalises these obligations as a voluntary compliance instrument, structuring provider requirements across three chapters: Transparency, Copyright, and Safety and Security.

When a GPAI model is integrated into a high-risk AI system, deployers may become subject to obligations arising from the high-risk AI system provisions applicable to their role and deployment context, in addition to obligations associated with the use of GPAI-enabled systems. The deployer must ensure appropriate human oversight, must implement measures to manage risks arising from the specific deployment, and must maintain records sufficient to demonstrate that governance operated correctly over the agent's decisions in production.

The critical point is that provider documentation cannot discharge deployer obligations. A model card produced by the provider does not attest to how the agent behaved inside the underwriting workflow on a specific date. A technical summary does not document which risk factors the agent weighted when generating a specific coverage recommendation. A conformity assessment filed by the provider does not constitute evidence that the deployer's data processing agreements were respected during a specific session. These are deployer obligations, and they require deployer-layer evidence. The Deployer Assumption Gap is the absence of that evidence in organizations that assumed the provider had already produced it.

Provider compliance documents the model as shipped. Deployer governance attests to what the model did in production. These are not the same record, and no regulator examining a specific incident will treat them as equivalent.

OpenBox addresses this by operating as a runtime governance layer at the deployer boundary, enforcing governance decisions during execution rather than reconstructing them after deployment incidents occur.

Why the Gap Is Structural, Not Procedural

The instinct in most compliance programs is to treat the Deployer Assumption Gap as a documentation gap. If the provider's technical summary is comprehensive enough, if the deployment design document is detailed enough, if the risk assessment is thorough enough, then the gap closes. It does not. The gap is not a documentation problem. It is an execution problem.

Three distinctions define what deployer-layer governance requires that documentation alone cannot provide. Logging records what the agent produced. Authorization determines whether the agent was permitted to act before it acted. Governance, in the sense the regulation requires, decides whether the agent's actions conformed to the policy conditions the deployer is responsible for enforcing, at the moment of execution, in the specific context of each decision. The first two are served by existing enterprise infrastructure. The third is not.

An underwriting agent that produces a coverage recommendation has logged an output. It has not attested to the governance decisions that preceded it. The log does not record whether the agent's retrieval was constrained to the authorized data scope for that customer. It does not record whether a behavioral pattern across multiple queries in the session triggered a policy constraint. It does not record whether the output was evaluated against the deployer's current risk policy before being surfaced to the broker. Without that record, the deployer cannot demonstrate governance. It can demonstrate that the agent produced outputs. Regulators examining a specific adverse decision will find that distinction significant.

Runtime Governance at the Deployer Layer


OpenBox (docs.openbox.ai) operates as the runtime governance layer for AI agents. It wraps existing agent infrastructure without architectural change and enforces governance decisions at the deployer layer, at the point of execution, rather than describing them in advance documentation. The Trust Lifecycle (Assess, Authorize, Monitor, Verify, Adapt) maps directly onto the deployer obligations that the GPAI obligation structure establishes.

Assess establishes a risk and behavioral baseline for each agent in the deployment. Each agent receives a Trust Score derived from risk, behavioral, and alignment signals calibrated to the deployment context. The Risk Profile Score produces the agent's Trust Tier and contributes 40% of the composite Trust Score that determines its permitted operating scope. An underwriting agent operating against customer financial data in a regulated insurance context does not begin production at the same Trust Tier as an internal document summarization agent with no regulatory exposure. The baseline is specific to the deployment context, not inherited from the provider's model assessment.

Authorize is where the deployer's governance obligations resolve at runtime. Runtime authorization operates through three control surfaces. Guardrails are hard constraints on agent actions, applied at the point of execution. Specific guardrail types are detailed in the OpenBox guardrails documentation. Each type can be configured to block on violation, log violations, or both. Policies, expressed as OPA/Rego stateless permission checks, encode the deployer's own risk management requirements as enforceable conditions on every agent action. Behavioral Rules detect stateful multi-step patterns that no single-action check can surface: the agent that, across a sequence of individually authorized retrievals, produces a recommendation drawing on data outside the scope of the customer's processing agreement. Each step passes a single-action check. The sequence does not. The output of every Authorize evaluation is a Governance Decision: ALLOW, BLOCK, REQUIRE_APPROVAL, or HALT. ALLOW permits the operation; BLOCK rejects the specific action while the agent continues; REQUIRE_APPROVAL pauses the operation for human review; HALT terminates the entire session. These runtime governance decisions are the operational form deployer accountability takes under the regulation.

Monitor is real-time behavioral observation. Every governance decision the agent makes is captured as it occurs, producing a continuous behavioral record across the full production session. The observation is not sampled from representative outputs. It is continuous across every agent action in every session.

Verify evaluates whether the agent's actions across a session remained aligned with the governance conditions under which it was deployed. It validates that the controls applied during Authorize continued to hold as behavior evolved: did the Policies fire where they should have, did the Behavioral Rules surface the patterns they were configured to detect, did the Guardrails enforce the boundaries they were designed for. Session Replay reconstructs the complete decision path for any session in a form reviewable by the deployer's compliance team or an external regulatory examiner. Verify is the structural complement to Authorize, not a substitute for it.

Cryptographic Attestation produces a tamper-evident proof certificate per governance session. This is the deployer-layer evidence the regulation requires: not a description of governance that was intended, but a signed, tamper-evident proof certificate of governance events, verifiable against the complete session record. When a regulator, an auditor, or a counterparty demands evidence of how a specific agent decision was governed, the attestation record is the answer.

Adapt is the policy update layer. When the Risk Profile Score crosses a Trust Tier boundary, the agent is reclassified automatically. The new tier changes the governance strictness level applied to the agent. Corresponding policy updates are surfaced as suggestions in the Adapt phase and require explicit operator action to take effect. Based on violation patterns and behavioral signals, OpenBox surfaces policy suggestions in the Adapt phase. Each suggestion can be accepted (creating the rule in Authorize), rejected, or modified. Guardrail and behavioral rule changes require explicit operator action. The governance posture adjusts on the signal the system produces, not on the review cycle the compliance calendar dictates.

What Changes for Deployer Compliance Programs

The shift from documentation-based compliance to runtime governance changes what a deployer can demonstrate, and to whom.

For compliance and legal teams, the work changes from assembling a pre-deployment evidence package to maintaining a continuous production evidence record. The tamper-evident audit trail is the compliance artifact. There is no retrospective reconstruction project when an adverse decision is examined, because the governance record for that specific decision already exists in attested form. The question shifts from 'can we demonstrate we intended to govern this' to 'here is the signed record of how it was governed.'

For risk teams, the deployer's liability exposure changes form. The EU AI Act's enforcement structure and the national implementation measures emerging across member states attach to the deployer's ability to demonstrate governance over specific decisions. An organization that can produce a session-level governance attestation for an adverse underwriting decision is in a fundamentally different position than one that can produce only a general description of its AI risk management framework. The first is evidence. The second is a statement of intent.

For engineering and AI teams, governance stops being a gate that slows deployment. Behavioral Rules are reviewed as part of the deployment pipeline and policy updates are managed through the Authorize phase. The compliance team and the engineering team operate against the same artifacts. New model versions enter a governance environment that already enforces deployer policy, which means compliance review applies to the governance configuration, not to every feature that passes through it.

August 2026 and the Closing Threshold

The EU AI Act's implementation timeline has shifted, but the underlying obligation has not. Under the original Act, high-risk AI system obligations for standalone Annex III systems were due to apply from 2 August 2026. The Digital Omnibus on AI, which reached provisional political agreement on 7 May 2026 and is expected to be formally adopted before 2 August 2026, has deferred those obligations to 2 December 2027 for Annex III systems and to 2 August 2028 for AI embedded in regulated products under Annex I. The Article 50 transparency obligations, including the requirement to disclose to users when they are interacting with AI systems, remain on schedule and proceed from 2 August 2026 as originally planned, while the Article 50(2) watermarking obligations have been pushed back to 2 December 2026 for all providers. The obligation to demonstrate deployer-layer governance over high-risk AI system deployments is therefore not immediate for most enterprise GPAI integrations, but it is not distant either. Organizations that integrated GPAI models before the Act's GPAI obligations entered into force in August 2025 had a transitional window to build compliant governance programs. That window is now bounded by the revised dates, and the work required to build deployer-layer governance does not compress to fit a shorter calendar.

The Deployer Assumption Gap will not close through better provider documentation, more detailed deployment design documents, or more thorough pre-deployment risk assessments. None of those artifacts attest to what the agent did in production on a specific date against a specific customer's data under a specific policy configuration. The regulation requires deployers to demonstrate compliance with the obligations that apply to their deployment context. In practice, that means maintaining evidence capable of withstanding external examination rather than relying solely on descriptions of intended governance. The revised timeline does not remove this requirement; it extends the runway. Organizations that use that runway to build runtime governance now will be in a fundamentally different position when the Annex III obligations take effect in December 2027 than those that defer the work until the deadline is imminent.

The GPAI compliance programs that will survive regulatory examination are the ones built on the assumption that every governance decision the agent makes must be attested at the moment it is made: not reconstructed after the fact or summarized through periodic review processes, but attested continuously in a form indistinguishable from the production record itself. The revised timeline makes this shift more achievable, not less necessary. Organizations that have not made this architectural shift are not running ungoverned agents. They are running agents whose governance cannot survive external examination. The difference between a program built before December 2027 and one assembled in the months before it is the difference between evidence and aspiration.



Trustworthy AI
Starts Here

By submitting your email, you agree to our Privacy Policy and consent to receiving updates from us

Trustworthy AI
Starts Here

By submitting your email, you agree to our Privacy Policy and consent to receiving updates from us

Trustworthy AI
Starts Here

By submitting your email, you agree to our Privacy Policy and consent to receiving updates from us

Trustworthy AI
Starts Here

By submitting your email, you agree to our Privacy Policy and consent to receiving updates from us