How It Works — RAGuard Runtime Governance Architecture

Request Lifecycle

What happens to every request.

Seven runtime operations execute on every prompt before it reaches your model, and on every response before it reaches your users. Policy and evidence travel with the interaction.

1

Request Ingress

Your application sends an API request to your RAGuard proxy endpoint instead of directly to OpenAI or Anthropic. RAGuard's native adapters parse the request format — no SDK changes required.

2

Threat Detection

The prompt is inspected by the two-tier injection detector — a fast regex pattern engine first (sub-1ms), then ML classification via Meta Prompt Guard 2 for borderline cases. Content safety classification runs in parallel. Total overhead: typically 20–80ms.

3

Data Loss Prevention

Named Entity Recognition and pattern matching scan for PII (names, emails, SSNs, phone numbers, credit cards), organisation data, and secrets (API keys, tokens, credentials). Detected entities are redacted according to your OPA policy before forwarding.

4

Policy Evaluation

The sanitised request is evaluated against your tenant-specific OPA/Rego policy. Rules determine whether to allow, block, modify, or log the request. Forbidden actions are deterministically enforced — no probabilistic reasoning at this layer.

5

Provider Forwarding

Sanitised, policy-compliant requests are forwarded to your LLM provider with appropriate authentication handling. RAGuard manages API key rotation and provider-specific request transformation transparently.

6

Response Filtering

The model's response is inspected before being returned to your application. Credential leakage in generated code, hallucinated PII, and policy violations are caught and remediated at this layer — catching what the model itself failed to prevent.

7

Audit Logging & Evidence Generation

Every request-response pair is logged with a risk score, applied policy decisions, and a ZKP-based evidence bundle. Logs are cryptographically immutable and independently verifiable. SHA-256 hashing and signed manifests are generated for every interaction.

Latency Profile

Built for production speed.

RAGuard adds sub-300ms overhead at P99. In practice, median additional latency is under 80ms for standard workloads. The detection pipeline is parallelised — threat detection, DLP, and content classification run concurrently, not sequentially.

Detection (parallel)20–80ms

Policy evaluation (OPA)5–15ms

Response filtering10–30ms

Audit logging<5ms async

P99 total runtime overhead: under 300ms. Median: under 80ms. Audit logging is async and does not add to request latency.

Deployment

Two ways to deploy.

Choose the deployment mode that fits your infrastructure requirements.

Managed Proxy (Default)

Point your existing LLM API calls to your RAGuard URL. One environment variable change. Works with any language or framework using the OpenAI or Anthropic SDK. Zero infrastructure to manage.

Self-Hosted

Enterprise customers deploy RAGuard within their own infrastructure for data residency requirements. Container image available for Kubernetes, ECS, and bare metal. Full configuration control.

SDK Integration Coming Soon

Native SDK for Python and TypeScript for deeper integration into agentic frameworks (LangChain, LlamaIndex, custom MCP orchestrators).

Compliance Engine

The ZKP Evidence Model.

Prove compliance without exposing content. Using Zero-Knowledge Proofs, RAGuard generates cryptographic evidence that a specific policy was applied to a specific interaction — verifiable by any third party, without revealing the underlying data.

SHA-256 Commitment

Every interaction is hashed using SHA-256, creating a cryptographic commitment to the content without storing the content itself in the evidence bundle.

Policy Decision Record

Applied policy rules, detection results, and risk scores are signed and appended to the interaction record. Immutable and tamper-evident.

Verifiable Proof Bundle

Zero-Knowledge Proofs allow a regulator or auditor to independently verify compliance was achieved — without accessing the interaction content.

        Why this matters: GDPR and HIPAA require demonstrable compliance without creating secondary
        data exposures. The ZKP model lets you prove your AI governance to auditors without handing them sensitive
        interaction logs.
      

OPA / Rego Policy Framework

Policy-as-code governance for AI.

RAGuard uses Open Policy Agent (OPA) for policy enforcement — the same framework used by Kubernetes, Envoy, and major cloud providers for infrastructure governance. Enterprise security teams already know and trust it.

policy.rego
# Example: Tenant-specific RAGuard policy in Rego
          
package raguard.policy
          
 
# Block if injection confidence exceeds threshold
          
deny["prompt_injection_detected"] {
 input.injection_score > 0.85
}
 
# Redact PII in all requests and responses
redact["pii"] {
 input.pii_entities[_].type == "EMAIL"
}
 
# Rate limit by tenant
rate_limit := 1000 {
 input.tenant == "startup_tier"
}
 
# Version controlled, peer reviewed, instantly deployable
            

Tenant-specific rules

Different business units or customers can operate under different risk profiles within a single RAGuard deployment.

Version controlled

Policies live in your Git repo alongside your infrastructure code. Review, test, and deploy like any other configuration.

Deterministic enforcement

Forbidden actions are enforced deterministically — not probabilistically. No model discretion at the policy layer.

Instantly deployable

Policy updates deploy in seconds, not release cycles. Roll back just as fast if a rule has unintended consequences.

A runtime layer that governs
AI behaviour in production.

What happens to every request.

Request Ingress

Threat Detection

Data Loss Prevention

Policy Evaluation

Provider Forwarding

Response Filtering

Audit Logging & Evidence Generation

Built for production speed.

Two ways to deploy.

Managed Proxy (Default)

Self-Hosted

SDK Integration Coming Soon

The ZKP Evidence Model.

SHA-256 Commitment

Policy Decision Record

Verifiable Proof Bundle

Policy-as-code governance for AI.

Tenant-specific rules

Version controlled

Deterministic enforcement

Instantly deployable

Ready to govern your AI stack?

A runtime layer that governsAI behaviour in production.

What happens to every request.

Request Ingress

Threat Detection

Data Loss Prevention

Policy Evaluation

Provider Forwarding

Response Filtering

Audit Logging & Evidence Generation

Built for production speed.

Two ways to deploy.

Managed Proxy (Default)

Self-Hosted

SDK Integration Coming Soon

The ZKP Evidence Model.

SHA-256 Commitment

Policy Decision Record

Verifiable Proof Bundle

Policy-as-code governance for AI.

Tenant-specific rules

Version controlled

Deterministic enforcement

Instantly deployable

Ready to govern your AI stack?

A runtime layer that governs
AI behaviour in production.