Architecture

A runtime layer that governs
AI behaviour in production.

RAGuard sits inline with your existing AI stack to inspect risk, mediate policy, retain governance evidence, and prepare for tool- and MCP-aware control paths. Your application keeps the same model APIs while RAGuard enforces runtime trust boundaries in under 300ms.

Start Free View Documentation →
Request Lifecycle

What happens to every request.

Seven runtime operations execute on every prompt before it reaches your model, and on every response before it reaches your users. Policy and evidence travel with the interaction.

1

Request Ingress

Your application sends an API request to your RAGuard proxy endpoint instead of directly to OpenAI or Anthropic. RAGuard's native adapters parse the request format — no SDK changes required.

2

Threat Detection

The prompt is inspected by the two-tier injection detector — a fast regex pattern engine first (sub-1ms), then ML classification via Meta Prompt Guard 2 for borderline cases. Content safety classification runs in parallel. Total overhead: typically 20–80ms.

3

Data Loss Prevention

Named Entity Recognition and pattern matching scan for PII (names, emails, SSNs, phone numbers, credit cards), organisation data, and secrets (API keys, tokens, credentials). Detected entities are redacted according to your OPA policy before forwarding.

4

Policy Evaluation

The sanitised request is evaluated against your tenant-specific OPA/Rego policy. Rules determine whether to allow, block, modify, or log the request. Forbidden actions are deterministically enforced — no probabilistic reasoning at this layer.

5

Provider Forwarding

Sanitised, policy-compliant requests are forwarded to your LLM provider with appropriate authentication handling. RAGuard manages API key rotation and provider-specific request transformation transparently.

6

Response Filtering

The model's response is inspected before being returned to your application. Credential leakage in generated code, hallucinated PII, and policy violations are caught and remediated at this layer — catching what the model itself failed to prevent.

7

Audit Logging & Evidence Generation

Every request-response pair is logged with a risk score, applied policy decisions, and a ZKP-based evidence bundle. Logs are cryptographically immutable and independently verifiable. SHA-256 hashing and signed manifests are generated for every interaction.

Latency Profile

Built for production speed.

RAGuard adds sub-300ms overhead at P99. In practice, median additional latency is under 80ms for standard workloads. The detection pipeline is parallelised — threat detection, DLP, and content classification run concurrently, not sequentially.

Detection (parallel)20–80ms
Policy evaluation (OPA)5–15ms
Response filtering10–30ms
Audit logging<5ms async

P99 total runtime overhead: under 300ms. Median: under 80ms. Audit logging is async and does not add to request latency.

Deployment

Two ways to deploy.

Choose the deployment mode that fits your infrastructure requirements.

Managed Proxy (Default)

Point your existing LLM API calls to your RAGuard URL. One environment variable change. Works with any language or framework using the OpenAI or Anthropic SDK. Zero infrastructure to manage.

Self-Hosted

Enterprise customers deploy RAGuard within their own infrastructure for data residency requirements. Container image available for Kubernetes, ECS, and bare metal. Full configuration control.

SDK Integration Coming Soon

Native SDK for Python and TypeScript for deeper integration into agentic frameworks (LangChain, LlamaIndex, custom MCP orchestrators).

Compliance Engine

The ZKP Evidence Model.

Prove compliance without exposing content. Using Zero-Knowledge Proofs, RAGuard generates cryptographic evidence that a specific policy was applied to a specific interaction — verifiable by any third party, without revealing the underlying data.

SHA-256 Commitment

Every interaction is hashed using SHA-256, creating a cryptographic commitment to the content without storing the content itself in the evidence bundle.

Policy Decision Record

Applied policy rules, detection results, and risk scores are signed and appended to the interaction record. Immutable and tamper-evident.

Verifiable Proof Bundle

Zero-Knowledge Proofs allow a regulator or auditor to independently verify compliance was achieved — without accessing the interaction content.

Why this matters: GDPR and HIPAA require demonstrable compliance without creating secondary data exposures. The ZKP model lets you prove your AI governance to auditors without handing them sensitive interaction logs.
OPA / Rego Policy Framework

Policy-as-code governance for AI.

RAGuard uses Open Policy Agent (OPA) for policy enforcement — the same framework used by Kubernetes, Envoy, and major cloud providers for infrastructure governance. Enterprise security teams already know and trust it.

policy.rego
# Example: Tenant-specific RAGuard policy in Rego
package raguard.policy
 
# Block if injection confidence exceeds threshold
deny["prompt_injection_detected"] {
input.injection_score > 0.85
}
 
# Redact PII in all requests and responses
redact["pii"] {
input.pii_entities[_].type == "EMAIL"
}
 
# Rate limit by tenant
rate_limit := 1000 {
input.tenant == "startup_tier"
}
 
# Version controlled, peer reviewed, instantly deployable

Tenant-specific rules

Different business units or customers can operate under different risk profiles within a single RAGuard deployment.

Version controlled

Policies live in your Git repo alongside your infrastructure code. Review, test, and deploy like any other configuration.

Deterministic enforcement

Forbidden actions are enforced deterministically — not probabilistically. No model discretion at the policy layer.

Instantly deployable

Policy updates deploy in seconds, not release cycles. Roll back just as fast if a rule has unintended consequences.

Ready to govern your AI stack?

Deploy in minutes. Govern prompts, tools, policy, and evidence without rebuilding your application.