RAGuard sits inline with your existing AI stack to inspect risk, mediate policy, retain governance evidence, and prepare for tool- and MCP-aware control paths. Your application keeps the same model APIs while RAGuard enforces runtime trust boundaries in under 300ms.
Seven runtime operations execute on every prompt before it reaches your model, and on every response before it reaches your users. Policy and evidence travel with the interaction.
Your application sends an API request to your RAGuard proxy endpoint instead of directly to OpenAI or Anthropic. RAGuard's native adapters parse the request format — no SDK changes required.
The prompt is inspected by the two-tier injection detector — a fast regex pattern engine first (sub-1ms), then ML classification via Meta Prompt Guard 2 for borderline cases. Content safety classification runs in parallel. Total overhead: typically 20–80ms.
Named Entity Recognition and pattern matching scan for PII (names, emails, SSNs, phone numbers, credit cards), organisation data, and secrets (API keys, tokens, credentials). Detected entities are redacted according to your OPA policy before forwarding.
The sanitised request is evaluated against your tenant-specific OPA/Rego policy. Rules determine whether to allow, block, modify, or log the request. Forbidden actions are deterministically enforced — no probabilistic reasoning at this layer.
Sanitised, policy-compliant requests are forwarded to your LLM provider with appropriate authentication handling. RAGuard manages API key rotation and provider-specific request transformation transparently.
The model's response is inspected before being returned to your application. Credential leakage in generated code, hallucinated PII, and policy violations are caught and remediated at this layer — catching what the model itself failed to prevent.
Every request-response pair is logged with a risk score, applied policy decisions, and a ZKP-based evidence bundle. Logs are cryptographically immutable and independently verifiable. SHA-256 hashing and signed manifests are generated for every interaction.
RAGuard adds sub-300ms overhead at P99. In practice, median additional latency is under 80ms for standard workloads. The detection pipeline is parallelised — threat detection, DLP, and content classification run concurrently, not sequentially.
Choose the deployment mode that fits your infrastructure requirements.
Point your existing LLM API calls to your RAGuard URL. One environment variable change. Works with any language or framework using the OpenAI or Anthropic SDK. Zero infrastructure to manage.
Enterprise customers deploy RAGuard within their own infrastructure for data residency requirements. Container image available for Kubernetes, ECS, and bare metal. Full configuration control.
Native SDK for Python and TypeScript for deeper integration into agentic frameworks (LangChain, LlamaIndex, custom MCP orchestrators).
Prove compliance without exposing content. Using Zero-Knowledge Proofs, RAGuard generates cryptographic evidence that a specific policy was applied to a specific interaction — verifiable by any third party, without revealing the underlying data.
Every interaction is hashed using SHA-256, creating a cryptographic commitment to the content without storing the content itself in the evidence bundle.
Applied policy rules, detection results, and risk scores are signed and appended to the interaction record. Immutable and tamper-evident.
Zero-Knowledge Proofs allow a regulator or auditor to independently verify compliance was achieved — without accessing the interaction content.
RAGuard uses Open Policy Agent (OPA) for policy enforcement — the same framework used by Kubernetes, Envoy, and major cloud providers for infrastructure governance. Enterprise security teams already know and trust it.
Different business units or customers can operate under different risk profiles within a single RAGuard deployment.
Policies live in your Git repo alongside your infrastructure code. Review, test, and deploy like any other configuration.
Forbidden actions are enforced deterministically — not probabilistically. No model discretion at the policy layer.
Policy updates deploy in seconds, not release cycles. Roll back just as fast if a rule has unintended consequences.