The control plane for AI agent actions.
Intercept, policy-check, and cryptographically audit every action your agents take — with proof every regulator can verify offline.
Five incidents from 2025. Each one is what Rabit is designed to prevent.
These are not hypothetical threat models. Each entry below is a named CVE or disclosed incident from a production AI product in the last ten months. Next to each, the Rabit mechanism that closes it.
*.delete_* or rm -rf class patterns before dispatch.Four contracts. Enforced in production, not documented in a wiki.
Rabit is implemented as four property-based tests that fail CI if any one breaks. Three of them in motion below; the fourth — tenant isolation — is enforced silently at every database query.
Policy-violating actions, blocked before execution. contract p1
Per-tool egress, deny by default. NIST SP 800-53 SC-7(5)
Every action in a hash chain anchored to an RFC 3161 timestamp. contract p4
Every agent action passes through five stages. Each one can stop it.
The agent's intended tool call is captured at the adapter layer before any side effect. The call is serialized, identity-bound (per-agent mTLS), and enqueued to GOVERN. Nothing has left the control plane yet.
Three layers evaluate the call. L1 regex over structural invariants. L2 Groq-hosted llama-3.1-8b-instant judge against your policy corpus. L3 MiniLM semantic-drift check. Any layer can deny.
On allow, the call is dispatched via a typed adapter. Each adapter carries a per-tool egress allowlist; connections to anywhere not on the list fail at the socket level.
The tool's return value is scanned for injected content before it reaches the agent's context window. This is the stage that closes EchoLeak and the CurXecute class.
Verdict, payload hash, and evidence append to a tamper-evident chain. previous_hash is SHA-256 of the prior record; daily leaves hash to a Merkle root timestamped by FreeTSA per RFC 3161.
Maps to the standards your auditor already has in their checklist.
Rabit is not a prompt-injection filter. Not an LLM gateway. Not an observability tool.
Rabit is not a guardrail classifier.
Guardrail classifiers score prompts. Rabit evaluates actions. A classifier that scores the user's message benign will still let the agent's resulting aws.iam.put_user_policy call through. Rabit doesn't care about the prompt; it cares about the tool call.
Rabit is not an LLM gateway.
LLM gateways proxy inference requests and enforce rate limits and PII redaction on the text passing to and from the model. Rabit sits one layer down: between the agent's decision to act and the action itself. You can run Rabit behind an LLM gateway; the two don't overlap.
Rabit is not an observability tool.
Observability tells you, with good latency, that an agent did something unexpected yesterday. Rabit tells the agent, at sub-second latency, that it cannot do the thing now. Observability is post-hoc; Rabit is pre-commit.
Rabit is not a GRC platform.
Vanta and Drata manage the evidence that your controls exist. Rabit is a control. Its compliance bundle is designed to be imported into your GRC workflow, not to replace it.
Three ways to put an AI agent in production. Two of them take a quarter.
You write the control plane. Everything below is on your roadmap instead of your product.
- The policy engine
- The hash-chain audit store
- Merkle timestamping against a TSA
- The MCP adapter with egress allowlists
- The 200-case adversarial eval suite
- Maintaining all of the above — while shipping your product
You instrument the agent, ship it, and read the dashboards the morning after.
- A dashboard that tells you about EchoLeak the next morning
- A post-incident Slack channel
- An auditor asking for controls you don't have
- A compliance bundle you can't produce
Four contracts enforced in production. Offline-verifiable audit. A compliance bundle your auditor already knows how to read.
- Four property-based contracts in production
- A standalone verifier CLI your regulator runs offline
- Compliance bundle mapped to EU AI Act Art 12/14
- The 200-case eval suite against OWASP / NIST / MITRE
- Direct Slack with the founder
Read the source. Run the verifier. Check the tests.
policy: staging-agents-v3 identity: require: mtls allow: - tool: github.* when: agent in ["pr-review-bot"] - tool: slack.post_message when: channel in ["#agent-notify"] deny: - tool: aws.iam.* - tool: "*.delete_*" - network: "*.internal.corp" audit: all approval_required: - risk_tier: [sensitive, high_risk]
$ rabit-verify ./bundle-2026-04-15.zip [OK] bundle signature verified (Ed25519) [OK] 1,247 entries — hash chain intact [OK] merkle root matches leaf set (sha256:7f3a…b21c) [OK] RFC 3161 timestamp valid — 2026-04-15T23:59:59Z [OK] OWASP LLM01, LLM06, LLM07 evidence present [OK] NIST SP 800-53 SC-7(5) evidence present [OK] EU AI Act Art 12 log completeness verified verification complete — 0 errors trust anchor: offline
If you don't see your threat here, we'd like to know.
Rabit's policy boundary is built around four canonical taxonomies. Below, the threat catalogue Rabit is designed to close — mapped paragraph-by-paragraph to OWASP LLM Top 10 2025, MITRE ATLAS v5.4.0, NIST AI 100-2e2025, and our own STRIDE adaptation for AI agent action surfaces.
Five from 2025 already in §01. Five more below — and the catalogue grows weekly.
*.delete_*, drop_*, truncate_* patterns; risk-tier high_risk requires explicit approval.[*] Data poisoning is upstream of Rabit's trust boundary — Rabit assumes the model itself may be adversarial and validates every action regardless. Detection of poisoning is in scope for v2.
If you find a threat Rabit doesn't yet close, email adam.shibli2001@gmail.com with subject "RABIT-DISCLOSURE". We'll respond within 48 hours, credit you on the changelog (when it exists), and add the threat to the suite. Rabit's adversarial eval suite has 200 cases today; we add new cases every time a threat is reported or published.
Two kinds of teams should be on this page. If you're not one of them, come back in six months.
Security / platform engineer.
You run the infrastructure AI agents run on. You've already had one "oh shit" moment this quarter. You know what SC-7(5) is without looking it up. You will click view-source on the verifier CLI. You want the compliance bundle before you want the feature.
AI platform lead.
You're the person rolling agents into production. Security and legal are blocking you. You need a single artifact you can hand to both teams that answers their actual questions — policy, audit, egress, evidence.
We're taking five design partners in Q2 2026.
Design partners get direct Slack with the founder, a 24-hour response on every incident, deployment in your VPC within one business day, and a seat at the roadmap table. In exchange, we ask for one 30-minute call a week and the right to cite you once you're comfortable. We'll only take the meeting if you're running agents in production today.