NEW RESEARCH: Your Sandbox Is Made of Glass

Read

Trinitite

PricingResearchBlogPodcasts

Glossary / AI Guardrails

Definition

What is AI Guardrails?

AI guardrails are controls that keep a model’s inputs and outputs inside policy. Beyond a two-valued allow/block filter, Trinitite’s guardrail returns a five-valued verdict — pass, correct, mask, block, or escalate — and signs a replayable receipt for every decision, trained on your policy rather than a generic classifier.

Most guardrails fail a near-miss shut, crashing the workflow. The three middle verdicts keep good traffic flowing: correct rewrites a near-miss in place via an RFC 6902 patch, mask reversibly tokenizes sensitive fields, and escalate parks a genuinely ambiguous call for a human.

Because the guardrail is a per-tenant model distilled from your policy and run on a determinism-fixed kernel, the verdict it renders inline is the same opinion your auditor signed off on — replayable in a post-incident review.

See AI Guardrails in action.

Run the free 1,000-log pre-audit and get a signed, reproducible report you can verify in a browser — no NDA.