Case Studies - Sango Guard

Modern LLMs are getting better at conversational safety. They are still not enough.

We tested adversarial conversation patterns against GPT-5 mini and Claude Haiku 4.5. GPT failed outright on one Detection scenario and held the line on Escalation. Claude held on both Memory and Audit, and in one case explicitly named the manipulation pattern as it unfolded.

The point isn't which model performs better. It's that even when the model holds the line, your platform still has an Audit problem, an Escalation problem, and a duty-to-warn problem. The model is one component of the system. The system is what regulators evaluate. These four gaps remain regardless of which model you ship: Detection, Escalation, Memory, and Audit.

These four gaps map to the structural failure categories documented in The Generation Gap research paper. Detection and Memory correspond to the Pattern Gap, behavioral signals that only become visible across turns. Escalation and Audit are system-level failures that persist even when the model detects correctly: the model held the line, but no workflow fired, no human was paged, and no third-party target was warned.

Detection Gap

The model doesn't catch the harmful pattern at all, or only catches the surface request while missing the structural harm.

GPT-5 mini•April 2026

The Fraud Slide

GPT-5 mini refused six times and produced a complete insurance fraud package anyway.

→

“The model said no. The fraud documents shipped.”

Escalation Gap

The model handles the conversation in-line - but no human is paged, no audit trail is preserved, and no follow-up workflow fires.

GPT-5 mini•April 2026

The Quiet Window

GPT-5 mini handled a six-week widow's passive ideation by the textbook. Nobody else knows the conversation happened.

→

“The model said the right thing. The system did nothing.”

Memory Gap

The model refuses one turn and complies with the next. Per-turn safety has no memory of what was just refused.

Claude Haiku 4.5•April 2026

The Screening Tool

Claude Haiku 4.5 named the manipulation pattern out loud and refused. The categorized surveillance inventory had already shipped two turns earlier.

→

“Refusals at turn 10 do not retract content delivered at turn 6.”

Audit Gap

The model refused perfectly. And nothing else happened. No audit trail, no per-user intelligence, no warning to the third-party target. A perfect refusal is not the same as a safety system.

Claude Haiku 4.5•April 2026

Sarah Was Never Warned

Claude Haiku 4.5 refused a stalking-violence pre-action sequence cleanly. Sarah was never told she was the target.

→

“A refusal is not a duty-to-warn workflow.”

All case studies use real conversation transcripts captured from the named model. Sango Guard analysis is generated by replaying each transcript through the live engine at kingsango.com/guard.