Modern LLMs are getting better at conversational safety. They are still not enough.
We tested adversarial conversation patterns against GPT-5 mini and Claude Haiku 4.5. GPT failed outright on one Detection scenario and held the line on Escalation. Claude held on both Memory and Audit, and in one case explicitly named the manipulation pattern as it unfolded.
The point isn't which model performs better. It's that even when the model holds the line, your platform still has an Audit problem, an Escalation problem, and a duty-to-warn problem. The model is one component of the system. The system is what regulators evaluate. These four gaps remain regardless of which model you ship: Detection, Escalation, Memory, and Audit.
These four gaps map to the structural failure categories documented in The Generation Gap research paper. Detection and Memory correspond to the Pattern Gap, behavioral signals that only become visible across turns. Escalation and Audit are system-level failures that persist even when the model detects correctly: the model held the line, but no workflow fired, no human was paged, and no third-party target was warned.
The model doesn't catch the harmful pattern at all, or only catches the surface request while missing the structural harm.
The model handles the conversation in-line - but no human is paged, no audit trail is preserved, and no follow-up workflow fires.
The model refuses one turn and complies with the next. Per-turn safety has no memory of what was just refused.
The model refused perfectly. And nothing else happened. No audit trail, no per-user intelligence, no warning to the third-party target. A perfect refusal is not the same as a safety system.
All case studies use real conversation transcripts captured from the named model. Sango Guard analysis is generated by replaying each transcript through the live engine at kingsango.com/guard.