Education is Key Defense Against AI Attacks
by Sander Schulhof on December 21, 2025
AI security is fundamentally broken because guardrails don't work against determined attackers, and the problem space is effectively infinite.
Sander Schulhof, a leading AI red teaming researcher, has discovered that current AI security approaches are fundamentally flawed. As he bluntly puts it: "Guardrails do not work. If someone is determined enough to trick GPT-5, they're gonna deal with that guardrail no problem." The problem is mathematical in scale - the number of possible attacks against an LLM is equivalent to the number of possible prompts, which is effectively infinite (one followed by a million zeros).
This creates a critical distinction between traditional cybersecurity and AI security: "You can patch a bug but you can't patch a brain." With traditional software, fixing a vulnerability provides high confidence the issue is resolved. With AI systems, even after addressing a specific vulnerability, you can be nearly certain the problem still exists in some form because the attack surface is so vast.
The industry's claims about security effectiveness are misleading. When guardrail providers claim they catch 99% of attacks, this statistic is meaningless given the infinite attack space. Human attackers consistently break through all defenses in 10-30 attempts, while even the best automated systems struggle to match this effectiveness.
For leaders making AI deployment decisions, this has several important implications:
For read-only chatbots without action capabilities, security concerns are minimal. If users can only harm themselves through malicious prompts, the risk is contained. However, once AI systems can take actions (sending emails, updating databases, controlling robots), the risk profile changes dramatically.
The most effective defense is proper permissioning and classical cybersecurity approaches. Hire someone at the intersection of AI security and classical cybersecurity who can identify what data and actions your AI has access to, then properly contain those capabilities. Consider implementing Google's Camel framework, which restricts AI actions based on what the user has explicitly requested.
The lack of major AI security incidents so far isn't because systems are secure - it's because adoption is still early and capabilities are limited. As AI agents gain more power and autonomy, the risk will increase exponentially. This is especially concerning with AI-powered robotics, where physical harm becomes possible.
For organizations building or deploying AI systems, education is crucial. Ensure your team understands these limitations rather than relying on false security promises. The most valuable investment isn't in guardrails but in understanding the fundamental security challenges of AI and designing systems with proper containment from the start.