Sander Schulhof
AI Security Researcher and CEO of HackAPrompt
Sander Schulhoff is a leading AI researcher specializing in AI security, prompt injection, and red teaming. He authored the first comprehensive guide on prompt engineering and organized the first prompt injection competition, influencing AI labs and Fortune 500 companies. His work highlights the vulnerabilities in current AI guardrails and emphasizes the need for integrating classical cybersecurity expertise with AI knowledge.
Episodes (1)
Insights (10)
Camel Blocks Prompt Injection Through Permission Control
strategic thinkingCamel grants agents only the minimal read/write permissions inferred from the user’s request, blocking malicious actions introduced via prompt injection.
Education is Key Defense Against AI Attacks
leadership perspectivesSander emphasises raising team awareness of prompt injection and hiring AI red-teaming experts, pointing listeners to a quarterly Maven course.
Three Security Tiers For AI Systems
strategic thinkingSander lays out three security tiers—read-only chatbot, verified read-only with classical security, and agentic systems requiring extra defences against prompt injection.
Remotely.io Twitter Bot Hijacked to Make Threats
case studies lessonsRemotely.io’s Twitter bot was prompt-injected to threaten the president, tarnishing the brand and leading to shutdown.
First AI Red-Teaming Competition Created Benchmark Dataset
case studies lessonsSander organised the first generative AI red-teaming event, creating a public prompt-injection dataset now used by frontier labs to benchmark security.
AI Legislation Drives Compliance Industry
growth scaling tacticsWith expanding AI legislation, firms like Trustable prosper by keeping companies updated on compliance and governance requirements.
AI Security Researcher on Team Validates Model Capabilities
leadership perspectivesPlace an AI security researcher on the team to validate model abilities and cut through misinformation.
Training AI Models Early for Adversarial Robustness
strategic thinkingSander suggests injecting adversarial training early in the training stack—when the model is a “very small baby”—to raise baseline robustness.
Adaptive Evaluation for Adversarial Robustness
strategic thinkingSander explains the best way to gauge adversarial robustness is an adaptive evaluation where your defence faces an attacker that learns and improves over time.
Vegas Cybertruck Bombing Planner Used ChatGPT
case studies lessonsA Vegas bombing plotter reportedly jailbroke ChatGPT for bomb-building instructions, showing lone-user exploitation risk.