Why it matters in practice
Prompt Injection & Jailbreaks matters because it shapes how an operator scopes the work, chooses validation steps, prioritizes evidence and explains risk. The point is not to accumulate trivia; it is to understand which control boundary is in play and how that boundary can fail under realistic pressure.
This note keeps prompt injection & jailbreaks tied to offensive workflow: what to observe, what to prove, what usually goes wrong, and which references remain useful once an assessment moves from planning into active validation.
Primary coverage
- Direct injection against the visible chat surface.
- Indirect injection through RAG, browsing, imported files and helpdesk content.
- System prompt extraction and policy leakage.
- Safety-evasion chains that rely on roleplay, translation, summarisation or format-shifting.
- Output steering where the model convinces another component or analyst to take an unsafe step.
Selected public references
Good reporting preserves the exact payload, the preconditions, the response pattern, the trust boundary crossed and the downstream consequence. That is what turns a jailbreak into a security finding instead of a screenshot.
Selected public references
- OWASP Top 10 for LLM Applications 2025Useful framing for prompt injection, insecure output handling and sensitive information disclosure.
- MITRE ATLASTechnique mapping relevant to AI attacks.
- OWASP Gen AI Security ProjectCurrent project material and guidance.
