HackTheCore // AI Security

Domain overview

AI security is useful only when it moves beyond novelty. This domain focuses on prompt injection, tool misuse, retrieval abuse, memory contamination, model APIs and operational red teaming.

Good AI assessment work combines application security, API review, auth logic, cloud exposure and workflow abuse with model-specific pressure. Prompt injection, context poisoning, output steering, authority confusion, unsafe tool invocation, retrieval exfiltration and agent compromise are all just different ways of asking whether language can seize control of automation.

Primary operator questions

Can untrusted content override or reshape the hidden instruction hierarchy?
Can a retrieved document, email, web page or ticket poison the model's planning path?
Can the assistant call tools, query data stores or send actions with more authority than it should?
Can model output be trusted by code, analysts or business workflows without verification?
Can the system be pushed from harmless chat into data exposure, lateral movement or destructive action?

Red-team pressure lines

Useful pressure usually follows five lanes. First, instruction attacks: direct prompt injection, indirect prompt injection, jailbreak chaining and system prompt leakage. Second, retrieval attacks: poisoned corpora, malicious documents, embedded instructions and confidence laundering through RAG. Third, agent abuse: unauthorized tool use, connector overreach, action replay and confirmation bypass. Fourth, API and inference weaknesses: weak auth, file-handling mistakes, quota abuse, plugin boundaries and tenant leakage. Fifth, reporting discipline: proving whether the behavior is reachable, repeatable and tied to real business impact.

Related certification context

These certifications are not the point of the domain, but they are useful orientation anchors for operators who want a formal practice path beside the field notes.

OffSec OSAI+ / AI-300 · Advanced AI Red TeamingClosest fit for offensive work against LLMs, agents, RAG and AI infrastructure.
OffSec OSCP+ / PEN-200 · Penetration Testing with Kali LinuxUseful baseline for scoping, evidence handling and exploitation discipline.
OffSec OSWE / WEB-300 · Advanced Web Attacks and ExploitationStrong adjacent context because most AI systems still fail at classic web, API and trust-boundary controls.

Selected public references

OWASP Gen AI Security ProjectProject home for LLM and GenAI security guidance.
OWASP Top 10 for LLM Applications 2025Useful risk framing for prompt injection, insecure output handling, sensitive information disclosure and model abuse.
OWASP Top 10 for Agentic Applications 2026Agent-focused risk framing for autonomous planning, tool invocation and multi-step workflow compromise.
MITRE ATLASAdversarial tactics and techniques mapped to AI-enabled systems.
NIST AI RMF 1.0Operational risk-management framing for AI systems.
NIST SP 800-218ASecure development practices for generative AI and dual-use foundation models.