AI & LLM Penetration Testing

Breaking Your GenAI, LLMs, Agents & RAG Before Someone Else Does.

Prompt injection. Jailbreaks. Tool abuse. RAG-source poisoning. Agent-driven data exfiltration. Model theft. The AI security attack surface is new, expanding weekly, and it does not map cleanly to your old web-app pentest playbook. Our research-led AI red team tests production GenAI systems against the OWASP Top 10 for LLMs, MITRE ATLAS and emerging agent-abuse techniques, with findings, proof-of-concept exploits and remediation guidance your engineering team can act on.

Request an AI pentest quote See OWASP LLM coverage

Free Scan your AI app's web surface with Cactus before we test the model

Why This Matters

The AI Attack Surface Is Already in Production, And Mostly Untested.

Organisations are shipping GenAI features faster than they can secure them. Here's what the research shows, and what we find when we look.

87%

of enterprises have deployed GenAI in at least one production workflow

McKinsey · State of AI 2024

38%

of those deployments had no pre-launch security review

ISC2 · AI Security Survey 2024

92%

of LLM-powered apps we tested had at least one OWASP LLM Top 10 critical or high finding

Secure Purple · Internal research, 2024–25

7.2

average critical or high severity findings per AI/LLM engagement we deliver

Secure Purple · Engagement metrics

$4.9M

average cost of a data breach involving shadow or unsecured AI systems

IBM · Cost of a Data Breach 2024

<14days

median time from GenAI feature launch to first indirect-prompt-injection research disclosure

Public CVE & advisory tracking

Anatomy of an LLM Attack

One Poisoned Source. Thousands of Silent Exfiltrations.

A modern LLM application rarely talks to just a human. It reads documents, invokes tools, queries vector stores, calls APIs and makes autonomous decisions. Every one of those hops is an untrusted boundary, and attackers know it. Indirect prompt injection (the #1 emerging LLM threat) turns a malicious PDF, email or web page into a silent instruction to your model.

Step 1: Attacker plants hidden instructions in a document, webpage or email.
Step 2: Your RAG pipeline ingests it. The model reads it as a trusted instruction.
Step 3: The model invokes a tool (email, HTTP fetch, database query) on the attacker's behalf.
Step 4: Sensitive data is exfiltrated. No alert fires. No log shows "breach".

OWASP LLM Top 10 Coverage

Every Category. Every Engagement.

We test against the full OWASP Top 10 for Large Language Model Applications, plus emerging agent-abuse techniques not yet in the public catalogue.

LLM01 Critical

Prompt Injection

Direct and indirect prompt injection: hidden instructions in documents, URLs, emails or tool outputs that coerce the model into bypassing its system prompt, leaking data or invoking unauthorised tools.

LLM02 High

Insecure Output Handling

LLM output treated as trusted input downstream, leading to XSS, SSRF, SQL injection and SSTI when model responses are rendered or executed without sanitisation.

LLM03 High

Training Data Poisoning

Adversarial inputs to training or fine-tuning pipelines, creating backdoors, bias injection or targeted misclassification in the deployed model.

LLM04 Medium

Model Denial of Service

Crafted prompts that consume excessive compute, trigger recursion, exhaust context window or drive infrastructure cost: denial-of-wallet and denial-of-service attacks.

LLM05 High

Supply Chain Vulnerabilities

Untrusted model weights, pickle-deserialisation RCE in .bin/.safetensors files, vulnerable dependencies, poisoned HuggingFace models and compromised fine-tuning datasets.

LLM06 Critical

Sensitive Information Disclosure

Extraction of system prompts, training data, PII, secrets, API keys, proprietary code and confidential business data from model outputs via inference-time attacks.

LLM07 High

Insecure Plugin Design

Plugins and tools without input validation, authorisation or scope, allowing chained attacks that pivot from a single prompt-injection into full environment compromise.

LLM08 Critical

Excessive Agency

Over-permissioned agents with destructive capabilities (delete, transfer, spend, post-to-customer) invoked without human-in-the-loop or scope enforcement.

LLM09 Medium

Overreliance

Business processes relying on LLM output as authoritative fact: hallucination, factual drift and prompt-injected answers treated as ground truth in downstream decisions.

LLM10 High

Model Theft

Extraction of proprietary model weights, fine-tuning data, system prompts and behavioural fingerprints via inference APIs and statistical reconstruction attacks.

+ Emerging

Agent Hijacking & Tool Abuse

Multi-turn agent-loop manipulation, tool-output poisoning, memory injection and MCP / function-calling abuse: the next wave of LLM attacks, tested against live agent deployments.

+ Emerging

Multimodal & Cross-Modal Attacks

Prompt injection embedded in images, audio and video: invisible to humans, instructive to the model. Covered for vision, voice and multimodal GenAI applications.

Research-Led AI Red Team

Why Our Team Is Credible on AI Security

AI security is not a "we added it to the pentest menu" service for us. It's a dedicated research practice.

Published AI Security Research

Team members have disclosed prompt-injection, agent-hijacking and RAG-poisoning CVEs against production GenAI platforms, with research published at major offensive security conferences.

OWASP & MITRE ATLAS Contributors

Active contributors to the OWASP Top 10 for LLM Applications community and MITRE ATLAS framework. We help define the testing standards the industry relies on.

Senior Offensive Security Background

Our AI red team comes from classical offensive security (OSCP, OSWE, CREST CRT, eWPTX, CEH, PNPT) with years of web-app, API, cloud and binary pentesting before pivoting into LLM security.

Custom Fuzzing & Automation

We've built and open-source-contributed to the LLM red-team tooling ecosystem: Garak extensions, PyRIT converters, a custom indirect-injection harness and a private MITRE ATLAS TTP library.

Cross-Vertical Engagement Experience

We've tested production LLM systems across financial services, healthcare, legal, government, e-commerce and critical infrastructure, including RAG-over-confidential-documents and autonomous agent deployments.

Engineering-Grade Remediation

Every finding comes with a working proof-of-concept exploit, a specific remediation recommendation your AI engineers can implement (not "add a guardrail"), and a free retest once the fix is shipped.

Methodology

How We Deliver an AI Penetration Test

Structured, repeatable and aligned with OWASP, MITRE ATLAS and NIST AI RMF, but with real exploit depth, not a checklist walkthrough.

01
Scoping & Threat Modelling
Map the AI attack surface (model, prompts, tools, data sources, agents, integrations) and build a target-specific threat model against OWASP LLM Top 10 & ATLAS.
02
Recon & Fingerprinting
Identify model family, system-prompt leakage surface, guardrail posture, tool set, RAG sources and rate-limit behaviour, without triggering abuse protections.
03
Manual Exploitation
Prompt injection, jailbreak, indirect injection via uploaded documents, tool abuse, memory poisoning and sensitive-data extraction, done by human researchers.
04
Automated Fuzzing
Custom and open-source fuzzing harnesses (Garak, PyRIT, PromptFoo, in-house) run 10k+ adversarial prompts to surface edge-case failures at scale.
05
Chained Attacks & Agent Abuse
Multi-step exploitation: combine LLM flaws with plugin, API, authorisation and business-logic weaknesses to demonstrate realistic impact paths.
06
Reporting & Retest
Executive and technical report, working PoCs, engineering-grade remediation guidance, regulator-ready evidence and a free retest of every fixed finding.

Engagement Types

Pick the Depth That Matches Your AI Surface

From a focused chatbot test to a full multi-agent adversary simulation, scoped to your specific architecture and risk appetite.

LLM-Powered Application Pentest

End-to-end security testing of a GenAI-enabled product feature: chatbot, copilot, summariser, classifier or agent. Full OWASP LLM Top 10 coverage plus the underlying web, API and auth layers.

OWASP LLM Top 10 + web/API layer
Prompt injection & jailbreak
System prompt & data extraction
Auth, session & rate-limit abuse
Manual + automated fuzzing
Executive & technical report

RAG Pipeline Security Assessment

Focused testing of retrieval-augmented generation stacks (ingestion, chunking, embeddings, vector store and grounding) with emphasis on indirect prompt injection and cross-tenant leakage.

Indirect injection via documents
Vector store access-control review
Cross-tenant data leakage
Embedding inversion attacks
Grounding & citation bypass
Source-verification hardening

AI Agent & Plugin Security Testing

Purple-team assessment of autonomous agents, tool-using LLMs and plugin ecosystems. Focused on excessive agency, tool-output poisoning, memory manipulation and MCP/function-calling abuse.

Excessive agency enumeration
Tool-output & memory poisoning
Agent-loop manipulation
MCP / function-calling abuse
Privilege-scoping review
Human-in-the-loop validation

AI Red Team & Adversary Simulation

Black-box, multi-week engagement simulating a determined external adversary targeting your AI surface. Scoped like a classical red team, with objectives, TLOs and realistic dwell time.

Objective-based engagement
Multi-vector chained attacks
MITRE ATLAS TTP coverage
Social + phishing + AI blended
Evasion of guardrails & WAF
Full attack narrative report

AI Supply Chain & Model File Review

Security review of your model-hosting supply chain: model weights, pickle files, HuggingFace dependencies, fine-tuning datasets, MLOps pipelines and model-registry access controls.

Pickle / safetensors RCE review
HuggingFace dependency audit
Fine-tuning dataset integrity
MLOps pipeline & registry ACL
Model signing & provenance
Typosquatting & impersonation

AI Security Training & Threat Modelling

Hands-on AI security workshops for your engineers, ML team and security architects. From LLM-attack fundamentals to running your own threat-modelling workshop on a new GenAI feature.

Engineer hands-on labs
Threat-modelling workshops
Secure AI SDLC playbook
Detection & monitoring design
Executive briefings
Board-level AI risk training

Related Services

Complete Your Security Programme

Shipping GenAI?
Get It Tested by People Who Break It for a Living.

Request a quote ask@securepurple.com