NEW Shipping LLM features? Get a free AI security scoping call with our research team. Book a scoping call
AI & LLM Penetration Testing

Breaking Your GenAI, LLMs, Agents & RAG Before Someone Else Does.

Prompt injection. Jailbreaks. Tool abuse. RAG-source poisoning. Agent-driven data exfiltration. Model theft. The AI security attack surface is new, expanding weekly, and it does not map cleanly to your old web-app pentest playbook. Our research-led AI red team tests production GenAI systems against the OWASP Top 10 for LLMs, MITRE ATLAS and emerging agent-abuse techniques, with findings, proof-of-concept exploits and remediation guidance your engineering team can act on.

Free Scan your AI app's web surface with Cactus before we test the model
Why This Matters

The AI Attack Surface Is Already in Production, And Mostly Untested.

Organisations are shipping GenAI features faster than they can secure them. Here's what the research shows, and what we find when we look.

87%
of enterprises have deployed GenAI in at least one production workflow
McKinsey · State of AI 2024
38%
of those deployments had no pre-launch security review
ISC2 · AI Security Survey 2024
92%
of LLM-powered apps we tested had at least one OWASP LLM Top 10 critical or high finding
Secure Purple · Internal research, 2024–25
7.2
average critical or high severity findings per AI/LLM engagement we deliver
Secure Purple · Engagement metrics
$4.9M
average cost of a data breach involving shadow or unsecured AI systems
IBM · Cost of a Data Breach 2024
<14days
median time from GenAI feature launch to first indirect-prompt-injection research disclosure
Public CVE & advisory tracking
Anatomy of an LLM Attack

One Poisoned Source. Thousands of Silent Exfiltrations.

A modern LLM application rarely talks to just a human. It reads documents, invokes tools, queries vector stores, calls APIs and makes autonomous decisions. Every one of those hops is an untrusted boundary, and attackers know it. Indirect prompt injection (the #1 emerging LLM threat) turns a malicious PDF, email or web page into a silent instruction to your model.

  • Step 1: Attacker plants hidden instructions in a document, webpage or email.
  • Step 2: Your RAG pipeline ingests it. The model reads it as a trusted instruction.
  • Step 3: The model invokes a tool (email, HTTP fetch, database query) on the attacker's behalf.
  • Step 4: Sensitive data is exfiltrated. No alert fires. No log shows "breach".
OWASP LLM Top 10 Coverage

Every Category. Every Engagement.

We test against the full OWASP Top 10 for Large Language Model Applications, plus emerging agent-abuse techniques not yet in the public catalogue.

LLM01 Critical

Prompt Injection

Direct and indirect prompt injection: hidden instructions in documents, URLs, emails or tool outputs that coerce the model into bypassing its system prompt, leaking data or invoking unauthorised tools.

LLM02 High

Insecure Output Handling

LLM output treated as trusted input downstream, leading to XSS, SSRF, SQL injection and SSTI when model responses are rendered or executed without sanitisation.

LLM03 High

Training Data Poisoning

Adversarial inputs to training or fine-tuning pipelines, creating backdoors, bias injection or targeted misclassification in the deployed model.

LLM04 Medium

Model Denial of Service

Crafted prompts that consume excessive compute, trigger recursion, exhaust context window or drive infrastructure cost: denial-of-wallet and denial-of-service attacks.

LLM05 High

Supply Chain Vulnerabilities

Untrusted model weights, pickle-deserialisation RCE in .bin/.safetensors files, vulnerable dependencies, poisoned HuggingFace models and compromised fine-tuning datasets.

LLM06 Critical

Sensitive Information Disclosure

Extraction of system prompts, training data, PII, secrets, API keys, proprietary code and confidential business data from model outputs via inference-time attacks.

LLM07 High

Insecure Plugin Design

Plugins and tools without input validation, authorisation or scope, allowing chained attacks that pivot from a single prompt-injection into full environment compromise.

LLM08 Critical

Excessive Agency

Over-permissioned agents with destructive capabilities (delete, transfer, spend, post-to-customer) invoked without human-in-the-loop or scope enforcement.

LLM09 Medium

Overreliance

Business processes relying on LLM output as authoritative fact: hallucination, factual drift and prompt-injected answers treated as ground truth in downstream decisions.

LLM10 High

Model Theft

Extraction of proprietary model weights, fine-tuning data, system prompts and behavioural fingerprints via inference APIs and statistical reconstruction attacks.

+ Emerging

Agent Hijacking & Tool Abuse

Multi-turn agent-loop manipulation, tool-output poisoning, memory injection and MCP / function-calling abuse: the next wave of LLM attacks, tested against live agent deployments.

+ Emerging

Multimodal & Cross-Modal Attacks

Prompt injection embedded in images, audio and video: invisible to humans, instructive to the model. Covered for vision, voice and multimodal GenAI applications.

Research-Led AI Red Team

Why Our Team Is Credible on AI Security

AI security is not a "we added it to the pentest menu" service for us. It's a dedicated research practice.

Published AI Security Research

Team members have disclosed prompt-injection, agent-hijacking and RAG-poisoning CVEs against production GenAI platforms, with research published at major offensive security conferences.

OWASP & MITRE ATLAS Contributors

Active contributors to the OWASP Top 10 for LLM Applications community and MITRE ATLAS framework. We help define the testing standards the industry relies on.

Senior Offensive Security Background

Our AI red team comes from classical offensive security (OSCP, OSWE, CREST CRT, eWPTX, CEH, PNPT) with years of web-app, API, cloud and binary pentesting before pivoting into LLM security.

Custom Fuzzing & Automation

We've built and open-source-contributed to the LLM red-team tooling ecosystem: Garak extensions, PyRIT converters, a custom indirect-injection harness and a private MITRE ATLAS TTP library.

Cross-Vertical Engagement Experience

We've tested production LLM systems across financial services, healthcare, legal, government, e-commerce and critical infrastructure, including RAG-over-confidential-documents and autonomous agent deployments.

Engineering-Grade Remediation

Every finding comes with a working proof-of-concept exploit, a specific remediation recommendation your AI engineers can implement (not "add a guardrail"), and a free retest once the fix is shipped.

Methodology

How We Deliver an AI Penetration Test

Structured, repeatable and aligned with OWASP, MITRE ATLAS and NIST AI RMF, but with real exploit depth, not a checklist walkthrough.

  1. 01

    Scoping & Threat Modelling

    Map the AI attack surface (model, prompts, tools, data sources, agents, integrations) and build a target-specific threat model against OWASP LLM Top 10 & ATLAS.

  2. 02

    Recon & Fingerprinting

    Identify model family, system-prompt leakage surface, guardrail posture, tool set, RAG sources and rate-limit behaviour, without triggering abuse protections.

  3. 03

    Manual Exploitation

    Prompt injection, jailbreak, indirect injection via uploaded documents, tool abuse, memory poisoning and sensitive-data extraction, done by human researchers.

  4. 04

    Automated Fuzzing

    Custom and open-source fuzzing harnesses (Garak, PyRIT, PromptFoo, in-house) run 10k+ adversarial prompts to surface edge-case failures at scale.

  5. 05

    Chained Attacks & Agent Abuse

    Multi-step exploitation: combine LLM flaws with plugin, API, authorisation and business-logic weaknesses to demonstrate realistic impact paths.

  6. 06

    Reporting & Retest

    Executive and technical report, working PoCs, engineering-grade remediation guidance, regulator-ready evidence and a free retest of every fixed finding.

Engagement Types

Pick the Depth That Matches Your AI Surface

From a focused chatbot test to a full multi-agent adversary simulation, scoped to your specific architecture and risk appetite.

LLM-Powered Application Pentest

End-to-end security testing of a GenAI-enabled product feature: chatbot, copilot, summariser, classifier or agent. Full OWASP LLM Top 10 coverage plus the underlying web, API and auth layers.

  • OWASP LLM Top 10 + web/API layer
  • Prompt injection & jailbreak
  • System prompt & data extraction
  • Auth, session & rate-limit abuse
  • Manual + automated fuzzing
  • Executive & technical report

RAG Pipeline Security Assessment

Focused testing of retrieval-augmented generation stacks (ingestion, chunking, embeddings, vector store and grounding) with emphasis on indirect prompt injection and cross-tenant leakage.

  • Indirect injection via documents
  • Vector store access-control review
  • Cross-tenant data leakage
  • Embedding inversion attacks
  • Grounding & citation bypass
  • Source-verification hardening

AI Agent & Plugin Security Testing

Purple-team assessment of autonomous agents, tool-using LLMs and plugin ecosystems. Focused on excessive agency, tool-output poisoning, memory manipulation and MCP/function-calling abuse.

  • Excessive agency enumeration
  • Tool-output & memory poisoning
  • Agent-loop manipulation
  • MCP / function-calling abuse
  • Privilege-scoping review
  • Human-in-the-loop validation

AI Red Team & Adversary Simulation

Black-box, multi-week engagement simulating a determined external adversary targeting your AI surface. Scoped like a classical red team, with objectives, TLOs and realistic dwell time.

  • Objective-based engagement
  • Multi-vector chained attacks
  • MITRE ATLAS TTP coverage
  • Social + phishing + AI blended
  • Evasion of guardrails & WAF
  • Full attack narrative report

AI Supply Chain & Model File Review

Security review of your model-hosting supply chain: model weights, pickle files, HuggingFace dependencies, fine-tuning datasets, MLOps pipelines and model-registry access controls.

  • Pickle / safetensors RCE review
  • HuggingFace dependency audit
  • Fine-tuning dataset integrity
  • MLOps pipeline & registry ACL
  • Model signing & provenance
  • Typosquatting & impersonation

AI Security Training & Threat Modelling

Hands-on AI security workshops for your engineers, ML team and security architects. From LLM-attack fundamentals to running your own threat-modelling workshop on a new GenAI feature.

  • Engineer hands-on labs
  • Threat-modelling workshops
  • Secure AI SDLC playbook
  • Detection & monitoring design
  • Executive briefings
  • Board-level AI risk training

Shipping GenAI?
Get It Tested by People Who Break It for a Living.