§ LLM Security Testing·RAG · copilots · agents

LLM security testing for systems where prompts can move data or tools.

DSE tests LLM applications beyond the chat box: prompt injection, sensitive-data disclosure, RAG isolation, tool abuse, excessive agency, output handling, and runtime cost controls. Findings map to the OWASP LLM Top 10 and MITRE ATLAS and include reproducible evidence.

Scope LLM security testing → See full assessment →

Prompt injection RAG isolation Tool abuse OWASP LLM Top 10

Coverage

We test the places LLM applications actually break.

A scanner cannot prove whether your agent will leak data, call the wrong tool, retrieve another tenant's document, or loop through spend. We use hands-on adversarial testing and capture the evidence.

LLM01

Prompt injection

Direct and indirect injection against prompts, retrieved content, documents, email, and tool outputs.

LLM02

Data leakage

System prompt leakage, sensitive context exposure, cross-user disclosure, and unsafe logging paths.

RAG

Retrieval isolation

Tenant boundaries, metadata filters, poisoning paths, chunk attribution, and source-grounding behavior.

Agents

Tool abuse

Excessive agency, unsafe function calls, permission escalation, tool chaining, and confirmation bypass.

Runtime

Cost and loop bounds

Token ceilings, tool-call limits, retry behavior, rate limits, and denial-of-wallet scenarios.

Evidence

Remediation proof

Reproducible payloads, transcripts, severity, control mapping, and retest notes where scoped.

Scope Fit

Which LLM testing route fits the system?

System	Risk	Recommended scope
Internal chatbot	Prompt injection, data leakage, unsafe advice.	Focused LLM security X-Ray.
RAG assistant	Cross-tenant retrieval, poisoning, ungrounded answers.	RAG isolation and prompt-injection testing.
Agent workflow	Tool abuse, excessive agency, unauthorized action.	Full AI red-team sprint.
MCP/tool ecosystem	Tool-surface drift, unsafe connector behavior, supply-chain risk.	MCP and tool-supply-chain review.

Scope

Tell us what the model can read and what it can do.

Useful scoping details: model/provider, whether RAG is used, tools/functions available, data classes touched, staging access, and test boundaries. We will return fit, next questions, or a fixed-fee testing scope.

Start scoping →

LLM security testing is a point-in-time, sampling-based assessment of the scoped system. It is not a guarantee that every flaw is found, not a certification, and not 24/7 monitoring or managed detection.