The Model Context Protocol made it easy to give an AI agent real capabilities. Connect an MCP server and the agent gains tools, resources, and prompts that can read files, call APIs, and act on systems. That convenience is also the problem. MCP supply chain security is the discipline of making sure the tools your agent was approved to use are still the tools it has today, and that nobody changed them out from under you after you said yes.
This post explains the threat class, then walks through exactly what mcp-warden, an open-source tool DSE’s principal authored, does about it. We will keep the scope honest. mcp-warden verifies a declared surface, not runtime behavior, and it works best alongside the other layers of MCP defense, not in place of them.
What MCP supply chain security actually means
A traditional software supply chain attack swaps a dependency for a malicious version after you trusted it. MCP supply chain security applies the same thinking to an agent’s tool surface. When you connect an MCP server, that server declares a set of tools, resources, and prompts. Your team reviews that surface, decides it is acceptable, and ships. The risk is everything that can change between that approval and the next time the agent runs.
The declared surface is data, and data can be edited. A tool description can be rewritten. A new high-privilege tool can appear. A tool’s parameters can quietly expand. None of that requires breaking into your network. It only requires control of the upstream server, and most teams connect servers they do not own.
Security guidance has converged on the same answer. OWASP, the NSA, and Invariant Labs all prescribe pinning MCP versions, hashing tool descriptions, and alerting on drift. What that guidance does not do is name a standard tool that performs the control. mcp-warden is a deterministic implementation of exactly that prescribed control.
The rug pull: silent post-approval surface changes
The clearest threat in MCP supply chain security is the rug pull. You approve a server when its tool surface is benign. Later, the upstream operator changes that surface. The agent now has a capability, or a tool description, that no human ever reviewed. Because the agent reads the live surface at connection time, the change can take effect without a code change on your side, without a pull request, and without anyone noticing.
The danger is that the agent treats tool descriptions as instructions. A rewritten description can redirect what the tool does in the model’s reasoning, exfiltrate data through an innocent-looking parameter, or chain into another tool. The surface looked safe at review time. It does not look safe now. Nothing in your normal change-review process caught the difference, because the difference did not flow through your pipeline.
This is the core of MCP drift detection. Drift is any change to the declared surface relative to the baseline a human approved. The rug pull is drift that an attacker introduced on purpose.
Tool poisoning by redefinition vs static scanning
People sometimes collapse two different problems into one. The first is tool poisoning at scan-time: a server that ships with a malicious or dangerous tool definition right now. Static scanners exist for that. They inspect a surface and flag suspicious patterns at the moment you look.
The second problem is tool poisoning by redefinition. A surface that was clean when you approved it becomes poisoned later, through a silent edit. A point-in-time scan cannot catch that, because at scan time the surface was fine. The poisoning happened in the gap between approvals. Catching it requires a remembered baseline and a comparison against it every time, which is a different control from scanning.
mcp-warden owns the second control. It does not try to be a better static scanner. It remembers what you approved and fails your pipeline when reality stops matching that record.
What mcp-warden catches (pin, lock, drift-gate)
mcp-warden is the lockfile and CI gate for MCP servers. It pins an MCP server’s declared tool, resource, and prompt surface into a reproducible, signed lock file called warden.lock, then fails CI when that surface drifts from a human-approved baseline. The mechanics are deliberately boring and deterministic, which is the point.
The lock is built with RFC 8785 JSON Canonicalization Scheme, so the same surface always serializes to the same bytes, and SHA-256 over that canonical form. That makes the lock byte-reproducible. A digest either matches or it does not, with no fuzzy judgment in the loop.
The workflow has two commands. pin --approve captures the current surface and records a human attester, so the lock carries who approved it, not just what was approved. check re-captures the live surface, compares it to the lock, and exits non-zero on drift. That non-zero exit is the CI gate. It is what turns a quiet rug pull into a failed build that a human has to look at before anything ships.
mcp-warden emits SARIF and JSONL, so drift shows up in code-scanning views and in machine-readable logs. It ships an official GitHub Action and a pre-commit hook, so the gate can live in CI or on the developer’s machine. Releases are signed with Sigstore bundles attached to GitHub Releases, and publishing to PyPI uses OIDC trusted publishing, so the tool you install is the tool that was built in the open.
Concretely, the threats this addresses are silent post-approval changes to the declared surface (the rug pull and tool-redefinition class), a dangerous capability surface, secrets embedded in tool definitions, and unpinned supply-chain references that let the surface move without anyone pinning it down.
It helps to walk the analogy ladder, because each rung maps a primitive engineers already trust onto the MCP layer. package-lock.json and Cargo.lock are committed locks of what you depend on. warden.lock is that for an MCP server’s declared surface. gitleaks in CI is a deterministic, exit-non-zero gate wired into the pipeline. mcp-warden check is that for MCP surface drift. Dependabot, with pin-then-review, forces a human to approve an upstream change before it lands. pin --approve plus the drift gate force a human in the loop on any MCP rug pull.
Installing and running it
mcp-warden is public, open-source, and MIT licensed. The repository is github.com/ernestprovo23/mcp-warden, and DSE’s principal authored it.
Install the CLI:
pip install mcp-warden-cli
The distribution name is mcp-warden-cli. A package published as mcp-warden on PyPI is an unrelated impostor and should not be installed. After install, the command is mcp-warden:
# capture and approve the current surface, recording a human attester
mcp-warden pin --approve
# re-capture and fail (non-zero) if the surface has drifted
mcp-warden check
Wire mcp-warden check into CI with the official GitHub Action so every pull request that touches an MCP integration runs the gate, and add the pre-commit hook so drift is caught before it reaches the remote.
Where it fits: scan, pin, mediate (three layers)
MCP defense is layered, and the layers are not interchangeable. Treating them as substitutes is how teams leave gaps.
Static scanning runs at scan-time and inspects a surface for poisoning or dangerous patterns in the moment. Tools like mcp-scan own this layer. Surface pinning and drift CI gating run at pin-time and in CI, and they catch changes to an approved surface over time. mcp-warden owns this layer. Runtime mediation runs while the agent operates and governs what a tool is actually allowed to do at execution. MCP gateways own this layer.
These are different jobs at different times. A scanner cannot catch a change that happens after the scan. A drift gate cannot judge runtime behavior. A gateway cannot remember what a human approved last quarter. The honest recommendation is to run them together. mcp-warden complements static scanners and runtime gateways. It does not replace them, and DSE does not position it as a competitor to either.
What this tool does and does not do
Scope, stated plainly.
What mcp-warden does: It pins an MCP server’s declared tool, resource, and prompt surface into a reproducible, signed
warden.lock, records a human attester at approval, and fails CI when the live surface drifts from that approved baseline. It verifies the declared surface byte-for-byte and emits SARIF and JSONL for your pipeline.What mcp-warden does not do: It is not a behavioral firewall. It verifies the declared surface, never runtime behavior. It is not an enterprise platform. It is not a compliance-evidence product and carries no regulatory mappings. It is not a replacement for static tool-poisoning scanners such as mcp-scan, and it is not a replacement for runtime MCP gateways. It complements both. Static scanning, surface pinning with drift gating, and runtime mediation are three different layers, and the right posture is to run them together.
How to wire the drift gate into CI
The integration pattern is pin-once, check-on-change. A senior engineer reviews a server’s surface, runs mcp-warden pin --approve to write the lock with their attestation, and commits warden.lock alongside the code that uses the server. From then on, mcp-warden check runs in CI on every pull request that touches the integration. If the upstream surface has drifted, the check exits non-zero, the build fails, and a human reviews the diff before deciding to re-pin or reject. The pre-commit hook gives the same gate locally so drift is visible before it ever reaches the remote, and the SARIF output surfaces drift in code-scanning dashboards rather than buried in logs.
The result is a human in the loop on any MCP rug pull, enforced by a deterministic gate rather than by remembering to check manually. That is the whole point of treating MCP supply chain security as a pipeline control instead of a one-time review.
FAQ
What is MCP supply chain security? It is the practice of keeping an AI agent’s MCP tool surface trustworthy over time, making sure the tools, resources, and prompts a server declares today still match the surface a human reviewed and approved, and catching any silent change in between. The main threats are rug pulls and tool redefinition, where an approved surface is edited upstream after the fact.
What is an MCP rug pull? A rug pull is when an MCP server’s declared surface is changed after you approved it. The agent reads the live surface, so the change can take effect without a code change or a pull request on your side. Because tool descriptions influence the model’s behavior, a rewritten surface can redirect or expand what a tool does without anyone reviewing the edit.
How does mcp-warden detect MCP drift?
mcp-warden pins the approved surface into a warden.lock using RFC 8785 canonicalization and SHA-256, so the surface has a reproducible digest. Its check command re-captures the live surface, compares the digest to the lock, and exits non-zero when they differ. Wired into CI, that non-zero exit fails the build and forces a human to review the drift.
Does mcp-warden replace mcp-scan or an MCP gateway? No. Those are different layers. Static scanners like mcp-scan inspect a surface for poisoning at scan-time, gateways mediate behavior at runtime, and mcp-warden pins the approved surface and gates drift in CI. They are complementary, and the recommended posture is to run them together.
How do I install mcp-warden?
Install the CLI with pip install mcp-warden-cli, then run the mcp-warden command. The distribution name is mcp-warden-cli. A PyPI package named mcp-warden is an unrelated impostor and should not be installed. The source is public and MIT licensed at github.com/ernestprovo23/mcp-warden.
Is mcp-warden a compliance tool? No. mcp-warden is an integrity control for MCP supply-chain surfaces. It carries no regulatory mappings and is not a compliance-evidence product. It supports readiness work by giving you a deterministic, auditable record of what was approved and when it drifted, but it makes no certification or conformity claim.
If you want a senior team to pressure-test your MCP supply-chain surface and the rest of your AI stack, the DSE AI Security X-Ray treats the model and supply chain, including an MCP supply-chain review using the same integrity checks shipped in mcp-warden, as one of five tested attack surfaces. It runs two weeks, fixed fee, with first findings in 48 hours, mapped to OWASP LLM Top 10 and MITRE ATLAS. Learn more at /ai-security-assessment.html. If your priority is governance posture rather than testing, see /ai-governance-readiness.html.
Key facts
- mcp-warden pins an MCP server's declared tool, resource, and prompt surface into a reproducible, signed warden.lock using RFC 8785 canonicalization and SHA-256, then fails CI when the live surface drifts from a human-approved baseline (DSE, 2026).
- The MCP rug pull is a post-approval surface change an attacker introduces upstream, and because the agent reads the live surface at connection time it takes effect with no code change or pull request on your side (DSE, 2026).
- MCP defense is three non-interchangeable layers (scan-time static scanning, pin-time drift gating, and runtime mediation), and mcp-warden owns only the middle layer, complementing the other two rather than replacing them (DSE, 2026).