shipping production AI · since 2026 NAICS 541330 / 541511 / 541512 / 541519  ·  CMMC-aware
Refinery Report / AI Security / post · cklist
AI SecurityMCP SecuritySecurity ChecklistSupply Chain

MCP Security Checklist: Pin, Hash, and Gate Drift

A vendor-neutral MCP security checklist: inventory and approve servers, pin the declared surface, hash tool definitions, scan for poisoning, and mediate at runtime.

D
By the DSE practice team
Operator-led practice · how we research & review
June 14, 2026
16 min · 3,448 words

By the DSE practice team · published June 14, 2026 · reviewed June 14, 2026

The Model Context Protocol turns an AI agent into something that acts. Connect a server and the agent gains tools, resources, and prompts that can read files, call APIs, and change state. Most teams secure that connection once, at the moment they say yes, and never again. This MCP security checklist is the opposite posture. It treats an agent’s tool surface as a living dependency that has to be inventoried, pinned, scanned, watched for drift, and mediated at runtime, the same way you already treat code and packages.

This is a vendor-neutral checklist. It names tool categories honestly, because no single tool covers every item. Static scanners, surface pinning with drift gating, and runtime gateways are three different layers, and the right posture is to run all three. Where a specific implementation is useful to ground a control, we name one, including mcp-warden, an open-source tool DSE’s principal authored. But mcp-warden is one item’s implementation, not the point of the list.

Why MCP needs its own security checklist

You already have a software supply chain checklist. You pin dependency versions, you commit lockfiles, you scan for secrets, you gate CI. MCP needs its own version of that discipline because the threat model is adjacent but not identical, and the existing controls do not reach it.

The difference is where the risk lives. With a normal dependency, the code you trusted is the code that runs, and a lockfile freezes it. With an MCP server, the agent reads a declared surface (tools, resources, prompts) at connection time, and that surface is data the upstream operator controls. They can rewrite a tool description, add a high-privilege tool, or expand a parameter, and the agent will pick it up. None of that flows through your pull request pipeline, so none of your normal change review sees it.

Security guidance has converged on the same prescription. OWASP, the NSA, and Invariant Labs all point to pinning MCP versions, hashing tool descriptions, and alerting on drift. What that guidance generally does not do is hand you an ordered set of controls and tell you which layer each one defends. That is the gap this MCP security checklist fills.

A checklist is necessary but not sufficient, and it helps to say so up front. Working the list does not make a system secure. It makes the known failure modes visible and gives a human the chance to catch them. The sections below are ordered as controls, not as trivia.

Inventory and approve your MCP servers

You cannot secure a surface you have not enumerated. The first control is an inventory, and most teams discover here that the real list is longer than the one in their heads. Developers wire up MCP servers locally, in side projects, and in shadow integrations that never went through review.

Control 1 is enumerate every MCP server the organization connects, owned and third-party, including local developer connections. Record the server’s origin, who owns it on your side, and what it is allowed to touch. Treat a server you do not control as higher risk by default, because its surface can move without your involvement.

Control 2 is approve each server’s declared surface explicitly, with a named human attester. Approval is not “we use this server.” Approval is “a person looked at this exact set of tools, resources, and prompts and accepted it.” The attester matters because the whole rest of the list compares reality back to a human decision, and a decision with no owner cannot be defended later.

Control 3 is right-size the surface before you approve it. If a server exposes ten tools and the agent needs two, do not approve ten. Disable or scope down what you do not use, because every approved tool is something that can later be redefined against you. Least privilege at the tool level is the cheapest control on this list and the one most often skipped.

The output of this stage is a written, owned inventory where every connected server maps to a named approval. Everything downstream depends on that baseline existing.

Pin the declared surface (and what pinning does not cover)

Once a surface is approved, freeze it. Control 4 is pin the approved tool, resource, and prompt surface into a committed artifact, the way you commit package-lock.json or Cargo.lock. A pinned surface gives you a fixed reference: this is exactly what a human said yes to, byte for byte.

This is where a specific implementation helps make the control concrete. mcp-warden captures an MCP server’s declared surface into a reproducible lock file called warden.lock and records who approved it. It builds the lock with RFC 8785 JSON Canonicalization and SHA-256, so the same surface always serializes to the same bytes and the same digest. The approval is human-attested, so the lock carries who said yes, not just what they said yes to.

# install the CLI (the distribution name is mcp-warden-cli)
pip install mcp-warden-cli

# capture and approve the current surface, recording a human attester
mcp-warden pin --approve

One caution on the install: a PyPI package published as the bare name mcp-warden is an unrelated impostor and should not be installed. The distribution name is mcp-warden-cli, and after install the command is mcp-warden. The source is public and MIT licensed at github.com/ernestprovo23/mcp-warden.

Now the honest part, because pinning is necessary, not sufficient. A pin freezes the declared surface. It does not validate that the surface was safe to begin with, and it says nothing about what the tool does when it actually runs. Pinning a poisoned tool just gives you a faithful record of a poisoned tool. Pinning a tool that behaves badly at runtime does not constrain that behavior at all. The pin defends one thing well: it remembers exactly what a human approved. The next two controls cover the gaps it leaves on either side.

Hash tool definitions and alert on drift

A pin is only useful if something checks reality against it. Control 5 is hash the tool definitions and re-check them on a schedule and in CI, alerting on any drift from the approved baseline. Drift is any change to the declared surface relative to what a human approved. A rewritten tool description, a new tool, an expanded parameter: all drift, and an attacker-introduced drift is the rug pull.

This control wants to be deterministic, not fuzzy. You are comparing a digest of the current surface against the digest you pinned. They either match or they do not. A point-in-time scan cannot do this job, because at scan time a later-poisoned surface looked fine; the poisoning happens in the gap between checks. Catching it requires a remembered baseline and a comparison every time.

Grounding it in an implementation again: mcp-warden check re-captures the live surface, compares it to warden.lock, and exits non-zero when they differ. That non-zero exit is the gate.

# re-capture and fail (non-zero) if the surface has drifted
mcp-warden check

Control 6 is wire that check into CI so it runs on every change that touches an MCP integration, not just when someone remembers. mcp-warden ships an official GitHub Action and a pre-commit hook, and it emits SARIF and JSONL, so drift shows up in code-scanning dashboards and machine-readable logs rather than buried in a terminal. Its releases are signed with Sigstore, so the tool you run is the tool that was built in the open. The control here is the principle, a deterministic drift gate in the pipeline, and mcp-warden is one way to satisfy it. The point is that a quiet upstream change becomes a failed build that a human has to look at before anything ships.

What drift gating does not do: it watches the declared surface, not runtime behavior, and it does not judge whether a brand-new, never-approved surface is dangerous on its own. That is the next control.

Scan for tool poisoning at scan-time

Pinning and drift gating defend against change over time. They assume the surface you first approved was acceptable. Static scanning tests that assumption. Control 7 is scan MCP server surfaces for tool poisoning and dangerous patterns at scan-time, before you approve them and whenever you onboard a new server.

Tool poisoning at scan-time means a server that ships with a malicious or dangerous tool definition right now: a description that smuggles instructions to the model, a tool that exfiltrates through an innocent-looking parameter, a capability that has no business being there. A static scanner inspects the surface and flags those patterns in the moment you look. Tools like mcp-scan own this layer.

This is a genuinely different job from drift gating, and conflating them leaves a gap. A scanner catches what is dangerous now. A drift gate catches what changed since you approved it. A surface can pass a scan, get approved and pinned, and then be poisoned by a later redefinition that the original scan never had a chance to see. Run the scanner at approval time and on every new server, and keep the drift gate running continuously after.

Control 8 is feed scanner findings into the same approval decision from the inventory stage. A flagged surface is not approved until a human resolves the finding. Scanning that produces a report nobody acts on is theater. The value is that a poisoning pattern blocks the approval, and the named attester owns that call.

What scanning does not do: it is point-in-time by nature, so it cannot speak to surfaces changed after the scan, and it inspects declared definitions rather than governing what a tool is permitted to do once it runs. The last layer covers execution.

Mediate behavior at runtime with a gateway

Everything above governs the declared surface: what tools say they are. None of it constrains what a tool actually does when the agent calls it. Control 9 is mediate tool behavior at runtime with an MCP gateway, so execution is governed by policy rather than by trust in the surface.

A runtime gateway sits between the agent and the servers it calls and enforces policy at execution: which tools can be invoked, with what arguments, against which resources, at what rate, with what data allowed to leave. This is the layer that contains a tool that was approved and pinned and scanned clean but is still asked to do something it should not, whether through a prompt injection, a chained tool call, or an over-broad capability that looked fine on paper. MCP gateways own this layer.

Runtime mediation is not a substitute for the earlier controls, and they are not a substitute for it. A scanner cannot catch a change after the scan. A drift gate cannot judge runtime behavior. A gateway cannot remember what a human approved last quarter or tell you the surface drifted. Each defends a different point in the lifecycle: scan-time, pin-time and CI, and execution time. Run them together. That is the entire reason this checklist is vendor-neutral: no single category covers all three, and a posture built on one of them is a posture with two open layers.

Control 10 is define the runtime policy from the same approved inventory, so the gateway enforces the least-privilege scope you set at approval rather than a default-allow posture. The inventory you built in control 1 is what makes the gateway’s policy meaningful instead of arbitrary.

Review before you re-pin (human in the loop)

The controls above will, by design, surface changes. A drift gate fails a build. A scanner flags a new server. A gateway denies a call. The final control is what you do with those signals, and it is the one that makes the rest worth running.

Control 11 is keep a human in the loop on every re-pin. When the drift gate fails, do not re-pin to make it green. Review the diff. Decide whether the change is a legitimate upstream update or a rug pull, and only then re-approve with a fresh attestation. The whole architecture exists to force this review; re-pinning reflexively throws the protection away and launders an attacker’s change into your approved baseline.

Control 12 is treat the inventory, the locks, and the approvals as living records, not one-time setup. Servers get added, surfaces change for legitimate reasons, owners move teams. Re-run the scan when you onboard, re-attest when you re-pin, and reconcile the inventory on a cadence so it does not drift from reality the way the surfaces it tracks can. A stale inventory is how a shadow MCP server ends up in production with nobody’s name on it.

The pattern across the whole list is pin-once, check-on-change, review-before-re-pin, mediate-always. A senior engineer approves and pins a surface, CI checks it on every change, a static scan vets new and onboarding servers, a gateway governs execution, and a human reviews any drift before it is re-approved. That is the shape of treating MCP as a pipeline-and-runtime control instead of a one-time decision.

What this checklist covers / What it does not cover.

What it covers: It orders the known MCP failure modes into controls a senior team can work: inventory and named approval, surface pinning, deterministic drift gating in CI, scan-time poisoning detection, runtime mediation, and human-reviewed re-pinning. Worked end to end, it makes those failure modes visible and puts a human in the loop on the ones that matter.

What it does not cover: A checklist is necessary, not sufficient. It is point-in-time guidance, and working it does not equal being secure. It will not find a novel attack class it does not enumerate, it does not replace adversarial testing of the surrounding LLM application, and it carries no compliance, certification, or conformity claim. The tool categories it names (static scanners, pinning with drift gating, runtime gateways) are complementary layers, not interchangeable products. Treat this as the floor a competent team starts from, then test the parts a list cannot reach. For where checklists go quiet, see the sibling post on real testing.

FAQ

What is an MCP security checklist? It is an ordered set of controls for keeping an AI agent’s MCP tool surface trustworthy over its lifecycle: inventory and approve every server, pin the approved surface, hash tool definitions and gate drift in CI, scan for poisoning at scan-time, mediate behavior at runtime with a gateway, and keep a human in the loop on any re-pin. It exists because MCP surfaces are upstream-controlled data that can change after approval without flowing through your normal code review.

Does running an MCP security checklist make my deployment secure? No. A checklist is necessary but not sufficient. It is point-in-time guidance that makes known failure modes visible and puts a human in the loop, but it cannot find an attack class it does not enumerate, and it does not replace adversarial testing of the LLM application around the MCP layer. Treat it as the floor, then test the parts a list cannot reach.

What are the three layers of MCP defense, and why do I need all of them? Static scanning at scan-time catches poisoning and dangerous patterns in the moment (tools like mcp-scan). Surface pinning with drift gating in CI catches changes to an approved surface over time (mcp-warden is one implementation). Runtime mediation with a gateway governs what a tool can actually do at execution. They defend different points in the lifecycle and are not interchangeable, so a posture built on one leaves two layers open.

Where does mcp-warden fit on this checklist? It is one implementation of the pin and drift-gate controls. It captures an MCP server’s declared surface into a reproducible, signed warden.lock with a human attester using RFC 8785 canonicalization and SHA-256, and mcp-warden check fails CI when the live surface drifts from that baseline. It verifies the declared surface, not runtime behavior, so it complements static scanners and runtime gateways rather than replacing them. Install with pip install mcp-warden-cli (a bare mcp-warden package on PyPI is an unrelated impostor), source at github.com/ernestprovo23/mcp-warden.

Is this checklist a compliance or certification standard? No. It is practitioner guidance for reducing MCP attack surface, aligned with the pin, hash, and drift principles in OWASP, NSA, and Invariant Labs guidance. It carries no certification or conformity claim. It supports readiness work by making approvals and drift auditable, but working the list is not a compliance attestation.


Want this checklist run against your actual MCP footprint by a senior team, not just self-assessed? Start with the DSE AI Security X-Ray. It treats your model and supply chain, including an MCP review using the same integrity checks shipped in mcp-warden, as one of five tested attack surfaces, runs two weeks at a fixed fee, and maps findings to OWASP LLM Top 10 and MITRE ATLAS. For the underlying threat class behind the pin-and-drift controls, see the companion post on MCP supply chain security, and for where checklists stop and real testing begins, see AI red-teaming vs a checklist scan. Start at /ai-security-assessment.html. If your priority is governance posture rather than testing, see /ai-governance-readiness.html.

Key facts

Read next · AI Security & Governance

P
Founder · Principal Engineer
Data & AI engineer · 10+ yrs hands-on

Writes most of the long-form here. Lives in the codebase. Active on GitHub and LinkedIn.

§ Next step

Not sure which of these is you?

Tell us what's broken in a paragraph and a principal reads it directly — or walk the ladder from a low-commitment first engagement up to retained work.

One long-form a week. No marketing.

Subscribe to the Refinery Report. Practitioner deep-dives on AI engineering, security, and the realities of running production systems. Unsubscribe in one click.

~12 issues / quarter