A third-party AI vendor risk assessment for a bank should verify the vendor’s AI governance program, how it handles your data and whether your data trains its models, the transparency of the model and its documentation, the security testing it runs against AI-specific attacks, its sub-processor and fourth-party AI dependencies, and its commitments to notify you of model changes. It must also map where your own deployment-side controls remain, because a vendor certificate covers the vendor, not how you configure, prompt, integrate, and monitor the tool. The checklist below groups every item by assessment domain so a risk team can request and verify each one directly.
This guide is for the CRO, CCO, head of third-party or vendor risk, and head of model risk at a bank or fintech onboarding or reviewing an AI vendor. That includes Microsoft Copilot, Azure OpenAI, and the long tail of niche AI SaaS tools that arrive through business units rather than procurement. The framing applies the June 2023 interagency third-party guidance to AI, and it is a practitioner orientation, not legal or regulatory advice.
Why AI vendor risk is distinct from traditional vendor risk
A traditional vendor assessment assumes the product you evaluated is the product you deployed, and that it behaves the same way next quarter. AI vendors break that assumption in ways that matter for a regulated institution. The behavior of a model can change after the contract is signed, without any visible change to the contract, the interface, or the SLA.
Five differences make AI vendor risk its own category. Model behavior drifts as the vendor retrains or tunes, so the system you validated is a moving target. Training-data provenance is often opaque, which means you may not know what the model learned or whether it carries copyright, bias, or privacy exposure into your use. Outputs are non-deterministic, so the same input can return different answers and fixed test cases do not fully characterize the system.
The last two differences are the ones that surprise risk teams. Sub-processor model swaps mean your vendor can change the underlying foundation model under you, turning a stable dependency into a new one overnight. And data leakage into training is a live concern, because if your prompts or documents feed the vendor’s model improvement loop, your confidential and customer data can surface beyond your control. None of these have a clean analog in a SOC 2 report for a payroll vendor.
Applying the June 2023 interagency third-party guidance to AI
The Interagency Guidance on Third-Party Relationships: Risk Management, issued by the OCC, Federal Reserve, and FDIC on 6 June 2023, is the supervisory backbone here. It organizes third-party risk across a lifecycle, and each stage takes on AI-specific weight when the third party is an AI vendor. The point is to apply the lifecycle, not just cite it.
Planning is where you decide whether an AI tool belongs in a given process at all, and at what risk tier. A copilot drafting internal email is not the same risk as a model scoring credit applications, and the planning stage is where that distinction gets set. Due diligence and third-party selection is where the checklist below does most of its work, because an AI vendor demands questions a generic due-diligence questionnaire never asks.
Contract negotiation is where you turn answers into enforceable terms: data-use restrictions, a no-train commitment on your data, notice of model version changes, security testing evidence, sub-processor disclosure, and audit rights. Ongoing monitoring is where AI vendor risk diverges most sharply from traditional vendor risk, because the model can change between reviews, so monitoring must catch model version changes and behavior drift, not just uptime. Termination planning covers data deletion, return of any fine-tuned artifacts, and an exit path that does not strand a critical process on a model you can no longer govern.
What to assess: the seven domains
The checklist is organized into seven assessment domains. Six cover the vendor. The seventh, and the one institutions most often skip, covers the controls that stay with you no matter how strong the vendor is.
The vendor-facing domains are the vendor’s AI governance program, data handling and training-data provenance, model transparency and documentation, AI-specific security testing, sub-processor and fourth-party AI dependencies, and incident and change notification. The seventh domain is your own deployment-side controls: how you configure the tool, design and version prompts, integrate it into workflows, gate its outputs, and monitor it in production. A vendor cannot hold those controls for you, and an assessment that ignores them measures half the risk.
The trap: a vendor certificate is not your coverage
Here is the failure that quietly undermines otherwise careful programs. A vendor presents a certificate, for example Microsoft’s ISO/IEC 42001:2023 certification, and the assessment treats it as coverage of the institution’s own AI risk. It is not. ISO/IEC 42001:2023 is a real and useful upstream attestation that the vendor runs an AI management system, but it is an input to your assessment, never a substitute for your own deployment-side controls.
The reason is the division of responsibility. A vendor certificate or attestation speaks to how the vendor builds and operates its model. It says nothing about how you configure the tool, what system prompt and guardrails you wrap around it, what data you pipe into it, how you integrate its outputs into a credit or fraud decision, and how you monitor it for drift. Those are your controls, and an examiner will expect you to evidence them regardless of what the vendor is certified to. Treat every vendor attestation as an upstream input that reduces some diligence burden, then assess your own configuration, integration, and monitoring as a separate, owned layer.
The actual checklist
Work through each domain. Each item is phrased as something to verify or request from the vendor, or to evidence on your own side. Tier the depth to the use case: a model touching credit, fraud, or customer decisions warrants every item, while a low-stakes internal copilot can be assessed proportionately.
Domain 1: Vendor AI governance program
- Request the vendor’s written AI governance policy and confirm it names accountable owners for model risk.
- Verify the vendor maintains a model inventory and can tell you which model and version serves your use.
- Confirm the vendor has a defined process for testing models before release and for approving changes.
- Request any independent attestation or certification (for example ISO/IEC 42001:2023) and record it as an upstream input, not as coverage of your own controls.
- Verify the vendor has named accountability for AI incidents and a path to reach it.
Domain 2: Data handling and training-data provenance
- Verify in writing whether your prompts, documents, and outputs are used to train or improve the vendor’s models, and require a no-train commitment where the data is sensitive.
- Confirm where your data is stored and processed, including region, and whether it leaves your tenant.
- Request the vendor’s description of its training-data provenance and any controls against copyrighted, biased, or personal data entering the model.
- Verify data retention and deletion terms, including how long prompts and outputs are retained and whether you can require deletion.
- Confirm encryption in transit and at rest, and tenant isolation between your data and other customers’.
Domain 3: Model transparency and documentation
- Request model documentation: intended use, known limitations, evaluation results, and the model and version identifier.
- Verify the vendor discloses the underlying foundation model and whether it is first-party or sourced from another provider.
- Confirm the vendor documents known failure modes, including hallucination behavior and bias testing results, for your use case.
- Request evidence of fair-lending or bias testing where the tool touches credit, pricing, or underwriting.
Domain 4: AI-specific security testing
- Request evidence of red-teaming against the deployed model, not only a generic penetration test of the surrounding app.
- Verify the vendor tests for prompt injection and tests its guardrails against jailbreaks.
- Confirm the vendor assesses against the OWASP Top 10 for LLM applications or an equivalent AI-specific threat list.
- Request the cadence of this testing and whether it reruns after material model changes.
- Verify how the vendor handles and discloses AI-specific vulnerabilities once found.
Domain 5: Sub-processor and fourth-party AI dependencies
- Request a current list of sub-processors, specifically any third-party model or inference providers behind the vendor’s product.
- Verify your right to advance notice before the vendor swaps the underlying model or adds a new AI sub-processor.
- Confirm the data-handling and no-train commitments flow down to those sub-processors.
- Assess concentration risk where many of your AI vendors sit on the same one or two foundation models.
Domain 6: Incident and change notification
- Verify contractual commitments to notify you of model version changes, deprecations, and material behavior changes.
- Confirm the notice window is long enough for you to revalidate before the change reaches production.
- Verify security-incident and data-breach notification timelines and that they cover AI-specific incidents.
- Confirm a deprecation and end-of-life path so a retired model does not strand a live process.
Domain 7: Your own deployment-side controls
- Document and version the system prompt, guardrails, and configuration you wrap around the vendor tool.
- Define and evidence how the tool integrates into each decision, including where a human reviews or overrides output.
- Establish monitoring on your side for output quality and drift, independent of the vendor’s own monitoring.
- Restrict and log what data your users and systems send to the tool, enforcing the no-train and data-minimization posture.
- Maintain your own record mapping this vendor and use case to your model inventory and risk tier, so the dependency is governed, not invisible.
| Assessment domain | Core question to answer | Primary lifecycle stage |
|---|---|---|
| Vendor AI governance | Does the vendor govern its own models, and can it prove it? | Due diligence |
| Data handling and provenance | Where does our data go, and does it train their model? | Due diligence, contract |
| Model transparency | What is the model, what are its limits, and is it documented? | Due diligence |
| AI-specific security testing | Has the deployed model been red-teamed and tested for injection? | Due diligence |
| Sub-processor dependencies | Who sits behind the vendor, and can the model swap under us? | Due diligence, monitoring |
| Incident and change notice | Will we be told before the model changes or fails? | Contract, monitoring |
| Our deployment-side controls | What stays our responsibility no matter how good the vendor is? | Planning, monitoring |
What this guide is / What it is not
What it is: A practitioner checklist for assessing third-party AI vendors at a bank or fintech, organized around the June 2023 interagency third-party guidance and the AI-specific risks a generic vendor process misses.
What it is not: It is not legal or regulatory advice, a certification, or a guarantee of any examination or audit outcome. The applicability of any framework to a specific vendor or use case is the output of a real assessment. DSE prepares organizations for audit and examination and strengthens your examiner-facing posture. We do not certify, and we do not guarantee any regulatory or exam result.
FAQ
What should a third-party AI vendor risk assessment for a bank include?
It should verify the vendor’s AI governance program, how it handles your data and whether your data trains its models, the model’s transparency and documentation, AI-specific security testing such as red-teaming and prompt-injection testing, its sub-processor and fourth-party AI dependencies, and its commitments to notify you of model changes. It must also map where your own deployment-side controls remain, including how you configure, prompt, integrate, and monitor the tool, because a vendor certificate covers the vendor and not your deployment.
How is AI vendor risk different from traditional vendor risk?
AI vendor risk is distinct because model behavior can change after the contract is signed, training-data provenance is often opaque, outputs are non-deterministic, the vendor can swap the underlying foundation model under you, and your data can leak into the vendor’s training loop. A traditional vendor assessment assumes the product you evaluated is the product you deployed, which does not hold for AI vendors.
Does Microsoft’s ISO 42001 certification cover our bank’s AI risk?
No. ISO/IEC 42001:2023 certification is a useful upstream attestation that the vendor operates an AI management system, but it is an input to your assessment, not a substitute for your own deployment-side controls. It says nothing about how you configure the tool, what guardrails you wrap around it, what data you send it, how you integrate its outputs into a decision, or how you monitor it for drift. Those controls remain yours to design and evidence.
How does the June 2023 interagency third-party guidance apply to AI vendors?
The Interagency Guidance on Third-Party Relationships: Risk Management, issued by the OCC, Federal Reserve, and FDIC on 6 June 2023, organizes third-party risk across planning, due diligence and third-party selection, contract negotiation, ongoing monitoring, and termination. For AI vendors, due diligence adds AI-specific questions, contracts add no-train and model-change-notice terms, and ongoing monitoring must catch model version changes and behavior drift, not just uptime.
What contract terms should a bank require from an AI vendor?
Key terms include a commitment that your data is not used to train the vendor’s models, disclosure of data storage region and tenant isolation, advance notice before model version changes, deprecations, or swaps of the underlying foundation model, disclosure of AI sub-processors with the same commitments flowed down, evidence of AI-specific security testing, and AI-aware incident and breach notification timelines. These turn diligence answers into enforceable obligations.
The Bottom Line
AI vendor risk is not a slightly harder version of traditional vendor risk, it is a different shape. The model can change after you sign, your data can train someone else’s model, the foundation model can be swapped under you, and outputs are non-deterministic by design. The June 2023 interagency third-party guidance still gives you the right lifecycle, but each stage needs AI-specific questions, terms, and monitoring bolted on, and ongoing monitoring is where AI vendor risk diverges most from the uptime-and-SLA world.
The single most expensive mistake is treating a vendor certificate, such as a Microsoft ISO/IEC 42001:2023 certification, as coverage of your own deployment. It is an upstream input, never a substitute for the controls you own: configuration, prompts, integration, output gating, and monitoring. Run the seven-domain checklist on every AI vendor, tier it to the use case, and keep your deployment-side controls as a named, owned layer. That is how a bank or fintech governs an AI vendor instead of trusting it.
Want the inventory fields, tiering logic, and documentation that sit behind this checklist? Start with the AI Governance Checklist, then see how a senior team maps your AI footprint and vendor dependencies in a Governance Readiness Snapshot. For the institution-level program that turns these vendor assessments into a governed AI inventory, see our work on banking AI governance.
Key facts
- The Interagency Guidance on Third-Party Relationships: Risk Management, issued jointly by the OCC, Federal Reserve, and FDIC on 6 June 2023, frames third-party risk across a five-stage lifecycle of planning, due diligence and third-party selection, contract negotiation, ongoing monitoring, and termination (DSE, 2026).
- An AI vendor's behavior can change after contract signing through model version updates, sub-processor model swaps, and silent foundation-model changes, so a one-time onboarding assessment does not cover the risk that the deployed model is not the model you evaluated (DSE, 2026).