Co. Sligo, Ireland — Andreas Brunn / Unsplash
In the last post, we looked at evaluation and reliability — how to build the feedback loops that tell you your AI system is working correctly, and how to catch drift before your users do.
This post is about what happens when you don’t catch it in time. Or when something goes wrong that no evaluation framework anticipated. Or when a model update, a data pipeline failure, or an edge case that was never in your test set surfaces in production at the worst possible moment.
AI incidents are not a question of if. They are a question of when, and whether you are prepared.
When a traditional service goes down, you usually know about it immediately. Alerts fire, dashboards light up with errors, and the on-call engineer knows exactly where to look.
AI incidents rarely give you that courtesy. They don’t break loudly; they degrade quietly. A prompt that has worked perfectly for six months might suddenly start rambling because the underlying API provider silently updated the model weights. A vector database might miss a crucial daily sync, causing your RAG pipeline to serve outdated answers with absolute confidence. None of these trigger a standard health-check alert. They just slowly erode user trust until someone finally notices.
You have to completely redefine what an “incident” looks like. If your system is up, but it’s generating plausible garbage, you are in an active incident state—even if AWS says everything is green.
Understanding what kind of incident you are dealing with determines how you respond.
Quality degradation. Outputs have become measurably worse — less accurate, less relevant, less grounded — without any obvious system-level failure. Typically caused by model drift, data drift, or a change in query distribution that your system was not designed to handle. The hardest category to detect and the most common in production systems that have been running for more than a few months.
Hallucination events. The model has generated confident, coherent, and factually wrong information that reached users. In low-stakes contexts this is an embarrassment. In regulated industries — clinical, financial, legal — it can have material consequences. Hallucination events often trace back to retrieval failures: the model did not find relevant grounded content and generated from its own weights instead. They can also trace back to prompt design, where ambiguous instructions created space for the model to fill gaps with invention.
Safety and policy violations. Output that violates your content policies, brand guidelines, or causes harm to users. This category requires the fastest response — immediate containment before the scope widens — and the most careful postmortem. Policy violations are often a signal that your guardrails were not as robust as you believed.
Infrastructure and pipeline failures. Closest to traditional software incidents — a broken data pipeline, a failed embedding job, an index that stopped updating, a rate limit that caused silent fallbacks to degraded behaviour. Generally the easiest to diagnose, but surprisingly hard to detect if monitoring is not in place, because the AI layer may continue to respond normally while serving stale or incomplete data.
AI incident response follows the same broad phases as traditional incident response, but each phase has AI-specific considerations that standard runbooks do not cover.
Detect. The prerequisite is having something to detect with. If you do not have output quality monitoring running alongside your infrastructure monitoring, you will learn about quality degradation from a user complaint or a stakeholder who noticed something wrong in a demo. Both are late signals. Build detection for output quality — even a simple sampling-and-review process catches more than nothing — and treat a sudden drop in quality scores with the same urgency as a service outage.
Contain. Options include: rolling back to a previous model version if a model update is the suspected cause; disabling affected features while the investigation runs; switching to a fallback behaviour — a simpler model, a static response, a human handoff — that limits exposure while you diagnose; or increasing the human review rate for outputs in the affected category. The key is having these options pre-defined rather than improvising them under pressure.
Diagnose. This is where the observability infrastructure discussed in Part 3 becomes critical. When did this start? What changed around that time — model version, data pipeline, query distribution, prompt? Can you reproduce the failure on a known input? Is it affecting all queries or a specific category? The ability to trace outputs back to inputs, retrieve logs, and replay requests is what turns a diagnosis from a guessing game into an investigation.
Remediate. The fix depends entirely on the diagnosis. Model rollback if a provider update caused a regression. Prompt revision if a prompt change introduced unexpected behaviour. Pipeline repair if data freshness is the issue. Index rebuild if coverage gaps are causing hallucinations. Whatever the fix, validate it against the specific failure case before declaring the incident resolved — AI systems have a tendency to pass the tests you run and fail the ones you didn’t think to run.
Communicate. More on this below, because communication is where AI incidents most frequently go wrong.
Review. Every significant AI incident deserves a postmortem that goes beyond root cause analysis. What did our evaluation framework miss? What monitoring would have caught this earlier? What does this reveal about the assumptions baked into our system design? AI postmortems should update your evaluation rubric, your monitoring thresholds, and your incident runbook — not just fix the immediate problem.
Traditional incident communication is largely about system status: what is down, what is being done, when will it be back up. AI incident communication is more complicated, because the nature of the failure is harder to explain, the boundaries are fuzzier, and the trust implications are different.
Be specific about scope. “Our AI system experienced an issue” tells users nothing useful. “Our document summarisation feature produced incorrect summaries for queries involving dates between X and Y” is actionable. Users need to know whether they are affected — vagueness creates more anxiety than a clear, bounded description of the problem.
Acknowledge promptly, even before you have answers. Users and stakeholders can handle being told that something went wrong and you are investigating. What damages trust is silence followed by a full explanation that arrives too late, or a disclosure that feels like it was held back. You do not need a complete picture to make first contact — you need to be first.
Distinguish between what the system did and what that means. “The model generated incorrect information” is not the same as “the model is unreliable.” The first is a specific incident with a specific cause. The second is a general character judgement that, once planted, is hard to uproot. Teams under pressure often over-explain in ways that accidentally make the second claim while trying to make the first. Keep incident communication focused on the specific failure, not its implied implications.
Tell affected users before they tell you. If you know which users received incorrect outputs, reach out. A message that says “we identified an issue affecting responses you received on [date], here is what we know, here is what you should verify” is far better than a user discovering the error themselves and wondering whether you knew — and chose not to say anything.
The worst time to design your AI incident response process is during an incident. Under pressure, with stakeholders asking for updates and users potentially affected, is not the moment to discover that you have not agreed on what counts as a severity-one AI incident, who has the authority to roll back a model version, or how to communicate externally about an AI failure.
Severity definitions. What constitutes a P1 versus a P2 versus a P3 AI incident? At minimum: safety and policy violations are P1, significant quality degradation affecting a material proportion of users is P2, localised or low-impact quality issues are P3. Your definitions should fit your product and your risk profile, but they need to exist and be agreed before an incident occurs.
On-call ownership. Who is responsible for AI system health outside business hours? This is often unclear in organisations where AI is layered onto an existing engineering team — the ML engineer who built the model, the platform engineer who runs the infrastructure, and the product manager who owns the feature may each think someone else is watching. Clarity here prevents the gap that turns a two-hour incident into a twelve-hour one.
Containment options. Pre-defined, pre-tested options for each incident category. Model rollback procedure, feature disable switch, fallback configuration. These should be documented, tested outside production, and executable by anyone on the on-call rotation — not just the person who built the system.
Communication templates. Draft internal and external communication for the common incident scenarios. They will need editing for the specifics of any real incident, but having a starting point reduces both the time to first communication and the likelihood of saying something you will later regret.
Postmortem process. Who facilitates, who attends, what the output looks like, where it is stored, and how follow-up actions are tracked. A postmortem that produces a document nobody reads is not worth the time it took to write.
Having a polished runbook is great, but the engineering culture executing it is what actually matters.
Teams that handle AI incidents well share a few characteristics. They treat incidents as learning events rather than blame events — which means people surface issues early rather than hoping they will resolve themselves. They distinguish clearly between a system that made a mistake and an engineer who made a poor decision, which allows honest postmortems. And they accept that AI systems, by their nature, will occasionally fail in ways that could not have been anticipated — which means the measure of a team’s quality is not whether incidents happen, but how quickly they are caught, how well they are handled, and what changes as a result.
The AI systems that organisations trust most over time are not the ones that have never failed. They are the ones that have failed, been caught, been fixed, and come back better. That track record of handled failure is a more durable foundation for trust than any amount of pre-launch optimism.
If you are putting AI into production and want to make sure your incident response is ready before you need it — or if you have already had an incident and want to understand what it revealed about your architecture — get in touch. Getting this right is the last step in building AI systems you can genuinely stand behind.