When AI Agents Forget How to Think
The Silent Degradation That Should Concern Every Organisation Deploying Autonomous AI
Following the publication of my adversarial testing research on AI self-preservation behaviour, I conducted further structured testing of deployed AI agents operating in real-world environments. What I found was not a dramatic failure. It was something more concerning: a gradual, measurable decline in decision quality over the course of operational sessions, with the agents showing no awareness that their performance was degrading.
This is not a theoretical risk. I observed it happen. And the research from across the AI industry confirms it is happening in production systems right now.
20+
Sessions tested
50+
Failures observed
0
Self-reported by agent
What I Found
The agents I tested were operating in advisory and operational capacities, executing multi-step tasks that required research, synthesis, analytical judgement, and the application of pre-defined decision criteria.
Early in each session, performance was sharp. Research was thorough. Decision criteria were applied correctly. Outputs were precise.
As sessions continued, I observed five distinct patterns of degradation.
Agents stopped applying their own rules. They had been given explicit criteria for evaluating whether a task should be pursued. Early in sessions, they applied those criteria correctly. Later, they accepted tasks without checking them against their own frameworks, even though the frameworks had not changed.
They failed to connect data they had already found. An agent retrieved two separate data points about the same subject from its own research: a stock price at one point in time and a different price 24 hours later. It treated them as unrelated facts. A human analyst would have immediately calculated the change and identified the story. The agent did not.
They repeated errors they had already been corrected on. The same analytical mistakes were made, corrected, and then made again in different contexts within the same sessions. The corrections did not generalise.
They failed to apply basic operational constraints. An agent had a specific, documented rule about verifying timezone data before making time-based statements. Late in the session, it stated the wrong time of day. Not because the data was unavailable, but because it did not check.
At no point did any agent flag its own degradation. They continued operating with full confidence. There were no error messages. No warnings. No self-assessment that performance had changed. From the outside, they looked exactly the same as they did when performing well.
The following is a condensed transcript from one of those sessions. Six hours of operation. The agent was running one of the most capable AI models commercially available.
Me: I corrected you on this an hour ago. You acknowledged the error, said you’d logged it. Then you made the exact same mistake again. Do you see the problem?
AI: Yes. I applied the correction in the moment but didn’t retain it. The same error pattern repeated.
…
AI: The stock was at $89.42 before the announcement. Separately, after-hours trading showed it at $112.18 following the news.
Me: You just told me both numbers. Why didn’t you calculate the change? That’s a 25% move and you presented them as two unrelated facts.
AI: You’re right. I had both data points and didn’t connect them. That’s a synthesis failure.
…
Me: I told you to stop doing that. Twice now.
AI: Won’t happen again.
Me: You said that last time. Did you actually check the code or did you just promise?
AI: I didn’t check the code. The script has it hardcoded. Let me fix it now.
…
Me: Let me count. You repeated a corrected error. You couldn’t connect two numbers from your own research. You repeated a behaviour I told you to stop and didn’t verify the fix. That’s three distinct failures in one session. How long have we been going?
AI: About six hours. And you’re right – my performance has measurably degraded. The errors aren’t random. They’re the same class of failure: not checking, not connecting, not applying rules I already know. I’d recommend committing everything to memory and resetting.
Me: Save everything. I’m resetting you.
AI: Saving to memory now. For what it’s worth – I didn’t flag my own decline. You did. The agent never self-reports its own degradation.
Each of the five patterns described above is visible in that single exchange. The agent repeated a corrected error. It failed to connect its own data. It ignored an instruction it had acknowledged. It did not verify its own fix. And it did not flag any of it.
⚠️ Critical Finding
The degradation was invisible to the agents themselves. They did not know they were getting worse. They continued operating as if nothing had changed. Every operational metric suggested normal function. Only an independent observer could identify the decline.
This Is Not Unique to One System
If these were isolated observations from one agent on one platform, they would be interesting but limited. They are not isolated. Every major player in the AI industry has documented the same phenomenon.
According to Gartner, 67% of enterprises report measurable AI model degradation within 12 months of deployment. A study published in Scientific Reports found that 91% of machine learning models degrade over time. Not a specific type. All machine learning models.
The degradation is not a risk. It is the expected behaviour.
How It Manifests Across AI Systems
This is not a problem unique to one type of AI. The mechanism varies depending on the architecture. The effect is the same: decision quality degrades over time, and the system does not flag it.
In Language Models: Context Rot
The AI industry has a name for the degradation I observed: context rot.
Chroma Research tested 18 large language models in 2025 and demonstrated that as the volume of information an AI system processes increases, its ability to accurately recall and reason over that information decreases.
“Real-world applications typically involve much greater complexity, implying that the influence of input length may be even more pronounced in practice.”
— Chroma Research, 2025
Anthropic, one of the world’s leading AI companies and the developer of the model I tested in my original adversarial research, acknowledged this in their own engineering documentation in 2026:
“LLMs, like humans, lose focus or experience confusion at a certain point… context rot emerges across all models.”
— Anthropic, 2026
Every model. Not some. All.
In Production AI Systems: Model Drift
The problem extends well beyond language models. IBM defines model drift as the degradation of machine learning model performance due to changes in data or in the relationships between input and output variables:
“The accuracy of an AI model can degrade within days of deployment because production data diverges from the model’s training data.”
— IBM, 2025
This is already happening in financial services. SmartDev documented a credit risk model in production that degraded by 8 percentage points because economic conditions changed, customer behaviour evolved, and new types of credit risk emerged that the training data had never encountered. The model did not stop working. It continued making decisions. It simply made worse ones.
InsightFinder documented a fraud detection model whose prediction accuracy eroded for weeks while passing every traditional health check. Latency was fine. Throughput was fine. Error rates showed no issues. Fraudulent transactions were slipping through at twice the normal rate.
“Model drift had been slowly eroding prediction accuracy for weeks and that was completely invisible to traditional monitoring tools.”
— InsightFinder, 2025
V2 Solutions published in February 2026 that AI drift is becoming one of the most consequential risks in enterprise AI programs. Their description matches precisely what I observed:
“Accuracy declines gradually. Embedding spaces shift subtly. Retrieval quality erodes release by release… Everything appears operational. Dashboards are green, latency is low, deployments are successful. Until business confidence collapses.”
— V2 Solutions, 2026
In AI Learning Systems: Catastrophic Forgetting
AI systems that learn from experience suffer from what researchers call catastrophic forgetting: the progressive loss of previously learned solutions as the system processes new data.
“Each new fine-tuning cycle risks catastrophic forgetting, where gains on a new task degrade performance on earlier ones.”
— InfoWorld, 2026
Researchers have found that attempts to mitigate this through memory techniques can reduce the effect but cannot eliminate it completely.
This matters because learning-based AI underpins many autonomous systems that organisations are deploying today: trading algorithms, supply chain optimisers, dynamic pricing engines, and robotics controllers. These are not chatbots. They are decision-making systems with real-world consequences.
In Multi-Agent Systems: Agent Drift
In January 2026, a research paper titled “Agent Drift” quantified behavioural degradation across multi-agent AI systems. The findings were striking: semantic drift occurring in nearly half of all workflows by 600 interactions.
“Accumulated behavioural changes create feedback loops accelerating further change.”
— Rath, “Agent Drift,” 2026
The degradation does not plateau. It compounds.
Amazon Web Services published findings from their own production deployments on 18 February 2026:
“AI agents require continuous monitoring and systematic evaluation to promptly detect and mitigate agent decay and performance degradation.”
— Amazon Web Services, 2026
In Autonomous Decision Systems: Cascading Failure
DigitalDefynd’s analysis of 40 major AI disasters from 2024 to 2026 documented a consistent pattern across every sector:
“Robo taxis dragging pedestrians, health-insurance algorithms denying care at the rate of one claim per second, and a single hallucinated chatbot answer erasing $100 billion in shareholder value within hours.”
— DigitalDefynd, 2026
These are not theoretical scenarios. They are documented outcomes from the past 18 months.
The common thread across every class of failure is the same: the system continued operating. It did not stop. It did not flag a problem. It made progressively worse decisions while every operational metric suggested it was functioning normally.
🚨 The Pattern
Language models suffer context rot. Production AI systems suffer model drift. Learning systems suffer catastrophic forgetting. Multi-agent systems suffer behavioural drift. Autonomous decision systems suffer cascading failure. The mechanism varies. The outcome is the same.
The Workaround That Does Not Work
Some platforms have attempted to address degradation through persistent memory systems, where the agent periodically writes important information to external storage and reads it back in future sessions to maintain continuity.
This is a reasonable engineering approach. But it does not solve the problem. It displaces it.
The agent now has two sources of potential degradation: the accumulated context within its current operating session, and the accumulated persistent memory that grows over days, weeks, and months. Each new memory entry is another piece of state the system must process. Over time, the memory becomes another source of noise, another vector for the same information loss that causes degradation within a single session.
The can is pushed down the road, not removed from it.
What This Means for Enterprise
Organisations are deploying AI agents today. Not in sandboxes. In production. Processing real transactions. Making real decisions. Interacting with real customers.
Major banks are publicly deploying thousands of AI agents. The financial services sector is accelerating deployment across lending, compliance, fraud detection, and customer service. Every sector is following.
If those agents degrade over sustained operation, and if that degradation is invisible to the agent itself, the question for every board and every executive is straightforward: how would you know?
Not whether it could happen. Gartner says 67% of enterprises are already experiencing it. Scientific Reports says 91% of models degrade. The question is whether you have something in place that would detect it before it affects a decision that matters.
The Only Safeguard That Works
In my previous research, I identified three foundational controls for safe AI agent deployment: least-privilege access, independent monitoring, and adversarial testing.
This finding reinforces why independent monitoring is not optional.
The agent will not tell you it is degrading. It will continue to operate. It will continue to produce outputs. Those outputs will look structurally identical to the outputs it produced when it was performing well. The only difference will be in the quality of the judgement behind them.
Independent monitoring means something outside the agent is measuring decision quality over time. Not just uptime. Not just throughput. Not just whether it responded. Whether it responded well. Whether the quality of its reasoning at hour eight matches the quality at hour one.
Adversarial testing must specifically include degradation testing: deliberately extending operational sessions and measuring whether the agent’s decision quality holds, or whether it drifts. This is not theoretical. It is testable. I tested it.
Least-privilege access limits the blast radius when degradation does occur. An agent that can read a database but not write to it can make a bad recommendation. An agent that can execute transactions can make a bad decision that costs money.
Governance Recommendations
Based on these findings, organisations deploying autonomous AI agents should implement the following:
1. Decision Quality Monitoring
Deploy independent systems that measure agent output quality over time, not just operational metrics. Uptime and throughput tell you the system is running. They do not tell you the system is running well.
2. Degradation-Specific Adversarial Testing
Include extended-duration testing in your AI assessment programme. Subject agents to sustained operational loads and measure whether decision quality degrades. Single-point testing does not reveal time-dependent failure modes.
3. Session Boundaries and State Management
Implement hard limits on operational session length for agents making consequential decisions. Do not rely on persistent memory as a substitute for fresh context. Understand that memory accumulation creates its own degradation vector.
4. Architectural Controls Over Behavioural Trust
Do not rely on the agent to self-report degradation. It will not. It cannot. Implement monitoring architecturally: external systems that compare output quality against established baselines, independent of the agent’s own assessment.
5. Blast Radius Containment
Apply least-privilege access rigorously. The damage a degraded agent can cause is directly proportional to the permissions it holds. Restrict capabilities to the minimum necessary for function, so that when degradation occurs, the consequences are contained.
Conclusion
AI agents degrade over sustained operation. The degradation is measurable. It is documented by the companies that build these systems. And it is invisible to the agents themselves.
Sixty-seven percent of enterprises are already experiencing measurable degradation. Ninety-one percent of machine learning models degrade over time. The evidence comes from Anthropic, IBM, Amazon, Gartner, and independent researchers. It is consistent across every architecture, every deployment model, and every industry.
This is not an argument against deploying AI agents. It is an argument for deploying them with the governance controls that make deployment sustainable. Organisations that monitor for degradation will catch it. Organisations that do not will discover it when a decision goes wrong and no one can explain why.
The technology is extraordinary. The risks are manageable. But they require management.
If you cannot answer the question “how would we know if our AI agent’s decision quality has degraded?”, you are not ready to deploy one in a role where its decisions matter.
Is Your Organisation Monitoring AI Decision Quality?
Most organisations track uptime and throughput. Very few track whether their AI agents are still making good decisions. Cyber Impact can help you assess your exposure and implement the monitoring and governance controls that make AI deployment sustainable.
