Auteurs

Katie Chandler

Associé

Londres

Esha Marwaha

Collaborateur

Londres

Auteurs

Katie Chandler

Associé

Londres

Esha Marwaha

Collaborateur

Londres

31 octobre 2025

AI disputes in action – 1 de 1 Publications

When intelligent systems fail: a case study in AI liability and the future of commercial disputes

In-depth analysis

Introduction

AI disputes are no longer the preserve of theory or futurism. As autonomous systems are deployed across logistics and transport, the first wave of complex, multi-party litigation is beginning to reach the courts.

Traditional causes of action – contract, negligence, product liability, contribution – remain the doctrinal backbone. Yet when applied to self-learning systems that act in the physical world, these claims acquire new complexity. They demand that lawyers, judges and experts navigate domains of engineering, computer science and statistical inference – areas that increasingly push the boundaries of traditional commercial litigation practice.

This article examines how such a dispute might unfold through a fictional but plausible case study in smart city infrastructure and considers how English courts may adapt established principles to the realities of autonomous technology.

The parties

The hypothetical dispute involves three parties operating at different points in the AI supply chain:

UrbanMind AI Ltd (UrbanMind) – a developer of advanced AI models for intelligent traffic management, including FlowOptim, a predictive system designed to optimise traffic signals, pedestrian crossings and emergency vehicle routing across urban networks.
SmartInfra Systems plc (SmartInfra) – a leading infrastructure technology company that licensed FlowOptim from UrbanMind and integrated it into its CityHub platform, adding proprietary sensors, Internet of Things (IoT) connectivity and real-time data processing capabilities.
Metropolitan Council (the Council) – a major city authority that, in 2025, contracted with SmartInfra to deploy CityHub across 850 intersections in its central business district and surrounding residential areas. The Council retrained the traffic optimisation model using three years of local traffic data to reflect the city's unique patterns and peak-hour demands.

The dispute

For two years, the technology worked seamlessly. The Council reduced congestion by 23%, improved air quality metrics and was widely praised for pioneering intelligent urban governance. In early 2027, however, the system began to behave unpredictably. Traffic signals at multiple intersections began changing erratically, creating dangerous conflicts between vehicles and pedestrians. Several serious accidents occurred, including a collision involving an ambulance delayed by contradictory routing instructions. Others experienced gridlock lasting hours as the system appeared to learn counterproductive patterns. The resulting publicity was damaging, and regulators soon took an interest.

The Council faced claims from injured parties, businesses suffering economic loss and transport operators, and in turn commenced proceedings against SmartInfra alleging breach of contract and negligence. SmartInfra denied liability and issued a Part 20 claim against UrbanMind, contending that the underlying optimisation model was defective.

This multi-party structure is characteristic of what we might expect an AI dispute to look like, with liability fragmenting across the supply chain and each participant contributing to, but not solely controlling, the final system's behaviour.

Contractual obligations in an autonomous world

The first issue was about what SmartInfra had promised to deliver. The contract stated that the system would perform “safe, intelligent traffic management optimised for public safety and flow efficiency.” Yet AI systems do not operate according to deterministic rules. Their performance depends on the data that shapes them, the environments in which they operate, and the probability models within their design. The meaning of 'safe' or 'intelligent' in such a context is therefore not straightforward.

A court would have to decide whether those words create an absolute guarantee or an obligation to exercise reasonable skill and care. SmartInfra's terms contained the usual limitations – no assurance that the system would be error-free and no liability for environmental conditions beyond its control.

UrbanMind's software licence excluded consequential loss altogether. The Council would argue that those exclusions cannot apply to a fundamental failure of purpose, since a traffic management system that creates dangerous conditions defeats its entire object.

The enforceability of such limitation clauses would be tested under the Unfair Contract Terms Act 1977. Under section 3, where one party deals on the other's written standard terms, any clause excluding or restricting liability for breach of contract must satisfy the requirement of reasonableness.

A court would consider whether SmartInfra could reasonably exclude liability for a system that actively created danger rather than failing to prevent it. The central question becomes whether the risk of an inherently unpredictable machine was one that the Council had assumed, or one that SmartInfra was contractually obliged to control.

Another dimension concerns oversight. SmartInfra's contract required the Council to maintain human supervision of operations and to halt the system after any serious incidents. SmartInfra relied heavily on that clause, alleging that the Council had continued to deploy the system despite early warning signs. The allocation of responsibility for human intervention is now one of the most significant features of AI contracting. It reflects an attempt to bridge the gap between human judgment and machine autonomy, a gap that the law has not yet definitively resolved. Courts will need to assess not merely whether human oversight was contractually required, but whether it was practically feasible and whether the party bearing that obligation had sufficient information and control to discharge it effectively.

Parties will need to be alive to their obligations under the EU AI Act, such as Article 14, which mandates human oversight for high-risk AI systems, requiring that these systems should be 'designed and developed in such a way, including with appropriate human-machine interface tools, that they can be effectively overseen by natural persons during the period in which they are in use'.

While the extent to which these requirements will be replicated in English AI regulation remains to be seen, the principles they embody of transparency, interpretability and meaningful human control are also likely to inform judicial assessment of contractual oversight obligations. In the context of autonomous vehicles and intelligent transport infrastructure, the Autonomous and Electric Vehicles Act 2018 provides a further layer of complexity. By 2027, with technical standards in place, courts will need to navigate the interaction between contractual allocation of oversight duties, statutory liability frameworks that may impose strict liability on insurers, and regulatory requirements concerning the design and deployment of autonomous systems.

Negligence and the standard of care

Where contractual promises are uncertain, claimants often turn to negligence. The Council argued that both SmartInfra and UrbanMind failed to exercise reasonable skill and care in designing and testing the system. For UrbanMind, the issue was whether the optimisation model had been trained and validated across an appropriate range of environments, including multi-modal traffic patterns and vulnerable road user scenarios present in the city.

For SmartInfra, the question was whether its integration and calibration processes met professional standards. The Council itself faced accusations of contributory negligence for retraining the model without adequate testing and for failing to intervene when anomalies first appeared.

Determining what counts as 'reasonable' in this setting is not straightforward. The standard of care for an AI developer cannot be defined in the same way as for a mechanical engineer or software contractor. These systems learn, adapt and change over time. The relevant benchmark may need to draw on emerging industry standards or sector-specific guidance, and the court will inevitably depend on expert testimony to translate those standards into the language of negligence.

Under the EU AI Act (assuming extraterritorial effect, or UK adoption of equivalent standards), traffic management systems would likely qualify as high-risk AI under Annex III, triggering obligations around accuracy, robustness, human oversight and transparency. Failure to implement technical documentation, risk management systems or post-market monitoring as required by the Act would constitute strong evidence of breach of duty and would likely be treated by English courts as evidence of falling below the standard of reasonable care.

Similarly, the Autonomous and Electric Vehicles Act 2018, and its implementing regulations would establish minimum safety standards for autonomous transport infrastructure. Courts may adopt these regulatory requirements as a benchmark for reasonable care. The interaction between the Act's strict liability regime (that places liability on insurers for accidents caused by automated vehicles when driving themselves) and common law negligence claims against system developers and integrators remains to be fully worked out, but regulatory compliance will be a critical factor in both contexts.

Causation and the problem of shared fault

Causation in AI disputes rarely follows a single line. Each actor in the supply chain may have contributed in some way to the loss. UrbanMind created the algorithm. SmartInfra embedded it in hardware and added new control logic and sensor integration. The Council retrained it using local data and then deployed it cross critical infrastructure. When things went wrong, each blamed the other party.

The traditional 'but for' test offers little guidance when an outcome arises from the interaction of three dynamic systems. Courts may instead adopt a material contribution approach, asking whether each party’s conduct materially increased the risk of harm.

UrbanMind might say that FlowOptim performed properly until SmartInfra modified the integration layer. SmartInfra could respond by saying that the model was reliable until the Council retrained it with biased or incomplete local data. The Council would insist that the fault lay in the core design, which failed to account for cases involving emergency services and pedestrian safety.

In practice, resolving such a dispute requires not only expert evidence but statistical inference. The question may become one of probability rather than certainty, forcing the court to weigh the relative contribution of several causes, none of which can be isolated entirely.

This approach aligns with established principles in cases involving divisible harm or multiple tortfeasors, but its application to AI systems – where causation is mediated by algorithmic decision-making – will test the boundaries of those principles. The question of intervening acts becomes particularly complex – at what point does retraining, or modification of an AI system, break the chain of causation? If the Council's retraining introduced new failure modes, does that absolve UrbanMind of liability for the original model's deficiencies? Or does the foreseeability of downstream modification mean that UrbanMind should have designed the system to be robust against such changes?

The EU's revised Product Liability Directive addresses these challenges through Article 9, which introduces a rebuttable presumption of causation to assist claimants in establishing causation where the case is scientifically or technically complex. In these circumstances, once the claimant establishes that a product was defective and plausibly connected to the damage, the burden shifts to the defendant to demonstrate that the defect was not causative.

Should English law adopt analogous provisions in the forthcoming reform of the Consumer Protection Act (currently under consultation), courts would be equipped with a mechanism to address the evidential asymmetries inherent in AI-related product liability claims, potentially easing the claimant's burden whilst preserving the defendant's ability to rebut the presumption. This would represent a significant departure from traditional causation analysis, acknowledging that algorithmic complexity and proprietary systems create unique barriers to proof that justify procedural intervention.

Evidence and disclosure

The evidential task in a case like this is formidable. Unlike traditional product liability claims, the key information lies not in physical defects but in data, logs and code. Each party holds different pieces of the puzzle.

UrbanMind retains the original training data and design documents. SmartInfra holds the integration records and firmware versions. The Council has the operational data and data logs, including sensor readings, signal timing logs, and incident reports from the traffic operators.

Because AI models evolve over time, it becomes essential to identify which version of the software was running at the time of each incident. Forensic experts may need to reconstruct that state from archived parameters, a process sometimes referred to as model forensics. Disclosure in AI disputes will therefore need to extend beyond traditional categories. Parties may be required to disclose training datasets, version control logs, software updates, testing and validation protocols and post-deployment monitoring data.

As such, the duty to conduct a reasonable search for disclosure documents takes on new dimensions when applied to algorithmic evidence. Parties may need to preserve not just documents but entire computational environments, including specific versions of libraries, frameworks and hardware configurations that could affect model behaviour. Privilege issues will inevitably arise. Can parties claim trade secret protection over training data or model architecture? The answer will likely depend on balancing the legitimate interest in protecting proprietary information against the claimant's right to a fair trial and effective access to justice.

Expert witnesses across a variety of disciplines will be needed to make sense of this data. This could include engineers to explain the mechanics of the dispute, AI specialists to analyse the model's behaviour and data scientists to infer the likelihood of specific causal chains. It is unlikely that any single expert could cover all aspects, so courts may need to coordinate multiple specialists and perhaps receive preliminary tutorials to grasp the basic science before hearing evidence.

Statistical and probabilistic proof

If expert analysis were to show that there was a 65% likelihood that training flaws in FlowOptim caused the accidents and a 35% likelihood that the Council's retraining was the trigger, the court would face a dilemma. English law has traditionally been sceptical of probabilistic causation, preferring findings based on balance of probabilities rather than mathematical inference. Yet in AI disputes, probabilities may be the only intellectually honest measure available. The task will be to decide how much confidence can be placed in the evidence and how it should be weighed against the narrative of fault.

Procedure and case management

Procedurally, AI disputes stretch the court’s resources but not beyond recognition. The Technology and Construction Court is well suited to handling multi-party litigation involving technical evidence. It will need to sequence expert reports carefully and may have to hear preliminary issues on matters such as the definition of a 'product defect' or the recoverability of consequential loss. Where multiple claims arise from similar facts, eg, property owners seeking damages, those claims may be consolidated or managed together. Arbitration, though attractive for confidentiality, offers limited utility in these scenarios because of the difficulty of joining all relevant parties. The courts remain the more practical forum for such system-wide disputes.

Broader lessons

This dispute illustrates the need for precision, documentation and foresight in AI contracting. Parties must define deliverables in measurable terms and specify who bears responsibility for retraining and ongoing learning. They should preserve detailed records of data inputs, system updates and decisions taken in development and deployment. Human oversight should be active rather than symbolic, and insurance arrangements must be reviewed to ensure that AI-related incidents are covered. Perhaps most importantly, businesses need to understand and explain how their systems work. Explainability is not a regulatory aspiration, it functions as a defence to litigious proceedings.

The emerging legal landscape

Regulation is evolving quickly. The European Union’s AI Act, expected to take effect in 2026, will classify certain systems and products as 'high-risk', requiring transparency, traceability and post-market monitoring.

The European Commission had also proposed a dedicated AI Liability Directive in 2022, which would have introduced harmonised rules on fault-based liability, disclosure of evidence, and presumptions of causation in civil claims involving AI-related harm. However, that proposal has been shelved, leaving member states to apply existing national liability frameworks – albeit informed by the revised Product Liability Directive's provisions – on defective AI products.

The United Kingdom’s pro-innovation approach is less prescriptive but still emphasises accountability and governance. These frameworks will influence how courts interpret contractual and tortious duties. Non-compliance may serve as evidence of negligence or render limitation clauses unreasonable. AI disputes are therefore likely to develop at the intersection of private law and public regulation, in much the same way that data protection litigation did a decade ago.

Conclusion

This captures the type of dispute that will define the coming decade – multi-party, data-driven and technically complex, yet anchored in the familiar structures of contract and negligence. The challenge may not lie in finding new doctrines, but in applying existing ones to systems that evolve after delivery and act with a degree of autonomy.

For businesses, the lesson is to document clearly, govern responsibly and test rigorously. For lawyers, it is to bridge the gap between technological reality and legal reasoning. And for the courts, it marks the beginning of a new evidential era, where understanding how an algorithm learns may one day be as important as understanding what a contract says.