Can AI be used to control safety critical systems? A U.K.-funded research program aims to find out

9 Min Read

Hello and welcome to Eye on AI. In this edition…Meta hires Scale AI founder for new ‘superintelligence’ drive…OpenAI on track for $10 billion in annual recurring revenue…Study says ‘reasoning models’ can’t really reason.

Today’s most advanced AI models are relatively useful for lots of things—writing software code, research, summarizing complex documents, writing business correspondence, editing, generating images and music, role-playing human interactions, the list goes on. But relatively is the key word here. As anyone who uses these models soon discovers, they remain frustratingly error-prone and erratic. So how could anyone think that these systems could be used to run critical infrastructure, such as electrical grids, air traffic control, communications networks, or transportation systems?

Yet that is exactly what a project funded by the U.K.’s Advanced Research and Invention Agency (ARIA) is hoping to do. ARIA was designed to be somewhat similar to the U.S. Defense Advanced Research Projects Agency (DARPA), with government funding for moonshot research that has potential governmental or strategic applications. The £59 million ($80 million) ARIA project, called The Safeguarded AI Program, aims to find a way to combine AI “world-models” with mathematical proofs that could guarantee that the system’s outputs were valid.

David Dalrymple, the machine learning researcher who is leading the ARIA effort, told me that the idea was to use advanced AI models to create a “production facility” that would churn out domain-specific control algorithms for critical infrastructure. These algorithms would be mathematically tested to ensure that they meet the required performance specifications. If the control algorithms pass this test, the controllers—but not the frontier AI models that developed them—would be deployed to help run critical infrastructure more efficiently.

Dalrymple (who is known by his social media handle Davidad) gives the example of the U.K.’s electricity grid. The grid’s operator currently acknowledges that if it could balance supply-and-demand on the grid more optimally, it could save £3 billion ($4 billion) that it spends each year essentially paying to have excess generation capacity up-and-running to avoid the possibility of a sudden blackout, he says. Better control algorithms could reduce those costs.

Besides the energy sector, ARIA is also looking at applications in supply chain logistics, biopharmaceutical manufacturing, self-driving vehicles, clinical trial design, and electric vehicle battery management.

AI to develop new control algorithms

Frontier AI models may be reaching the point now where they may be able to automate algorithmic research and development, Davidad says. “The idea is, let’s take that capability and turn it to narrow AI R&D,” he tells me. Narrow AI usually refers to AI systems that are designed to perform one particular, narrowly-defined task at superhuman levels, rather than an AI system that can perform many different kinds of tasks.

The challenge, even with these narrow AI systems, is then coming up with mathematical proofs to guarantee that their outputs will always meet the required technical specification. There’s an entire field known as “formal verification” that involves mathematically proving that software will always provide valid outputs under given conditions—but it’s notoriously difficult to apply to neural network-based AI systems. “Verifying even a narrow AI system is something that’s very labor intensive in terms of a cognitive effort required,” Davidad says. “And so it hasn’t been worthwhile historically to do that work of verifying except for really, really specialized applications like passenger aviation autopilots or nuclear power plant control.”

This kind of formally-verified software won’t fail because a bug causes an erroneous output. They can sometimes break down because they encounter conditions that fall outside their design specifications—for instance a load balancing algorithm for an electrical grid might not be able to handle an extreme solar storm that shorts out all of the grid’s transformers simultaneously. But even then, the software is usually designed to “fail safe” and revert back to manual control.

ARIA is hoping to show that frontier AI modes can be used to do the laborious formal verification of the narrow AI controller as well as develop the controller in the first place.

But will AI models cheat the verification tests?

But this raises another challenge. There’s a growing body of evidence that frontier AI models are very good at “reward hacking”—essentially finding ways to cheat to accomplish a goal—as well as at lying to their users about what they’ve actually done. The AI safety nonprofit METR (short for Model Evaluation & Threat Research) recently published a blog on all the ways OpenAI’s o3 model tried to cheat on various tasks.

ARIA says it is hoping to find a way around this issue too. “The frontier model needs to submit a proof certificate, which is something that is written in a formal language that we’re defining in another part of the program,” Davidad says. This “new language for proofs will hopefully be easy for frontier models to generate and then also easy for a deterministic, human audited algorithm to check.” ARIA has already awarded grants for work on this formal verification process.

Models for how this might work are starting to come into view. Google DeepMind recently developed an AI model called AlphaEvolve that is trained to search for new algorithms for applications such as managing data centers, designing new computer chips, and even figuring out ways to optimize the training of frontier AI models. Google DeepMind has also developed a system called AlphaProof that is trained to develop mathematical proofs and write them in a coding language called Lean that won’t run if the answer to the proof is incorrect.

ARIA is currently accepting applications from teams that want to run the core “AI production facility,” with the winner the £18 million grant to be announced on October 1. The facility, the location of which is yet to be determined, is supposed to be running by January 2026. ARIA is asking those applying to propose a new legal entity and governance structure for this facility. Davidad says ARIA does not want an existing university or a private company to run it. But the new organization, which might be a nonprofit, would partner with private entities in areas like energy, pharmaceuticals, and healthcare on specific controller algorithms. He said that in addition to the initial ARIA grant, the production facility could fund itself by charging industry for its work developing domain-specific algorithms.

It’s not clear if this plan will work. For every transformational DARPA project, many more fail. But ARIA’s bold bet here looks like one worth watching.

With that, here’s more AI news.

Jeremy Kahn
jeremy.kahn@fortune.com
@jeremyakahn

Want to know more about how to use AI to transform your business? Interested in what AI will mean for the fate of companies, and countries? Why not join me in Singapore on July 22 and 23 for Fortune Brainstorm AI Singapore. We will dive deep into the latest on AI agents, examine the data center build out in Asia, and talk to top leaders from government, board rooms, and academia in the region and beyond. You can apply to attend here
.

This story was originally featured on Fortune.com

Share This Article