Benchmark scores aren’t enough anymore
Patronus AI just announced a $50 million Series B, bringing its total funding to $70 million as it expands into another part of the AI stack. The company builds simulated environments where AI agents can learn, fail, and improve before they’re trusted with real work.
The round was led by Greenfield Partners with participation from Notable Capital, Lightspeed Venture Partners, Datadog, Samsung, Factorial Capital, Gokul Rajaram, and other investors.
Why AI testing is gaining traction
We’ve reached the point where building capable AI agents isn’t the only challenge. Companies also have to figure out how to evaluate those systems before they start navigating enterprise software, writing code, or handling other long-running tasks.
A benchmark score might show an agent can solve a specific problem, but it says very little about how it behaves after hours of unpredictable work.
For managed service providers and systems integrators building AI-powered offerings, that reliability challenge is becoming increasingly important.
As partners deploy AI agents into customer environments for IT operations, security, customer support, and business automation, the ability to validate agent behavior before production could help reduce risk, improve customer confidence, and shorten deployment cycles.
Building digital replicas
Patronus calls its platform Digital World Models. The company creates realistic copies of websites, enterprise software, research workflows, and communication systems, where AI agents can train and be evaluated before deployment.
Take autonomous vehicle companies (think Waymo), for example. These companies and models relied on simulation long before putting cars on public roads. Patronus is applying that same philosophy and strategy to digital environments, giving AI agents opportunities to test-drive and navigate unexpected situations without affecting real customers or business systems.
According to the company, revenue has grown more than 15x over the past year, and its technology is now used by the majority of the world’s leading frontier AI labs and hyperscalers.
The testing layer is getting more attention
Patronus plans to use the new funding to expand its research team, hire more engineers, and build out the computing infrastructure needed to support larger simulation environments.
As AI agents take on more complex work, companies are paying closer attention to how those systems are evaluated before they’re deployed. Patronus believes testing and simulation will become specialized infrastructure rather than remaining an in-house function at every AI lab.
“Patronus AI is tackling one of the most important infrastructure problems in artificial intelligence,” said Itay Inbar, partner at Greenfield Partners.
“The future of AI will depend on systems that can learn and operate reliably in complex environments, and simulations are becoming essential to making that possible.”
Dell is approaching the same challenge from a different angle. Alongside new infrastructure and agentic AI updates, the company recently expanded the Dell AI Data Platform to help organizations build, deploy, and manage AI workloads more effectively. Read more here.





