SHARE

Red Hat AI 3 Puts Inference Front and Center

Red Hat AI 3 unifies OpenShift, Linux, and AI Inference Server to help enterprises scale, manage, and optimize AI workloads across environments.

Written By

Allison Francis

Oct 15, 2025

Channel Insider content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Red Hat has rolled out Red Hat AI 3, the latest version of its enterprise AI platform. It’s built to help organizations move beyond experimentation and run AI workloads in production at scale. The release combines OpenShift AI, Enterprise Linux AI, and the AI Inference Server into one platform.

“As enterprises scale AI from experimentation to production, they face a new wave of complexity, cost and control challenges,” said Joe Fernandes, VP and GM of Red Hat’s AI business unit. “With Red Hat AI 3, we are providing an enterprise-grade, open source platform that minimizes these hurdles.”

At the core of Red Hat AI 3 is inference, the part of the AI process where models stop “training” and start doing the work. This stage consumes a lot of computing power and can be fairly unpredictable, which is why Red Hat is putting so much emphasis here.

Distributed inference and model-as-a-service

One of the biggest updates is llm-d, a system that spreads the work of running large language models across many servers. Built on Kubernetes and the open-source vLLM project, it helps organizations handle the heavy, unpredictable demands of AI without wasting expensive hardware.

“With llm-d, customers can adopt an intelligent AI platform that integrates seamlessly with Kubernetes,” said Steven Huels, VP of AI engineering, Red Hat. “Kubernetes scheduling helps maximize model performance and utilization of the underlying GPU hardware so they’re not sitting there idle.”

Alongside inference, Red Hat AI 3 introduced Model-as-a-Service (MaaS). This means IT teams can make models available as flexible, on-demand services within their own systems. It helps them keep costs in check, track usage, and stay compliant, while avoiding the risks of depending on outside AI providers.

Flexibility for agents and open standards

Red Hat AI 3 is also preparing for what comes next: agent-based AI systems. These are more complex, autonomous applications that will nudge inference demands even higher. To make that easier, the platform includes a Unified API layer built on Llama Stack. It is among the first to adopt the Model Context Protocol (MCP), which standardizes how AI models plug into outside tools.

“AI platforms aren’t going to run a single model on a single inference server on a single machine,” Fernandes said. “You’re going to have multiple models across multiple inference servers across a distributed environment.”

Built for collaboration and control

Beyond raw performance, Red Hat AI 3 is designed to unify the entire AI lifecycle. The platform includes:

An AI hub for lifecycle management and governance
A generative AI studio for experimenting with models and prototyping applications
A catalog of tested, optimized models, including tools like Whisper for speech-to-text and Voxtral Mini for voice-driven agents.

By centralizing tools, workflows, and governance, Red Hat AI 3 gives platform engineers and AI developers a common foundation. The result is a more predictable, cost-effective way to operationalize AI across data centers, public clouds, and edge environments.

Red Hat’s push into distributed inference lands alongside moves from others in the ecosystem. Pure Storage just introduced updates with Azure, Portworx, and NVIDIA, including a Key Value Accelerator that boosts inference speeds up to 20x while cutting costs and energy use—another sign that AI platforms and infrastructure are evolving with efficiency front and center.

Allison Francis

Allison is a contributing writer for Channel Insider, specializing in news for IT service providers. She has crafted diverse marketing, public relations, and online content for top B2B and B2C organizations through various roles. Allison has extensive experience with small to midsized B2B and channel companies, focusing on brand-building, content and education strategy, and community engagement. With over a decade in the industry, she brings deep insights and expertise to her work. In her personal life, Allison enjoys hiking, photography, and traveling to the far-flung places of the world.

Red Hat AI 3 Puts Inference Front and Center

Distributed inference and model-as-a-service

Flexibility for agents and open standards

Built for collaboration and control

Allison Francis

Recommended for you...

Company

Categories