SHARE
Facebook X Pinterest WhatsApp

Red Hat AI 3 Puts Inference Front and Center

Red Hat AI 3 unifies OpenShift, Linux, and AI Inference Server to help enterprises scale, manage, and optimize AI workloads across environments.

Oct 15, 2025
Channel Insider content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Red Hat has rolled out Red Hat AI 3, the latest version of its enterprise AI platform. It’s built to help organizations move beyond experimentation and run AI workloads in production at scale. The release combines OpenShift AI, Enterprise Linux AI, and the AI Inference Server into one platform.

“As enterprises scale AI from experimentation to production, they face a new wave of complexity, cost and control challenges,” said Joe Fernandes, VP and GM of Red Hat’s AI business unit. “With Red Hat AI 3, we are providing an enterprise-grade, open source platform that minimizes these hurdles.”

At the core of Red Hat AI 3 is inference, the part of the AI process where models stop “training” and start doing the work. This stage consumes a lot of computing power and can be fairly unpredictable, which is why Red Hat is putting so much emphasis here.

Distributed inference and model-as-a-service

One of the biggest updates is llm-d, a system that spreads the work of running large language models across many servers. Built on Kubernetes and the open-source vLLM project, it helps organizations handle the heavy, unpredictable demands of AI without wasting expensive hardware. 

“With llm-d, customers can adopt an intelligent AI platform that integrates seamlessly with Kubernetes,” said Steven Huels, VP of AI engineering, Red Hat. “Kubernetes scheduling helps maximize model performance and utilization of the underlying GPU hardware so they’re not sitting there idle.”

Alongside inference, Red Hat AI 3 introduced Model-as-a-Service (MaaS). This means IT teams can make models available as flexible, on-demand services within their own systems. It helps them keep costs in check, track usage, and stay compliant, while avoiding the risks of depending on outside AI providers.

Flexibility for agents and open standards

Red Hat AI 3 is also preparing for what comes next: agent-based AI systems. These are more complex, autonomous applications that will nudge inference demands even higher. To make that easier, the platform includes a Unified API layer built on Llama Stack. It is among the first to adopt the Model Context Protocol (MCP), which standardizes how AI models plug into outside tools.

“AI platforms aren’t going to run a single model on a single inference server on a single machine,” Fernandes said. “You’re going to have multiple models across multiple inference servers across a distributed environment.”

Built for collaboration and control

Beyond raw performance, Red Hat AI 3 is designed to unify the entire AI lifecycle. The platform includes:

  • An AI hub for lifecycle management and governance
  • A generative AI studio for experimenting with models and prototyping applications
  • A catalog of tested, optimized models, including tools like Whisper for speech-to-text and Voxtral Mini for voice-driven agents.

By centralizing tools, workflows, and governance, Red Hat AI 3 gives platform engineers and AI developers a common foundation. The result is a more predictable, cost-effective way to operationalize AI across data centers, public clouds, and edge environments.

Red Hat’s push into distributed inference lands alongside moves from others in the ecosystem. Pure Storage just introduced updates with Azure, Portworx, and NVIDIA, including a Key Value Accelerator that boosts inference speeds up to 20x while cutting costs and energy use—another sign that AI platforms and infrastructure are evolving with efficiency front and center.

thumbnail Allison Francis

Allison is a contributing writer for Channel Insider, specializing in news for IT service providers. She has crafted diverse marketing, public relations, and online content for top B2B and B2C organizations through various roles. Allison has extensive experience with small to midsized B2B and channel companies, focusing on brand-building, content and education strategy, and community engagement. With over a decade in the industry, she brings deep insights and expertise to her work. In her personal life, Allison enjoys hiking, photography, and traveling to the far-flung places of the world.

Recommended for you...

NetApp Announces Solutions for AI, Cloud, and Cyber Resilience
Jordan Smith
Oct 14, 2025
Kantata Unveils AI Engine to Help Firms Harness Expertise
Luis Millares
Oct 9, 2025
EncompaaS CEO on “Early Days” of AI & Data Challenges
Why Google Is Spending $4 Billion on an AI Data Center in Arkansas
Channel Insider Logo

Channel Insider combines news and technology recommendations to keep channel partners, value-added resellers, IT solution providers, MSPs, and SaaS providers informed on the changing IT landscape. These resources provide product comparisons, in-depth analysis of vendors, and interviews with subject matter experts to provide vendors with critical information for their operations.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.