OpenAI on Wednesday announced its first custom-built inference processor, designed and manufactured in collaboration with Broadcom. The new processor, named Jalapeño, was designed specifically for the unique needs of OpenAI’s inference systems. OpenAI’s proprietary AI model helped develop the chip, the company said.
The chip is still being tested, but OpenAI says early results show significantly better performance per watt than current state-of-the-art alternatives.
The partnership was officially announced in October, but OpenAI’s chip plans have long been rumored as a way to reduce the company’s dependence on Nvidia GPUs. Google and Amazon have both developed custom chips that serve similar purposes: silicon specifically designed to speed up machine learning workloads, often referred to as “AI accelerators.”
OpenAI President Greg Brockman explained the company’s approach to chip development in an internal podcast shortly after the Broadcom partnership was announced.
“We deeply understand the workload,” Brockman said during the episode. “We’ve been really looking at specific workloads that are underserved and thinking about how we can build something that can accelerate what’s possible.”
Jalapeño is specifically designed for inference, the process of running pre-built AI models in response to user commands. In its announcement, OpenAI highlighted the chip’s low operating costs when running real-time coding models. Perhaps more performance-intensive tasks, such as pre-training, will still rely on Nvidia hardware, but even a small reduction in inference costs could go a long way toward improving the company’s bottom line.
Optimizing that inference system could prove to be a key element in the economics of future AI, and it could happen at every level of the stack. OpenAI has already built agent products such as Codex, the models that power them, and the data centers that run those models. As the company explained in its announcement, the move to dedicated chips will allow the company to move that process further.
“OpenAI doesn’t just develop frontier models and build products on top of them; we design the underlying infrastructure, including chip architectures, kernels, memory systems, networking, scheduling, deployment systems, and product experiences,” the company wrote. “OpenAI works across the stack, allowing us to optimize each layer with the same goal of making our models faster, more reliable, and more affordable for users.”
If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.
