OpenAI launches Jalapeño, a custom AI chip built with Broadcom

OpenAI

Deploy Jalapeño chips in your data centers to achieve higher LLM inference performance per watt.

What to do now

Evaluate Jalapeño chip specifications and plan procurement for upcoming LLM inference workloads.

Summary

OpenAI has announced Jalapeño, its first custom application‑specific integrated circuit (ASIC) designed in partnership with Broadcom. The chip is engineered to accelerate large‑language‑model (LLM) inference for flagship products such as ChatGPT, Codex, the OpenAI API, and forthcoming agent‑based services. Jalapeño’s architecture features a near‑reticle die that packs 216 GB of HBM3E memory, delivers 7.1–7.4 TB/s of memory bandwidth, and achieves 10 PFLOPS of FP4 compute. The design‑to‑tapeout cycle was completed in just nine months—an unusually rapid timeline for high‑performance ASICs—underscoring OpenAI’s push to own more of the AI stack and reduce reliance on merchant GPU supply chains.

The chip’s specifications translate into strong performance‑per‑watt, enabling OpenAI to tighten control over compute economics and product behaviour. By integrating memory, networking, scheduling, and deployment layers, Jalapeño allows the company to tailor inference pipelines to its specific workloads. The announcement also highlighted parallel industry moves: Qualcomm’s recent acquisition of Modular, which has confirmed that its Mojo language will remain open‑source, and NVIDIA’s NeMo AutoModel, which boosts mixture‑of‑experts training throughput. SkyPilot’s new Endpoints service further illustrates the trend toward unified inference platforms.

Strategically, Jalapeño positions OpenAI to offer on‑premises or edge inference options, potentially lowering cloud costs and reducing latency for end users. The move signals a broader shift toward vertically integrated inference stacks beyond the traditional NVIDIA/CUDA ecosystem. By controlling the hardware, software, and deployment layers, OpenAI aims to shape the future of AI services, ensuring that its models run efficiently and reliably across diverse environments.

Key changes

Jalapeño is OpenAI’s first Intelligence Processor designed from scratch for LLM inference
9‑month design‑to‑tape‑out accelerated by OpenAI models
Early testing shows performance per watt better than current state‑of‑the‑art
Architecture reduces data movement and balances compute, memory, networking
Broadcom’s Tomahawk networking silicon integrated
Planned deployment at gigawatt scale with Microsoft and partners
Jalapeño will support a multi‑generation compute platform for future LLMs

Affects

enterprise

Story evolution

Source angles · 2 perspectives

OpenAI Blog

Vendor angle

OpenAI and Broadcom unveil LLM-optimized inference chip

Open

AI News

Independent angle

OpenAI Unveils Jalapeño Chip: Custom ASIC for LLM Inference

Open

Customer impact

Analyzing matches…

Ask about this story

Impact on an agency? Which customers? Compare historically Risks of waiting