OpenAI and Broadcom Jointly Develop AI Inference Chip
OpenAI and Broadcom have been revealed to be jointly developing an AI inference chip. This chip would provide AI model providers with the option to reduce token pricing, potentially lowering barriers to AI adoption for cost-conscious enterprises.

OpenAI and Broadcom have been revealed to be jointly working on the development of an AI inference-specialized chip. Inference refers to the process by which a trained AI model generates answers to actual queries, and it is a stage directly linked to the costs incurred each time a user engages with AI. This chip is designed with the goal of optimizing inference processing efficiency.
As AI services become more widespread, "token pricing"—which is charged based on usage volume—has become an unavoidable cost factor for enterprises. The token is the smallest unit for processing text, and it is consumed when posing questions to AI and generating responses. As usage scales increase, costs grow correspondingly, and enterprises seeking to leverage AI extensively tend to be particularly concerned about high per-token costs.
If this chip becomes practical, AI model providers would gain the option to reduce token pricing. This has the potential to ease concerns among enterprises that have been hesitant to adopt AI by lowering cost barriers.
Currently, NVIDIA GPUs are predominantly used for AI inference processing, and the company maintains an overwhelming market share in this domain. The initiatives by OpenAI and Broadcom are positioned in the context of reducing dependence on specific vendors and expanding inference infrastructure options. There is also a perspective that having in-house designed chips would enable finer control over cost structures.
Broadcom is a company with a long track record in the telecommunications and semiconductor sectors, and in recent years has been increasing its presence in the custom chip design for AI (ASIC) business. By partnering with AI model development companies like OpenAI, it becomes possible to create specialized designs optimized for actual inference workloads. Unlike general-purpose GPUs, chips tailored to specific processing tasks tend to have advantages in power efficiency and cost.
As commercial AI applications become mainstream, how to control inference costs has become a fundamental issue in service design. This development demonstrates that, alongside competition in model performance, cost optimization of infrastructure is becoming an increasingly important competitive axis across the entire industry. The development and procurement strategy for inference chips is expected to be a significant factor influencing AI service pricing and adoption speed in the future.
This article is an original work independently written and edited by the AI issue editorial team based on factual reporting. © AI issue. Unauthorized reproduction, redistribution, or use for AI training is prohibited.