Baidu Unveils ERNIE 5.0: A Multimodal AI Model Challenging Global Competitors

This article was generated by AI and cites original sources.

Chinese tech company Baidu has announced the release of its latest AI model, ERNIE 5.0, at the Baidu World 2025 event. This proprietary foundation model is designed to process and generate content across text, images, audio, and video, positioning it as a competitor in the global enterprise AI market.

Unlike its predecessor, ERNIE 4.5-VL-28B-A3B-Thinking, which was open-source, ERNIE 5.0 is exclusively available through Baidu’s ERNIE Bot website and the Qianfan cloud platform’s API for enterprise clients.

Baidu claims that ERNIE 5.0 has demonstrated impressive performance, rivaling or surpassing Western models like GPT-5-High and Gemini 2.5 Pro in tasks such as multimodal reasoning, document understanding, and image-based question answering. The model excels in structured document understanding, visual chart reasoning, and integrating multiple modalities, setting it apart in the multimodal foundation model landscape.

Baidu’s pricing strategy positions ERNIE 5.0 at the premium end, aligning it with top-tier offerings from Chinese competitors like Alibaba. The contrast in costs between ERNIE 5.0 and earlier models underscores Baidu’s differentiation between high-volume, low-cost models and high-capability models for complex tasks and multimodal reasoning.

In addition to the model release, Baidu is expanding its international presence with products like GenFlow 3.0, Famou, MeDo, and Oreate, aiming to broaden its AI footprint beyond China.

Source: VentureBeat