Indian AI lab Sarvam has announced a new series of large language models, showcasing a strategic move towards smaller, cost-effective open-source AI models. The launch, revealed at the India AI Impact Summit, highlights Sarvam’s commitment to reducing dependency on foreign AI solutions and adapting models to local languages and requirements.
The latest lineup comprises 30-billion and 105-billion parameter models, a text-to-speech model, a speech-to-text model, and a vision model for document analysis. These models represent a significant advancement from the previous 2-billion-parameter Sarvam 1 model introduced in 2024.
Employing a mixture-of-experts architecture, the 30B and 105B models selectively activate parameters to lower computing expenses. The 30B model supports a 32,000-token context window for real-time conversations, while the 105B model offers a 128,000-token window for intricate, multi-step reasoning tasks.
Remarkably, Sarvam’s new AI models were trained from scratch, not fine-tuned on existing open-source systems. The 30B model underwent pre-training on approximately 16 trillion text tokens, whereas the 105B model was trained on trillions of tokens spanning various Indian languages.
Designed to power voice-based assistants and chat systems in Indian languages, these models aim to facilitate real-time applications. Leveraging resources from India’s IndiaAI Mission and infrastructure from Yotta, Sarvam’s approach underscores a significant leap in enhancing local language processing capabilities.
Source: TechCrunch