search
HomeTechnology peripheralsAIChitrarth-1: A Multilingual VLM by Krutrim AI Labs

India's AI landscape is rapidly evolving, with significant advancements and innovations emerging. Krutrim AI Labs, an Ola Group company, is a key player in this growth, recently unveiling Chitrarth-1, a groundbreaking Vision Language Model (VLM). Designed for India's diverse linguistic and cultural context, Chitrarth-1 supports ten major Indian languages plus English, addressing a critical need for multilingual AI solutions. This article delves into Chitrarth-1 and its implications for India's expanding AI capabilities.

Table of Contents

  • What is Chitrarth-1?
  • Chitrarth-1 Architecture and Specifications
  • Training Data and Methodology
    • Phase 1: Adapter Pre-training
    • Phase 2: Instruction Tuning
  • Performance and Benchmarks
  • Accessing Chitrarth-1
  • Chitrarth-1 in Action
  • Conclusion

What is Chitrarth-1?

Chitrarth-1 (combining "Chitra" – image and "Artha" – meaning) is a 7.5-billion parameter VLM integrating advanced language and vision processing. Built to serve India's diverse linguistic needs, it supports Hindi, Bengali, Telugu, Tamil, Marathi, Gujarati, Kannada, Malayalam, Odia, Assamese, and English. This model embodies Krutrim's commitment to developing AI "for our country, of our country, and for our citizens." Its use of a rich, multilingual dataset minimizes bias and ensures robust performance across Indic languages and English, promoting equitable AI access. Research on Chitrarth-1 is published in leading academic journals, including NeurIPS and the Ninth Conference on Machine Translation.

Chitrarth-1 Architecture and Specifications

Chitrarth-1 utilizes the Krutrim-7B LLM as its foundation, enhanced by a vision encoder based on the SIGLIP (siglip-so400m-patch14-384) model. Key architectural components include:

  • A pre-trained SIGLIP vision encoder for image feature extraction.
  • A trainable linear mapping layer to project image features into the LLM's token space.
  • Fine-tuning with instruction-following image-text datasets for improved multimodal performance.

Training Data and Methodology

Chitrarth-1's training involved two phases using a vast, multilingual dataset:

Chitrarth-1: A Multilingual VLM by Krutrim AI Labs

Phase 1: Adapter Pre-training

  • Pre-trained on a diverse dataset translated into multiple Indic languages using an open-source model.
  • Maintained a balanced representation of English and Indic languages to ensure equitable performance.
  • Designed to avoid bias towards any single language, optimizing for efficiency and robustness.

Phase 2: Instruction Tuning

  • Fine-tuned on a complex instruction dataset to enhance multimodal reasoning capabilities.
  • Utilized an English-based instruction-tuning dataset and its multilingual translations.
  • Included a vision-language dataset featuring diverse Indian imagery (personalities, monuments, artwork, cuisine).
  • Incorporated high-quality proprietary English text data for balanced domain representation.

Performance and Benchmarks

Chitrarth-1: A Multilingual VLM by Krutrim AI Labs

Chitrarth-1 has been rigorously tested against leading VLMs like IDEFICS 2 (7B) and PALO 7B, consistently outperforming them on various benchmarks while maintaining competitiveness on tasks such as TextVQA and Vizwiz. It also surpasses LLaMA 3.2 11B Vision Instruct in key metrics. Krutrim introduced BharatBench, a new evaluation suite for ten under-resourced Indic languages across three tasks, establishing a baseline for future research and highlighting Chitrarth-1's ability to handle these languages effectively. Sample BharatBench results are shown below:

Language POPE LLaVA-Bench MMVet
Telugu 79.9 54.8 43.76
Hindi 78.68 51.5 38.85
Bengali 83.24 53.7 33.24
Malayalam 85.29 55.5 25.36
Kannada 85.52 58.1 46.19
English 87.63 67.9 30.49

For more details, click here.

Accessing Chitrarth-1

Chitrarth-1 is accessible through:

  • Hugging Face: Direct use or fine-tuning. (Click here to visit)
  • GitHub: (Code provided in the original article)
  • Krutrim Cloud: (Click here to explore)

Chitrarth-1: A Multilingual VLM by Krutrim AI Labs

Chitrarth-1 in Action

Examples of Chitrarth-1's capabilities include image analysis, image caption generation, and UI/UX screen analysis (images provided in the original article).

Chitrarth-1: A Multilingual VLM by Krutrim AI Labs Chitrarth-1: A Multilingual VLM by Krutrim AI Labs Chitrarth-1: A Multilingual VLM by Krutrim AI Labs

Conclusion

Krutrim AI Labs, a division of the Ola Group, is committed to building the future of AI computing. With Chitrarth-1, and other offerings like GPU as a Service, AI Studio, and more, they are establishing a new standard for inclusive, culturally sensitive AI, fostering a more equitable technological landscape.

The above is the detailed content of Chitrarth-1: A Multilingual VLM by Krutrim AI Labs. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
A Business Leader's Guide To Generative Engine Optimization (GEO)A Business Leader's Guide To Generative Engine Optimization (GEO)May 03, 2025 am 11:14 AM

Google is leading this shift. Its "AI Overviews" feature already serves more than one billion users, providing complete answers before anyone clicks a link.[^2] Other players are also gaining ground fast. ChatGPT, Microsoft Copilot, and Pe

This Startup Is Using AI Agents To Fight Malicious Ads And Impersonator AccountsThis Startup Is Using AI Agents To Fight Malicious Ads And Impersonator AccountsMay 03, 2025 am 11:13 AM

In 2022, he founded social engineering defense startup Doppel to do just that. And as cybercriminals harness ever more advanced AI models to turbocharge their attacks, Doppel’s AI systems have helped businesses combat them at scale— more quickly and

How World Models Are Radically Reshaping The Future Of Generative AI And LLMsHow World Models Are Radically Reshaping The Future Of Generative AI And LLMsMay 03, 2025 am 11:12 AM

Voila, via interacting with suitable world models, generative AI and LLMs can be substantively boosted. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including

May Day 2050: What Have We Left To Celebrate?May Day 2050: What Have We Left To Celebrate?May 03, 2025 am 11:11 AM

Labor Day 2050. Parks across the nation fill with families enjoying traditional barbecues while nostalgic parades wind through city streets. Yet the celebration now carries a museum-like quality — historical reenactment rather than commemoration of c

The Deepfake Detector You've Never Heard Of That's 98% AccurateThe Deepfake Detector You've Never Heard Of That's 98% AccurateMay 03, 2025 am 11:10 AM

To help address this urgent and unsettling trend, a peer-reviewed article in the February 2025 edition of TEM Journal provides one of the clearest, data-driven assessments as to where that technological deepfake face off currently stands. Researcher

Quantum Talent Wars: The Hidden Crisis Threatening Tech's Next FrontierQuantum Talent Wars: The Hidden Crisis Threatening Tech's Next FrontierMay 03, 2025 am 11:09 AM

From vastly decreasing the time it takes to formulate new drugs to creating greener energy, there will be huge opportunities for businesses to break new ground. There’s a big problem, though: there’s a severe shortage of people with the skills busi

The Prototype: These Bacteria Can Generate ElectricityThe Prototype: These Bacteria Can Generate ElectricityMay 03, 2025 am 11:08 AM

Years ago, scientists found that certain kinds of bacteria appear to breathe by generating electricity, rather than taking in oxygen, but how they did so was a mystery. A new study published in the journal Cell identifies how this happens: the microb

AI And Cybersecurity: The New Administration's 100-Day ReckoningAI And Cybersecurity: The New Administration's 100-Day ReckoningMay 03, 2025 am 11:07 AM

At the RSAC 2025 conference this week, Snyk hosted a timely panel titled “The First 100 Days: How AI, Policy & Cybersecurity Collide,” featuring an all-star lineup: Jen Easterly, former CISA Director; Nicole Perlroth, former journalist and partne

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use