Microsoft Launches Project Gecko to End Generative AI’s Language Divide
Microsoft Research launched Project Gecko, an initiative dedicated to creating cost-effective and customizable AI systems for the global majority.
Generative AI tools are significantly boosting productivity worldwide, but their effectiveness is not uniform across all communities. Systems often perform poorly for populations under-represented online due to training data originating primarily from well-resourced online languages and cultures. This gap means generative AI struggles in many local languages and fails to reflect diverse social and cultural realities. Infrastructure issues contribute, but even when accounting for factors like GDP and internet access, AI adoption remains lower in nations where low-resource languages dominate.
To address these inequities, Microsoft Research launched Project Gecko, an initiative dedicated to creating cost-effective and customizable AI systems for the global majority. This project focuses on delivering vital expertise using local languages, culturally sensitive content, and multimodal engagement through text, voice, and video. It is a collaborative effort involving researchers from Microsoft Research Africa (Nairobi), Microsoft Research India, and the Microsoft Research Accelerator in the United States, along with Digital Green—a global development organization focused on community-driven digital infrastructure for agriculture—and other partners in agri-tech, philanthropy, and academia.
A core innovation of Project Gecko is a new AI system called MMCTAgent, a multimodal critical thinking agent framework. This system analyzes inputs from speech, images, and and videos to provide relevant, context-aware responses. MMCTAgent is currently available on Azure AI Foundry Labs and its code can be accessed on GitHub. This work aligns with Microsoft’s mission to empower everyone globally, emphasizing that developing equitable generative AI, which incorporates culturally nuanced experiences, is key to advancing AI responsibly and inclusively.
Project Gecko chose to begin its work in agriculture because of the sector’s strategic role in advancing climate, health, and education outcomes simultaneously. The initial focus is on small farms in India and Kenya, where millions of people could benefit from technology to boost crop yields and resilience against volatile climates. The project is built on VeLLM (uniVersal Empowerment with LLMs), a Microsoft Research India platform that supports AI systems creating multilingual, multimodal content grounded in culturally relevant data. VeLLM uses community-contributed data to improve performance in non-English languages, exemplified by its use in developing the Shiksha copilot for teachers in rural India.
Agriculture is a critical economic driver in both target regions, accounting for a significant portion of GDP and employing millions of people, primarily smallholder farmers. While digital services exist to help farmers with challenges like weather and pests, their reliance on Large Language Models (LLMs) trained mostly on English and Western languages results in farmers struggling to get accurate answers using local languages and domain-specific terms, leading to low usage. In countries like Kenya and India, which have strong oral cultures, voice communication and video are preferred for information sharing, necessitating multimodal approaches. Furthermore, limited connectivity demands that any system must run on low bandwidth and minimal computing power.
The project is working closely with FarmerChat, a speech-first AI assistant from Digital Green that advises millions of farmers with trusted agricultural recommendations. Digital Green has curated a library of over 10,000 videos in more than 40 languages and dialects over two decades, a vast, underutilized reservoir of local knowledge. Project Gecko’s goal was to evolve FarmerChat from a basic Q&A tool into a trusted farming companion. The team envisioned farmers submitting queries via speech or text and receiving actionable, step-by-step answers in their preferred language through text, voice, and a video that starts precisely at the relevant solution.
The MMCTAgent framework is central to achieving this by enhancing experimental frontier models through domain-specific tools. It processes various information types—audio, visual, and text—and breaks down complex questions. It uses techniques like NLP and computer vision to better understand the videos and transcripts in the Digital Green library, making them searchable and accessible. MMCTAgent adapts its reasoning and verifies its answers using a built-in critic to ensure accuracy and relevance. The resulting multimodal answers are culturally and linguistically appropriate because they are grounded in content created by local communities. Field studies in Kenya and India confirmed that the system provides better response quality, usability, and user trust compared to more generic state-of-the-art models.
To overcome the lack of training data and computational resources for low-resource languages, the Project Gecko team is building new models from scratch, including automatic speech recognition (ASR) and text-to-speech (TTS) models. They are also utilizing Small Language Models (SLMs), which require significantly less computing power than the massive LLMs, making them easier to fine-tune for targeted domains and languages. The result is a set of tailored speech models and SLMs for languages like Kiswahili, Hindi, and Kikuyu, continuously improved with user data and locally adapted. The team has expanded language support to six languages in Kenya by incorporating a large dataset of crowd-sourced data. They are also developing enhancements for FarmerChat, such as the ability to ask clarifying questions and foster peer-to-peer sharing.
Looking ahead, Project Gecko plans to expand its impact into other domains, including healthcare, education, and retail. By analyzing the successful design patterns and infrastructure used in agriculture, Microsoft aims to create generalizable solutions. The team will soon release a multilingual playbook to provide end-to-end guidance for developers on building domain-specific multilingual AI applications, drawing on the cross-cultural experiences of the Microsoft Research teams in India and Kenya. The ultimate goal is to ensure the next generation of AI is globally inclusive, culturally relevant, and shaped by the communities it is designed to serve.

