Today's Bulletin: March 14, 2026

More results...

Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
Africacom
AfricaCom 2024
AfricaCom 2025
AI
Apps
Apps
Arabsat
Banking
Broadcast
Cabsat
CABSAT
Cloud
Column
Content
Corona
Cryptocurrency
DTT
eCommerce
Editorial
Education
Entertainment
Events
Fintech
Fixed
Gitex
Gitex Africa
Gitex Africa 2025
GSMA Cape Town
Healthcare
IBC
Industry Voices
Infrastructure
IoT
MNVO Nation Africa
Mobile
Mobile Payments
Music
MWC Barcelona
MWC Barcelona 2025
MWC Barcelona 2026
MWC Kigali
MWC Kigali 2025
News
Online
Opinion Piece
Orbiting Innovations
Podcast
Q&A
Satellite
Security
Software
Startups
Streaming
Technology
TechTalks
TechTalkThursday
Telecoms
Utilities
Video Interview
Follow us

Google Research Africa Launches WAXAL, Open Dataset Covering 21 African Languages

February 2, 2026
2 min read
Author: Akim Benamara

Developed over three years, WAXAL aims to empower researchers and support the creation of inclusive technology across Africa.

For many people around the world, speaking to devices has become second nature, whether to get directions, check the news, or dictate voice notes. However, this convenience often disappears when technology cannot understand local languages—a reality for hundreds of millions of people, particularly in Sub-Saharan Africa, where over 2,000 distinct languages are spoken. The main challenge in developing inclusive voice technology for the region has been the lack of accessible, high-quality speech data.

To address this gap, researchers have introduced WAXAL, a dataset named after the Wolof word for “speak.” Developed over three years, WAXAL aims to empower researchers and support the creation of inclusive technology across Africa. The dataset covers 21 languages, including Acholi, Hausa, Luganda, and Yoruba, and comprises over 11,000 hours of speech data from nearly two million recordings. It includes approximately 1,250 hours of transcribed speech for automatic speech recognition (ASR) and more than 20 hours of studio recordings for text-to-speech (TTS) applications.

The project is a collaborative effort led by African institutions and experts. Makerere University in Uganda and the University of Ghana collected data for 13 languages, while Digital Umuganda in Rwanda led data collection for five additional languages. High-quality studio recordings were produced in partnership with Media Trust and Loud n Clear, and the African Institute for Mathematical Sciences (AIMS) contributed multilingual datasets for future expansions. The framework ensures that partners retain ownership of the data they collected while making resources available to the global research community.

WAXAL captures authentic speech ethically, combining everyday language use—such as participants describing pictures in their native tongues—with professional voice recordings for text-to-speech development. Beyond supporting AI innovation, WAXAL is expected to aid in the digital preservation of African languages. The full dataset is released under an open license and is available today on Hugging Face, with detailed methodology published in an accompanying research paper.

The TechAfrica News Podcast

Follow us on LinkedIn

Newsletter signup

Sign up for our weekly newsletter and get the latest industry insights right in your inbox!

Please wait...

Thank you for sign up!