Introducing Munsit

The World’s Most Accurate Arabic Speech-to-Text Model

Trained on 15,000 hours of diverse Arabic speech using cutting-edge weak supervision, Munsit-1 outperforms global models from OpenAI, Meta and Microsoft across every major Arabic benchmark.

Click the microphone and start speaking to try Munsit
(arabic only)

Start speaking now to try it
Stop & Upload

Transcript:

Benchmark

The leader in Arabic ASR Performance

Munsit-1 delivers state-of-the-art results across all major Arabic ASR benchmarks — consistently outperforming top-tier models like Whisper (OpenAI), SeamlessM4T (Meta) and Nvidia Conformer.

View the Hugging Face Leaderboard

Munsit

V1

:

Arabic ASR models benchmark

Average Word Error Rate (WER) Comparison Across 6 Industry-Standard Datasets

Tested across six key Arabic speech benchmarks — SADA, Common Voice, MASC (clean), MASC (noisy), Casablanca, and MGB-2 — Munsit-1 sets a new standard for Arabic speech recognition with the lowest average WER of 26.68%.

Model
WER
CNTXT AI Munsit-1
26.68 %
NVIDIA Conformer-CTC Large V3
34.74 %
OpenAl Whisper
36.86 %
META SeamlessM4T V2 Large
38.16 %
ElevenLabs Scribe-V1
40.05 %
OpenAl GPT-4 Transcribe
44.94 %
Microsoft Azure STT
45.72 %
Why Munsit

#1 for Arabic Speech:
Why Munsit Leads the Way

Most ASR models struggle with Arabic. Munsit was built for it.

Experience Munsit

15,000 hours

Most ASR models struggle with Arabic. Munsit was built for it.

Measured for real-world accuracy

Benchmarked against 6 standard benchmarking datasets

Robust across 25+ dialects

From the Arabic-speaking world

Dialect-agnostic processing

That adapts to regional variations

Conformer architecture

Optimized for Arabic speech

Performs in clean and noisy conditions

With consistent accuracy across devices and recording environments