Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription
English summary
Microsoft AI has announced MAI-Transcribe-1.5, an updated automatic speech recognition model supporting 43 languages with a single system. It achieves a 2.4% word error rate on the Artificial Analysis leaderboard and claims best-in-class accuracy on the FLEURS benchmark. The model offers up to 5x faster transcription for long audio, transcribing an hour of audio in under 15 seconds. A new keyword biasing feature reduces errors on domain-specific terms by up to 30%. MAI-Transcribe-1.5 is integrated into Microsoft products like Copilot, Teams, and Dynamics 365 and is available through Azure AI Foundry.
Chinese summary
微软AI发布了MAI-Transcribe-1.5,这是一款更新的自动语音识别模型,支持43种语言并使用单一系统。它在Artificial Analysis排行榜上实现了2.4%的词错误率,并在FLEURS基准上声称达到最佳准确率。该模型在长音频转录上速度提升高达5倍,能在15秒内转录一小时音频。新增的关键词偏置功能可将领域特定术语的错误率减少多达30%。MAI-Transcribe-1.5已集成到Copilot、Teams和Dynamics 365等微软产品中,并通过Azure AI Foundry提供。
Key points
2.4% WER on Artificial Analysis leaderboard, ranked third
在Artificial Analysis排行榜上WER为2.4%,排名第三
Best-in-class accuracy on FLEURS benchmark across 43 languages
在FLEURS基准测试中,在43种语言上达到最佳准确率
Up to 5x faster long-audio transcription; transcribes 1 hour in under 15 seconds
长音频转录速度提升高达5倍;可在15秒内转录1小时音频
Keyword biasing feature with up to 30% WER reduction on domain-specific terms
关键词偏置功能,在领域特定术语上WER降低多达30%
Supports 43 languages, including new South Asian and European languages
支持43种语言,包括新增的南亚和欧洲语言