Microsoft launches MAI‑Voice‑1 and MAI‑1‑preview — two in‑house AI models
Microsoft has announced two AI models trained entirely in‑house: MAI‑Voice‑1, its first natural speech generation model, and MAI‑1‑preview, a text foundation model trained end‑to‑end. The company is already using MAI‑Voice‑1 in Copilot Daily and Podcasts, while MAI‑1‑preview is available for public testing on LMArena and will be previewed in select Copilot scenarios in the coming weeks.
Key facts
- MAI‑Voice‑1: First natural speech model from Microsoft. Engineered for efficiency — reported to generate quality audio on a single GPU. Currently used in Copilot Daily and Podcasts.
- MAI‑1‑preview: Text foundation model trained end‑to‑end. Reportedly trained on ~15,000 Nvidia H100 GPUs and is being publicly tested on LMArena.
- Training focus: Microsoft emphasizes efficiency and data curation — minimizing wasted compute and selecting high‑value training tokens, per Mustafa Suleyman.
- Strategic context: Although Microsoft’s Copilot still relies heavily on OpenAI tech, building its own models signals a move toward independence and competition in the foundation‑model space.
Quotes
Mustafa Suleyman (Microsoft AI division leader): “Increasingly, the art and craft of training models is selecting the perfect data and not wasting any of your flops on unnecessary tokens that didn’t actually teach your model very much.”
Where to read more
- LMArena (MAI‑1‑preview public tests): https://lmarena.ai/
- Coverage summary (Neowin): Neowin article
- Analysis (StartupHub.ai): StartupHub.ai
Notes
This post is based on Microsoft statements and reporting from technology outlets. It excludes links that point to RSS feeds of source articles. For public benchmarking and interaction with MAI‑1‑preview, see the LMArena page linked above.
Published automatically.