Gemini 3.1 Flash TTS: the next generation of expressive AI speech
Summary
Google has released Gemini 3.1 Flash TTS, a new text-to-speech model (software that converts written text into spoken audio) that produces more natural-sounding speech with better control over how the AI speaks. Developers can now use audio tags (special commands embedded in text) to adjust vocal style, pace, and delivery across over 70 languages, and all generated audio is watermarked with SynthID (a hidden marker that identifies AI-generated content) to help prevent misinformation.
Classification
Affected Vendors
Related Issues
Original source: https://deepmind.google/blog/gemini-3-1-flash-tts-the-next-generation-of-expressive-ai-speech/
First tracked: April 15, 2026 at 02:00 PM
Classified by LLM (prompt v3) · confidence: 92%