Gemini 3.1 Flash TTS
Summary
Google released Gemini 3.1 Flash TTS, a new text-to-speech model that generates audio from text using prompts sent through the standard Gemini API. Unlike typical AI models, this one accepts detailed creative instructions (called prompts) to control how the audio sounds, including vocal style, pace, accent, and emotional tone, allowing users to create speech with specific characteristics like a particular regional accent or energetic delivery.
Classification
Affected Vendors
Related Issues
Original source: https://simonwillison.net/2026/Apr/15/gemini-31-flash-tts/#atom-everything
First tracked: April 15, 2026 at 02:00 PM
Classified by LLM (prompt v3) · confidence: 85%