Google unveils Gemini 2.5 Flash: A hybrid model with enhanced reasoning & thinking budget
Google has begun rolling out an early version of Google Gemini 2.5 Flash in preview through the Gemini API, accessible via Google AI Studio and Vertex AI. This iteration builds on the 2.0 Flash foundation, offering a significant upgrade in reasoning capabilities while maintaining an emphasis on speed and cost efficiency.
Gemini 2.5 Flash is Google's inaugural fully hybrid reasoning model, allowing developers to toggle thinking on or off. It introduces a thinking budget feature, enabling developers to balance quality, cost, and latency by controlling the maximum number of tokens generated during reasoning. The model can automatically adjust its reasoning duration based on task complexity, ensuring efficient performance without exhausting the budget unnecessarily.
Even with the thinking feature disabled, developers can leverage the model's speed improvements over 2.0 Flash. The preview of Gemini 2.5 Flash, including its reasoning capabilities, is now available via the Gemini API and in a dedicated dropdown within the Gemini app.
