Google Gemini 2.5 Flash-Lite launches with lower pricing and 1M context window
Gemini 2.5 Flash-Lite is now generally available after a month of early access. Designed for speed and affordability, it’s Google’s fastest Gemini model, now ready for production via Vertex AI and Google AI Studio. Compared to its predecessor, it offers quicker, more consistent performance in translation, basic coding, and multimodal input handling.
Google has also cut prices significantly: inputs are now $0.10 and outputs $0.40 per million tokens, with audio input rates reduced by 40%. Flash-Lite is now cheaper than alternatives like OpenAI’s o4-mini and Claude Sonnet 4 from Anthropic. It supports a one million-token context window, adjustable thinking budgets, code execution, and Google Search Grounding.
Developers using the preview must switch to gemini-2.5-flash-lite before the old alias is retired on August 25.
