Google has officially advanced its Gemini 2.5 model lineup as of June 17, 2025: both Gemini 2.5 Pro and Gemini 2.5 Flash are now stable and generally available, while Gemini 2.5 Flash‑Lite enters public preview. Pro offers top-tier reasoning, multimodal understanding, coding capabilities, and handles up to a 1 million-token context—ideal for complex, mission-critical tasks. Flash strikes a balance of speed, cost-efficiency, and robust reasoning, with simplified pricing at $0.30 per million input and $2.50 per million output tokens. Flash‑Lite is the fastest and most economical option, optimized for high-throughput tasks like translation and classification, with reasoning off by default and support for tool use. All three models share the million-token context window, adaptive “thinking” control, and grounding via Google Search, code execution, function-calling, and multimodality—all accessible via Gemini app, AI Studio, Vertex AI, and more.

Here are some key features across the Gemini 2.5 models:

  • Hybrid Reasoning Models: Designed to provide excellent performance while being efficient in terms of cost and speed.
  • “Thinking” Capabilities: Models can reason through their thoughts before responding, which leads to improved accuracy. Developers have control over the “thinking budget” to balance performance and cost.
  • Native Multimodality: Understands and processes inputs across various modalities including text, images, audio, and video.
  • Long Context Window: Features a 1 million-token context length, allowing them to comprehend vast datasets and handle complex problems from different information sources.
  • Tool Integration: Can connect to tools like Google Search and code execution.

Gemini 2.5 Pro:

  • Most Advanced Model: Excels at coding and highly complex tasks.
  • Enhanced Reasoning: State-of-the-art in key math and science benchmarks.
  • Advanced Coding: Capable of generating code for web development tasks and creating interactive simulations.

Gemini 2.5 Flash:

  • Fast Performance: Optimized for everyday tasks and large-scale processing.
  • Cost-Efficient: Balances price and performance.
  • Live API Native Audio: Offers high-quality, natural conversational audio outputs with enhanced voice quality and adaptability, including features like Proactive Audio and Affective Dialog.

Gemini 2.5 Flash-Lite:

  • Most Cost-Efficient and Fastest: Designed for high-volume, latency-sensitive tasks like translation and classification.
  • Higher Quality: Outperforms 2.0 Flash-Lite on coding, math, science, reasoning, and multimodal benchmarks.
  • Lower Latency: Offers lower latency compared to 2.0 Flash-Lite and 2.0 Flash for a broad range of prompts.

Gemini Technical Report