Google Introduces Gemini 3.1 Flash-Lite, a Faster and More Cost-Efficient AI Model for Developers

Google has unveiled Gemini 3.1 Flash-Lite, the fastest and most cost-efficient model in its Gemini 3 series, designed to support high-volume developer workloads and real-time AI applications at scale.

The new model is rolling out in preview for developers through the Gemini API in Google AI Studio and for enterprise users via Vertex AI. According to Google, the model is optimised for speed, affordability and high-frequency workflows while maintaining strong performance across reasoning and multimodal tasks.

Priced at $0.25 per million input tokens and $1.50 per million output tokens, Gemini 3.1 Flash-Lite offers significantly lower costs compared with larger AI models. Benchmark results from Artificial Analysis show that the model delivers a 2.5-times faster time to first answer token and a 45 per cent increase in output speed compared with Gemini 2.5 Flash, while maintaining similar or better response quality.

The model has also demonstrated strong benchmark performance, achieving an Elo score of 1432 on the Arena.ai Leaderboard and scoring 86.9 per cent on GPQA Diamond and 76.8 per cent on MMMU Pro, outperforming several models in its category and even surpassing earlier Gemini models in some evaluations.

Gemini 3.1 Flash-Lite includes adjustable thinking levels, enabling developers to control how much computational reasoning the model applies to a task. This feature allows teams to balance speed, cost and reasoning depth depending on the complexity of workloads.

The model is designed for a wide range of enterprise and developer applications, including high-volume translation, content moderation, UI and dashboard generation, simulations, and instruction-based workflows. Its low latency also makes it suitable for building responsive real-time applications and large-scale automated systems.

Early-access developers using Google AI Studio and Vertex AI—including companies such as Latitude, Cartwheel and Whering—have already begun deploying the model for large-scale AI solutions. Testers reported that Gemini 3.1 Flash-Lite handles complex inputs with the precision of larger models while maintaining strong instruction adherence and efficiency.

With the launch of Gemini 3.1 Flash-Lite, Google aims to provide developers with a scalable and affordable AI model capable of powering high-throughput applications and next-generation AI-driven services.

Reset Your Password

News & Analysis

Google Introduces Gemini 3.1 Flash-Lite, a Faster and More Cost-Efficient AI Model for Developers

AI Tools Directory

Claude Launches Opus 4.8, Bringing Smarter Automation and Lower AI Operating Costs

Google Launches AI Threat Defence to Counter AI-Powered Cyberattacks

Vestmark Launches AI-Powered Portfolio Intelligence Solution Vestmark Pulse

OpenAI Launches Advanced Account Security for ChatGPT Users Facing Elevated Cyber Risks

Funding Tracker

Anthropic Expands AI Compute Capacity through New SpaceX Partnership

Anthropic, Blackstone and Goldman Sachs Launch New AI Services Company

IBM, Oracle Expand 40-Year Partnership with New AI and Hybrid Cloud Innovations

AI Watchlist

Bayer Expands AI-Powered FarmRise Platform to Reach 5 Million Indian Farmers

The Future of Mobility Will Be Software-Defined, AI-Native, and Open Source

EU Researchers Develop AI Tool to Turn Disaster News into Actionable Risk Intelligence

SAP to Acquire Prior Labs to Establish a Globally Leading Frontier AI Lab in Europe

IBM, Oracle Expand 40-Year Partnership with New AI and Hybrid Cloud Innovations

China Makes AI-Powered Robots Core of National Strategy

Contact Us

Enquiry

Subscribe to AISpectrum India Newsletter