Google launches 'implicit caching' to make accessing its latest AI models cheaper | TechCrunch

Posted by Alumni from TechCrunch

May 9, 2025

We just shipped implicit caching in the Gemini API, automatically enabling a 75% cost savings with the Gemini 2.5 models when your request hits a cache 'We also lowered the min token required to hit caches to 1K on 2.5 Flash and 2K on 2.5 Pro! Caching, a widely adopted practice in the AI industry, reuses frequently accessed or pre-computed data from models to cut down on computing requirements and cost. For example, caches can store answers to questions users often ask of a model, eliminating the need for the model to re-create answers to the same request. Google previously offered model prompt caching, but only explicit prompt caching, meaning devs had to define their highest-frequency prompts. While cost savings were supposed to be guaranteed, explicit prompt caching typically involved a lot of manual work. Some developers weren't pleased with how Google's explicit caching implementation worked for Gemini 2.5 Pro, which they said could cause surprisingly large API bills.... learn more

Expertise

Find out how we connect targeted research expertise in academia to your business requirements. Discover how we accelerate business innovation and take care of the paperwork (hourly fees, fixed price, IP acquisition, seed funding)

Learn more about our events, organized by our ambassadors. Discover events organized by circle, university, metro area, and more.

Connect with Unicircles members at the universities and schools in our network.

Investors

Discover the opportunities for investors.

Find out how we facilitate investments with startups

Learn more about the opportunity behind startup investments

Corporates

Discover the opportunities for corporates.

Find out more about methodology behind how we facilitate collaboration between startups and corporates.

Learn more about the services tailored to corporates.

Check out our case studies.

Community

A global ecosystem of innovators empowering other innovators.

A global ecosystem of innovators empowering other innovators.

Find out more about partner opportunities

Check out our global events.

Unicircles

The marketplace for academic expertise and innovation.

Our story and expertise.

Send us a message, we will get back ASAP.

Join our team.

Company news, case studies, articles and more.

Google launches 'implicit caching' to make accessing its latest AI models cheaper | TechCrunch

JOIN UNICIRCLES The leading marketplace for advanced expertise and funding. learn more

JOIN UNICIRCLES
The leading marketplace for advanced expertise and funding. learn more