Google is already sending notices that the 2.5 models will be deprecated soon while all the 3.x models are in preview. It really is wild and peak Google.
Public Service Announcement!! I don't know why the hell google do this, but when the deprecate a model, the error you will see is a Rate Limit error. This has caught me out before and it is super annoying.
Yes, sorry - you are correct. Once removed, that's the error, which is incredibly confusing. I spent way too long troubleshooting usage when 2.0 was removed before I figured it out.
Well it’s not even performance (define that however you will), but behavior is definitely different model to model. So while whatever new model is released might get billed as an improvement, changing models can actually meaningfully impact the behavior of any app built on top of it.
the problem the price point is increasing sharply every time.
gemini 2 flash lite was $0.3 per 1Mtok output, gemini 2.5 flash lite is $0.4 per 1Mtok output, guess the pricing for gemini 3 flash lite now.
yes you guess it right, it is $1.5 per 1Mtok output. you can easily guest that because google did the same thing before: gemini 2 flash was $0.4, then 2.5 flash it jumps to $2.5.
and that is only the base price, in reality newer models are al thinking models, so it costs even more tokens for the sample task.
at some point it is stopped being viable to use gemini api for anything.
There's a whole universe of tasks that aren't "fix a Github issue" or even related to coding in the slightest. A large number of those tasks doesn't necessarily get better with model updates. In many cases, the performance is similar but with different behavior so you have to rewrite prompts to get the same. In some cases the performance is just worse. Model updates usually only really guarantee to be better at coding, and maybe image understanding.