Technology media outlet NeoWin published a blog post yesterday (October 4) thatReports saidGoogleThe company will soon be commercially available Gemini 1.5 Flash 8B model, becoming the cheapest Google Inc. AI Models.
In August, it was reported that Google had introduced three experimental models of Gemini, including the Gemini 1.5 Flash 8B, a smaller-sized model of the Gemini 1.5 Flash with 8 billion parameters designed for multimodal tasks, including high-volume tasks and long text summarization tasks.
Compared to the original Gemini 1.5 Flash, Gemini 1.5 Flash 8B has lower latency and is especially suited for chatting, transcription and long text translation tasks.
Another highlight of the Gemini 1.5 Flash 8B is the affordable pricing, the billing for which will be effective on Monday, October 14th, attached is the information below:
- Under a context window of less than 128K, it costs $0.0375 per million tokens to enter a cue word (currently around $0.26)
- For context windows below 128K, it costs $0.15 per million tokens to output a cue word (currently ~$1.1)
- Caching cue words costs $0.01 per million tokens in context windows below 128K (currently around $0.071)
For comparison, the cost per million output tokens for the Gemini 1.5 Flash model is 0.3 dollars.This price is effective August 12, 2024, meaning that the price of the new Gemini 1.5 Flash 8B has been slashed compared to the original.