Google Gemini — everything you need to know
Making sense of Google's tangled AI empire
Google launched the first Gemini model in December 2023 when its chatbot was still named Bard. Since then, the search giant has gradually adopted the name Gemini for almost everything it does related to AI.
The Bard chatbot was the first to fall, becoming simply Gemini earlier this year. This was soon followed by the Gemini Assistant largely replacing the previous assistant on Android. The company also uses Gemini in Docs and for developers.
After the initial flurry of activity things seemed to slow down for Google. Rather than a new name, as they'd done previously, the company doubled down on Gemini, adding it to ever more products and services.
Then, in December Google released Gemini 2.0. CEO Sundar Pichai described its release as the start of the Agent Era. This is where AI models perform tasks on your behalf based on an initial set of instructions.
What is Gemini?
The Gemini model has been trained not just on text, but as a multimodal model which can process images, video, audio and even computer code. This is similar to OpenAI's GPT-4o and as of Gemini 2 it can also output those modalities.
In line with Google’s typical mode of operation, the latest version of the model has been quietly developed over the past months and offers some features that more hyped products like ChatGPT have overlooked.
For example, there are now over 50,000 variations of Gemini on Hugging Face, covering a multitude of languages and uses.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
Unfortunately, this variety has generated quite a bit of confusion. The latest flurry of Gemini launches has made things even worse, and so we thought it was time to lay out a clear map of the Gemini universe to make things easier to understand.
The first thing to realize is Google likes to mix and match model technology and applications, with variations of the same name. Once you get that clear, everything else starts to slot into place.
1. Models
In the beginning was DeepMind, the AI lab launched in London in 2010. This foundation stone of the whole AI industry delivered the LaMDA, PaLM, and Gato AI models to the world. Gemini is the latest iteration of this generational family.
Version 1.0 of the Gemini model was launched in three flavors, Ultra, Pro and Nano. As the names suggest, the models ranged from high power down to petite versions designed to run on phones and other small devices.
Note that much of the confusion from the subsequent launches has come about because of Google's philosophical tussle between its search and AI businesses.
AI cannibalism of search has always been a sword hanging above the company’s head, and has contributed mightily to its ‘will they, won’t they’ attitude towards releasing AI products.
Gemini 1.5, released ten months ago, was an incremental improvement of the original model, incorporating mixture of experts (MoE) tech, a one million token context window and new architecture. Since that time we’ve seen the launch of Gemini 1.5 Flash, Gemini 1.5 Pro-002 and Gemini 1.5 Flash-002 - the latter released just three months ago.
At the same time the company also made a surprising foray into open model territory, with the launch of the free Gemma product. These 2B and 7B parameter models were seen as a direct response to Meta’s release of the Llama model family. Gemma 2.0 was released five months later.
Gemini 2.0 launched in December 2024, and is billed as a model for the agentic era. The first version to be released was Gemini 2.0 Flash Experimental, a high performance multimodal model, which supports tool use like Google search, and function calling for code generation.
Within weeks the company launched Gemini 2.0 Experimental Advanced, apparently the full version of the current generation. We say apparently because at this point in time nobody’s really sure what’s full and what’s early code.
What can be said with certainty is that Gemini 2.0 Flash Experimental is an extremely capable and performant AI model all round.
Gemini models
- Gemini 1 Ultra - powerful
- Gemini 1 Pro - mid-range
- Gemini 1 Nano – small
- Gemini 1.5 Flash – fast, cheaper
- Gemini 1.5 Pro – slower, more expensive
- Gemini 2.0 Flash Experimental
- Gemini 2.0 Flash Thinking
- Gemini 2.0 Experimental Advanced
Gemma models ( Gemmaverse)
- Gemma 1 (2B, 7B parameters)
- Gemma 2 (2B, 9B, 27B): 27B trained from scratch.
- CodeGemma (2B and 7B): fine-tuned for code generation.
- RecurrentGemma (2B, 9B): Griffin-based, instead of Transformer-based.
- PaliGemma 2 (3B, 10B, 28B): vision model accepts text and image inputs. Multilingual.
- DataGemma: data focused model
- GemmaScope: AI research tool
2. Applications
Google is both a research and a product company. DeepMind and Google AI lead the research and release the models. The other side of Google takes those models and puts them into products. This includes hardware, software and services.
Chatbots
Chatbots lead the charge in terms of Google applications, as they do for so many other foundation model suppliers. Again, this being Google, things get a little bit fuzzy in terms of names and functions.
Gemini chatbot. This used to be called Bard, and is completely separate to the Gemini model. Ten months ago Bard and Duet AI, another Google product, were merged together under the Gemini brand with the launch of an Android app.
Subsequent to that action, Gemini chat has now been integrated into more Google products, including Android Assistant, the Chrome browser, Google Photos and Google Workspace.
At the time of writing the Gemini Chatbot and legacy Android Assistant are offered as dual options on the latest versions of the Android phone operating system. Gemini Live is seen as the Google alternative to OpenAI’s low latency, high speed Advanced Voice Mode, and is expected to roll out across Google Pixel smartphones in the near future.
Products
While Gemini as a chatbot might get most of the new models and attention from AI aficionados, most of the eyes on AI will be going to Gemini on mobile.
This comes in two forms, first through the Gemini App on iPhone and Android, and then through its deep integration into the Android operating system.
On Android developers can even use the Gemini Nano model in their own apps without having to use a cloud-based, or costly model to perform basic tasks.
The deep integration allows for system functions to be triggered from Gemini, as well as the use of Gemini Live — the AI voice assistant — to play songs and more.
Experiments
The latest Gemini model launch has been accompanied by a series of major Google application releases or previews tied into the new model. The list is long and impressive. Some of them include:
- Project Astra: spectacular demonstration of the power of visual understanding for AI assistants
- Project Mariner: a great showcase of the power of multimodal AI for real world use cases
- NotebookLM: a stunning new paradigm for research and study applications
- Deep Research: hugely powerful agentic research tool with deep search ability and huge contexts
3. Platforms
Outside of the mobile and web-based versions of Gemini there are some premium and developer focused products. These usually offer the most advanced models and features such as Deep Research in Gemini Advanced.
- Gemini Advanced: Google’s sophisticated subscription-based gateway to its AI products.
- Google Cloud: Pay as you go on-ramp to the full range of Google’s enterprise and consumer products
- AI Studio: Free AI playground to test and evaluate the Gemini range of AI models
- Vertex AI: AI development platform integrated as part of Google Cloud services
- Google One: Subscription-based cloud storage service for consumers
Nigel Powell is an author, columnist, and consultant with over 30 years of experience in the technology industry. He produced the weekly Don't Panic technology column in the Sunday Times newspaper for 16 years and is the author of the Sunday Times book of Computer Answers, published by Harper Collins. He has been a technology pundit on Sky Television's Global Village program and a regular contributor to BBC Radio Five's Men's Hour.
He has an Honours degree in law (LLB) and a Master's Degree in Business Administration (MBA), and his work has made him an expert in all things software, AI, security, privacy, mobile, and other tech innovations. Nigel currently lives in West London and enjoys spending time meditating and listening to music.