Yesterday, Google DeepMind released its long-awaited Gemini AI model. A lot of AI enthusiasts on X and Reddit are already claiming that it is the GPT killer.
Alphabet shares saw a 5% increase today as a result of yesterday's announcement as this represents a significant innovation on Google's part.
Google says that Gemini is their largest and most capable language model ever. In this short post, we're going to see how Gemini stacks up to GPT, its capabilities, and how you can try it today, for free.
Google's newest and most capable AI model
There's no better way to introduce you to Gemini than the official announcement video from Google below:
Gemini vs. GPT
Since the launch of ChatGPT, Google has been trying to catch up and it seems that all the effort has paid off. Google's Bard was having a hard time dethroning OpenAI's ChatGPT, but this may soon change.
On paper, and as per the benchmarks and tests Google released, Gemini Ultra beats GPT-4 in various tasks.
According to Google, Gemini has 5 times the computational power of GPT. In benchmark tests, the top Gemini's model Ultra beat GPT-4 in various language, programming, image, video, and audio tasks.
For the complete benchmark results check the official Gemini page here.
In terms of cost, to use OpenAI's GPT-4, you must have an active ChatGPT Plus subscription ($20/mo). We're not sure how much Gemini Ultra is going to cost, but you can take the Pro version for a spin right now for free. (See how below)
Is Gemini the GPT killer?
It could be, however, we'll have to wait and see how Gemini performs in the real world and within different applications. How it will handle everyday requests from users who want to use it to brainstorm ideas, look up information, write code, and much more will reveal its capabilities and impact on GPT.
The folks at Google published a nice hands-on demo showcasing Gemini's capabilities. Here's the official video (A must-watch):
Gemini is built for Multimodality
That's a big deal! Google says that Gemini was built from the ground up for multimodality. You may be wondering: What is multimodality anyway?
Multimodality means that Gemini can understand, interpret, and respond to different types of information from multiple sources. As we've seen in the hands-on video above, Gemini can manage text, images, video, audio, and code at the same time seamlessly.
Using Gemini in your app
Google announced that Gemini models will be available to integrate into your application using Google AI Studio and Google Vertex AI. I'm going to test it out and post about this as soon as I get access, so stay tuned!
Gemini comes in three versions:
- Ultra: This is the largest and most capable, designed for highly complex tasks, and the direct GPT4 competitor.
- Pro: This version is ideal for scaling across a broad range of tasks, offering a balance of capability and versatility.
- Nano: This is the most efficient model, suited for on-device integration, and it's now available on the Pixel 8 Pro.
How to access Google Gemini now?
To try the new Gemini Pro model you just need to have access to Bard. If you do, head over to the website and start chatting with Bard, it is now using the new Gemini Pro model by default.
Gemini Ultra is not available at the moment and will come later in 2024 and the Pro version supports English only and is exclusively accessible within Google Bard.
This is all great news for us developers and AI enthusiasts whether you're on the GPT or Gemini team. At the end of the day, we have more options.
It's also too early to decide whether Bard and Gemini will dethrone ChatGPT and GPT, particularly with Gemini Ultra yet to launch. Moreover, how Google will monetize these models is unclear, and the role developers will play in the ecosystem is yet to be defined.
FAQ: Frequently asked questions
What is the difference between Gemini and Bard
Bard is a web application that uses an AI model (Gemini) to generate its answers. Before Gemini, PaLM2 was the default LLM used by Bard. So really, Gemini and Bard are two different things, similar to how ChatGPT (the chat web interface) uses GPT-4 or GPT-3.5 (the LLM) to generate its answers.
Is Gemini replacing Bard?
No, a common misconception is that Gemini will replace Bard while in fact, Gemini is the model that powers Bard. You can think of Bard as the car's exterior and Gemini as its engine. It did replace the PaLM2 model that Bard was using.
Is Google Bard using Gemini?
Yes, currently Bard uses the Gemini Pro model by default. The better AI model, Gemini Ultra will come to Bard but the details are not clear.
Is Gemini better than GPT-4?
Yes. Based on the benchmarks and tests provided by Google, Gemini performs better at a variety of tasks than GPT-4. This however remains to be seen in day to day usage.
Is Gemini better than ChatGPT?
Comparing Gemini (an AI model) to ChatGPT (a web interface) is not correct. ChatGPT uses GPT-3.5 and GPT-4. Therefore, the comparison should be between GPT and Gemini, or ChatGPT and Bard.