A new AI model outperforms ChatGPT in tests

Artificial Intelligence. Concept
image: @BlackJack3D | iStock

All you need to know about Google’s new AI model, Gemini, and why it outperforms ChatGPT, the model one year its senior

What is Gemini?

Google has revealed its latest artificial intelligence model, Gemini, which is over-performing compared to ChatGPT in various tests. Gemini, developed by DeepMind, a Google unit based in London, showcases advanced capabilities across various formats, including evaluating and grading a student’s physics homework.

Google claims that Gemini outshines ChatGPT’s most powerful model, GPT-4, on 30 out of 32 benchmark tests, demonstrating prowess in reasoning and image understanding.

Gemini comes in three versions and is a multimodal model, so it can simultaneously comprehend text, audio, images, video, and computer code. The model is set to integrate into various Google products, including the search engine. The initial release will be in more than 170 countries, excluding the UK and Europe, where regulatory clearance is pending.

The three versions of Gemini

The Guardian reported that Demis Hassabis, CEO of DeepMind, described Gemini as the most complex project undertaken by the company. The three versions of Gemini-Pro, Nano, and Ultra will be released progressively, with Ultra being the most powerful iteration and set to be publicly available in early 2024.

Ultra has already achieved a milestone by outperforming human experts, scoring 90% on a multitasking test covering diverse subjects such as mathematics, physics, law, medicine, and ethics.

Gemini’s Ultra model can also power AlphaCode2, a new code-writing tool that can outperform 85% of competition-level human computer programmers. Google plans to undergo external “red team” testing to evaluate Ultra’s security and safety.

Problems that still need to be overcome.

While promotional videos showcased Gemini’s capabilities, including analysing handwritten physics homework and identifying drawings, concerns were raised about the model’s tendency for “hallucinations” or providing false answers. Eli Collins, the head of product at Google DeepMind, admitted that this remains an unresolved research problem.

Despite Gemini’s remarkable achievements, questions linger about its collaboration with governments for testing, as discussed at the recent AI safety summit. Google revealed ongoing discussions with the UK government about tests conducted by the AI Safety Institute. The Pro and Nano versions will not be part of these tests; they are specifically tailored for the most advanced or frontier models.

LEAVE A REPLY

Please enter your comment!
Please enter your name here