Loading...

Google launches its ‘most capable’ multimodal AI model Gemini

Google launches its ‘most capable’ multimodal AI model Gemini
Loading...

Late Wednesday, Alphabet-owned Google introduced its self-admittedly ‘most capable’ artificial intelligence model Gemini. For now, Google has optimised the first version of the model Gemini 1.0 which comes in three different sizes – Ultra for highly complex tasks; Pro for scaling across a range of tasks; Nano for on-device tasks. As the name suggests, these three versions are in descending order of size. Gemini is flexible and can efficiently run across infrastructures from data centers to mobile devices. 

One of the features of this AI model is that it has been built from ground to be multimodal, which means that it can contextually understand and operate across and combine different types of inputs like text, code, audio, image, and video. “We designed Gemini to be natively multimodal, pre-trained from the start on different modalities. Then we fine-tuned it with additional multimodal data to further refine its effectiveness. This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models,” the company blog said. 

Google notes that Gemini Ultra’e performance was found to exceed current state of the art results on 30 of the 32 widely used academic benchmarks used in large language model (LLM) research and development. The Ultra model has also scored 90% on MMLU (massive multitask language understanding), becoming the first model to outperform humans on world knowledge and problem solving capabilities. This means that Gemini Ultra thinks ‘more carefully’ before answering difficult questions, instead on just using first impressions, Google said. 

Loading...

Gemini 1.0 has been trained using Google’s tensor processing units (TPUs) v4 and v5. Along with Gemini, Google has also unveiled its ‘most powerful, efficient and scalable TPU system to date’ – Cloud TPU v5p. It is expected to help developers and enterprises to train large scale generative AI models faster. 

Lastly, using specialised version of Gemini, Google has also created code generation tool AlphaCode 2, with improvements in competitive problem-solving capabilities over the previous version. It goes beyond just coding to include complex math and theoretical computer science problem-solving.


Sign up for Newsletter

Select your Newsletter frequency