Google just released Gemini, their most recent multimodal AI model, positioning it as a strong rival to OpenAI's GPT-4. However, controversy surrounds the demo film "Hands-on with Gemini: Interacting with multimodal AI," as Google admitted to staging portions of it. The video's output speed was enhanced, and contrary to its title, there was no real-time voice interaction between the AI and the user. Google defended its approach, emphasizing that all prompts and outputs in the video were real, aiming to inspire developers.
What is Gemini capable of?
The model, praised for its "impressive multimodal capabilities," integrates various data types, such as text and graphics, to handle diverse applications. Google plans three sizes of Gemini: Nano (on-device), Pro (integrated with Bard chatbot), and Ultra (expected in 2024).
Gemini Pro reportedly outperformed the previous GPT-3.5 model on six out of eight AI benchmarks. However, no direct comparisons with OpenAI's GPT-4 chatbot have been disclosed. The Ultra version, slated for 2024, surpassed competitors, including GPT-4, in benchmark tests. Still, it requires further testing and evaluation by select customers, developers, partners, and safety experts before hitting the market.
Why did Google postpone the launch?
Originally planned for a launch between the holidays, Gemini faced challenges with non-English queries, prompting Google CEO Sundar Pichai to postpone the release. Google canceled Gemini's launch events, opting to introduce it in January amid industry scrutiny. The model is expected to enhance various Google products, including Bard and Google Assistant. Despite Google's anticipation of Gemini's advancements, the company seems cautious, given OpenAI's existing market dominance.
Gemini, touted as Google's most powerful large language model, is described as an integrated multimodal AI, capable of handling complex coding requirements with ease. As Google navigates the complexities surrounding Gemini's launch, the tech giant aims to refine and thoroughly test the model for safety, emphasizing its commitment to delivering a versatile and impactful AI solution.