Google IO Round-up 2024

Wednesday, May 15, 2024

Google's developer keynote was dominated by AI advancements. The company unveiled new AI-powered chatbot tools, enhanced search capabilities, and numerous machine intelligence upgrades for Android.

HC

Written by

CEO & Founder

Google kicked off its annual I/O developer conference today. Traditionally, the Google I/O keynote showcases a variety of new software updates and occasional hardware releases. This year, no new hardware was introduced as the new Pixel 8A phone had been announced previously. Instead, today's presentation focused on a huge array of AI software updates, highlighting Google's ambition to lead the generative AI surge of recent years.

Here are the biggest announcements from I/O 2024:

The AI Search Evolution

Let's not forget what most people use Google for - search. Google's latest AI updates represent a significant transformation for its core product.

New capabilities include AI-organised search, which delivers more concise and readable results, along with improved responses to longer queries and photo-based searches.

Google also introduced AI overviews, which are brief summaries that compile information from multiple sources to answer your query directly in the search box. These summaries appear at the top of the results, eliminating the need to visit a website for answers. This feature is already controversial, as publishers and websites worry that providing direct answers might harm their traffic, especially since they already struggle to rank in Google’s search results. Despite these concerns, the enhanced AI overviews are being rolled out to all users in the US starting today.

Additionally, a new feature called Multi-Step Reasoning helps you find layered information about a topic, providing more contextual depth. For example, when planning a trip, Google Maps can help you find hotels and set up transit itineraries, suggest restaurants, and assist with meal planning. You can refine the search by specifying cuisine types or vegetarian options. All this information is presented in an organised manner.

Lastly, a quick demo showcased how users can rely on Google Lens to answer questions about whatever they point their camera at. Although similar to Project Astra, these capabilities are integrated into Lens differently. The demo featured a woman struggling with a "broken" turntable; Google Lens identified that the tonearm needed adjustment and provided video and text instructions on how to fix it. Impressively, it even correctly identified the make and model of the turntable through the camera.

Gemini Upgrades

Gemini Nano, Google's on-device mobile large language model, is receiving an upgrade and will now be called Gemini Nano with Multimodality. According to Google CEO Sundar Pichai, this enhanced version can "turn any input into any output." It can extract information from text, photos, audio, web or social videos, and live video from your phone's camera, then synthesize that input to provide summaries or answer questions. In a demonstration video, Google showcased how someone scanned a bookshelf with a camera, allowing Gemini Nano to record and later recognise the titles in a database.

Better AI Searching of Photos

Google has enhanced Google Photos with robust visual search tools. A new feature called Ask Photos allows you to use Gemini to search your photos and get more detailed results. For instance, you can provide your license plate number, and it will use contextual clues to locate your car in all your photos.

In a Google blog post, Google Photos software engineer Jerem Selier assures that the feature does not collect data from your photos to serve ads or train other Gemini AI models, aside from its use in Google Photos. Ask Photos will be available this summer.

More Gemini & Workspace Integrations

Google is integrating AI into its Workplace suite of office tools. Starting today, a button to toggle Google’s Gemini AI will be available in the side panel of various Google apps, including Gmail, Google Drive, Docs, Sheets, and Slides. This Gemini assistant can answer questions, help craft emails or documents, and provide summaries of lengthy documents or email threads.

To show that their AI advancements go beyond office tasks, Google highlighted features tailored for parents. These include AI chatbots that can assist students with homework or provide summaries of missed PTA meetings. Additionally, Google’s Circle to Search, introduced earlier this year, is being upgraded to aid students with schoolwork, such as explaining how to solve math problems.

Google has also integrated a Gemini-powered AI Teammate into apps like Docs and Gmail. This AI assistant, which you can name anything you want (in today's demo, it was named Chip), acts like a productivity buddy. The AI Teammate can help coordinate communications with coworkers, track project files, assemble to-do lists, and follow up on assignments, similar to a turbocharged Slackbot.

Additionally, Google demonstrated Gems, a new feature that allows you to set automated routines for tasks you want Gemini to perform regularly. You can configure it to manage various digital chores and activate these routines with a voice command or text prompt. Each routine is called a "Gem," playing on the Gemini name.

Creativity Tools

Google's creative AI efforts were highlighted with demos from Google Labs' experimental AI division.

A key new feature is VideoFX, a generative video model based on Google DeepMind's video generator, Veo. It creates 1080p videos from text prompts, offering greater flexibility in the production process. Google also introduced improvements to ImageFX, a high-resolution image generator. The updated ImageFX reduces unwanted digital artifacts and better interprets user prompts to generate accurate text.

Additionally, Google unveiled DJ Mode in MusicFX, an AI music generator that allows musicians to create song loops and samples from prompts. DJ Mode was showcased during a vibrant performance by musician Marc Rebillet, which preceded the I/O keynote.