Google Gemini helps a robot navigate an office

Google Gemini helps a robot navigate an office

/ 08:12 AM July 17, 2024

Tech experts want to deploy robots into offices and other indoor spaces. However, they need sophisticated artificial intelligence to be more aware of their surroundings.

It seems Google Gemini is close to being that AI solution.

Google DeepMind detailed in their arXiv paper how they implemented Gemini 1.5 Pro to teach a robot to respond to commands and navigate around an office. 

Article continues after this advertisement

READ: Google Gemini chatbot continues development

FEATURED STORIES

As a result, the machine could lead a researcher to the nearest power outlet and remember where they left specific items. Google admits these features are still mundane, so they will need further testing to derive practical applications.

How did Google Gemini help with robot navigation?

TechCrunch says the researchers led the robot on a specialized guided tour to help it familiarize itself with the DeepMind office.

Article continues after this advertisement

The team calls this method the “Multimodal Instruction Navigation with demonstration Tours (MINT).” It involved walking the robot around the office while pointing out different landmarks via speech.

Article continues after this advertisement

This process allows the artificial intelligence to map the indoor environment based on what it “sees” with its cameras.

Article continues after this advertisement

Next, the scientists instructed Google Gemini on how to translate user requests into navigational directions it must follow. 

Specifically, they used a hierarchical Vision-Language-Action (VLA) navigation policy to make it possible. It “combined the environment understanding and common sense reasoning.” 

Article continues after this advertisement

Digital Trends says the results were highly successful.

The Google Gemini bot achieved “86% and 90% end-to-end success rates on previously infeasible navigation tasks involving complex reasoning and multimodal user instructions in a large real-world environment.” 

The robot was able to lead researchers to the nearest power outlet and recall where to get cans of soda. Moreover, it led them to the DeepMind office whiteboard.

Google DeepMInd admits the Google Gemini robot is still clunky.

The machine still can’t perform its office tour without assistance. Also, it takes 10 to 30 seconds to respond.

We would likely take a few more years before we have Gemini bots doing our chores.

Your subscription could not be saved. Please try again.
Your subscription has been successful.

Subscribe to our daily newsletter

By providing an email address. I agree to the Terms of Use and acknowledge that I have read the Privacy Policy.

Nevertheless, the researchers will continue honing the artificial intelligence to realize this application.

TOPICS: Robot, technology
TAGS: Robot, technology

Your subscription could not be saved. Please try again.
Your subscription has been successful.

Subscribe to our newsletter!

By providing an email address. I agree to the Terms of Use and acknowledge that I have read the Privacy Policy.

© Copyright 1997-2024 INQUIRER.net | All Rights Reserved

This is an information message

We use cookies to enhance your experience. By continuing, you agree to our use of cookies. Learn more here.