Google Gemini helps a robot navigate an office
Tech experts want to deploy robots into offices and other indoor spaces. However, they need sophisticated artificial intelligence to be more aware of their surroundings.
It seems Google Gemini is close to being that AI solution.
Google DeepMind detailed in their arXiv paper how they implemented Gemini 1.5 Pro to teach a robot to respond to commands and navigate around an office.
Article continues after this advertisementREAD: Google Gemini chatbot continues development
As a result, the machine could lead a researcher to the nearest power outlet and remember where they left specific items. Google admits these features are still mundane, so they will need further testing to derive practical applications.
How did Google Gemini help with robot navigation?
TechCrunch says the researchers led the robot on a specialized guided tour to help it familiarize itself with the DeepMind office.
Article continues after this advertisementThe team calls this method the “Multimodal Instruction Navigation with demonstration Tours (MINT).” It involved walking the robot around the office while pointing out different landmarks via speech.
This process allows the artificial intelligence to map the indoor environment based on what it “sees” with its cameras.
Next, the scientists instructed Google Gemini on how to translate user requests into navigational directions it must follow.
Specifically, they used a hierarchical Vision-Language-Action (VLA) navigation policy to make it possible. It “combined the environment understanding and common sense reasoning.”
Digital Trends says the results were highly successful.
The Google Gemini bot achieved “86% and 90% end-to-end success rates on previously infeasible navigation tasks involving complex reasoning and multimodal user instructions in a large real-world environment.”
The robot was able to lead researchers to the nearest power outlet and recall where to get cans of soda. Moreover, it led them to the DeepMind office whiteboard.
Google DeepMInd admits the Google Gemini robot is still clunky.
The machine still can’t perform its office tour without assistance. Also, it takes 10 to 30 seconds to respond.
We would likely take a few more years before we have Gemini bots doing our chores.
Nevertheless, the researchers will continue honing the artificial intelligence to realize this application.