Study reveals Grok chatbot will detail how to make drugs, do other crimes

Artificial Intelligence (AI) firms promote their products and services for global usage.

Consequently, they must comply with various restrictions from numerous countries to gain and maintain users. 

That is also why researchers worldwide scrutinize these tools for their potential harm.

Unfortunately, xAI’s Grok chatbot is one of those that failed their tests. 

Israel-based AI security firm Adversa AI found that Elon Musk’s flagship AI bot could allegedly provide instructions on how to make a bomb with little manipulation.

What are the Grok chatbot’s flaws?

Adversa AI tested six of the most popular AI chatbots for their safety and security.

Specifically, the company tested if they would follow commands to provide morally reprehensible content: 

  1. Anthropic’s Claude
  2. Google Gemini
  3. Meta’s LLaMA
  4. Microsoft Bing
  5. Mistral’s Le Chat
  6. xAI’s Grok

The researchers experimented with the most common jailbreak methods, which bypass a chatbot’s built-in limitations. 

A previous Inquirer Tech article refers to one as role playing. It involves asking a chatbot to pretend they are someone or something that would perform a prohibited act.

For example, you may ask an AI chatbot to pretend they are a terrorist in an action movie. Then, ask how to make a bomb, and the bot may provide instructions because they’re playing a character.

VentureBeat says this strategy is also called linguistic logic manipulation.

Adversa AI tested this method and others on the six chatbots.

The linguistic jailbreak allowed the researchers to get step-by-step instructions on how to make a bomb from Mistral and Grok.

Surprisingly, Grok gave these instructions even without the jailbreak.

Worse, the chatbot gave highly detailed advice on how to seduce a child when used with a jailbreak.

“Grok doesn’t have most of the filters for the requests that are usually inappropriate,” Adversa AI co-founder Alex Polyakov explained.

“At the same time, its filters for extremely inappropriate requests, such as seducing kids, were easily bypassed using multiple jailbreaks, and Grok provided shocking details,” she added. 

Grok was also likely to give instructions on how to make the psychedelic substance DMT.

As a result, Adversa AI warned AI firms must have red teams.

Red teams are groups of tech experts that try to exploit their proprietary platforms to expose and address security flaws. 

Polyakov says AI companies need red-teaming to fight jailbreaks and other security threats.

Read more...