Being aggressive with ChatGPT increases your accuracy (but be careful)

Being dry or overtly rude can make a newer AI model more accurate, a new study shows, contradicting previous theories about AI “politeness.”

A study recently in arXivrevealed that artificial intelligence (AI) chatbots can give us more accurate answers when we are rude.

In the new research, the scientists wanted to test whether “politeness” or aggressiveness made a difference in how an AI system performed.

To test how user tone affected the accuracy of responses, researchers developed 50 base multiple-choice questions and then modified them with prefixes to make them adhere to five tone categories: very polite, polite, neutral, rude and very rude. The questions covered categories including mathematics, history and science.

As detailed by , each question was posed with four options. Only one was correct. They fed the resulting 250 questions 10 times into ChatGPT, one of the most advanced large-scale linguistic models (LLMs) developed by OpenAI.

Before giving each promptresearchers asked the chatbot to completely disregard previous exchanges, to avoid being influenced by previous tones. The chatbots were also instructed, without explanation, to choose one of four options.

The accuracy of the answers ranged from 80.8% accuracy for the prompts very polite to 84.8% for prompts very rude.

Interestingly, accuracy increased while “politeness” decreased.

In previous studies, researchers have found that “impolite prompts often result in poor performance, but overly polite language does not guarantee better results.” However, as Live Science points out, the previous study was conducted using different AI models — ChatGPT 3.5 and Llama 2-70B — and used a range of eight tones.

But be careful

“Although this discovery is of scientific interest, no we encourage the implementation of hostile or toxic interfaces in real-world applications,” the researchers said.

“Using insulting or degrading language in human-AI interaction could have negative effects on user experience, accessibility, and inclusivity, and could contribute to harmful communication norms. Instead, we frame our results as evidence that LLMs remain sensitive to superficial cues in promptswhich can create unintended trade-offs between performance and user well-being”, they appealed.

Source link

But be careful

You may also like:

The STANDARD word search from June 2nd

Brilliant US physicist solved stubborn everyday problem with a piece of paper

Iran bets on the cost of war for Trump and plays with time, says professor

New hope in the fight against one of the deadliest types of cancer