AI tried to diagnose patients with conversations. It failed completely

by Andrea
0 comments
AI tried to diagnose patients with conversations. It failed completely

ZAP // Dall-E-2

AI tried to diagnose patients with conversations. It failed completely

Technology is good at carrying out professional medical exams, but when they talk in the form of a chatbot with “patients”, the story changes. What are you missing?

Researchers carried out a test to investigate AI’s ability to perform medical screenings. The “patients” were 2000 medical cases taken primarily from US medical board professional examinations.

When it comes to the professional exams themselves, they did well. The problem was the rest, says what was published this Thursday in Nature.

“Although great linguistic models show impressive results on multiple-choice tests, your accuracy decreases significantly in dynamic conversations“, he tells Pranav Rajpurkar from Harvard University. “Models particularly struggle with open-ended diagnostic reasoning.”

4 of the leading large language models — OpenAI’s GPT-3.5 and GPT-4 models, Meta’s Llama-2-7b model, and Mistral AI’s Mistral-v2-7b model — performed considerably worse on conversation-based benchmarking than when they made diagnoses based on written case summaries.

When there were multiple choice options, GPT-4 was able to identify 82% of the diseases, but when they were not, its ability to identify the disease was negative, at 49%.

E qWhen conversations between the patient and the chatbot were simulated, accuracy dropped even further, to 26%.

O GPT-4 was the best performing AI model in the study, with the GPT-3.5 often coming in second place, the Mistral AI model sometimes coming in second or third place and the Llama da Meta model generally getting the lowest score.

Rajpurkar points out that medical practice in the real world is “messier” than in simulations, and the technology does not yet seem ready for real life, in which there is “complex social and systemic factors“.

“The good performance in our benchmark test suggests that AI can be a powerful tool to support clinical work, but not necessarily a substitute for holistic evaluation by experienced doctors”, concludes the researcher.

In 2021, the opening of the University of Porto for Medicine students was questioned.

In the curricular unit, it is written that “it is not the objective of the course for students to learn how to write a poem” and refers to capacity for “interpretation” and “interactivity” as main objectives of discipline.

Should we, after all, give AI some poetry lessons, so that it knows how to better interpret its patients’ messages, or are there fields in which humans are truly irreplaceable?

Carolina Bastos Pereira, ZAP //

Source link

You may also like

Our Company

News USA and Northern BC: current events, analysis, and key topics of the day. Stay informed about the most important news and events in the region

Latest News

@2024 – All Right Reserved LNG in Northern BC