Embracing the future-is artificial intelligence already better? A comparative study of artificial intelligence performance in diagnostic accuracy and decision-making.
Large language models (LLM) such as ChatGPT are potentially useful tools to improve healthcare, particularly in diagnosis. In this study, researchers submitted 188 scenarios from the American Academy of Neurology's Question of the Day app to ChatGPT-3.5, and compared mean success rates between the app's users and ChatGPT. There were no statistically significant differences between app users’ and ChatGPT’s success rate. Nevertheless, substantial research is still required before LLM and other artificial intelligence applications can be used safely in clinical practice.