Sorry, you need to enable JavaScript to visit this website.
Skip to main content
Study

Diagnostic accuracy of a large language model in pediatric case studies.

Barile J, Margolis A, Cason G, et al. Diagnostic accuracy of a large language model in pediatric case studies. JAMA Pediatr. 2024;178(3):313-315. doi:10.1001/jamapediatrics.2023.5750.

Save
Print
January 17, 2024
Barile J, Margolis A, Cason G, et al. JAMA Pediatr. 2024;178(3):313-315.
View more articles from the same authors.

Clinicians and the public are increasingly interested in using chatbots like ChatGPT to learn more about their care, particularly for diagnoses. This study asked ChatGPT to provide a differential diagnosis list and final diagnosis for 100 pediatric case studies. ChatGPT had an overall error rate of 83%. Among incorrect diagnoses, many were clinically related to the final diagnosis, but too broad to be classified as correct, and just over half were of the same organ system. Despite the error rate, authors still thought that large language models (LLMs) could be helpful to clinicians as a tool, and recommend that teaching chatbots may improve diagnostic accuracy.

Save
Print
Cite
Citation

Barile J, Margolis A, Cason G, et al. Diagnostic accuracy of a large language model in pediatric case studies. JAMA Pediatr. 2024;178(3):313-315. doi:10.1001/jamapediatrics.2023.5750.