New Study Finds OpenAI's o1-preview AI Model Outperforms Doctors in Diagnosing Tricky Medical Cases

Dec. 25, 2012 - A team of researchers from Harvard Medical School and Stanford University has conducted an in-depth evaluation of medical diagnostic OpenAI of the o1-preview model.found to be better than human doctors at diagnosing trickyMedicalCase in point.

New Study Finds OpenAI's o1-preview AI Model Outperforms Doctors in Diagnosing Tricky Medical Cases

According to the study, o1-preview correctly diagnosed 78.31 TP3T of test cases, and was even more accurate at 88.61 TP3T in 70 case-specific comparison tests, significantly better than its predecessor, GPT-4, at 72.91 TP3T.

Using the Standardized Scale for the Assessment of the Quality of Medical Reasoning R-IDEA, o1-preview achieved 78 perfect scores out of 80 cases. In comparison, experienced physicians achieved perfect scores in only 28 cases, and residents in only 16 cases.

In complex cases designed by 25 experts, o1-preview scored as high as 861 TP3T, more than twice as high as physicians using GPT-4 (411 TP3T) and those using traditional tools (341 TP3T).

The researchers acknowledged the limitations of the test in that some of the test cases may have been included in o1-preview's training data and that the test focused primarily on the system working alone and did not adequately consider scenarios in which it would work in concert with a human physician; in addition, the diagnostic test suggested by o1-preview is costly and has limitations in practical application.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

The dark side of AI search, where hidden content can manipulate ChatGPT results

2024-12-25 17:59:30

Information

Anthropic's new study: Typos can 'jailbreak' GPT-4, Claude, and other AI models

2024-12-25 18:02:20

Search