Study finds: GPT-4 outperforms doctors in clinical reasoning, but also makes mistakes more often

In a new study, scientists at Beth Israel Deaconess Medical Center (BIDMC) compared a large language model with human doctors to find out whether the patient's speech was correct or not.Clinical ReasoningComparison of abilities. The researchers used the revised IDEA (r-IDEA) score, a commonly used tool for assessing clinical reasoning abilities.

This study involved giving GPT-4The chatbot, 21 attending physicians, and 18 residents were asked to provide support for 20 clinical cases to build diagnostic reasoning and solve problems. The r-IDEA scores of the three groups of answers were then evaluated. The researchers found that the chatbot actually gainedHighestThe authors found that the chatbot “often got it completely wrong”.

Source Note: The image is generated by AI, and the image is authorized by Midjourney

"Further research is needed to determine how large language models can be used to predict the future of speech," explains lead author Dr. Stephanie Cabral.most“In summary, the results showed reasonable reasoning by the chatbot, but also significant errors; this further supports that this AI-driven system, at its current level of maturity, is best suited as a tool to augment physicians’ practice rather than replace their diagnostic abilities.”

As medicalLeadersAs technologists often explain, this is because the practice of medicine is not based solely on the output of rule-based algorithms, but on deep reasoning and clinical intuition, which is LLM However, tools like this that provide diagnostic or clinical support can still be extremely powerful assets in the physician’s workflow. For example, if the system can reasonably provide “firstDiagnosis” or preliminary diagnosis suggestions may allow doctors to save a lot of time in the diagnosis process. In addition, there may be opportunities to increase efficiency if these tools can enhance doctors’ workflow and improve their ability to process the large amounts of clinical information in medical records.

Many organizations are taking advantage of these potential clinical enhancements. For example, AI-driven transcription technologies that leverage natural language processing are helping physicians complete clinical documentation more efficiently. Enterprise search tools are integrating with organizational and electronic medical record systems to help physicians search large amounts of data, promote data interoperability, and gain faster and deeper insights into existing patient data. Other systems may even help provide preliminary diagnoses; for example, tools are emerging in the fields of radiology and dermatology that can suggest potential diagnoses by analyzing uploaded photos.

However, there is still much work to be done in this area. In short, although these AI systems are not yet ready for clinical diagnosis, it is still possible to use this technology to enhance clinical workflow, especially to ensure safe and accurate processes while maintaining human control.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.

GPT-4 outperforms doctors in clinical reasoning, but also makes mistakes more often, study finds

Stability AI reportedly ran out of money and couldn’t pay its rented cloud GPU bills

Some brands are beginning to ban advertising companies from using artificial intelligence

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Stability AI reportedly ran out of money and couldn’t pay its rented cloud GPU bills

Some brands are beginning to ban advertising companies from using artificial intelligence

GAIA benchmark reveals surprising gap between humans and GPT-4

Nvidia releases ChatQA model with GPT-4 performance

Research: The Internet is full of low-quality machine-translated content, and large language model training needs to be wary of data traps

Japan develops humanoid robot Alter3: It can even take selfies using GPT-4 technology

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow