AI chatbots show promise in endodontics

Iveta Ramonaite, Dental Tribune International

Tue. 5. May 2026

save

DALLAS, US: As artificial intelligence (AI) tools are increasingly explored in dental education and clinical learning, questions remain about how reliably they can support the development of diagnostic reasoning and treatment planning skills. A recent study has found that two AI chatbots, GPT-4o and Gemini 2.5 Pro, may be useful as adjunctive tools in clinical education, particularly in helping students and residents practise clinically relevant decision-making in endodontics.

Board-certified endodontist and educator Dr Poorya Jalali believes that, if used appropriately and under human supervision, artificial intelligence chatbots could support interactive learning, help students prepare for examinations and encourage more critical engagement with clinical scenarios. (Image: Dr Poorya Jalali)

“AI, and particularly large language models (LLMs), are already being used widely by students and educators. We wanted to understand not only whether they can answer questions but also how accurate and reliable they are in a clinically relevant setting,” lead author Dr Poorya Jalali, clinical associate professor and director of graduate endodontics at the Texas A&M College of Dentistry, told Dental Tribune International.

According to Dr Jalali, many of the previous studies on LLMs in endodontics have employed multiple-choice questions, which primarily assess recall. The American Board of Endodontics (ABE) oral examination, however, is designed to assess clinical reasoning, decision-making and the ability to justify clinical decisions with reference to the literature—skills that more closely reflect clinical practice. For this reason, the researchers sought to test the systems in a setting that better mirrors how endodontists approach diagnosis and treatment planning.

In the study, GPT-4o and Gemini 2.5 Pro were tested in a simulated ABE oral examination on three cases covering different endodontic scenarios developed by two board-certified endodontists. For each case, they were given a detailed patient profile and asked 20 open-ended questions. The responses were independently assessed by the same two examiners for the clinical validity of the answers and for the accuracy and relevance of the supporting references. Each response was also given an overall performance score.

Both chatbots performed well, and most responses were rated as acceptable to excellent. On a 0–3 scale, Gemini 2.5 Pro achieved a mean overall score of 2.83 and GPT-4o scored 2.73. Statistical analysis, which accounted for differences between case scenarios, individual questions and examiners, showed no significant difference between the two models in the clinical validity of the responses or in overall performance. However, GPT-4o appeared to vary more by case type and Gemini 2.5 Pro performed more consistently across the three scenarios.

“This suggests that current models can demonstrate structured clinical reasoning and answer oral board-style questions at a level comparable to advanced trainees,” Dr Jalali said. He cautioned, however, that the findings should not be over-interpreted. Although both chatbots showed strong potential, this does not mean that they would reliably pass a real ABE oral examination, which involves live, timed interaction with examiners and independent radiographic interpretation. In this study, the chatbots answered written prompts and were given descriptions of the radiographic findings.

“These systems cannot diagnose or plan treatment independently. They cannot perform clinical tests, examine the patient or interpret radiographs in a real clinical setting. They are LLMs, and their performance depends on the information provided to them,” he explained.

The findings suggest that AI chatbots could serve as educational aids in endodontics rather than as replacements for human expertise. They could, as Dr Jalali suggests, help students and residents practise answering clinical questions, test their knowledge and compare their reasoning with model answers. Used in this way, the technology could provide an interactive learning environment that supplements traditional teaching, particularly in preparation for board-style examinations.

The work forms part of a broader research series. In a previous study, the researchers evaluated LLMs using written board-style questions. According to Dr Jalali, the next step will be to explore whether these tools can help design high-quality examination questions, and initial pilot results in this area have been promising.

The study, titled “Artificial intelligence chatbots taking American Board of Endodontics simulated oral board examination”, was published online on 26 February 2026 in the Journal of Endodontics, ahead of inclusion in an issue.