Evaluating Large Language Models performance in Endodontics: A clinical experimental study

Authors

DOI:

https://doi.org/10.33448/rsd-v15i1.50523

Keywords:

Artificial intelligence, Endodontics, Dental pulp diseases, Diagnosis, Machine learning.

Abstract

This study aims to evaluate the diagnostic accuracy, consistency and diagnostic success rates of eight different AI-based chatbots in Endodontics. This cross-sectional study evaluated diagnostic accuracy of eight diverse AI models, selected for architectural/developer heterogeneity and clinical relevance, using 12 validated fictitious endodontic cases aligned with AAE guidelines and ethical approval was waived as no human data were used. STROBE guidelines were followed to ensure methodological rigor. Standardized prompts ensured uniformity, with three independent executions per case to assess consistency. Responses were anonymized and evaluated by blinded, calibrated reviewers and statistical analysis included Kruskal-Wallis, Dunn’s tests, Fleiss’ Kappa, and chi-square to compare diagnostic/treatment accuracy and intramodel agreement. The analysis revealed significant diagnostic accuracy variation among AI models (p < 0.001), with ChatGPT o1 (97%), Claude (97%), and DeepSeek (90.9%) outperforming Gemini (54.5%). Treatment recommendations showed uniformly high accuracy (97–100%, p = 0.537). Multivariate regression confirmed ChatGPT o1 (OR=32.7) and Claude (OR=30.5) as superior, though complex diagnoses (e.g., acute apical abscess, asymptomatic irreversible pulpitis) reduced accuracy (OR=0.01–0.3, p<0.05). Stratified analysis identified model-specific vulnerabilities: Gemini failed in reversible pulpitis (0/3, p=0.001) and chronic apical abscess (0/3, p=0.001), while ChatGPT o1 struggled with acute apical abscess (0/3, p<0.001). Overall agreement was 93%, with high intraclass reliability (ICC >0.85) for top models versus Gemini (ICC=0.65). Fleiss’ Kappa highlighted moderate agreement (κ=0.28–0.45) in ambiguous cases, emphasizing heterogeneous reliability. In conclusion, seven AI chatbots demonstrated high accuracy in endodontics cases, being considered as helpful tools for complement of clinical practice.

References

AAE Consensus Conference Recommended Diagnostic Terminology. (2009). Journal of Endodontics, 35, 1634.

Ahmed, N., Abbasi, M. S., Zuberi, F., Qamar, W., Halim, M. S. B., Maqsood, A., & Alam, M. K. (2021). Artificial intelligence techniques: Analysis, application, and outcome in dentistry—A systematic review. Biomed Research International, 2021, 9751564.

Ahmed, Z. H., Almuharib, A. M., Abdulkarim, A. A., Alhassoon, A. H., Alanazi, A. F., Alhaqbani, M. A., Alshalawi, M. S., et al. (2023). Artificial intelligence and its application in endodontics: A review. Journal of Contemporary Dental Practice, 24, 912–917.

American Association of Endodontists. (2003). Glossary of endodontic terms. Chicago: American Association of Endodontists.

Aminoshariae, A., Kulild, J., & Nagendrababu, V. (2021). Artificial intelligence in endodontics: Current applications and future directions. Journal of Endodontics, 47, 1352–1357.

Aminoshariae, A., Nosrat, A., Nagendrababu, V., Dianat, O., Mohammad-Rahimi, H., O’Keefe, A. W., & Setzer, F. C. (2024). Artificial intelligence in endodontic education. Journal of Endodontics, 50, 562–578.

An, Q., Rahman, S., Zhou, J., & Kang, J. J. (2023). A comprehensive review on machine learning in healthcare industry: Classification, restrictions, opportunities and challenges. Sensors (Basel), 23, 4178.

Bonny, T., Al Nassan, W., Obaideen, K., Al Mallahi, M. N., Mohammad, Y., & El-Damanhoury, H. M. (2023). Contemporary role and applications of artificial intelligence in dentistry. F1000Research, 12, 1179.

Casadei, R. (2023). Artificial collective intelligence engineering: A survey of concepts and perspectives. Artificial Life, 29, 433–467.

Costa Neto, P. L. O., & Bekman, O. R. (2009). Statistical analysis of decision-making (2nd ed.). São Paulo: Editora Blucher.

Decurcio, D. A., Bueno, M. R., Silva, J. A., Loureiro, M. A. Z., Sousa-Neto, M. D., & Estrela, C. (2021). Digital planning on guided endodontics technology. Brazilian Dental Journal, 32, 23–33.

de Moura, J. D. M., Fontana, C. E., da Silva Lima, V. H. R., de Souza Alves, I., de Melo Santos, P. A., & de Almeida Rodrigues, P. (2024). Comparative accuracy of artificial intelligence chatbots in pulpal and periradicular diagnosis: A cross-sectional study. Computers in Biology and Medicine, 183, 109332.

Glickman, G. N. (2009). AAE consensus conference on diagnostic terminology: Background and perspectives. Journal of Endodontics, 35, 1619–1620.

Kaplan, A. D., Kessler, T. T., Brill, J. C., & Hancock, P. A. (2023). Trust in artificial intelligence: Meta-analytic findings. Human Factors, 65, 337–359.

Karobari, M. I., Adil, A. H., Basheer, S. N., Murugesan, S., Savadamoorthi, K. S., Mustafa, M., Abdulwahed, A., et al. (2023). Evaluation of the diagnostic and prognostic accuracy of artificial intelligence in endodontic dentistry: A comprehensive review of literature. Computational and Mathematical Methods in Medicine, 2023, 7049360.

Karamifar, K., Tondari, A., & Saghiri, M. A. (2020). Endodontic periapical lesion: An overview on the etiology, diagnosis and current treatment modalities. European Endodontic Journal, 5, 54–67.

Kim, D., Kim, S. H., Kim, T., Kang, B. B., Lee, M., Park, W., Ku, S., et al. (2021). Review of machine learning methods in soft robotics. PLoS One, 16, e0246102.

Künzle, P., & Paris, S. (2024). Performance of large language artificial intelligence models on solving restorative dentistry and endodontics student assessments. Clinical Oral Investigations, 28, 575.

Mangano, F. G., Admakin, O., Lerner, H., & Mangano, C. (2023). Artificial intelligence and augmented reality for guided implant surgery planning: A proof of concept. Journal of Dentistry, 133, 104485.

Maltarollo, T. F. H., Strazzi-Sahyon, H. B., Amaral, R. R., & Sivieri-Araújo, G. (2024). Is the field of endodontics prepared to utilise ChatGPT? Australian Endodontic Journal, 50, 176–177.

Mohammad-Rahimi, H., Ourang, S. A., Pourhoseingholi, M. A., Dianat, O., Dummer, P. M. H., & Nosrat, A. (2024). Validity and reliability of artificial intelligence chatbots as public sources of information on endodontics. International Endodontic Journal, 57, 305–314.

Mukhamediev, R. I., Popova, Y., Kuchin, Y., Zaitseva, E., Kalimoldayev, A., Symagulov, A., Levashenko, V., et al. (2022). Review of artificial intelligence and machine learning technologies: Classification, restrictions, opportunities and challenges. Mathematics, 10, 2552.

Nguyen, T. T., Larrivée, N., Lee, A., Bilaniuk, O., & Durand, R. (2021). Use of artificial intelligence in dentistry: Current clinical trends and research advances. Journal of the Canadian Dental Association, 87, l7.

Nordblom, N. F., Büttner, M., & Schwendicke, F. (2024). Artificial intelligence in orthodontics: Critical review. Journal of Dental Research, 103, 577–584.

Ossowska, A., Kusiak, A., & Świetlik, D. (2022). Artificial intelligence in dentistry—Narrative review. International Journal of Environmental Research and Public Health, 19, 3449.

Pereira, A. S., et al. (2018). Scientific research methodology [free e-book]. Santa Maria: Editora da UFSM.

Putra, R. H., Doi, C., Yoda, N., Astuti, E. R., & Sasaki, K. (2022). Current applications and development of artificial intelligence for digital dental radiography. Dentomaxillofacial Radiology, 51, 20210197.

Ramoni, D., Sgura, C., Liberale, L., Montecucco, F., Ioannidis, J. P. A., & Carbone, F. (2024). Artificial intelligence in scientific medical writing: Legitimate and deceptive uses and ethical concerns. European Journal of Internal Medicine, 127, 31–35.

Revilla-León, M., Gómez-Polo, M., Vyas, S., Barmak, A. B., Özcan, M., Att, W., & Krishnamurthy, V. R. (2022). Artificial intelligence applications in restorative dentistry: A systematic review. Journal of Prosthetic Dentistry, 128, 867–875.

Schwendicke, F., & Büttner, M. (2023). Artificial intelligence: Advances and pitfalls. British Dental Journal, 234, 749–750.

Setzer, F. C., & Kratchman, S. I. (2022). Present status and future directions: Surgical endodontics. International Endodontic Journal, 55, 1020–1058.

Setzer, F. C., Li, J., & Khan, A. A. (2024). The use of artificial intelligence in endodontics. Journal of Dental Research, 103, 853–862.

Shiammala, P. N., Duraimutharasan, N. K. B., Vaseeharan, B., Alothaim, A. S., Al-Malki, E. S., Snekaa, B., Safi, S. Z., et al. (2023). Exploring the artificial intelligence and machine learning models in the context of drug design difficulties and future potential for the pharmaceutical sectors. Methods, 219, 82–94.

Shitsuka, R., et al. (2014). Fundamental mathematics for technology (2nd ed.). São Paulo: Editora Érica

Sohrabniya, F., Hassanzadeh-Samani, S., Ourang, S. A., Jafari, B., Farzinnia, G., Gorjinejad, F., Ghalyanchi-Langeroudi, A., et al. (2025). Exploring a decade of deep learning in dentistry: A comprehensive mapping review. Clinical Oral Investigations, 29, 143.

Soori, M., Arezoo, B., & Dastres, R. (2023). Artificial intelligence, machine learning and deep learning in advanced robotics: A review. Cognitive Robotics, 3, 54–70.

Stanley, K. (2023). Artificial intelligence and the future of dentistry. Compendium of Continuing Education in Dentistry, 44, 250–253.

Torres, P. E. P., Torres, E. A., Hernández-Álvarez, M., & Yoo, S. G. (2020). EEG-based BCI emotion recognition: A survey. Sensors (Basel), 20, 5083.

Uribe, S. E., Maldupa, I., Kavadella, A., El Tantawi, M., Chaurasia, A., Fontana, M., Marino, R., et al. (2024). Artificial intelligence chatbots and large language models in dental education: Worldwide survey of educators. European Journal of Dental Education, 28, 865–876.

Vieira, S. (2021). Introduction to biostatistics. Rio de Janeiro: Editora GEN/Guanabara Koogan.

Downloads

Published

2026-01-13

Issue

Section

Health Sciences

How to Cite

Evaluating Large Language Models performance in Endodontics: A clinical experimental study. Research, Society and Development, [S. l.], v. 15, n. 1, p. e2515150523, 2026. DOI: 10.33448/rsd-v15i1.50523. Disponível em: https://rsdjournal.org/rsd/article/view/50523. Acesso em: 23 jan. 2026.