Dentales reliability of ChatGPT in the assessment of dental anomalies

Authors

DOI:

https://doi.org/10.33448/rsd-v15i5.51135

Keywords:

Generative Artificial Intelligence, Tooth Abnormalities, Dental Informatics.

Abstract

This research aimed to evaluate the content accuracy of responses generated by ChatGPT regarding dental anomalies. This is an observational, methodological study with a quantitative approach, focused on expert content evaluation. The study was conducted following relevant methodological standards for the examination of data produced by Artificial Intelligence, considering human evaluation. The assessments were carried out independently by each of the five researchers. Each researcher received the same set of 19 ChatGPT responses. The accuracy and truthfulness of the ChatGPT responses were evaluated using a 5-point Likert scale, in which the highest score indicated better response quality: (1) very unsatisfactory; (2) unsatisfactory; (3) acceptable; (4) good; and (5) very good. The mean score assigned by the evaluators was 3.47 ± 0.063. To assess inter-rater reliability, two coefficients were used: the Intraclass Correlation Coefficient (ICC) and Cronbach’s Alpha. The results obtained in this study demonstrate that the knowledge provided by ChatGPT on stomatology topics related to dental anomalies indicates a moderate performance according to the evaluators.

References

Abou-Bakr, A., El Barbary, A., & Hassanein, F. E. A. (2025). ChatGPT-5 vs oral medicine experts for rank-based differential diagnosis of oral lesions: a prospective, biopsy-validated comparison. Odontology. https://doi.org/10.1007/s10266-025-01242-x

Abu Arqub, S., Al-Moghrabi, D., Allareddy, V., Upadhyay, M., Vaid, N., & Yadav, S. (2024). Content analysis of AI-generated (ChatGPT) responses concerning orthodontic clear aligners. The Angle Orthodontist, 94(3), 263–272. https://doi.org/10.2319/071123-484.1.

Agrawal, P., & Nikhade, P. (2022). Artificial Intelligence in Dentistry: Past, Present, and Future. Cureus, 14(7). https://doi.org/10.7759/cureus.27405

Albagieh H., Alzeer, Z. O., Alasmari, O. N., Alkadhi, A. A., Naitah, A. N., Almasaad, K. F., Alshahrani, T. S., Alshahrani, K. S., & Almahmoud, M. I. (2024). Comparing Artificial Intelligence and Senior Residents in Oral Lesion Diagnosis: A Comparative Study. Curēus. https://doi.org/10.7759/cureus.51584

Biswas, S., Logan, N. S., Davies, L. N., Sheppard, A. L., & Wolffsohn, J. S. (2023). Assessing the utility of ChatGPT as an artificial intelligence‐based large language model for information to answer questions on myopia. Ophthalmic and Physiological Optics. https://doi.org/10.1111/opo.13207

Cakir, H., Ufuk Caglar, Sami Sekkeli, Esra Zerdali, Sarilar, O., Oguzhan Yıldız, & Faruk Ozgor. (2024). Evaluating ChatGPT Ability to Answer Urinary Tract Infection-Related Questions. Infectious Diseases Now, 54(4), 104884–104884. https://doi.org/10.1016/j.idnow.2024.104884

Costa Neto, P. L. O. & Bekman, O. R. (2009). Análise estatística da decisão. (2ed). Editora Blucher.

Eggmann, F., Weiger, R., Zitzmann, N. U., & Blatz, M. B. (2023). Implications of large language models such as ChatGPT for dental medicine. Journal of Esthetic and Restorative Dentistry, 35(7). https://doi.org/10.1111/jerd.130

Giannakopoulos, K., Kavadella, A., Salim, A. A., Stamatopoulos, V., & Kaklamanos, E. G. (2023). Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study. Journal of Medical Internet Research, 25(1), e51580. https://doi.org/10.2196/51580

Hajibagheri P., Sani, S. K., Samami, M., Rasoul Tabari-Khomeiran, Azadpeyma, K., & Sani, M. K. (2025). ChatGpt’s accuracy in the diagnosis of oral lesions. BMC Oral Health, 25(1). https://doi.org/10.1186/s12903-025-06582-2

Huang, H., Zheng, O., Wang, D., Yin, J., Wang, Z., Ding, S., Yin, H., Xu, C., Yang, R., Zheng, Q., & Shi, B. (2023). ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. International Journal of Oral Science, 15(1), 1–13. https://doi.org/10.1038/s41368-023-00239-y

Ji, K., Wu, Z., Han, J., Zhai, G., & Liu, J. (2025). Evaluating ChatGPT-4’s performance on oral and maxillofacial queries: Chain of Thought and standard method. Frontiers in Oral Health, 6. https://doi.org/10.3389/froh.2025.1541976

Junior, M. L. S. S., Brito, M. L. de, Carvalho, B. W. L., Silva, E. M. C. da, & Lira, A. de L. S. de. (2023). Prevalence and influence of dental development anomalies in smile esthetics: a cross-sectional analysis. Brazilian Journal of Oral Sciences, 22, e237434–e237434. https://doi.org/10.20396/bjos.v22i00.8667434

McKinney, R., & Olmo, H. (2022). Developmental Disturbances Of The Teeth, Anomalies Of Shape And Size. PubMed; StatPearls Publishing. https://www.ncbi.nlm.nih.gov/books/NBK574555/

Pereira, A. S. et al. (2018). Metodologia da pesquisa científica. [Free ebook]. Santa Maria. Editora da UFSM.

Putra, R. H., Doi, C., Yoda, N., Astuti, E. R., & Sasaki, K. (2021). Current applications and development of artificial intelligence for digital dental radiography. Dentomaxillofacial Radiology, 51(1), 20210197. https://doi.org/10.1259/dmfr.20210197

Risemberg, R. I. C. et al. (2026). A importância da metodologia científica no desenvolvimento de artigos científicos. E-Acadêmica, 7(1), e0171675. https://eacademica.org/eacademica/article/view/675

Schwendicke F, Blatz M, Uribe S, Cheung W, Verma M, Linton J, Kim YJ. Artificial intelligence for dentistry, FDI artificial intelligence working group. FDI. 2023.

Schwendicke, F., Samek, W., & Krois, J. (2020). Artificial Intelligence in Dentistry: Chances and Challenges. Journal of Dental Research, 99(7), 769–774. https://doi.org/10.1177/0022034520915714

Semerci, Z. M., & Yardımcı, S. (2024). Empowering Modern Dentistry: The Impact of Artificial Intelligence on Patient Care and Clinical Decision Making. Diagnostics, 14(12), 1260. https://doi.org/10.3390/diagnostics14121260

Shen, Y., Heacock, L., Elias, J., Hentel, K. D., Reig, B., Shih, G., & Moy, L. (2023). ChatGPT and Other Large Language Models Are Double-edged Swords. Radiology, 307(2). https://doi.org/10.1148/radiol.230163

Shitsuka, R. et al. (2014). Matemática fundamental para tecnologia. (2ed). Editora Érica.

Tanaka, O. M., Gasparello, G. G., Hartmann, G. C., Casagrande, F. A., & Pithon, M. M. (n.d.). Assessing the reliability of ChatGPT: a content analysis of self-generated and self-answered questions on clear aligners, TADs and digital imaging. Dental Press Journal of Orthodontics, 28(5), e2323183. https://doi.org/10.1590/2177-6709.28.5.e2323183.oar

Zhu, J., Wan, M., Duan, X., Fan, Z., Sun, Y., Wang, X., Zheng, S., Zheng, L., Zhu, Q., Chen, D., Dai, J., Han, D., He, M., Huang, C., Jiang, Y., Jia, Z., Pan, Y., Pan, Y., Wang, T., & Wang, W. (2026). Expert consensus on the diagnosis and management of tooth developmental anomalies. International Journal of Oral Science, 18(1). https://doi.org/10.1038/s41368-025-00401-8

Published

2026-05-26

Issue

Section

Health Sciences

How to Cite

Dentales reliability of ChatGPT in the assessment of dental anomalies. Research, Society and Development, [S. l.], v. 15, n. 5, p. e10015551135, 2026. DOI: 10.33448/rsd-v15i5.51135. Disponível em: https://rsdjournal.org/rsd/article/view/51135. Acesso em: 15 jun. 2026.