The Role of ChatGPT in Dental Examination A Study on Reliability and Efficiency in Automated Essay Scoring

  • Dr. Himaja. K Student, Department of Public Health Dentistry, Mamata Dental College, Khammam, India.
  • Dr. K. V. R. Pratap Professor and HOD, Department of Public Health Dentistry, Mamata Dental College, Khammam, India.
  • Dr. T. Madhavi Padma Professor, Department of Public Health Dentistry, Mamata Dental College, Khammam, India.
  • Dr. Siva Kalyan Reader, Department of Public Health Dentistry, Mamata Dental College, Khammam, India.
  • Dr. Surbhit Singh Senior Lecturer, Department of Public Health Dentistry, Mamata Dental College, Khammam, India.
  • Dr. V. Srujan Kumar Senior Lecturer, Department of Public Health Dentistry, Mamata Dental College, Khammam, India.
Keywords: Artificial Intelligence, ChatGPT, Dental-specific Terminologies, Intra-rater

Abstract

The integration of artificial intelligence (AI) in dental education and assessment has gained significant attention in recent years. This study evaluates the role of ChatGPT in dental examinations, specifically focusing on its reliability and efficiency in automated essay scoring. The research aims to assess how effectively ChatGPT can evaluate dental students’ written responses, considering factors such as accuracy, consistency, and grading bias.

A dataset of subjective dental examination answers was analyzed using ChatGPT’s natural language processing (NLP) capabilities. The AI-generated scores were compared with manual grading by dental educators, using metrics such as correlation with expert scores, intra-rater reliability, and time efficiency. Results indicate that ChatGPT demonstrates high consistency and efficiency in grading, significantly reducing the time required for evaluation. However, challenges such as contextual misinterpretation, grading fairness, and domain-specific limitations were observed.

This study concludes that ChatGPT has promising potential in automated essay scoring for dental examinations, offering a scalable and time-saving solution. However, human oversight remains essential to ensure clinical relevance and fairness in assessment. Future research should focus on refining AI models to better understand dental-specific terminologies and reasoning for improved accuracy.

Aim

  1. Assess the accuracy of ChatGPT’s grading compared to manual scoring by dental educators.
  2. Analyze the consistency of AI-generated scores across different responses.
  3. Evaluate time efficiency, determining whether ChatGPT can reduce the time required for essay evaluation.

4. Identify limitations and challenges, such as contextual misinterpretation or bias in grading.

Objective

  1. To analyze the accuracy of ChatGPT’s automated essay scoring in dental examinations by comparing AI-generated scores with those given by expert dental educators.
  2. To evaluate the reliability of ChatGPT in maintaining consistency across multiple essay responses.
  3. To measure the efficiency of ChatGPT in terms of time taken for evaluation compared to manual grading.

Method: A cross sectional survey was conducted among 204 dental students comprising 57 males and 147 females. The survey included 14 questions. The responses were analyzed based on gender and year of study using chi square gets to identify statistically significant differences.

References

Floridi, L., Chiriatti, M. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds & Machines 30, 681–694 (2020). https://doi.org/10.1007/s11023-020-09548-1

Abd-Alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy PM, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large language models in medical education: opportunities, challenges, and future directions. JMIR Med Educ. 2023;9:e48291. doi: https://doi.org/10.2196/48291

Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274

Javaid M, Haleem A, Singh RP, Khan S, Khan IH. Unlocking the opportunities through ChatGPT Tool towards ameliorating the education system. BenchCouncil Transact Benchmarks Standards Eval. 2023;3(2): 100115. doi: http://dx.doi.org/10.1016/j.tbench.2023.100115

Javaid M, Haleem A, Singh RP, Khan S, Khan IH. Unlocking the opportunities through ChatGPT Tool towards ameliorating the education system. BenchCouncil Transact Benchmarks Standards Eval. 2023;3(2): 100115. doi: http://dx.doi.org/10.1016/j.tbench.2023.100115

Ramesh, D., Sanampudi, S.K. An automated essay scoring systems: a systematic literature review. Artif Intell Rev 55, 2495–2527 (2022). https://doi.org/10.1007/s10462-021-10068-2

Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050

Erturk, S., van Tilburg, W.A.P. & Igou, E.R. Off the mark: Repetitive marking undermines essay evaluations due to boredom. Motiv Emot 46, 264–275 (2022). https://doi.org/10.1007/s11031-022-09929-2

Khan, R. A., Jawaid, M., Khan, A. R., & Sajjad, M. (2023). ChatGPT - Reshaping medical education and clinical management. Pakistan Journal of Medical Sciences, 39(2), 605. https://doi.org/10.12669/pjms.39.2.7653

Published
2025-03-16
How to Cite
K, D. H., Pratap, D. K. V. R., Padma, D. T. M., Kalyan, D. S., Singh, D. S., & Kumar, D. V. S. (2025). The Role of ChatGPT in Dental Examination A Study on Reliability and Efficiency in Automated Essay Scoring. International Journal Of Drug Research And Dental Science, 7(1), 86-94. https://doi.org/10.36437/ijdrd.2025.7.1.F

Most read articles by the same author(s)

1 2 > >>