Title: | ChatGPT as an automated essay scoring tool in the writing classrooms: how it compares with human scoring |
Author(s): | Ngoc My Bui |
Keywords: | Generative AI; ChatGPT; Automated essay scoring; Automated writing evaluation; Argumentative essays |
Abstract: | With the generative artificial intelligence (AI) tool’s remarkable capabilities in understanding and generating meaningful content, intriguing questions have been raised about its potential as an automated essay scoring (AES) system. One such tool is ChatGPT, which is capable of scoring any written work based on predefined criteria. However, limited information is available about the reliability of this tool in scoring the different dimensions of writing quality. Thus, this study examines the relationship between the scores assigned by ChatGPT and a human rater and how consistent ChatGPT-assigned scores are when taken at multiple time points. This study employed a cross-sectional quantitative approach in analyzing 50 argumentative essays from each proficiency level (A2_0, B1_1, B1_2, and B2_0), totaling 200. These essays were rated by ChatGPT and an experienced human rater. Using correlational analysis, the results reveal that ChatGPT’s scoring did not align closely with an experienced human rater (i.e., weak to moderate relationships) and failed to establish consistency after two rounds of scoring (i.e., low intraclass correlation coefficient values). These results were primarily attributed to ChatGPT’s scoring algorithm, training data, model updates, and inherent randomness. Implications for writing assessment and future studies are discussed |
Issue Date: | 2024 |
Publisher: | Springer |
Series/Report no.: | Vol. 30 |
URI: | https://digital.lib.ueh.edu.vn/handle/UEH/74135 |
DOI: | https://doi.org/10.1007/s10639-024-12891-w |
ISSN: | Jessie S. Barrot 1360-2357 (Print), 1573-7608 (Online) |
Appears in Collections: | INTERNATIONAL PUBLICATIONS
|