ChatGPT as an automated essay scoring tool in the writing classrooms: how it compares with human scoring

Ngoc My Bui

Please use this identifier to cite or link to this item: https://digital.lib.ueh.edu.vn/handle/UEH/74135

Full metadata record

DC Field	Value	Language
dc.contributor.author	Ngoc My Bui	-
dc.contributor.other	Jessie S. Barrot	-
dc.date.accessioned	2025-02-20T04:09:54Z	-
dc.date.available	2025-02-20T04:09:54Z	-
dc.date.issued	2025	-
dc.identifier.issn	1360-2357 (Print), 1573-7608 (Online)	-
dc.identifier.uri	https://digital.lib.ueh.edu.vn/handle/UEH/74135	-
dc.description.abstract	With the generative artificial intelligence (AI) tool’s remarkable capabilities in understanding and generating meaningful content, intriguing questions have been raised about its potential as an automated essay scoring (AES) system. One such tool is ChatGPT, which is capable of scoring any written work based on predefined criteria. However, limited information is available about the reliability of this tool in scoring the different dimensions of writing quality. Thus, this study examines the relationship between the scores assigned by ChatGPT and a human rater and how consistent ChatGPT-assigned scores are when taken at multiple time points. This study employed a cross-sectional quantitative approach in analyzing 50 argumentative essays from each proficiency level (A2_0, B1_1, B1_2, and B2_0), totaling 200. These essays were rated by ChatGPT and an experienced human rater. Using correlational analysis, the results reveal that ChatGPT’s scoring did not align closely with an experienced human rater (i.e., weak to moderate relationships) and failed to establish consistency after two rounds of scoring (i.e., low intraclass correlation coefficient values). These results were primarily attributed to ChatGPT’s scoring algorithm, training data, model updates, and inherent randomness. Implications for writing assessment and future studies are discussed	en
dc.language.iso	eng	-
dc.publisher	Springer	-
dc.relation.ispartof	Education and Information Technologies	-
dc.relation.ispartofseries	Vol. 30	-
dc.rights	Springer Nature	-
dc.subject	Generative AI	en
dc.subject	ChatGPT	en
dc.subject	Automated essay scoring	en
dc.subject	Automated writing evaluation	en
dc.subject	Argumentative essays	en
dc.title	ChatGPT as an automated essay scoring tool in the writing classrooms: how it compares with human scoring	en
dc.type	Journal Article	en
dc.identifier.doi	https://doi.org/10.1007/s10639-024-12891-w	-
dc.format.firstpage	2041	-
dc.format.lastpage	2058	-
ueh.JournalRanking	Scopus	-
item.openairetype	Journal Article	-
item.cerifentitytype	Publications	-
item.languageiso639-1	en	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.grantfulltext	none	-
item.fulltext	Only abstracts	-
Appears in Collections:	INTERNATIONAL PUBLICATIONS

Show simple item record

Google Scholar^TM

Check

Google ScholarTM

Altmetric

Google Scholar^TM