The Effects of Inclusion of Native Speakers’ Writing Samples on the Domain Scoring Accuracy of Automated Essay Scoring of Writing Submitted by Taiwanese English Language Learners

The Effects of Inclusion of Native Speakers’ Writing Samples on the Domain Scoring Accuracy of Automated Essay Scoring of Writing Submitted by Taiwanese English Language Learners

  • Version 1.0.0
  • Download 1
  • File Size 43.83 KB
  • File Count 1
  • Create Date August 2, 2018
  • Last Updated August 2, 2018

The Effects of Inclusion of Native Speakers' Writing Samples on the Domain Scoring Accuracy of Automated Essay Scoring of Writing Submitted by Taiwanese English Language Learners

While the scoring accuracy of automated scoring of essays written in English has been established, more research is needed with regards to domain scoring for English Language Learners. This paper presents findings regarding the effects of training set composition on the domain (Focus and Meaning, Content and Development, Organization, Language Use and Style, and Mechanics and Conventions) scoring accuracy of essays submitted by Taiwanese students scored by an automated essay scoring system. Typically, each scoring model created is based on a set of previously scored essays. This study compares the accuracy of scoring the same set of essays written by Taiwanese students using two different models: one model using blended native and ELL essays and one using a set of entirely ELL essays. While both models yielded adjacent agreement rates from 98 to 100 percent across the domains, there were differences at the exact agreement level. Exact agreements for the model developed using the ELL training set ranged from 50 to 64 percent, while the blended training set resulted in exact agreements ranging from 66 to 76 percent. Pearson correlations for the two models were very similar (.83 to .89 for the first and .84 to .90 for the second). This study supports the use of a blended training set.

Attached Files

FileAction
paper_1162a202fe.pdfDownload 
Menu
X