The Hebrew Language Project: Automated Essay Scoring & Readability Analysis

The Hebrew Language Project: Automated Essay Scoring & Readability Analysis

[featured_image]
  • Version
  • Download 148
  • File Size 327.25 KB
  • File Count 1
  • Create Date August 2, 2018
  • Last Updated August 2, 2018

The Hebrew Language Project: Automated Essay Scoring & Readability Analysis

In 2000, NITE launched the Hebrew Language Project (HLP). The goal of the project is to develop computational tools for the analysis and evaluation of Hebrew texts.The current paper reports the results of two studies.The first study examined the differential contribution of quantified text features to the automated scoring of essays elicited in three different contexts: essays written by 8th-gradenative Hebrew-speakers who took part in the Israeli National Assessment of Educational Progress (n=1413), essays written by 12th-grade indigenous students in an instructional writingprogram (n=662), and essays written by applicants to higher education who took the YAEL test of Hebrew as a foreign language (n=980). The study also examined the effects of the size of the training sample used to develop the prediction model, and the effect of the textfeatureclustering model on the precision of the automated score.The second study examined the feasibility of assessing the difficulty (readability) of reading comprehension passages using statistical, morphological and lexical text features. A total of 7 sets of 10 passages, taken from various tests administered by NITE, were used in the study.Each set was given to three expert judges who were asked to evaluate the difficulty of the texts on a 1-10 scale. The average of the judges', difficulty estimates for each passage yielded a single difficulty measure. Next, a linear prediction model of passage difficulty was developed. 17 of the 50 text features examined were found to be significantly correlated (.23-.41) with the difficulty level obtained from the expert judges. The correlation between the predicted score and the difficulty measure was .80, indicating that about 65% of the variance in text difficulty ,can be explained by quantified text features.

Attached Files

FileAction
paper_4e1237ae.pdfDownload