Combining Human and Automated Scoring Methods in Experimental Assessments of Writing: A Case Study Tutorial
This article presents a practical, research-based framework for combining human scoring with automated text analysis to evaluate writing outcomes in randomized controlled trials. Using a large-scale content literacy intervention as a case study, the authors demonstrate how machine learning and natural language processing tools can augment traditional human-coded assessments of student writing by generating a rich set of additional text-based outcomes. These automated measures—ranging from vocabulary use and discourse structure to psychological and linguistic features—help unpack how and why an intervention improved students’ argumentative writing, beyond what a single holistic score can reveal. The study shows that automated methods can efficiently identify treatment-driven changes in writing style, vocabulary, and reasoning, while also cautioning that machine-generated scores should supplement, not replace, human judgment. By offering a clear analytic workflow, open-source tools, and concrete examples, the article provides a scalable template for researchers seeking to deepen causal interpretations of writing outcomes in education research.