Machine Scoring of Student Essays

Machine Scoring of Student Essays: Truth and Consequences

PATRICIA FREITAG ERICSSON
RICHARD H. HASWELL
Copyright Date: 2006
Pages: 274
https://www.jstor.org/stable/j.ctt4cgq0p
  • Cite this Item
  • Book Info
    Machine Scoring of Student Essays
    Book Description:

    The current trend toward machine-scoring of student work, Ericsson and Haswell argue, has created an emerging issue with implications for higher education across the disciplines, but with particular importance for those in English departments and in administration. The academic community has been silent on the issue-some would say excluded from it-while the commercial entities who develop essay-scoring software have been very active.

    Machine Scoring of Student Essays is the first volume to seriously consider the educational mechanisms and consequences of this trend, and it offers important discussions from some of the leading scholars in writing assessment.

    Reading and evaluating student writing is a time-consuming process, yet it is a vital part of both student placement and coursework at post-secondary institutions. In recent years, commercial computer-evaluation programs have been developed to score student essays in both of these contexts. Two-year colleges have been especially drawn to these programs, but four-year institutions are moving to them as well, because of the cost-savings they promise. Unfortunately, to a large extent, the programs have been written, and institutions are installing them, without attention to their instructional validity or adequacy.

    Since the education software companies are moving so rapidly into what they perceive as a promising new market, a wider discussion of machine-scoring is vital if scholars hope to influence development and/or implementation of the programs being created. What is needed, then, is a critical resource to help teachers and administrators evaluate programs they might be considering, and to more fully envision the instructional consequences of adopting them. And this is the resource that Ericsson and Haswell are providing here.

    eISBN: 978-0-87421-536-6
    Subjects: Education

Table of Contents

  1. Front Matter
    (pp. [i]-[iv])
  2. Table of Contents
    (pp. [v]-[vi])
  3. INTRODUCTION
    (pp. 1-7)
    Patricia Freitag Ericsson and Richard H. Haswell

    We’re in the fifth year of the twenty-first century and the Parliament of India, several universities in Italy, and four Catholic churches in Monterrey, Mexico, all have bought cell-phone jammers. Meanwhile in the State of Texas, USA, the State Board of Education has decided that students who fail the essay-writing part of the state’s college entrance examination can retake it either with ACT’s COMPASS tests using e-Write or with the College Board’s ACCUPLACER tests using WritePlacer Plus. Though dispersed geographically, these events have one thing in common. They illustrate how new technology can sneak in the back door and establish...

  4. 1 INTERESTED COMPLICITIES: The Dialectic of Computer-Assisted Writing Assessment
    (pp. 8-27)
    Ken S. McAllister and Edward M. White

    This excerpt from George Landow’s tongue-in-cheek short story about “Apprentice Author Austen” and her attempts to publish a story on the international computer network, thereby ensuring her promotion to “Author,” suggests a frightful future for writing and its assessment. The notion that a computer can deliver aesthetic judgments based on quantifiable linguistic determinants is abhorrent to many contemporary writing teachers, who usually treasure such CPU-halting literary features as ambiguity, punning, metaphor, and veiled reference. But Landow’s “Evaluator” may only be a few generations ahead of extant technologies like the Educational Testing Service’s e-rater, and recent developments in the fields of...

  5. 2 THE MEANING OF MEANING: Is a Paragraph More than an Equation?
    (pp. 28-37)
    Patricia Freitag Ericsson

    Several chapters in this collection allude to or deal briefly with issues of “meaning” in the controversy about the machine scoring of essays. This chapter’s intent is to explore extensively the “meaning of meaning,” arguing that, although they may appear to be esoteric, considerations of “meaning” are central to the controversy about the machine scoring of student essays and important to include as we make arguments about it. Foregrounding of the “meaning of meaning” in this chapter establishes a foundation for other chapters that may allude to the importance of meaning in the machine-scoring controversy. Discussion in this chapter can...

  6. 3 CANʹT TOUCH THIS: Reflections on the Servitude of Computers as Readers
    (pp. 38-56)
    Chris M. Anson

    Consider, for a moment, what’s going on. First, you’re in a multidimensional context where you and I, and this text, share a presence, a purpose, and knowledge that delimit the interpretive possibilities and let you begin fitting into boxes what little you’ve seen so far, or maybe shaping a box around it: academic genre, essay in a book, trusted editors, a focus on machines as readers, the common use of an opening quotation (lyrics, or a poem, or a proverb, or a line of text from a famous work). This one’s in a vernacular. Does its style provide the meaning...

  7. 4 AUTOMATONS AND AUTOMATED SCORING: Drudges, Black Boxes, and Dei Ex Machina
    (pp. 57-78)
    Richard H. Haswell

    Her name really is Nancy Drew. Like her fictional namesake, she is into saving people, although more as the author of a mystery than the hero of one. She teaches English at a high school in Corpus Christi, Texas, and according to the local newspaper (Beshur 2004), she has designed a software program that will grade student essays. The purpose is to help teachers save students from the Texas Essential Knowledge and Skills test, which must be passed for grade promotion and high school diploma. Her software will also save writing teachers from excessive labor, teachers who each have around...

  8. 5 TAKING A SPIN ON THE INTELLIGENT ESSAY ASSESSOR
    (pp. 79-92)
    Tim McGee

    The following narrative recounts an experiment I performed upon a particular essay-scoring machine, the Intelligent Essay Assessor (IEA), that was first brought to my attention by Anne Herrington and Charles Moran’s 2001 College English essay “What Happens When Machines Read Our Students’ Writing?” As a writing program administrator, I was experienced in the progressive waves of writing assessment historically performed by human readers and well versed in many aspects of computer-assisted instruction. At the time of my experiment, however, I was still a neophyte in the area of automated essay scoring. Nevertheless, despite serious misgivings about my qualifications in the...

  9. 6 ACCUPLACERʹS ESSAY-SCORING TECHNOLOGY: When Reliability Does Not Equal Validity
    (pp. 93-113)
    Edmund Jones

    Placement of students in first-year writing courses is generally seen as a time-consuming but necessary exercise at most colleges and universities in the United States. Administrators have been concerned about both the expense and inconvenience of testing, about the validity of the tests, and about the reliability of the scorers. Over the past decade, computer technology has developed to the point that a company like ACCUPLACER, under the auspices of the College Board, can plausibly offer computer programs that score student essays with the same reliability as expert scorers (Vantage Learning 2000). Under this system, schools need hire no faculty...

  10. 7 WRITEPLACER PLUS IN PLACE: An Exploratory Case Study
    (pp. 114-129)
    Anne Herrington and Charles Moran

    In 2001, we published an essay in College English entitled “What Happens When Machines Read Our Students’ Writing?” In it, we discussed two computer programs, then relatively new to the market, that were designed to evaluate student writing automatically: WritePlacer Plus, developed by Vantage Technology, and Intelligent Essay Assessor, developed by three University of Colorado faculty who incorporated as Knowledge Analysis Technologies to market it. At this time, ETS had also developed its own program, e-rater, and was using it to score essays for the Graduate Management Admissions Test.

    Flash forward to 2004, and a quick check of company Web...

  11. 8 E-WRITE AS A MEANS FOR PLACEMENT INTO THREE COMPOSITION COURSES: A Pilot Study
    (pp. 130-137)
    Richard N. Matzen Jr. and Colleen Sorensen

    In the fall of 2002 Utah Valley State College (UVSC) began institutional research into placement tests for first-year composition courses: two basic writing courses and a freshman composition course. UVSC researchers had previously presented evidence in the article “Basic Writing Placement with Holistically Scored Essays: Research Evidence” (Matzen and Hoyt 2004) that suggested the college’s multiple-choice tests—ACT, ACT COMPASS, and DRP (Degrees of Reading Power)—often misplaced UVSC students in composition courses. As an alternative to placement by these tests, a research team including people from the Department of Basic Composition, the Institutional Research and Management Studies Office, and...

  12. 9 COMPUTERIZED WRITING ASSESSMENT: Community College Faculty Find Reasons to Say ʺNot Yetʺ
    (pp. 138-146)
    William W. Ziegler

    Community colleges exist to provide educational opportunities to a fluid population, many of whom encounter sudden changes in their work, family lives, and financial situations. For this reason, community colleges often admit, place, and register a student for classes all within a little more than twenty-four hours. This need to respond promptly explains why Virginia’s community college administrators took notice when the computerized COMPASS placement test appeared in 1995. The test, published by ACT Inc., promised to gather demographic data as well as provide nearly instant placement recommendations in mathematics, reading, and writing. Several colleges piloted the test independently, and...

  13. 10 PILOTING THE COMPASS E-WRITE SOFTWARE AT JACKSON STATE COMMUNITY COLLEGE
    (pp. 147-153)
    Teri T. Maddox

    Placement issues are a major concern in higher education. Many states require students who do not have college-level scores on entrance exams such as the ACT or SAT to take precollege developmental classes. Without reliable placement testing, students may be put into classes that are too easy for them and become bored, or worse, they may be put into classes too hard for them and drop out because they don’t think they are “college material.” Correct placement should mean that students are put into appropriate classes for their ability level so that they will be properly challenged and supported. In...

  14. 11 THE ROLE OF THE WRITING COORDINATOR IN A CULTURE OF PLACEMENT BY ACCUPLACER
    (pp. 154-165)
    Gail S. Corso

    Placement processes into college writing and developmental writing courses include diverse options. Processes include use of external exams and externally set indicators, such as SAT or ACCUPLACER scores; use of locally designed essay topics for placement that are assessed by human readers using either holistic or analytic trait measures; use of directed self-placement (Royer and Gilles 2002) or informed self-placement (Bedore and Rossen-Knill 2004); and use of electronic portfolios (Raymond Walters College 2002). Whatever process for placement an institution uses needs to align with course and program outcomes, institutional assessment outcomes, the mission of the school, and, most significantly, the...

  15. 12 ALWAYS ALREADY: Automated Essay Scoring and Grammar-Checkers in College Writing Courses
    (pp. 166-176)
    Carl Whithaus

    Although Ken S. McAllister and Edward M. White call the development of automated essay scoring “a complex evolution driven by the dialectic among researchers, entrepreneurs, and teachers” (chapter 1 of this volume), within composition studies the established tradition points toward the rejection of machine-scoring software and other forms of computers as readers. This tradition culminates in the Conference on College Composition and Communication’s (2005) “Position Statement on Teaching, Learning, and Assessing Writing in Digital Environments,” where the penultimate sentence succinctly captures our discipline’s response: the committee writes, “We oppose the use of machine-scored writing in the assessment of writing” (789)....

  16. 13 AUTOMATED ESSAY GRADING IN THE SOCIOLOGY CLASSROOM: Finding Common Ground
    (pp. 177-198)
    Edward Brent and Martha Townsend

    This chapter describes an effort by one author, a sociologist, to introduce automated essay grading in the classroom, and the concerns raised by the other author, the director of a campuswide writing program, in evaluating the grading scheme for fulfillment of a writing-intensive (WI) requirement. Brent provides an overview of existing automated essay-grading programs, pointing out the ways these programs do not meet his needs for evaluating students’ understanding of sociology content. Then he describes the program he developed to meet those needs along with an assessment of the program’s use with six hundred students over three semesters. Townsend provides...

  17. 14 AUTOMATED WRITING INSTRUCTION: Computer-Assisted or Computer-Driven Pedagogies?
    (pp. 199-210)
    Beth Ann Rothermel

    Elsewhere in this collection William Condon (chapter 15) exposes the losses college writing programs may experience when employing machine scoring in the assessment process. Contrasting machine scoring with a host of other more “robust” forms of assessment, such as portfolio-based assessment, Condon reveals how machine scoring fails “to reach into the classroom;” using machine scoring for student and program assessment provides administrators and teachers with little of the necessary data they need to engage in effective internal evaluation. This essay examines another recent application of commercial machine scoring—its use with Web-based writing-instruction programs currently marketed to K–16 writing...

  18. 15 WHY LESS IS NOT MORE: What We Lose by Letting a Computer Score Writing Samples
    (pp. 211-220)
    William Condon

    Earlier in this volume, Rich Haswell (chapter 4) questions the validity of machine scoring by tying it to holistic scoring methodologies—and I certainly agree with his critique of timed writings, holistically scored. However, I want to suggest that questions about machine scoring differ from questions about holistic readings. Machine scoring involves looking back even farther in the history of writing assessment—to indirect testing. Several essays in this collection describe the basic methodology behind machine scoring. The computer is programmed to recognize linguistic features of a text that correlate highly with score levels previously assigned a text by human...

  19. 16 MORE WORK FOR TEACHER? Possible Futures of Teaching Writing in the Age of Computerized Assessment
    (pp. 221-233)
    Bob Broad

    In her book More Work for Mother: The Ironies of Household Technology from the Open Hearth to the Microwave (1983), Ruth Schwartz Cowan presents a feminist history of modern household technology. As the title of her book emphasizes, her argument is that the hundreds of “gadgets” invented with the purpose of easing the labor of “housewives” achieved the net result of dramatically increasing the quantity and range of tasks for which women were responsible in the American home. For example, when the wood-burning stove replaced the open hearth as the home’s source of heat and the cooking apparatus, men and...

  20. 17 A BIBLIOGRAPHY OF MACHINE SCORING OF STUDENT WRITING, 1962–2005
    (pp. 234-243)
    Richard H. Haswell
  21. GLOSSARY OF TERMS
    (pp. 244-245)
  22. NOTES
    (pp. 246-250)
  23. REFERENCES
    (pp. 251-261)
  24. INDEX
    (pp. 262-266)
  25. Back Matter
    (pp. 267-268)