Skip to main content icon/video/no-internet

Essay tests are a type of constructed-response assessment in which test-takers display skills and knowledge by writing responses to an assigned task, question, or prompt. Instead of selecting answers to multiple-choice questions, the test-taker may be required to present an opinion on an issue, explain a chemical process, describe a scene in a foreign language, present conclusions drawn from a set of economic data, write a narrative, or demonstrate other skills and understandings in writing. The tasks may be simple or highly complex, discrete or multiphased; the response format may vary from a paragraph to a letter, a multiparagraph essay, a proposal, and so on.

For centuries, essay testing was the primary form of assessment. But with an emphasis on statistical rigor and the advent of rapid-scanning equipment and other technical advances in the mid-20th century, selected-response tests (primarily multiple-choice tests) emerged as a generally more objective, efficient, and cost-effective form for large-scale assessments. Selected-response tests usually sample a broader content domain: In an hour of testing time, test-takers might respond to 45 multiple-choice questions instead of only one or a few essay questions. Furthermore, the more questions in a test, the less a test-taker's performance on any one question influences the final score. These factors, in addition to the right-or-wrong nature of the questions, make selected-response tests ‘more reliable’—that is, test-takers' scores are substantially more consistent across different versions of the same test. As selected-response testing evolved, essay testing was often dismissed as more subjective and less reliable.

In recent decades, however, developments in cognitive psychology and an increased emphasis on more authentic assessment have helped spur interest in having test-takers display or perform certain skills. Essay testing has become a highly rigorous discipline supported with research, technological advancements, and published guidelines for best practices (although theorists continue to debate conceptual issues, including the extent to which test reliability should be emphasized or readers should set aside their personal reactions to essay responses).

Essay testing requires careful deliberation at every stage. The process of specifying the domain of content and skills to be assessed often includes surveys, focus-group sessions, committee deliberations, or reviews of professional and curriculum standards. The development process may also include prototype testing of various formats, directions, topics, and evaluation criteria, as well as questionnaires or interviews ascertaining test-takers' opinions about the tasks. Below are some important steps in designing an essay test:

  • Determine which skills and knowledge are best assessed via essay tasks.
  • Consider the extent to which writing fluency could unduly affect measurement of other skills or knowledge.
  • Include a sufficient number of test questions—possibly combining essay tasks and selected-response questions—for desired content coverage and reliable scores.
  • Provide adequate time for each task.
  • Consider how requiring test-takers either to handwrite or word process essay responses may affect test-takers' ability to display the skills being assessed.
  • Give test-takers ample opportunity to prepare by prepublishing information about the task types and scoring standards, including sample questions and sample responses with scores and commentary.
  • Determine the extent to which evidence supports the claim that the essay test is valid for its intended purpose—that is, that the inferences and actions based on test scores will be appropriate, meaningful, and useful. For example, conduct studies to investigate relationships between test-takers' performance on the essay test and demonstrations of the same skills in other contexts.

In essay testing, how responses are scored becomes an integral part of test design. First, the scoring methodology determines what type of information will be obtained. One common approach is holistic scoring, in which a reader appraises the overall quality of a response to assign a single score. Holistic scoring can reflect the integrated nature of proficiencies such as writing, reading comprehension, and/or critical-thinking skills; moreover, it is an efficient approach to scoring large volumes of responses. However, it does not provide the diagnostic feedback or detailed information that may be gained from more time-consuming analytic scoring, in which a reader may evaluate the different features or traits of a response separately or else award scores based on the presence or absence of specified components (e.g., particular facts about a chemical process). In practice, testing programs may employ variants of holistic or analytic scoring.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading