Modern Classroom Assessment

Books

Bruce B. Frey

  • Citations
  • Add to My List
  • Text Size

  • Chapters
  • Front Matter
  • Back Matter
  • Subject Index
  • Copyright

    Preface

    As a college professor and researcher in education, I've used and reviewed dozens of classroom assessment textbooks. Some books cover the basic types and strategies of classroom assessment with a little theory and research thrown in, but don't provide enough concrete examples of what this looks like in today's schools. Other texts are essentially example after example of multiple-choice formats and performance-based assessments, without much guidance about why those examples work or are consistent with theories or research about how students think and learn. I wanted a book that does both. It's always seemed to me that there was a critical need for a book that covers all the major, research-based approaches to student-centered teacher-designed assessment in today's modern world, while also sharing tons of detailed models of what teachers actually do. The goal for Modern Classroom Assessment is to go beyond simply listing the basic assessment formats by exploring five broad up-to-date approaches or philosophies to assessment with the supporting scholarship and theory to guide their appropriate use. Most important, though, my mission in writing this book was to make these sometimes abstract concepts and guidelines clear and practical by including as many real-world illustrations and examples as I could fit between these covers.

    The five modern ways of approaching classroom assessment that form the heart of this book include almost everything a teacher needs to know about classroom assessment. These approaches are the following:

    • Formative Assessment

      Providing frequent feedback directly to students so they can monitor and control their own learning is the only assessment approach that has been found to affect learning and increase test scores. And formative assessment opens up the definition of what classroom assessment is and what its purpose should be. This is not your father's end-of-the-year high stakes exam. This process for collecting and sharing information is a collaboration between students and their teacher and is covered in Chapter 4: Formative Assessment.

    • Traditional Paper-and-Pencil Assessment

      The tried-and-true, efficient, objectively scored approaches to quickly and reliably assessing achievement include multiple-choice questions, matching, true-false, short answer, and combinations of those approaches. In many contexts and for many purposes in the modern classroom, these approaches fall short of teachers' needs. Sometimes forgotten, however, is that even today these methods often are the best and fairest choice. Chapter 5: Summative Assessment: Traditional Paper-and-Pencil Tests focuses on this still most common of approaches.

    • Performance-Based Assessment

      Twenty-five years ago, this approach was new and quickly gained popularity. The idea was to go beyond the measurement of low-level knowledge by asking students to perform a skill or create a product and assess student ability. This approach led by necessity to new scoring options, such as the creation of subjective scoring rubrics that increased validity, but it can lead to unique reliability difficulties. Chapter 7: Performance-Based Assessment examines this important approach to classroom assessment.

    • Authentic Assessment

      A current and very modern best practice in the field of classroom assessment is to use assessment tasks that match real-world expectations. This approach increases the usefulness of classroom assessment across all ages—preschool through graduate school to on the job. Assessment that is authentic is intrinsically interesting and focuses on the “important stuff” like critical thinking and transferable skills. Chapter 8: Authentic Assessment is dedicated to realistic assessment.

    • Universal Design of Assessment

      Modern methods of test design emphasize accessibility and fairness for all children, regardless of gender, first language, ethnicity, or disability. Basic standards exist that can and should be applied to classroom assessment in all contexts and at all levels. This book is unique in the focus it provides on universal design of assessment and what it means for the classroom teacher. Chapter 9: Universal Test Design provides that focus.

    Supporting the discussion of these five key assessment approaches are several other crucial topics:

    • Chapter 3: Basic Assessment Strategy: Categories of Learning, Objectives, and Backward Design provides a smart way for creating effective tests and assignments that work well for any of the broad assessment approaches.
    • Chapter 6: Constructed-Response Items and Scoring Rubrics focuses on the design and scoring of complex assessments, assignments, and tasks.
    • Chapter 10: Test Accommodations explores what classroom teachers can and should do to make any assessment fairer and its scores more meaningful for individual students.
    • Chapter 11: Understanding Scores From Classroom Assessments and Chapter 13: Standardized Tests handle the heavy lifting by covering the statistical and analytical methods for interpreting student performance. The tricks of the trade for talking about standardized test scores with parents are discussed.
    • Chapter 12: Making the Grade presents a variety of philosophies for designing grading systems and assigning that all-important letter grade. How to discuss grades with parents is explored.
    A Book for Teachers

    Modern Classroom Assessment was planned from the start for the future school teacher. The target audience is the undergraduate college student in a teacher education program. If that's you, I wrote this book with you in mind. If you're an experienced teacher or a graduate student, you'll still find Modern Classroom Assessment to be valuable, though. By design, the book is mostly full of hundreds of applied examples, applications, and authentic illustrations of what modern teachers do and the assessment choices they make, but the examples are always discussed in the context of theory and educational research. So the hope is primarily that you will get clear guidance and ideas about today's best practices in the classroom. It may be that that is all you need. “Just tell me what good teachers do!” It might be, though, that you'd like to know more about the theory or scholarship that supports the claim that a practice is “best.” That's here, too. Perhaps most important, I've tried to provide a richness of detail in our discussions of different formats, purposes, and grand strategies of assessment so you can solve assessment problems and apply the broad assessment approaches emphasized here to the unique specifics of your own classroom and your own students.

    Special Features

    To help foster a depth of understanding about both application and theory in the world of classroom assessment, we have included a variety of unique features in each chapter. These include the following:

    Stories From the Classroom

    Many chapters begin with a story about a teacher with a problem. Part 1 ends with a cliff-hanger, with Part 2 appearing at the end of the chapter. I hope that by seeing real-world dilemmas with real-world classroom assessment solutions, the meaningful application of that chapter's ideas and suggestions will make a lot more sense.

    Good Question!

    These are questions that students may wish to ask their instructor, but for some reason often don't. You can think of these as Frequently Unasked Questions. I've taken the liberty of asking and answering these questions for you!

    Real-World Choices Teachers Make

    As professionals, classroom teachers routinely make choices among assessment options. They have been trained, know best practice, and often are aware of the theory and research behind some strategy or approach. In the real world, though, even knowing all that sometimes doesn't make the right choice clear. Wherever this section appears, we take a look at the issues and information that help real-world teachers make quality choices.

    Technology

    This feature spotlights computerized, electronic, or web-based resources, which in today's world make assessment easier and more useful.

    A Closer Look

    Sometimes a theory, study, or idea deserves closer inspection. Rather than slowing down the conversation, I've placed these discussions in their own spaces. They are there for those interested, but can be skipped by you or your instructor. Think of this feature as an HD or high-definition option for a higher resolution picture.

    There's a Stat for That!

    Scores, item responses, student performance, validity analyses, and reliability estimates all produce numbers. These focused presentations on useful statistical and mathematical procedures are designed to be readable, meaningful, and useful in the real world without expecting students to be statisticians.

    Organizational Tools in Each Chapter

    Every chapter includes these guideposts and outlines to make it clear what the purpose of each chapter is, what will be covered, and how to think about what you just read:

    • Chapter Objectives

      Think of these as the instructional objectives for each chapter.

    • Looking Ahead

      Here are the major points in what you're about to read. This section comes at the start of each chapter.

    • Looking Back

      These are reminders of the critical points that we just covered. This section comes at the end of each chapter.

    • Things to Think About

      These are questions for discussion or self-reflection to clarify your own thoughts about the key points in each chapter.

    • On the Web

      A brief list of websites that provides further support, examples, and insights is found at the end of each chapter.

    Ancillaries

    Two websites created especially for Modern Classroom Assessment provide all sorts of free resources to help both teachers and students:

    Instructor Teaching Site

    A password-protected site, available at www.sagepub.com/frey, features resources that have been designed to help instructors plan and teach their courses. These resources include an extensive test bank, chapter-specific PowerPoint presentations, lecture notes, class activities, sample syllabi for semester and quarter courses, and links to SAGE journal articles with accompanying review questions.

    Student Study Site

    An open-access study site is available at www.sagepub.com/frey. This site includes eFlashcards, web quizzes, web resources, additional rubrics, and links to SAGE journal articles.

    Acknowledgments

    A critical contribution to Modern Classroom Assessment was made by almost two dozen reviewers. To these college professors, experts in classroom assessment, and top teachers, I give my heartfelt thanks for their careful thought, evaluation, and suggestions. Without their input, this book would, quite honestly, not be very good.

    I'd especially like to single out among this group my colleague and friend Professor Robert Harrington, whose close consideration of a few key chapters was especially instructive. Thank you, Bob, for your yeoman's work!

    A team of SAGE editors provided strong guidance and support during the development of Modern Classroom Assessment. It began with acquisitions editor Diane McDaniel, who thought this book sounded like a good idea. Then Megan Krattli took the ball and ran with it. Theresa Accomazzo finished up, under the guidance of Reid Hester. These four folks are very good at their jobs and have been a pleasure to work with. Thanks, SAGE gang!

    Neil Salkind, prolific author and well-respected goofball, helped make this book happen and continues as my friend and guide. Two research assistants helped with some important components of this text. Stephani Howarter and Zach Conrad did all I asked and did it on time. They are both very smart. Thank you.

    I'd like to acknowledge the support of my wife, Dr. Bonnie Johnson. As always, I'd have accomplished little in life without her.

    Finally, the author and SAGE would like to acknowledge the contributions of the following reviewers:

    William Boone, Miami University

    Betsy Botts, University of West Florida

    Maureen Davin, Bethune Cookman College

    Cheryl Van De Mark, University of Central Florida

    Debra Dirksen, Western New Mexico University

    Carolyn Doolittle, Baker University

    Karen Eifler, University of Portland

    Robert Ferrera, Notre Dame de Namur University

    James Gasparino, Florida Gulf Coast University

    Marva Gavins, University of Houston–Clear Lake

    Ramona Hall, Cameron University

    Martha Jane Harris, Texas A & M University–Texarkana

    Susan Hibbard, Florida Gulf Coast University

    Adria Karle, Florida International University

    Patricia Lutz, Kutztown University

    Kathleen Makuch, Wilkes University

    Elda E. Martinez, University of The Incarnate Word

    Saramma Mathew, Troy University

    Nelson Maylone, Eastern Michigan University

    David McMullen, Bradley University

    Gayle Mindes, DePaul University

    Cindi Nixon, Francis Marion University

    Judith Presley, Tennessee State University

    Germaine Taggart, Fort Hays State University

    Jahnette Wilson, University of Houston

    Eunmi Yang, Stonehill College

    About the Author

    Bruce B. Frey, PhD is an award-winning teacher and scholar at The University of Kansas. His areas of research include classroom assessment and instrument development. Dr. Frey is the author of the popular introductory statistics book, Statistics Hacks, and the co-editor of the Encyclopedia of Research Design. In his spare time, he collects comic books, and is especially fond of 1960's DC stories wherein super-pets turn against their superhero masters.

  • Glossary

    Accommodations:

    Physical and procedural changes in testing conditions, such as a separate room, lighting changes, more time, and so on.

    Analytic Approach:

    A scoring method that evaluates each of the pieces or steps of a product or performance.

    Authentic Assessment:

    Assessment that includes tasks, content, expectations, and evaluation methods similar to those that are valued outside of the classroom.

    Classroom Assessment:

    The systematic collection of information about students designed, administered, and scored by teachers or students.

    Coefficient Alpha:

    A number generally ranging from 0 to 1 indicating the level of internal reliability for a group of test items. The closer to 1, the higher the reliability.

    Construct:

    The invisible trait that one wishes to assess. Pronounced CON-struct. In the classroom, constructs are typically knowledge, understanding, skills, attitudes, traits, and so on. In the broader world of educational and psychological measurement, constructs include things like intelligence, depression, learning disabilities, aptitude, and personality.

    Construct Validity:

    The broadest category of validity. A construct validity argument is that performance on the assessment reflects the underlying knowledge, skill, or trait that one intends to measure.

    Constructed-Response Items:

    Assessment tasks that ask students to create a complex written answer or a complex, frequently creative, product.

    Content Validity:

    A content validity argument is that the items on a test are a fair and representative sample of the items that could or should be on the test. For example, teachers may have a well-defined set of instructional objectives that an assessment should cover.

    Content Validity Ratio:

    A number ranging from 0 to 1 that indicates the extent to which an item is essential when covering a particular topic.

    Criterion-Based Validity:

    A criterion validity argument is that performance on an assessment is related to scores on another assessment in a way that makes sense.

    Criterion-Referenced:

    An approach to score interpretation that judges performance against some criterion (such as instructional objectives, percentage of points possible, and so on).

    Criterion Validity:

    A type of validity argument that provides evidence that the scores on one test correlate with the scores on some other measure.

    Cut Scores:

    Specific scores that define categories of performance.

    Dichotomous Scoring:

    A scoring system with only two possible scores.

    Difficulty Index:

    The proportion of students who answered a question correctly.

    Distribution:

    A set of scores and their associated frequencies.

    Effect Sizes:

    Numbers that represent the strength of relationships between variables. Effect sizes are used in educational research to judge, for example, the effectiveness of an instructional approach.

    Feedback Intervention Theory:

    A theory suggesting that formative assessment feedback is most effective when it is narrowly focused on specific tasks and behaviors related to success and least effective when it is broad (such as “Good work!”).

    Formative Assessment:

    Feedback produced while learning is occurring and concepts and knowledge bases are still being developed. It allows students and teachers to modify their behaviors and understanding before instruction has ended.

    Grade:

    Categories of quality or performance placed in some meaningful order.

    Grading Scale:

    A set of rules for assigning letter grades based on points or performance.

    Internal Reliability:

    Consistency in scores within the various items on a test.

    Inter-rater Reliability:

    Consistency in scores between two different scorers or raters.

    Interval Level:

    A level of measurement with equal intervals in meaning between any two adjacent scores.

    Item Difficulty Index:

    The proportion of students getting a question right. Technically, it's the average score from a single item. Calculated by dividing the number of students who got an item correct by the total number of student who took an assessment.

    Item Discrimination Index:

    A number indicating how well a single item discriminates between high scorers on a test and low scorers.

    Item Score:

    The number of points a student received for a single question or assessment task.

    Level of Measurement:

    The amount of information provided by a given scoring format. There are four levels, with nominal, where numbers are used only as names for categories, as the least informative and ratio, where scores represent equally spaced quantities and there are no possible scores below 0, as the most informative.

    Mean:

    The arithmetic average. Calculated by adding all the scores together and dividing by the number of scores in the distribution.

    Median:

    The score right in the middle of a distribution. 50% of scores are greater; 50% are lesser.

    Mode:

    The most commonly occurring score in a distribution.

    Modifications:

    Changes made for an individual student in a test in order to increase validity for that student. For example, a different version might be used, or there may be different directions.

    Nominal Level:

    A level of measurement with numbers being used only as names or labels, not as quantities.

    Norm-Referenced:

    An approach to understanding scores by comparing scores with each other. The information in a score comes from referencing what is normal.

    Normal Curve:

    A very common shape of the distribution of scores. If one graphs a moderate number of scores from almost any assessment with the scores in order along the X axis and the frequency of the scores placed along the Y axis, then the distribution tends to be symmetrical around the mean with most scores close to the mean and very few scores far from the mean.

    Number Correct:

    A common scoring system where students get a point for each correct answer.

    Objective Scoring:

    A scoring system where no judgment is involved in assigning a score. If a computer can do the scoring, it is an objective scoring system (e.g., multiple-choice tests).

    Ordinal Level:

    A level of measurement where numbers are used to show some ranking (such as listing students in order of their performance) but there is not an expectation that the intervals between ranks are equal.

    Percent Correct:

    A common scoring system that indicates the percentage of points possible that a student received. Most commonly it is the percentage of questions answered correctly.

    Percentile Rank:

    The percentage of students scoring at or below a given score.

    Performance-Based Assessment:

    An approach to assessment that requires students to perform or produce something for evaluation. It is most commonly used to assess a skill or ability.

    Primary Trait Approach:

    A common approach to scoring performance-based assessment that involves the identification of a few major constructs or traits and then judging the level of each.

    Range:

    In a distribution, the distance between the highest score and the lowest score.

    Ratio Level:

    A level of measurement that includes the characteristics of interval level measurement, with the extra requirement that there is a “true zero”; one can literally have none of the trait of interest. No negative numbers are used in ratio scoring.

    Raw Scores:

    The actual scores that students receive on a test. They have not been altered or standardized.

    Reliability:

    Consistency and precision in scores. Scores that are very close to what a student would typically receive on a given test are reliable.

    Scoring Rubrics:

    A written set of scoring rules, often in the form of a table. They provide guidance for the assignment of scores.

    Selection Item:

    An item format where the answer is provided to students and they must select it or indicate it (such as multiple-choice or matching items).

    Self-Directed Learners:

    Students who are self-managing, self-monitoring, and selfmodifying.

    Standardized Score:

    A score that had been modified from a raw score using known, standardized rules. Usually, standardized scores provide information on where a student performed in terms of standard deviations above or below the mean.

    Standardized Test:

    A test that is administered in a standard way. Sometimes the term is reserved only for large-scale “official” high-stakes tests that produce standardized scores.

    Stanines:

    Areas under a normal curve that has been sliced into nine convenient roughly equal levels of performance. The term is short for standard nines.

    Subjective Scoring:

    Scoring systems that require some human judgment.

    Subscale Scores:

    Scores of groups of items within a larger assessment that are focused on a single domain, skill, or trait.

    Summative Assessment:

    An assessment approach with the goal of summarizing student performance at the end of a period of instruction. Grades are usually assigned based on summative assessments.

    Supply Item:

    An item format where the correct answer is not provided; students must supply it.

    T Score:

    A standardized score with a mean of 50 and a standard deviation of 10.

    Table of Specifications:

    Typically, a matrix with columns and rows that provides guidance as to the nature of the items which should appear on a test in terms of content, for example, or level of Bloom's Taxonomy. These tables form the blueprint for an assessment.

    Test-Retest Reliability:

    A type of consistency in scores that are stable across time.

    Traditional Paper-and-Pencil Assessment:

    Very popular, efficient, objectively scored approach to assessment (such as multiple-choice questions, matching, true-false, fill-in-the-blank, and some short answer formats).

    Universal Design:

    The design of products and environments to be usable in a meaningful and similar way by all people.

    Universal Test design:

    An approach to test design that emphasizes accessibility and fairness for all children, regardless of gender, first language, ethnicity, or disability.

    Validity:

    The characteristic of an assessment that measures what it is supposed to measure. Supposed means that the assessment measures what you assume it does, and it also means that the assessment measures what it is intended to measure.

    ZScore:

    A standardized score that transforms a raw score by subtracting the mean from it and then dividing by the standard deviation of the score's distribution. Z scores have a mean of 0 and a standard deviation of 1.


    • Loading...
Back to Top

Copy and paste the following HTML into your website