Skip to main content icon/video/no-internet

Standardized Score

There are times when it is important to compare the scores of different types of data that are scored in different units or to compare scores within a sample or a population. One common example is that students typically want to understand how their test score compares with the scores of the rest of the class. Or they might want to understand how their scores compare across different classes, which could each be scored using somewhat different methods. In either case, the challenge is somewhat like attempting to compare apples with oranges. A standardized score is calculated on an arbitrary (but universal) scale, which has the effect of turning the apples and oranges into pears, scores that can be more easily evaluated and interpreted. In other words, a raw score can be converted from one measurement scale to another to facilitate data comparison. This entry discusses comparative and standardization methods.

Comparative Methods

Several different methods are used to create scores that are comparable with one another; however, some of the more familiar methods can have limited comparative accuracy. For example, there are 23 students who are each taking mathematics, history, and psychology midterm exams. One student, Ben, scores a 53 in math, 77 in history, and 88 in psychology. Comparing these scores, it seemed to Ben that he is not very good at math. But these are raw scores that can tell us only the number of questions that were correctly answered on each examination. Without a frame of reference, it is not possible to determine Ben's standing in each class relative to the other students or to understand the relationship of his performance across his different classes. One way to understand Ben's performance compared with the other students is to use the range of examination scores within each class. In this example, the scores of all the students who took the math examination ranged from a low of 48 to a high of 57. This means Ben's score was just above the middle of that range, which might indicate his score is close to average when compared with the rest of the class. The history exam scores ranged from 75 to 99, so Ben's score was nearly at the bottom of that range, seeming to be a very poor score compared with the other students. And, the psychology scores ranged from 86 to 90, which puts Ben's score precisely in the middle of the range of his classmates’ scores.

A somewhat more useful way of calculating scores for comparative purposes would be to calculate the percentage scores for each examination (i.e., number of correct answers divided by total possible correct answers, then multiplied by 100). This type of calculation might provide more equitable scores for a comparison within and between classes. Because the percentage scores (not to be confused with percentile scores) are dependent on the total number of questions on the examination, converting Ben's raw score of 53 on his math examination to a percentage might show that his score compares differently both within and between his classes than when using simple range values for comparison. Some might immediately think that calculating percentages solves the comparison problem: not necessarily and, most likely, not adequately for accurately understanding Ben's performance within each class, or his overall performance when comparing across his classes. Ben's math exam had a total of 150 questions, so his raw score of 53 converts to 35%. His history test had 100 questions, so his raw score of 77 remains at 77%. And, his psychology test had 90 questions, turning his raw score of 88 to 98%. When comparing scores calculated as a percentage of the total instead of comparing raw scores, it seems that Ben's math score is even worse than he originally thought. But, Ben's math instructor creates very difficult exams and on this exam everyone in the class scored from 32% to 38%. Consistent with the comparison of the range of scores, his score of 35% falls precisely in the center of the class percentages, which could mean he is actually an average student. On the history exam, the other students scored from 75% to 99%. Ben's score of 77% is well below average, and it is still consistent with his ranking using the range of scores. But, on the psychology exam the other students scored from 92% to 100%, which indicates Ben's score of 98% is close to the top of the class, a fairly substantial increase from the comparison using the range of scores. However, that there are differences between the two comparison methods makes it very difficult to determine which set of comparisons are correct. Furthermore, while there are indicators of the spread of the scores from each exam using the upper and lower ends of either the raw or the percentage scores, neither method provides information about the variability of the scores. If only one student scored lower than Ben on the math exam and the rest scored very close to the top of the range, Ben's score would not be average after all. Although his score was numerically in the middle of both the range and percentage scores, he actually got the second lowest score of all the 23 students. And, because the range and the percentage scores are determined only within each group, both types of scores are insufficient for comparing Ben's scores from each of the classes. These simple examples illustrate some of the important pitfalls of using some of the possible methods of comparisons.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading