Skip to main content icon/video/no-internet

Samples are often collected in such a way that the exact value of one or more cases is unknown. Such missing information is referred to as censored data. In one source of such censored data, values are known to exceed or fall below some limit. Often, for example, in a study based on the survival times of laboratory animals, a protocol requires that all data be collected within a specified period of time. For an appreciably sized subset of subjects, an exact survival time may not be known, simply because the study ends before the event of interest (such as the animal's death) could be observed. Because the exact survival times for those who lived longer than the length of the study are not known, this illustrates the generation of right-censored data. Another common example occurs when the age of study participants is recorded in exact years for most subjects, but the ages of those younger than 18 are all recorded as less than 18 years. Because the exact ages of persons younger than 18 years are not known, this is an example of leftcensored data. In general, when some, but not all, values are recorded on a continuous scale of measurement, special techniques are required that differ from the many of the common procedures based on what is called maximum likelihood (ML) estimation.

Work with censored data is facilitated by a notational convention for observed values called order statistics. The sample median is an example of an order statistic. Unlike the mean of a size n sample, which is represented by placing a bar above a letter, as in X, the subscripted symbol X(1/2n + 1 =2), when n is an odd number, or [Xi(n=2) + X(1/2n=2 + 1) =2, when n is an even number, designates the median. More generally, by calling on parenthesized subscripts to denote ascending variate value magnitudes, the order of variate values can be designated. For example, the smallest and largest values are denoted respectively as X(1) and X(n). (In case of a tied value, the counterpart of a coin toss distinguishes between variate designations, say, between X(i) and X(i + 1).)

Censored and Incomplete Ordered Measurements

The value placed within some ith order statistic's parenthesized subscript denotes a quantity called a rank. Nonparametric inferential methods that are based on ranks often help trade off statistical power and/or efficiency, on the one hand, to gain robustness, in other words, insensitivity to departures from assumptions, on the other. In the context of the many applications that make use of censored samples, order statistic-based methods have been developed that are both robust and efficient. This is accomplished by statistical approaches that differ from ML procedures, such as the wellknown Kaplan-Meier (KM) estimator.

The title of Kaplan and Meier's classic paper, ‘Nonparametric Estimation From Incomplete Observations,’ refers to incomplete observations, not censored observations. This distinction is central to an understanding of the epidemiological roles that order statistics often play and can be illustrated by a common laboratory experiment. In toxicological studies, measurements are often entered in a three-column spreadsheet as shown in Figure 1. All but the last entry (across from Day 734 in Column A) record the dates when one or more animals die. The last entries record when animals are sacrificed to determine if the targeted tumor is present at the end of the study's data-gathering stage, Xs. In Column B, across from days-survived column entries, the daily number of natural or sacrifice deaths where, at autopsy, the targeted tumor is found is recorded, here as 7. Column C records the number of deaths that day deemed to be non-toxic-substance attributable (because at autopsy there is no trace left of the targeted tumor), here as 14.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading