Skip to main content icon/video/no-internet

Cell Suppression

Under certain circumstances, it is considered necessary to withhold or suppress data in certain cells in a published statistical table. This is often done when particular estimates are statistically unreliable or when the information contained could result in public disclosure of confidential identifiable information. Suppression for reasons of statistical reliability involves consideration of sampling error as well as the number of cases upon which the cell estimate is based. Suppression to avoid the disclosure of confidential information in tabular presentations involves many additional considerations.

Cell suppression may involve primary suppression, in which the contents of a sensitive cell are withheld; or if the value for that cell can be derived from other cells in the same or other tables, secondary or complementary suppression. In the latter instance, the contents of nonsensitive cells as well those of the sensitive cells are suppressed. Sensitive cells are identified as those containing some minimum number of cases. In an establishment survey, for example, a cell size of 2 would be regarded as sensitive because it could reveal to one sample establishment (included in the tabulation and knowing its contribution to an estimate reported in the table) the value of a variable reported by another establishment known to have participated in the survey. Often, the minimum cell size for suppression is considerably higher than 2, depending upon such factors as total sample size, sampling ratio, and potential harm to survey participants resulting from disclosure.

Once sensitive cells have been identified, there are some options to protect them from disclosure: (a) restructure the table by collapsing rows or columns until no sensitive cells remain, (b) use cell suppression, (c) apply some other disclosure limitation method, or (d) suppress the entire planned table.

When primary and complementary suppressions are used in any table, the pattern of suppression should be audited to check whether the algorithms that select the suppression pattern permit estimation of the suppressed cell values within “too close” of a range. The cell suppression pattern should also minimize the amount of data lost as measured by an appropriate criterion, such as minimum number of suppressed cells or minimum total value suppressed. If the information loss from cell suppression is too high, it undermines the utility of the data and the ability to make correct inferences from the data. Cell suppression does create missing data in tables in a nonrandom fashion, and this harms the utility of the data.

In general, for small tables, it is possible to select manually cells for complementary suppression and to apply audit procedures to guarantee that the selected cells adequately protect the sensitive cells. However, for large-scale survey publications having many interrelated, higher-dimensional tables, the selection of a set of complementary suppression cells that are optimal is an extremely complex problem. Optimality in cell suppression is achieved by selecting the smallest number of cells to suppress (to decrease information loss) while ensuring that confidential information is protected from disclosure.

Stephen J.Blumberg

Further Readings

GonzalezJ.R., and CoxL. H.Software for tabular data protection. Statistics in Medicine24 (2005)

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading