• Entry
  • Reader's guide
  • Entries A-Z
  • Subject index

Data cleaning, data cleansing, or data scrubbing is the process of improving the quality of data by correcting inaccurate records from a record set. The term specifically refers to detecting and modifying, replacing, or deleting incomplete, incorrect, improperly formatted, duplicated, or irrelevant records, otherwise referred to as “dirty data,” within a database. Data cleaning also includes removing duplicated data within a database.

Data provided for communication research often rely on manual data entry, performed by humans, and therefore are subject to error introduction. Because of this manual process, the data require cleaning. The need for such cleaning increases when data come from multiple sources and a standard schema was not used across sources. The goal of data cleaning is to provide a data set that is ...

    • Loading...
    locked icon

    Sign in to access this content

    Get a 30 day FREE TRIAL

    • Watch videos from a variety of sources bringing classroom topics to life
    • Read modern, diverse business cases
    • Explore hundreds of books and reference titles