Who Creates Datasets, and How?

  • Add to list Added to list Added
  • Cite
  • Share
  • Embed
  • Download PDFopens in new window

Overview

While we may not consciously think about what information we use to make our decisions, the information we gather throughout our daily lives guides our choices, no matter how big or small the question is. This information can easily be turned into data, distinct pieces of information that have been formatted in a certain way. This data can be collected in multiple ways—observations, direct conversation with subjects, through anonymous large-scale surveys, etc.—to weave together a holistic record of our world. Data can be mentally formatted (lists we keep in our minds) or physically formatted (tally marks on a piece of paper or numbers inside a spreadsheet). The ways in which we gather or record this information can determine the success or consistency of our choices.

We can also gather information as the basis for analyses that answer even bigger questions, such as those that guide research proposals or projects.

Data may be everywhere, but, like oil, it is a resource that is most useful when it has been refined. Many different people and organizations take the raw data that exists in the world and package it into manageable, well-documented bundles that can easily be used. This section will guide you through the theory of collecting data, how individuals and organizations work to build datasets, as well as the motivations for bringing that information together.

A picture shows the surface on the earth in the outer space.

Source: Photo by NASA on Unsplash.

It is likely that if you are reading this, you have worked with a dataset before. However, while everyone has worked with some kind of curated dataset in order to complete some kind of analysis, not everyone is familiar with the process of collecting data, including the steps we as data creators can follow to make the process a success, honing the focus of the data, and thinking through the question of what kind of data is even needed. This lesson will go through each of these steps, as well as discuss how the recipe for a dataset may change based on the institution gathering the information.

This Skill will help you start from the very beginning, with the decisions and processes that go into generating data, as well as describe recommended practices for using, managing, storing, and sharing that information.

Suggested Readings
McKenney, N. R., & Bennett, C. E. (1994). Issues regarding data on race and ethnicity: The census bureau experience. Public Health Reports, 109(1), 1625.
Morrow, J. (2021). Be data literate: The data literacy skills everyone needs to succeed. Kogan Page Limited.
Olson, W. (2012). Data collection: Key debates and methods in social science research. Sage Publications.
Roberts, L. D. (2015). Ethical issues in conducting qualitative research in online communities. Qualitative Research in Psychology, 12(3), 314325.