Apache Software Foundation is a decentralized community of developers. Apache Hadoop is an open-source software developed by Apache that facilitates an inexpensive method of analyzing huge amounts of information—“big data” sets. There are other software items that allow similar analysis, but Apache Hadoop is the most widely used representative of this class of software. This class of software is important because of its connection to emerging questions of big data, the potential uses and misuses of which are already evident and will become more important as the uses of mass data sets become more important over time.

Hadoop implements a framework for distributed parallel processing of data, most applicable on large amounts of unstructured unmodeled data—that is, big data. The term distributed refers to the ...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles