
Dzmitry Sukhavarau, 123RF
Is Hadoop the new HPC?
Where Worlds Collide
Apache Hadoop [1] has been generating a lot of headlines lately. For those who are not aware, Hadoop is an open source project that provides a distributed filesystem and MapReduce framework for massive amounts of data. The primary hardware used for Hadoop comprises clusters of commodity servers. File sizes can easily be in the petabyte range and use hundreds or thousands of compute servers.
Hadoop also has many components that live on top of the core Hadoop filesystem (HDFS) and MapReduce mechanism. Interestingly, high-performance computing (HPC) and Hadoop clusters share some features, but how much crossover you will see between the two disciplines depends on the application. Hadoop's strengths lie in the sheer size of data it can process and its high redundancy and toleration of node failures without halting user jobs.
Many organizations use Hadoop on a daily basis, including Yahoo!, Facebook, American Airlines, eBay, and others. Hadoop is designed to allow users to manipulate large unstructured or unrelated data sets. It is not intended to be a replacement for a relational database management system (RDMS). For example, Hadoop can be used to scan weblogs, online transaction data, or web content, all of which are growing each year.
MapReduce
To many HPC users, MapReduce is a methodology used by Google to process large amounts of web data. Indeed, the now famous Google MapReduce paper [2] was the inspiration for Hadoop.
The MapReduce idea is quite simple and, when used in parallel, can provide extremely powerful search and compute capabilities. Two major steps constitute the MapReduce process. If you have not figured it out, they are the "Map" step followed by a "Reduce" step. Some people are surprised to learn that mapping is done all
...Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.
