Map Reduce - Distribution and Reliability

Distribution and Reliability

MapReduce achieves reliability by parceling out a number of operations on the set of data to each node in the network. Each node is expected to report back periodically with completed work and status updates. If a node falls silent for longer than that interval, the master node (similar to the master server in the Google File System) records the node as dead and sends out the node's assigned work to other nodes. Individual operations use atomic operations for naming file outputs as a check to ensure that there are not parallel conflicting threads running. When files are renamed, it is possible to also copy them to another name in addition to the name of the task (allowing for side-effects).

The reduce operations operate much the same way. Because of their inferior properties with regard to parallel operations, the master node attempts to schedule reduce operations on the same node, or in the same rack as the node holding the data being operated on. This property is desirable as it conserves bandwidth across the backbone network of the datacenter.

Implementations are not necessarily highly reliable. For example, in Hadoop the NameNode is a single point of failure for the distributed filesystem.

Read more about this topic:  Map Reduce

Famous quotes containing the word distribution:

    My topic for Army reunions ... this summer: How to prepare for war in time of peace. Not by fortifications, by navies, or by standing armies. But by policies which will add to the happiness and the comfort of all our people and which will tend to the distribution of intelligence [and] wealth equally among all. Our strength is a contented and intelligent community.
    Rutherford Birchard Hayes (1822–1893)