Overview
Previous post we went through basics of big data processing and Hadoop. This post I describe some basic map reduce program structure. Here I have used example of counting words in large number of books. I have configured a two node cluster with distributed file system. Here we are using three classes as,
- Mapper class
- Reducer Class
- Main words Count Class
Assume I have 10 documents having only words “hello” and “world”. Documents will be shared by two nodes. How much documents for each node will be decide by Hadoop framework.