Showing posts with label Hadoop. Show all posts
Showing posts with label Hadoop. Show all posts

Sunday, December 8, 2013

Simple MapReduce Program for Hadoop

Overview

Previous post we went through basics of big data processing and Hadoop. This post I describe some basic map reduce program structure. Here I have used example of counting words in large number of books. I have configured a two node cluster with distributed file system. Here we are using three classes as,
  1. Mapper class 
  2. Reducer Class 
  3. Main words Count Class 
Assume I have 10 documents having only words “hello” and “world”. Documents will be shared by two nodes. How much documents for each node will be decide by Hadoop framework.

Thursday, November 28, 2013

Big Data processing with Apache Hadoop

Overview

Earlier software applications have been developed to run on single computer. Some examples are calculators, word processing packages, drawing applications etc. With the introduction of client server architecture vast amount of software systems were developed with databases. Most of web applications, business systems are built databases and concurrency, transaction handling are some new terms introduced with the architecture. Now world has moved to new computer era with concepts of high performance computing.