admin@onlinelearningcenter.in (+91) 7 999 01 02 03

What is MapReduce? How Does It Work

What is MapReduce? 

As we know how Hadoop split the data into blocks and stores them in HDFS. It’s time to process those data and that is where MapReduce comes into the picture. 

It is a programming model which is used for processing huge amounts of data. 

It works in two phases- map phase & reduce phase. 

Map phase deals with splitting and mapping the data whereas, Reduce phase deals with reducing the data. 

There are a few more important phases like Key-value pair Generator(InputFormatter via RecordReader), shuffle, sorting, groupby 

 

𝐇𝐨𝐰 𝐌𝐚𝐩𝐑𝐞𝐝𝐮𝐜𝐞 𝐖𝐨𝐫𝐤𝐬:

Assume we have a file with some content in it and let’s try to find out the word count of it. 

We would first create a mapper java class and inside it, we need to create a map method. This map method will bring the data in the form of key and value pair given by InputFormatter(RecordReader)

map(k,v) 

When a map receives all the input data in key-value pair, the next step would be splitting the data and emit (word,1) where word is each unique word.

 

Mapper Code:

public class MyMapper extends Mapper<LongWritable, Text, Text, IntWritable>

{
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

          String line = value.toString();

          System.out.println(“key= “+key);

          System.out.println(“value= “+value);

          String words[] = line.split(“ “);

          for(String word: words){
               context.write(new Text(word), new IntWritable(1));
          }
     }
}

 

Once the mapper code has been executed the code, we get a lot of (word,1) which would be then shuffled, sorted, and grouped 

Example-

apple 1
apple 1
ball 1
ball 1
apple 1

 Now we have a reducer phase, where we receive the word as a key and all the values as iterable. Now the job is to add those 1’s together.

Reducer Code:

Public class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable>

{
     protected void reduce(Text word, Iterable counts, Context context) throws IOException, InterruptedException){

        Int total = 0;

        for(IntWritable iw : counts){

           total = total + iw.get();

        }
        context.write(word, new IntWritable(total));
     }
}

 Hence, the final output would be:

apple, 3
ball, 2

 

 

 

 

Published By : Sumeet Vishwakarma
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Comments

Jquery Comments Plugin