subject
Engineering, 07.03.2020 05:43 wbrandi118

JAVA HADOOP MAPREDUCE

Modify the WordCount program so it outputs the wordcount for each distinct word in each file. So the output of this DocWordCount program should be of the form ‘wordfilename count’, where ‘’ serves as a delimiter between word and filename and tab serves as a delimiter between filename and count. Submit your source code in a file named DocWordCount. java.

Explanation: Consider two simple files file1.txt and file2.txt. $ echo "Hadoop is yellow Hadoop" > file1.txt $ echo "yellow Hadoop is an elephant" > file2.txt Running ‘DocWordCount. java’ on these two files will give an output similar to that below, where is a delimiter.

Output of DocWordCount. java

yellowfile2.txt 1

Hadoopfile2.txt 1

isfile2.txt 1

elephantfile2.txt 1

yellowfile1.txt 1

Hadoopfile1.txt 2

isfile1.txt 1

anfile2.txt 1

Initial code that needs to be modified:

package org. myorg;

import java. io. IOException;
import java. util. regex. Pattern;
import org. apache. hadoop. conf. Configured;
import org. apache. hadoop. util. Tool;
import org. apache. hadoop. util. ToolRunner;
import org. apache. log4j. Logger;
import org. apache. hadoop. mapreduce. Job;
import org. apache. hadoop. mapreduce. Mapper;
import org. apache. hadoop. mapreduce. Reducer;
import org. apache. hadoop. fs. Path;
import org. apache. hadoop. mapreduce. lib. input. FileInputFormat;
import org. apache. hadoop. mapreduce. lib. output. FileOutputFormat;
import org. apache. hadoop. io. IntWritable;
import org. apache. hadoop. io. LongWritable;
import org. apache. hadoop. io. Text;

public class WordCount extends Configured implements Tool {

private static final Logger LOG = Logger .getLogger( WordCount. class);

public static void main( String[] args) throws Exception {
int res = ToolRunner .run( new WordCount(), args);
System .exit(res);
}

public int run( String[] args) throws Exception {
Job job = Job .getInstance(getConf(), " wordcount ");
job. setJarByClass( this .getClass());

FileInputFormat. addInputPaths(job, args[0]);
FileOutputFormat. setOutputPath(job, new Path(args[ 1]));
job. setMapperClass( Map .class);
job. setReducerClass( Reduce .class);
job. setOutputKeyClass( Text .class);
job. setOutputValueClass( IntWritable .class);

return job. waitForCompletion( true) ? 0 : 1;
}

public static class Map extends Mapper {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();

private static final Pattern WORD_BOUNDARY = Pattern .compile("\\s*\\b\\s*");

public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {

String line = lineText. toString();
Text currentWord = new Text();

for ( String word : WORD_BOUNDARY .split(line)) {
if (word. isEmpty()) {
continue;
}
currentWord = new Text(word);
context. write(currentWord, one);
}
}
}

public static class Reduce extends Reducer {
@Override
public void reduce( Text word, Iterable counts, Context context)
throws IOException, InterruptedException {
int sum = 0;
for ( IntWritable count : counts) {
sum += count. get();
}
context. write(word, new IntWritable(sum));
}
}
}

ansver
Answers: 1

Another question on Engineering

question
Engineering, 04.07.2019 03:10
What precautions should you take to prevent injuries when dealing with heavy loads?
Answers: 1
question
Engineering, 04.07.2019 18:10
If a particle moves along a path such that r : (3 sin t) m and ? : 2t rad, where t is in seconds. what is the particle's acceleration in m/s in 4 seconds? a)- 16.43 b)- 16.29 c)- 15.21 d)- 13.79
Answers: 1
question
Engineering, 04.07.2019 18:10
Apump is used to circulate hot water in a home heating system. water enters the well-insulated pump operating at steady state at a rate of 0.42 gal/min. the inlet pressure and temperature are 14.7 lbf/in.2, and 180°f, respectively; at the exit the pressure is 60 lbf/in.2 the pump requires 1/15 hp of power input. water can be modeled as an incompressible substance with constant density of 60.58 lb/ft3 and constant specific heat of 1 btu/lb or. neglecting kinetic and potential energy effects, determine the temperature change, in °r, as the water flows through the pump.
Answers: 1
question
Engineering, 04.07.2019 18:10
Asingle-geared blanking press has a stroke of 200 mm and a rated capacity of 320 kn. a cam driven ram is assumed to be capable of delivering the full press load at constant force during the last 15 percent of a constant-velocity stroke. the camshaft has an average speed of 90 rev/min and is geared to the flywheel shaft at a 6: 1 ratio. the total work done is to include an allowance of 16 percent for friction a) estimate the maximum energy fluctuation b) find the rim weight for an effective diameter of 1.2 m and a coefficient of speed fluctuation of 0.10
Answers: 1
You know the right answer?
JAVA HADOOP MAPREDUCE

Modify the WordCount program so it outputs the wordcount for each...
Questions
question
Mathematics, 28.02.2021 20:30
question
Mathematics, 28.02.2021 20:30
question
Mathematics, 28.02.2021 20:30
question
History, 28.02.2021 20:40