write a program using hadoop map reduce

In each case, write a program using Hadoop (map/reduce) and the language of your choice, to:

  1. Find the distribution of bigrams for your dataset (only digits, not the decimal point A bigram is 2 successive digits/letters/etc. For example, the string 938193 has 3 (93, 81, 93). The distribution would be: 93 – 2, and 81 – 1 . Assume that the data set is large enough so that bigrams at the boundaries of nodes are not significant (most likely you will have only 1 mapper in any case since this is a very small dataset, so it won’t be an issue.

Your submission should be copied into MSWord, and should include (in one file):

  1. Your “mapper” program
  2. The K/V value pairs emitted by your mapper
  3. Your “reducer” program
  4. The K/V pairs emitted by your reducer
  5. The Answers

Data:

First 1000 Digits of Pi:

3.14159265358979323846264338327950288419716939937510

58209749445923078164062862089986280348253421170679

82148086513282306647093844609550582231725359408128

48111745028410270193852110555964462294895493038196

44288109756659334461284756482337867831652712019091

45648566923460348610454326648213393607260249141273

72458700660631558817488152092096282925409171536436

78925903600113305305488204665213841469519415116094

33057270365759591953092186117381932611793105118548

07446237996274956735188575272489122793818301194912

98336733624406566430860213949463952247371907021798

60943702770539217176293176752384674818467669405132

00056812714526356082778577134275778960917363717872

14684409012249534301465495853710507922796892589235

42019956112129021960864034418159813629774771309960

51870721134999999837297804995105973173281609631859

50244594553469083026425223082533446850352619311881

71010003137838752886587533208381420617177669147303

59825349042875546873115956286388235378759375195778

1857780532171226806613001927876611195909216420198