Machine Learning [ECNU] Assignment 1

Use a crawler to get at least 20 webpages from a website.

Count theoccurrences of words in the webpages on Hadoop.

Hand in:

  1. Each one should crawl different websites, list the website URL, as well as the URLsof the crawled webpages.
  2. Count the word occurrence on Hadoop, code in both JAVA and another language such asPig Latin. print out your code.

  3. Print out your result.

Home work due: 4/12

You are allowed toform a group of no more than 4 fellow students.