use Nutch to crawl Microsoft Dataset
Linux, Apache, MySQL, PHP
1
Posts
1
Posters
0
Views
1
Watching
-
Hey Guys, I am new to Nutch. I am part of a IR research team & need to create a setup where in I need to crawl Microsoft's Dataset with Nutch. After googling for a while, I didn't get any tutorial or help. Can anyone guide me for the same? I am using Nutch 1.4 on Ubuntu 11.10 & Eclipse 3.7. Till now I am able to crawl public network from my Nutch setup integrated with Eclipse... Is there any tutorial or wiki explaining how I can achieve this - or any other dataset kept on File System? If not, can you help me please.... Thanks in advance.
Cheers!!! - Varun