Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Web Development
  3. Linux, Apache, MySQL, PHP
  4. use Nutch to crawl Microsoft Dataset

use Nutch to crawl Microsoft Dataset

Scheduled Pinned Locked Moved Linux, Apache, MySQL, PHP
tutorialcomsysadminlinuxcollaboration
1 Posts 1 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V Offline
    V Offline
    varunpandeyengg
    wrote on last edited by
    #1

    Hey Guys, I am new to Nutch. I am part of a IR research team & need to create a setup where in I need to crawl Microsoft's Dataset with Nutch. After googling for a while, I didn't get any tutorial or help. Can anyone guide me for the same? I am using Nutch 1.4 on Ubuntu 11.10 & Eclipse 3.7. Till now I am able to crawl public network from my Nutch setup integrated with Eclipse... Is there any tutorial or wiki explaining how I can achieve this - or any other dataset kept on File System? If not, can you help me please.... Thanks in advance.

    Cheers!!! - Varun

    1 Reply Last reply
    0
    Reply
    • Reply as topic
    Log in to reply
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes


    • Login

    • Don't have an account? Register

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Recent
    • Tags
    • Popular
    • World
    • Users
    • Groups