Recrawl script for nutch

Note: The information contained in this post may be outdated!

Here’s a small shell script for doing the recrawl process in nutch. You might have to change certain lines because I did some customizations, but it should work for you too 🙂

recrawl.sh

Beteilige dich an der Unterhaltung

2 Kommentare

  1. It looks like you run nutch parse, but your fetcher isn’t called with -noParsing, meaning that the call to parse isn’t needed as fetcher will parse by default.

  2. I want to use the automatic crawling in nutch-1.0 by using timer.
    How can i use your recrawl script?
    Please give me a few hint.

Schreib einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.