org.apache.nutch.crawl
Class LinkDbFilter

java.lang.Object
  extended by org.apache.nutch.crawl.LinkDbFilter
All Implemented Interfaces:
Closeable, JobConfigurable, Mapper

public class LinkDbFilter
extends Object
implements Mapper

This class provides a way to separate the URL normalization and filtering steps from the rest of LinkDb manipulation code.

Author:
Andrzej Bialecki

Field Summary
static org.apache.commons.logging.Log LOG
           
static String URL_FILTERING
           
static String URL_NORMALIZING
           
static String URL_NORMALIZING_SCOPE
           
 
Constructor Summary
LinkDbFilter()
           
 
Method Summary
 void close()
           
 void configure(JobConf job)
           
 void map(WritableComparable key, Writable value, OutputCollector output, Reporter reporter)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

URL_FILTERING

public static final String URL_FILTERING
See Also:
Constant Field Values

URL_NORMALIZING

public static final String URL_NORMALIZING
See Also:
Constant Field Values

URL_NORMALIZING_SCOPE

public static final String URL_NORMALIZING_SCOPE
See Also:
Constant Field Values

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

LinkDbFilter

public LinkDbFilter()
Method Detail

configure

public void configure(JobConf job)
Specified by:
configure in interface JobConfigurable

close

public void close()
Specified by:
close in interface Closeable

map

public void map(WritableComparable key,
                Writable value,
                OutputCollector output,
                Reporter reporter)
         throws IOException
Specified by:
map in interface Mapper
Throws:
IOException


Copyright © 2006 The Apache Software Foundation