org.apache.nutch.parse
Interface Parser
- All Superinterfaces:
- Configurable, Pluggable
- All Known Implementing Classes:
- ExtParser, HtmlParser, JSParseFilter, MSBaseParser, MSExcelParser, MSPowerPointParser, MSWordParser, OOParser, PdfParser, RSSParser, SWFParser, TextParser, ZipParser
public interface Parser
- extends Pluggable, Configurable
A parser for content generated by a Protocol
implementation. This interface is implemented by extensions. Nutch's core
contains no page parsing code.
X_POINT_ID
static final String X_POINT_ID
- The name of the extension point.
getParse
Parse getParse(Content c)
- Creates the parse for some content.
Copyright © 2006 The Apache Software Foundation