Name |
Description |
ArticleExtractor |
A full-text extractor which is tuned towards news articles. |
DefaultExtractor |
A quite generic full-text extractor. |
ExtractorBase |
The base class of Extractors. |
KeepEverythingExtractor |
Marks everything as content. |
KeepEverythingWithMinKWordsExtractor |
A full-text extractor which extracts the largest text component of a page. |
LargestContentExtractor |
A full-text extractor which extracts the largest text component of a page. |
NumWordsRulesExtractor |
A quite generic full-text extractor solely based upon the number of words per block (the current, the previous and the next block). |