C# Class NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter

Keeps the largest NBoilerpipe.Document.TextBlock only (by the number of words). In case of more than one block with the same number of words, the first block is chosen. All discarded blocks are marked "not content" and flagged as NBoilerpipe.Labels.DefaultLabels.MIGHT_BE_CONTENT . Note that, by default, only TextBlocks marked as "content" are taken into consideration.
Inheritance: BoilerpipeFilter
Show file Open project: oganix/NBoilerpipe

Public Properties

Property Type Description
INSTANCE NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter
INSTANCE_EXPAND_TO_SAME_TAGLEVEL NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter

Public Methods

Method Description
KeepLargestBlockFilter ( bool expandToSameLevelText ) : System.Collections.Generic
Process ( NBoilerpipe.Document.TextDocument doc ) : bool

Method Details

KeepLargestBlockFilter() public method

public KeepLargestBlockFilter ( bool expandToSameLevelText ) : System.Collections.Generic
expandToSameLevelText bool
return System.Collections.Generic

Process() public method

public Process ( NBoilerpipe.Document.TextDocument doc ) : bool
doc NBoilerpipe.Document.TextDocument
return bool

Property Details

INSTANCE public static property

public static NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter INSTANCE
return NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter

INSTANCE_EXPAND_TO_SAME_TAGLEVEL public static property

public static NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter INSTANCE_EXPAND_TO_SAME_TAGLEVEL
return NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter