C# (CSharp) NBoilerpipe.Filters.Simple Namespace

Classes

Name Description
BoilerplateBlockFilter Removes NBoilerpipe.Document.TextBlock s which have explicitly been marked as "not content".
InvertedFilter Reverts the "isContent" flag for all NBoilerpipe.Document.TextBlock s
LabelToContentFilter Marks all blocks that contain a given label as "content".
MinClauseWordsFilter Keeps only blocks that have at least one segment fragment ("clause") with at least k words (default: 5).
MinWordsFilter Keeps only those content blocks which contain at least k words.
SplitParagraphBlocksFilter Splits TextBlocks at paragraph boundaries.