C# (CSharp) NBoilerpipe.Filters.Heuristics Namespace

Classes

Name Description
AddPrecedingLabelsFilter Adds the labels of the preceding block to the current block, optionally adding a prefix.
ArticleMetadataFilter
ContentFusion
DocumentTitleMatchClassifier Marks NBoilerpipe.Document.TextBlock s which contain parts of the HTML <TITLE> tag, using some heuristics which are quite specific to the news domain.
KeepLargestBlockFilter Keeps the largest NBoilerpipe.Document.TextBlock only (by the number of words). In case of more than one block with the same number of words, the first block is chosen. All discarded blocks are marked "not content" and flagged as NBoilerpipe.Labels.DefaultLabels.MIGHT_BE_CONTENT . Note that, by default, only TextBlocks marked as "content" are taken into consideration.
LabelFusion Fuses adjacent blocks if their labels are equal.
SimpleBlockFusionProcessor Merges two subsequent blocks if their text densities are equal.