C# 클래스 NBoilerpipe.Filters.English.KeepLargestFulltextBlockFilter

Keeps the largest NBoilerpipe.Document.TextBlock only (by the number of words). In case of more than one block with the same number of words, the first block is chosen. All discarded blocks are marked "not content" and flagged as NBoilerpipe.Labels.DefaultLabels.MIGHT_BE_CONTENT . As opposed to NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter , the number of words are computed using HeuristicFilterBase.GetNumFullTextWords(NBoilerpipe.Document.TextBlock) , which only counts words that occur in text elements with at least 9 words and are thus believed to be full text. NOTE: Without language-specific fine-tuning (i.e., running the default instance), this filter may lead to suboptimal results. You better use NBoilerpipe.Filters.Heuristics.KeepLargestBlockFilter instead, which works at the level of number-of-words instead of text densities.
상속: NBoilerpipe.Filters.English.HeuristicFilterBase, BoilerpipeFilter
파일 보기 프로젝트 열기: oganix/NBoilerpipe

공개 프로퍼티들

프로퍼티 타입 설명
INSTANCE KeepLargestFulltextBlockFilter

공개 메소드들

메소드 설명
Process ( NBoilerpipe.Document.TextDocument doc ) : bool

메소드 상세

Process() 공개 메소드

public Process ( NBoilerpipe.Document.TextDocument doc ) : bool
doc NBoilerpipe.Document.TextDocument
리턴 bool

프로퍼티 상세

INSTANCE 공개적으로 정적으로 프로퍼티

public static KeepLargestFulltextBlockFilter,NBoilerpipe.Filters.English INSTANCE
리턴 KeepLargestFulltextBlockFilter