C# Класс org.apache.lucene.analysis.miscellaneous.HyphenatedWordsFilter

When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines. This is often the case with documents where narrow text columns are used, such as newsletters. In order to increase search efficiency, this filter puts hyphenated words broken into two lines back together. This filter should be used on indexing time only. Example field definition in schema.xml:
 <fieldtype name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> <filter class="solr.StopFilterFactory" ignoreCase="true"/> <filter class="solr.HyphenatedWordsFilterFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldtype> 
Наследование: TokenFilter
Показать файл Открыть проект Примеры использования класса

Открытые методы

Метод Описание
HyphenatedWordsFilter ( TokenStream @in ) : System.Text

Creates a new HyphenatedWordsFilter

incrementToken ( ) : bool

{@inheritDoc}

reset ( ) : void

{@inheritDoc}

Приватные методы

Метод Описание
unhyphenate ( ) : void

Writes the joined unhyphenated term

Описание методов

HyphenatedWordsFilter() публичный Метод

Creates a new HyphenatedWordsFilter
public HyphenatedWordsFilter ( TokenStream @in ) : System.Text
@in TokenStream
Результат System.Text

incrementToken() публичный Метод

{@inheritDoc}
public incrementToken ( ) : bool
Результат bool

reset() публичный Метод

{@inheritDoc}
public reset ( ) : void
Результат void