C# Класс Lucene.Net.Analysis.Compound.HyphenationCompoundWordTokenFilter

A TokenFilter that decomposes compound words found in many Germanic languages.

"Donaudampfschiff" becomes Donau, dampf, schiff so that you can find "Donaudampfschiff" even when you only enter "schiff". It uses a hyphenation grammar and a word dictionary to achieve this.

You must specify the required Version compatibility when creating CompoundWordTokenFilterBase:

  • As of 3.1, CompoundWordTokenFilterBase correctly handles Unicode 4.0 supplementary characters in strings and char arrays provided as compound word dictionaries.

Наследование: CompoundWordTokenFilterBase
Показать файл Открыть проект Примеры использования класса

Открытые методы

Метод Описание
GetHyphenationTree ( FileInfo hyphenationFile ) : HyphenationTree

Create a hyphenator tree

GetHyphenationTree ( FileInfo hyphenationFile, Encoding encoding ) : HyphenationTree

Create a hyphenator tree

GetHyphenationTree ( Stream hyphenationSource ) : HyphenationTree

Create a hyphenator tree

GetHyphenationTree ( Stream hyphenationSource, Encoding encoding ) : HyphenationTree

Create a hyphenator tree

GetHyphenationTree ( string hyphenationFilename ) : HyphenationTree

Create a hyphenator tree

GetHyphenationTree ( string hyphenationFilename, Encoding encoding ) : HyphenationTree

Create a hyphenator tree

HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator ) : Lucene.Net.Analysis.Compound.Hyphenation

Create a HyphenationCompoundWordTokenFilter with no dictionary.

Calls {@link #HyphenationCompoundWordTokenFilter(Version, TokenStream, HyphenationTree, int, int, int) HyphenationCompoundWordTokenFilter(matchVersion, input, hyphenator, DEFAULT_MIN_WORD_SIZE, DEFAULT_MIN_SUBWORD_SIZE, DEFAULT_MAX_SUBWORD_SIZE }

HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator, CharArraySet dictionary ) : Lucene.Net.Analysis.Compound.Hyphenation

Creates a new HyphenationCompoundWordTokenFilter instance.

HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator, CharArraySet dictionary, int minWordSize, int minSubwordSize, int maxSubwordSize, bool onlyLongestMatch ) : Lucene.Net.Analysis.Compound.Hyphenation

Creates a new HyphenationCompoundWordTokenFilter instance.

HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator, int minWordSize, int minSubwordSize, int maxSubwordSize ) : Lucene.Net.Analysis.Compound.Hyphenation

Create a HyphenationCompoundWordTokenFilter with no dictionary.

Calls {@link #HyphenationCompoundWordTokenFilter(Version, TokenStream, HyphenationTree, CharArraySet, int, int, int, boolean) HyphenationCompoundWordTokenFilter(matchVersion, input, hyphenator, null, minWordSize, minSubwordSize, maxSubwordSize }

Защищенные методы

Метод Описание
Decompose ( ) : void

Описание методов

Decompose() защищенный Метод

protected Decompose ( ) : void
Результат void

GetHyphenationTree() публичный статический Метод

Create a hyphenator tree
If there is a low-level I/O error.
public static GetHyphenationTree ( FileInfo hyphenationFile ) : HyphenationTree
hyphenationFile System.IO.FileInfo the file of the XML grammar to load
Результат org.apache.lucene.analysis.compound.hyphenation.HyphenationTree

GetHyphenationTree() публичный статический Метод

Create a hyphenator tree
If there is a low-level I/O error.
public static GetHyphenationTree ( FileInfo hyphenationFile, Encoding encoding ) : HyphenationTree
hyphenationFile System.IO.FileInfo the file of the XML grammar to load
encoding Encoding
Результат org.apache.lucene.analysis.compound.hyphenation.HyphenationTree

GetHyphenationTree() публичный статический Метод

Create a hyphenator tree
If there is a low-level I/O error.
public static GetHyphenationTree ( Stream hyphenationSource ) : HyphenationTree
hyphenationSource System.IO.Stream the InputSource pointing to the XML grammar
Результат org.apache.lucene.analysis.compound.hyphenation.HyphenationTree

GetHyphenationTree() публичный статический Метод

Create a hyphenator tree
If there is a low-level I/O error.
public static GetHyphenationTree ( Stream hyphenationSource, Encoding encoding ) : HyphenationTree
hyphenationSource System.IO.Stream the InputSource pointing to the XML grammar
encoding Encoding
Результат org.apache.lucene.analysis.compound.hyphenation.HyphenationTree

GetHyphenationTree() публичный статический Метод

Create a hyphenator tree
If there is a low-level I/O error.
public static GetHyphenationTree ( string hyphenationFilename ) : HyphenationTree
hyphenationFilename string the filename of the XML grammar to load
Результат org.apache.lucene.analysis.compound.hyphenation.HyphenationTree

GetHyphenationTree() публичный статический Метод

Create a hyphenator tree
If there is a low-level I/O error.
public static GetHyphenationTree ( string hyphenationFilename, Encoding encoding ) : HyphenationTree
hyphenationFilename string the filename of the XML grammar to load
encoding Encoding
Результат org.apache.lucene.analysis.compound.hyphenation.HyphenationTree

HyphenationCompoundWordTokenFilter() публичный Метод

Create a HyphenationCompoundWordTokenFilter with no dictionary.

Calls {@link #HyphenationCompoundWordTokenFilter(Version, TokenStream, HyphenationTree, int, int, int) HyphenationCompoundWordTokenFilter(matchVersion, input, hyphenator, DEFAULT_MIN_WORD_SIZE, DEFAULT_MIN_SUBWORD_SIZE, DEFAULT_MAX_SUBWORD_SIZE }

public HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator ) : Lucene.Net.Analysis.Compound.Hyphenation
matchVersion LuceneVersion
input TokenStream
hyphenator org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
Результат Lucene.Net.Analysis.Compound.Hyphenation

HyphenationCompoundWordTokenFilter() публичный Метод

Creates a new HyphenationCompoundWordTokenFilter instance.
public HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator, CharArraySet dictionary ) : Lucene.Net.Analysis.Compound.Hyphenation
matchVersion LuceneVersion /// Lucene version to enable correct Unicode 4.0 behavior in the /// dictionaries if Version > 3.0. See CompoundWordTokenFilterBase for details.
input TokenStream /// the to process
hyphenator org.apache.lucene.analysis.compound.hyphenation.HyphenationTree /// the hyphenation pattern tree to use for hyphenation
dictionary CharArraySet /// the word dictionary to match against.
Результат Lucene.Net.Analysis.Compound.Hyphenation

HyphenationCompoundWordTokenFilter() публичный Метод

Creates a new HyphenationCompoundWordTokenFilter instance.
public HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator, CharArraySet dictionary, int minWordSize, int minSubwordSize, int maxSubwordSize, bool onlyLongestMatch ) : Lucene.Net.Analysis.Compound.Hyphenation
matchVersion LuceneVersion /// Lucene version to enable correct Unicode 4.0 behavior in the /// dictionaries if Version > 3.0. See CompoundWordTokenFilterBase for details.
input TokenStream /// the to process
hyphenator org.apache.lucene.analysis.compound.hyphenation.HyphenationTree /// the hyphenation pattern tree to use for hyphenation
dictionary CharArraySet /// the word dictionary to match against.
minWordSize int /// only words longer than this get processed
minSubwordSize int /// only subwords longer than this get to the output stream
maxSubwordSize int /// only subwords shorter than this get to the output stream
onlyLongestMatch bool /// Add only the longest matching subword to the stream
Результат Lucene.Net.Analysis.Compound.Hyphenation

HyphenationCompoundWordTokenFilter() публичный Метод

Create a HyphenationCompoundWordTokenFilter with no dictionary.

Calls {@link #HyphenationCompoundWordTokenFilter(Version, TokenStream, HyphenationTree, CharArraySet, int, int, int, boolean) HyphenationCompoundWordTokenFilter(matchVersion, input, hyphenator, null, minWordSize, minSubwordSize, maxSubwordSize }

public HyphenationCompoundWordTokenFilter ( LuceneVersion matchVersion, TokenStream input, HyphenationTree hyphenator, int minWordSize, int minSubwordSize, int maxSubwordSize ) : Lucene.Net.Analysis.Compound.Hyphenation
matchVersion LuceneVersion
input TokenStream
hyphenator org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
minWordSize int
minSubwordSize int
maxSubwordSize int
Результат Lucene.Net.Analysis.Compound.Hyphenation