C# Class Lucene.Net.Analysis.Th.ThaiWordBreaker

LUCENENET specific class to patch the behavior of the ICU BreakIterator. Corrects the breaking of words by finding transitions between Thai and non-Thai characters. This logic assumes that the Java BreakIterator also breaks up Thai numerals from Arabic numerals (1, 2, 3, etc.). That is, it assumes the first test below passes and the second test fails in Lucene (not attempted). ThaiAnalyzer analyzer = new ThaiAnalyzer(TEST_VERSION_CURRENT, CharArraySet.EMPTY_SET); AssertAnalyzesTo(analyzer, "๑๒๓456", new string[] { "๑๒๓", "456" }); AssertAnalyzesTo(analyzer, "๑๒๓456", new string[] { "๑๒๓456" });
Afficher le fichier Open project: apache/lucenenet Class Usage Examples

Méthodes publiques

Méthode Description
Current ( ) : int
Next ( ) : int
SetText ( string text ) : void
ThaiWordBreaker ( BreakIterator wordBreaker ) : ICU4NET

Private Methods

Méthode Description
GetNext ( ) : int

Method Details

Current() public méthode

public Current ( ) : int
Résultat int

Next() public méthode

public Next ( ) : int
Résultat int

SetText() public méthode

public SetText ( string text ) : void
text string
Résultat void

ThaiWordBreaker() public méthode

public ThaiWordBreaker ( BreakIterator wordBreaker ) : ICU4NET
wordBreaker BreakIterator
Résultat ICU4NET