C# Class Lucene.Net.Analysis.Th.ThaiTokenizer

Tokenizer that use BreakIterator to tokenize Thai text.

WARNING: this tokenizer may not be supported by all JREs. It is known to work with Sun/Oracle and Harmony JREs. If your application needs to be fully portable, consider using ICUTokenizer instead, which uses an ICU Thai BreakIterator that will always be available.

Inheritance: Lucene.Net.Analysis.Util.SegmentingTokenizerBase
Exibir arquivo Open project: apache/lucenenet Class Usage Examples

Public Properties

Property Type Description
DBBI_AVAILABLE bool

Public Methods

Method Description
ThaiTokenizer ( AttributeFactory factory, TextReader reader ) : ICU4NET

Creates a new ThaiTokenizer, supplying the AttributeFactory

ThaiTokenizer ( TextReader reader ) : ICU4NET

Creates a new ThaiTokenizer

Protected Methods

Method Description
IncrementWord ( ) : bool
SetNextSentence ( int sentenceStart, int sentenceEnd ) : void

Private Methods

Method Description
ThaiTokenizer ( ) : ICU4NET

Method Details

IncrementWord() protected method

protected IncrementWord ( ) : bool
return bool

SetNextSentence() protected method

protected SetNextSentence ( int sentenceStart, int sentenceEnd ) : void
sentenceStart int
sentenceEnd int
return void

ThaiTokenizer() public method

Creates a new ThaiTokenizer, supplying the AttributeFactory
public ThaiTokenizer ( AttributeFactory factory, TextReader reader ) : ICU4NET
factory AttributeFactory
reader System.IO.TextReader
return ICU4NET

ThaiTokenizer() public method

Creates a new ThaiTokenizer
public ThaiTokenizer ( TextReader reader ) : ICU4NET
reader System.IO.TextReader
return ICU4NET

Property Details

DBBI_AVAILABLE public_oe static_oe property

True if the JRE supports a working dictionary-based breakiterator for Thai. If this is false, this tokenizer will not work at all!
public static bool DBBI_AVAILABLE
return bool