C# Class Lucene.Net.Analysis.Cn.ChineseFilter

A {@link TokenFilter} with a stop word table.
  • Numeric tokens are removed.
  • English tokens must be larger than 1 char.
  • One Chinese char as one Chinese word.
TO DO:
  1. Add Chinese stop words, such as \ue400
  2. Dictionary based Chinese word extraction
  3. Intelligent Chinese word extraction
Inheritance: Lucene.Net.Analysis.TokenFilter
Datei anzeigen Open project: apache/lucenenet Class Usage Examples

Public Properties

Property Type Description
STOP_WORDS string[]

Public Methods

Method Description
ChineseFilter ( TokenStream @in ) : Lucene.Net.Analysis.Tokenattributes
IncrementToken ( ) : bool

Method Details

ChineseFilter() public method

public ChineseFilter ( TokenStream @in ) : Lucene.Net.Analysis.Tokenattributes
@in Lucene.Net.Analysis.TokenStream
return Lucene.Net.Analysis.Tokenattributes

IncrementToken() public method

public IncrementToken ( ) : bool
return bool

Property Details

STOP_WORDS public_oe static_oe property

public static string[] STOP_WORDS
return string[]