C# 클래스 Lucene.Net.Analysis.MockTokenizer

Tokenizer for testing.

this tokenizer is a replacement for #WHITESPACE, #SIMPLE, and #KEYWORD tokenizers. If you are writing a component such as a TokenFilter, its a great idea to test it wrapping this tokenizer instead for extra checks. this tokenizer has the following behavior:

  • An internal state-machine is used for checking consumer consistency. These checks can be disabled with #setEnableChecks(boolean).
  • For convenience, optionally lowercases terms that it outputs.
상속: Tokenizer
파일 보기 프로젝트 열기: apache/lucenenet 1 사용 예제들

공개 프로퍼티들

프로퍼티 타입 설명
DEFAULT_MAX_TOKEN_LENGTH int
KEYWORD CharacterRunAutomaton
SIMPLE CharacterRunAutomaton
WHITESPACE CharacterRunAutomaton

공개 메소드들

메소드 설명
Dispose ( ) : void
End ( ) : void
IncrementToken ( ) : bool
MockTokenizer ( AttributeFactory factory, TextReader input ) : Lucene.Net.Analysis.Tokenattributes

Calls {@link #MockTokenizer(Lucene.Net.Util.AttributeSource.AttributeFactory,Reader,CharacterRunAutomaton,boolean) MockTokenizer(AttributeFactory, Reader, WHITESPACE, true)}

MockTokenizer ( AttributeFactory factory, TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase ) : Lucene.Net.Analysis.Tokenattributes
MockTokenizer ( AttributeFactory factory, TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase, int maxTokenLength ) : Lucene.Net.Analysis.Tokenattributes
MockTokenizer ( TextReader input ) : Lucene.Net.Analysis.Tokenattributes

Calls #MockTokenizer(Reader, CharacterRunAutomaton, boolean) MockTokenizer(Reader, WHITESPACE, true)

MockTokenizer ( TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase ) : Lucene.Net.Analysis.Tokenattributes
MockTokenizer ( TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase, int maxTokenLength ) : Lucene.Net.Analysis.Tokenattributes
Reset ( ) : void

보호된 메소드들

메소드 설명
IsTokenChar ( int c ) : bool
Normalize ( int c ) : int
ReadChar ( ) : int
ReadCodePoint ( ) : int

비공개 메소드들

메소드 설명
SetReaderTestPoint ( ) : bool

메소드 상세

Dispose() 공개 메소드

public Dispose ( ) : void
리턴 void

End() 공개 메소드

public End ( ) : void
리턴 void

IncrementToken() 공개 최종 메소드

public final IncrementToken ( ) : bool
리턴 bool

IsTokenChar() 보호된 메소드

protected IsTokenChar ( int c ) : bool
c int
리턴 bool

MockTokenizer() 공개 메소드

Calls {@link #MockTokenizer(Lucene.Net.Util.AttributeSource.AttributeFactory,Reader,CharacterRunAutomaton,boolean) MockTokenizer(AttributeFactory, Reader, WHITESPACE, true)}
public MockTokenizer ( AttributeFactory factory, TextReader input ) : Lucene.Net.Analysis.Tokenattributes
factory AttributeFactory
input System.IO.TextReader
리턴 Lucene.Net.Analysis.Tokenattributes

MockTokenizer() 공개 메소드

public MockTokenizer ( AttributeFactory factory, TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase ) : Lucene.Net.Analysis.Tokenattributes
factory AttributeFactory
input System.IO.TextReader
runAutomaton CharacterRunAutomaton
lowerCase bool
리턴 Lucene.Net.Analysis.Tokenattributes

MockTokenizer() 공개 메소드

public MockTokenizer ( AttributeFactory factory, TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase, int maxTokenLength ) : Lucene.Net.Analysis.Tokenattributes
factory AttributeFactory
input System.IO.TextReader
runAutomaton CharacterRunAutomaton
lowerCase bool
maxTokenLength int
리턴 Lucene.Net.Analysis.Tokenattributes

MockTokenizer() 공개 메소드

Calls #MockTokenizer(Reader, CharacterRunAutomaton, boolean) MockTokenizer(Reader, WHITESPACE, true)
public MockTokenizer ( TextReader input ) : Lucene.Net.Analysis.Tokenattributes
input System.IO.TextReader
리턴 Lucene.Net.Analysis.Tokenattributes

MockTokenizer() 공개 메소드

public MockTokenizer ( TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase ) : Lucene.Net.Analysis.Tokenattributes
input System.IO.TextReader
runAutomaton CharacterRunAutomaton
lowerCase bool
리턴 Lucene.Net.Analysis.Tokenattributes

MockTokenizer() 공개 메소드

public MockTokenizer ( TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase, int maxTokenLength ) : Lucene.Net.Analysis.Tokenattributes
input System.IO.TextReader
runAutomaton CharacterRunAutomaton
lowerCase bool
maxTokenLength int
리턴 Lucene.Net.Analysis.Tokenattributes

Normalize() 보호된 메소드

protected Normalize ( int c ) : int
c int
리턴 int

ReadChar() 보호된 메소드

protected ReadChar ( ) : int
리턴 int

ReadCodePoint() 보호된 메소드

protected ReadCodePoint ( ) : int
리턴 int

Reset() 공개 메소드

public Reset ( ) : void
리턴 void

프로퍼티 상세

DEFAULT_MAX_TOKEN_LENGTH 공개적으로 정적으로 프로퍼티

public static int DEFAULT_MAX_TOKEN_LENGTH
리턴 int

KEYWORD 공개적으로 정적으로 프로퍼티

Acts Similar to KeywordTokenizer. TODO: Keyword returns an "empty" token for an empty reader...
public static CharacterRunAutomaton KEYWORD
리턴 CharacterRunAutomaton

SIMPLE 공개적으로 정적으로 프로퍼티

Acts like LetterTokenizer.
public static CharacterRunAutomaton SIMPLE
리턴 CharacterRunAutomaton

WHITESPACE 공개적으로 정적으로 프로퍼티

Acts Similar to WhitespaceTokenizer
public static CharacterRunAutomaton WHITESPACE
리턴 CharacterRunAutomaton