Property | Type | Description | |
---|---|---|---|
DoNext | bool | ||
FlushBigram | void | ||
FlushUnigram | void | ||
Refill | void |
Method | Description | |
---|---|---|
CJKBigramFilter ( TokenStream @in ) : Lucene.Net.Analysis.Standard |
Calls {@link CJKBigramFilter#CJKBigramFilter(TokenStream, int) CJKBigramFilter(in, HAN | HIRAGANA | KATAKANA | HANGUL)}
|
|
CJKBigramFilter ( TokenStream @in, int flags ) : Lucene.Net.Analysis.Standard |
Calls {@link CJKBigramFilter#CJKBigramFilter(TokenStream, int, boolean) CJKBigramFilter(in, flags, false)}
|
|
CJKBigramFilter ( TokenStream @in, int flags, bool outputUnigrams ) : Lucene.Net.Analysis.Standard |
Create a new CJKBigramFilter, specifying which writing systems should be bigrammed, and whether or not unigrams should also be output.
|
|
IncrementToken ( ) : bool | ||
Reset ( ) : void |
Method | Description | |
---|---|---|
DoNext ( ) : bool |
looks at next input token, returning false is none is available
|
|
FlushBigram ( ) : void |
Flushes a bigram token to output from our buffer This is the normal case, e.g. ABC -> AB BC
|
|
FlushUnigram ( ) : void |
Flushes a unigram token to output from our buffer. This happens when we encounter isolated CJK characters, either the whole CJK string is a single character, or we encounter a CJK character surrounded by space, punctuation, english, etc, but not beside any other CJK.
|
|
Refill ( ) : void |
refills buffers with new data from the current token.
|
public CJKBigramFilter ( TokenStream @in ) : Lucene.Net.Analysis.Standard | ||
@in | TokenStream | |
return | Lucene.Net.Analysis.Standard |
public CJKBigramFilter ( TokenStream @in, int flags ) : Lucene.Net.Analysis.Standard | ||
@in | TokenStream | |
flags | int | |
return | Lucene.Net.Analysis.Standard |
public CJKBigramFilter ( TokenStream @in, int flags, bool outputUnigrams ) : Lucene.Net.Analysis.Standard | ||
@in | TokenStream | |
flags | int | OR'ed set from |
outputUnigrams | bool | true if unigrams for the selected writing systems should also be output. /// when this is false, this is only done when there are no adjacent characters to form /// a bigram. |
return | Lucene.Net.Analysis.Standard |