C# 클래스 Lucene.Net.Analysis.Pattern.PatternTokenizerFactory

Factory for PatternTokenizer. This tokenizer uses regex pattern matching to construct distinct tokens for the input stream. It takes two arguments: "pattern" and "group".

  • "pattern" is the regular expression.
  • "group" says which group to extract into tokens.

group=-1 (the default) is equivalent to "split". In this case, the tokens will be equivalent to the output from (without empty tokens): String#split(java.lang.String)

Using group >= 0 selects the matching group as the token. For example, if you have:

 pattern = \'([^\']+)\' group = 0 input = aaa 'bbb' 'ccc' 
the output will be two tokens: 'bbb' and 'ccc' (including the ' marks). With the same input but using group=1, the output would be: bbb and ccc (no ' marks)

NOTE: This Tokenizer does not output tokens that are of zero length.

 <fieldType name="text_ptn" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.PatternTokenizerFactory" pattern="\'([^\']+)\'" group="1"/> </analyzer> </fieldType>
상속: Lucene.Net.Analysis.Util.TokenizerFactory
파일 보기 프로젝트 열기: apache/lucenenet

보호된 프로퍼티들

프로퍼티 타입 설명
group int
pattern System.Text.RegularExpressions.Regex

공개 메소드들

메소드 설명
Create ( Lucene.Net.Util.AttributeSource factory, TextReader input ) : Tokenizer

Split the input using configured pattern

PatternTokenizerFactory ( string>.IDictionary args ) : Lucene.Net.Analysis.Util

Creates a new PatternTokenizerFactory

메소드 상세

Create() 공개 메소드

Split the input using configured pattern
public Create ( Lucene.Net.Util.AttributeSource factory, TextReader input ) : Tokenizer
factory Lucene.Net.Util.AttributeSource
input System.IO.TextReader
리턴 Tokenizer

PatternTokenizerFactory() 공개 메소드

Creates a new PatternTokenizerFactory
public PatternTokenizerFactory ( string>.IDictionary args ) : Lucene.Net.Analysis.Util
args string>.IDictionary
리턴 Lucene.Net.Analysis.Util

프로퍼티 상세

group 보호되어 있는 프로퍼티

protected int group
리턴 int

pattern 보호되어 있는 프로퍼티

protected Regex,System.Text.RegularExpressions pattern
리턴 System.Text.RegularExpressions.Regex