C# Класс Lucene.Net.Analysis.Pattern.PatternTokenizerFactory

Factory for PatternTokenizer. This tokenizer uses regex pattern matching to construct distinct tokens for the input stream. It takes two arguments: "pattern" and "group".

  • "pattern" is the regular expression.
  • "group" says which group to extract into tokens.

group=-1 (the default) is equivalent to "split". In this case, the tokens will be equivalent to the output from (without empty tokens): String#split(java.lang.String)

Using group >= 0 selects the matching group as the token. For example, if you have:

 pattern = \'([^\']+)\' group = 0 input = aaa 'bbb' 'ccc' 
the output will be two tokens: 'bbb' and 'ccc' (including the ' marks). With the same input but using group=1, the output would be: bbb and ccc (no ' marks)

NOTE: This Tokenizer does not output tokens that are of zero length.

 <fieldType name="text_ptn" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.PatternTokenizerFactory" pattern="\'([^\']+)\'" group="1"/> </analyzer> </fieldType>
Наследование: Lucene.Net.Analysis.Util.TokenizerFactory
Показать файл Открыть проект

Защищенные свойства (Protected)

Свойство Тип Описание
group int
pattern System.Text.RegularExpressions.Regex

Открытые методы

Метод Описание
Create ( Lucene.Net.Util.AttributeSource factory, TextReader input ) : Tokenizer

Split the input using configured pattern

PatternTokenizerFactory ( string>.IDictionary args ) : Lucene.Net.Analysis.Util

Creates a new PatternTokenizerFactory

Описание методов

Create() публичный Метод

Split the input using configured pattern
public Create ( Lucene.Net.Util.AttributeSource factory, TextReader input ) : Tokenizer
factory Lucene.Net.Util.AttributeSource
input System.IO.TextReader
Результат Tokenizer

PatternTokenizerFactory() публичный Метод

Creates a new PatternTokenizerFactory
public PatternTokenizerFactory ( string>.IDictionary args ) : Lucene.Net.Analysis.Util
args string>.IDictionary
Результат Lucene.Net.Analysis.Util

Описание свойств

group защищенное свойство

protected int group
Результат int

pattern защищенное свойство

protected Regex,System.Text.RegularExpressions pattern
Результат System.Text.RegularExpressions.Regex