Lucene.Net.Analysis.Standard.Std31 |
Lucene.Net.Analysis.Standard.Std34 |
Lucene.Net.Analysis.Standard.Std36 |
Lucene.Net.Analysis.Standard.Std40 |
Name | Description |
---|---|
ClassicFilter | Normalizes tokens extracted with ClassicTokenizer. |
ClassicFilterFactory | Factory for ClassicFilter. <fieldType name="text_clssc" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.ClassicTokenizerFactory"/> <filter class="solr.ClassicFilterFactory"/> </analyzer> </fieldType> |
ClassicTokenizer | A grammar-based tokenizer constructed with JFlex This should be a good tokenizer for most European-language documents:
Many applications have specific tokenizer needs. If this tokenizer does not suit your application, please consider copying this source code directory to your project and maintaining your own grammar-based tokenizer. ClassicTokenizer was named StandardTokenizer in Lucene versions prior to 3.1. As of 3.1, StandardTokenizer implements Unicode text segmentation, as specified by UAX#29. |
ClassicTokenizerFactory | Factory for ClassicTokenizer. <fieldType name="text_clssc" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.ClassicTokenizerFactory" maxTokenLength="120"/> </analyzer> </fieldType> |
ClassicTokenizerImpl | This class implements the classic lucene StandardTokenizer up until 3.0 |
StandardAnalyzer | Filters {@link StandardTokenizer} with {@link StandardFilter}, {@link LowerCaseFilter} and {@link StopFilter}, using a list of English stop words. |
StandardAnalyzer.SavedStreams | |
StandardAnalyzer.TokenStreamComponentsAnonymousInnerClassHelper | |
StandardFilter | Normalizes tokens extracted with {@link StandardTokenizer}. |
StandardFilterFactory | Factory for StandardFilter. <fieldType name="text_stndrd" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StandardFilterFactory"/> </analyzer> </fieldType> |
StandardTokenizer | A grammar-based tokenizer constructed with JFlex This should be a good tokenizer for most European-language documents:
Many applications have specific tokenizer needs. If this tokenizer does not suit your application, please consider copying this source code directory to your project and maintaining your own grammar-based tokenizer. |
StandardTokenizerFactory | Factory for StandardTokenizer. <fieldType name="text_stndrd" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory" maxTokenLength="255"/> </analyzer> </fieldType> |
StandardTokenizerImpl | This class is a scanner generated by JFlex 1.4.1 on 12/18/07 9:22 PM from the specification file /Volumes/User/grantingersoll/projects/lucene/java/lucene-clean/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex |
TestStandardFactories | Simple tests to ensure the standard lucene factories are working. |
TestUAX29URLEmailTokenizerFactory | A few tests based on org.apache.lucene.analysis.TestUAX29URLEmailTokenizer |
UAX29URLEmailAnalyzer | Filters org.apache.lucene.analysis.standard.UAX29URLEmailTokenizer with StandardFilter, LowerCaseFilter and StopFilter, using a list of English stop words. You must specify the required org.apache.lucene.util.Version compatibility when creating UAX29URLEmailAnalyzer |
UAX29URLEmailAnalyzer.TokenStreamComponentsAnonymousInnerClassHelper | |
UAX29URLEmailTokenizerFactory | Factory for UAX29URLEmailTokenizer. <fieldType name="text_urlemail" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.UAX29URLEmailTokenizerFactory" maxTokenLength="255"/> </analyzer> </fieldType> |
UAX29URLEmailTokenizerImpl | This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs. Tokens produced are of the following types:
|