C# Класс Lucene.Net.Analysis.Ar.ArabicAnalyzer

Analyzer for Arabic.

This analyzer implements light-stemming as specified by: Light Stemming for Arabic Information Retrieval http://www.mtholyoke.edu/~lballest/Pubs/arab_stem05.pdf

The analysis package contains three primary components:

  • ArabicNormalizationFilter: Arabic orthographic normalization.
  • ArabicStemFilter: Arabic light stemming
  • Arabic stop words file: a set of default Arabic stop words.

Наследование: Lucene.Net.Analysis.Util.StopwordAnalyzerBase
Показать файл Открыть проект Примеры использования класса

Открытые методы

Метод Описание
ArabicAnalyzer ( LuceneVersion matchVersion ) : Lucene.Net.Analysis.Core

Builds an analyzer with the default stop words: #DEFAULT_STOPWORD_FILE.

ArabicAnalyzer ( LuceneVersion matchVersion, CharArraySet stopwords ) : Lucene.Net.Analysis.Core

Builds an analyzer with the given stop words

ArabicAnalyzer ( LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionSet ) : Lucene.Net.Analysis.Core

Builds an analyzer with the given stop word. If a none-empty stem exclusion set is provided this analyzer will add a SetKeywordMarkerFilter before ArabicStemFilter.

CreateComponents ( string fieldName, TextReader reader ) : TokenStreamComponents

Creates org.apache.lucene.analysis.Analyzer.TokenStreamComponents used to tokenize all the text in the provided Reader.

Описание методов

ArabicAnalyzer() публичный Метод

Builds an analyzer with the default stop words: #DEFAULT_STOPWORD_FILE.
public ArabicAnalyzer ( LuceneVersion matchVersion ) : Lucene.Net.Analysis.Core
matchVersion LuceneVersion
Результат Lucene.Net.Analysis.Core

ArabicAnalyzer() публичный Метод

Builds an analyzer with the given stop words
public ArabicAnalyzer ( LuceneVersion matchVersion, CharArraySet stopwords ) : Lucene.Net.Analysis.Core
matchVersion LuceneVersion /// lucene compatibility version
stopwords CharArraySet /// a stopword set
Результат Lucene.Net.Analysis.Core

ArabicAnalyzer() публичный Метод

Builds an analyzer with the given stop word. If a none-empty stem exclusion set is provided this analyzer will add a SetKeywordMarkerFilter before ArabicStemFilter.
public ArabicAnalyzer ( LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionSet ) : Lucene.Net.Analysis.Core
matchVersion LuceneVersion /// lucene compatibility version
stopwords CharArraySet /// a stopword set
stemExclusionSet CharArraySet /// a set of terms not to be stemmed
Результат Lucene.Net.Analysis.Core

CreateComponents() публичный Метод

Creates org.apache.lucene.analysis.Analyzer.TokenStreamComponents used to tokenize all the text in the provided Reader.
public CreateComponents ( string fieldName, TextReader reader ) : TokenStreamComponents
fieldName string
reader System.IO.TextReader
Результат TokenStreamComponents