C# Class Lucene.Net.Analysis.Br.BrazilianStemmer

A stemmer for Brazilian Portuguese words.
Datei anzeigen Open project: apache/lucenenet Class Usage Examples

Public Methods

Method Description
BrazilianStemmer ( ) : System.Globalization
Log ( ) : string

For log and debug purpose

Protected Methods

Method Description
Stem ( string term ) : string

Stems the given term to an unique discriminator.

Private Methods

Method Description
ChangeTerm ( string value ) : string

1) Turn to lowercase 2) Remove accents 3) ã -> a ; õ -> o 4) ç -> c

CreateCT ( string term ) : void

Creates CT (changed term) , substituting * 'ã' and 'õ' for 'a~' and 'o~'.

GetR1 ( string value ) : string

Gets R1 R1 - is the region after the first non-vowel following a vowel, or is the null region at the end of the word if there is no such non-vowel.

GetRV ( string value ) : string

Gets RV RV - IF the second letter is a consonant, RV is the region after the next following vowel, OR if the first two letters are vowels, RV is the region after the next consonant, AND otherwise (consonant-vowel case) RV is the region after the third letter. BUT RV is the end of the word if this positions cannot be found.

IsIndexable ( string term ) : bool

Checks a term if it can be processed indexed.

IsStemmable ( string term ) : bool

Checks a term if it can be processed correctly.

IsVowel ( char value ) : bool

See if string is 'a','e','i','o','u'

RemoveSuffix ( string value, string toRemove ) : string

Remove a string suffix

ReplaceSuffix ( string value, string toReplace, string changeTo ) : string

Replace a string suffix by another

Step1 ( ) : bool

Standard suffix removal. Search for the longest among the following suffixes, and perform the following actions:

Step2 ( ) : bool

Verb suffixes. Search for the longest among the following suffixes in RV, and if found, delete.

Step3 ( ) : void

Delete suffix 'i' if in RV and preceded by 'c'

Step4 ( ) : void

Residual suffix If the word ends with one of the suffixes (os a i o á í ó) in RV, delete it

Step5 ( ) : void

If the word ends with one of ( e é ê) in RV,delete it, and if preceded by 'gu' (or 'ci') with the 'u' (or 'i') in RV, delete the 'u' (or 'i') Or if the word ends ç remove the cedilha

Suffix ( string value, string suffix ) : bool

Check if a string ends with a suffix

SuffixPreceded ( string value, string suffix, string preceded ) : bool

See if a suffix is preceded by a String

Method Details

BrazilianStemmer() public method

public BrazilianStemmer ( ) : System.Globalization
return System.Globalization

Log() public method

For log and debug purpose
public Log ( ) : string
return string

Stem() protected method

Stems the given term to an unique discriminator.
protected Stem ( string term ) : string
term string The term that should be stemmed.
return string