C# 클래스 Lucene.Net.Analysis.Lv.LatvianStemmer

Light stemmer for Latvian.

This is a light version of the algorithm in Karlis Kreslin's PhD thesis A stemming algorithm for Latvian with the following modifications:

  • Only explicitly stems noun and adjective morphology
  • Stricter length/vowel checks for the resulting stems (verb etc suffix stripping is removed)
  • Removes only the primary inflectional suffixes: case and number for nouns ; case, number, gender, and definitiveness for adjectives.
  • Palatalization is only handled when a declension II,V,VI noun suffix is removed.

파일 보기 프로젝트 열기: apache/lucenenet

공개 메소드들

메소드 설명
Stem ( char s, int len ) : int

Stem a latvian word. returns the new adjusted length.

비공개 메소드들

메소드 설명
NumVowels ( char s, int len ) : int

Count the vowels in the string, we always require at least one in the remaining stem to accept it.

Unpalatalize ( char s, int len ) : int

Most cases are handled except for the ambiguous ones:

  • s -> š
  • t -> š
  • d -> ž
  • z -> ž

메소드 상세

Stem() 공개 메소드

Stem a latvian word. returns the new adjusted length.
public Stem ( char s, int len ) : int
s char
len int
리턴 int