C# 클래스 Lucene.Net.Analysis.De.GermanStemmer

A stemmer for German words.

The algorithm is based on the report "A Fast and Simple Stemming Algorithm for German Words" by Jörg Caumanns (joerg.caumanns at isst.fhg.de).

파일 보기 프로젝트 열기: synhershko/lucene.net 1 사용 예제들

보호된 프로퍼티들

프로퍼티	타입	설명
substCount	int

보호된 메소드들

메소드	설명
Substitute ( StringBuilder buffer ) : void	Do some substitutions for the term to reduce overstemming: - Substitute Umlauts with their corresponding vowel: äöü -> aou, "ß" is substituted by "ss" - Substitute a second char of a pair of equal characters with an asterisk: ?? -> ?* - Substitute some common character combinations with a token: sch/ch/ei/ie/ig/st -> $/В§/%/&/#/!

비공개 메소드들

메소드	설명
IsStemmable ( String term ) : bool	Checks if a term could be stemmed.
Optimize ( StringBuilder buffer ) : void	Does some optimizations on the term. This optimisations are contextual.
RemoveParticleDenotion ( StringBuilder buffer ) : void	Removes a particle denotion ("ge") from a term.
Resubstitute ( StringBuilder buffer ) : void	Undoes the changes made by Substitute(). That are character pairs and character combinations. Umlauts will remain as their corresponding vowel, as "?" remains as "ss".
Stem ( String term ) : String	Stemms the given term to an unique `discriminator`.
Strip ( StringBuilder buffer ) : void	Suffix stripping (stemming) on the current term. The stripping is reduced to the seven "base" suffixes "e", "s", "n", "t", "em", "er" and * "nd", from which all regular suffixes are build of. The simplification causes some overstemming, and way more irregular stems, but still provides unique. discriminators in the most of those cases. The algorithm is context free, except of the length restrictions.

메소드 상세

Substitute() 보호된 메소드

Do some substitutions for the term to reduce overstemming: - Substitute Umlauts with their corresponding vowel: äöü -> aou, "ß" is substituted by "ss" - Substitute a second char of a pair of equal characters with an asterisk: ?? -> ?* - Substitute some common character combinations with a token: sch/ch/ei/ie/ig/st -> $/В§/%/&/#/!

protected Substitute ( StringBuilder buffer ) : void
buffer	System.Text.StringBuilder
리턴	void

프로퍼티 상세

substCount 보호되어 있는 프로퍼티

Amount of characters that are removed with Substitute() while stemming.

protected int substCount
리턴	int