C# Class Lucene.Net.Analysis.Ar.ArabicNormalizer

Normalizer for Arabic.

Normalization is done in-place for efficiency, operating on a termbuffer.

Normalization is defined as:

  • Normalization of hamza with alef seat to a bare alef.
  • Normalization of teh marbuta to heh
  • Normalization of dotless yeh (alef maksura) to yeh.
  • Removal of Arabic diacritics (the harakat)
  • Removal of tatweel (stretching character).

Exibir arquivo Open project: apache/lucenenet

Public Methods

Method Description
Normalize ( char s, int len ) : int

Normalize an input buffer of Arabic text

Method Details

Normalize() public method

Normalize an input buffer of Arabic text
public Normalize ( char s, int len ) : int
s char input buffer
len int length of input buffer
return int