C# Class Lucene.Net.Analysis.Util.CharacterUtils

CharacterUtils provides a unified interface to Character-related operations to implement backwards compatible character operations based on a Version instance. @lucene.internal
Datei anzeigen Open project: paulirwin/lucene.net Class Usage Examples

Public Methods

Method Description
CodePointAt ( char chars, int offset, int limit ) : int

Returns the code point at the given index of the char array where only elements with index less than the limit are used. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

CodePointAt ( string seq, int offset ) : int

Returns the code point at the given index of the CharSequence. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

CodePointCount ( string seq ) : int

Return the number of characters in seq.

Fill ( CharacterBuffer buffer, System.IO.TextReader reader ) : bool

Convenience method which calls fill(buffer, reader, buffer.buffer.length).

Fill ( CharacterBuffer buffer, System.IO.TextReader reader, int numChars ) : bool

Fills the CharacterBuffer with characters read from the given reader Reader. This method tries to read numChars characters into the CharacterBuffer, each call to fill will start filling the buffer from offset 0 up to numChars. In case code points can span across 2 java characters, this method may only fill numChars - 1 characters in order not to split in the middle of a surrogate pair, even if there are remaining characters in the Reader.

Depending on the Version passed to CharacterUtils#getInstance(Version) this method implements supplementary character awareness when filling the given buffer. For all Version > 3.0 #fill(CharacterBuffer, Reader, int) guarantees that the given CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders.

A return value of false means that this method call exhausted the reader, but there may be some bytes which have been read, which can be verified by checking whether buffer.getLength() > 0.

GetInstance ( Lucene.Net.Util.Version matchVersion ) : CharacterUtils

Returns a CharacterUtils implementation according to the given Version instance.

NewCharacterBuffer ( int bufferSize ) : CharacterBuffer

Creates a new CharacterBuffer and allocates a char[] of the given bufferSize.

OffsetByCodePoints ( char buf, int start, int count, int index, int offset ) : int

Return the index within buf[start:start+count] which is by offset code points from index.

ToLower ( char buffer, int offset, int limit ) : void

Converts each unicode codepoint to lowerCase via Character#toLowerCase(int) starting at the given offset.

ToUpper ( char buffer, int offset, int limit ) : void

Converts each unicode codepoint to UpperCase via Character#toUpperCase(int) starting at the given offset.

toChars ( int src, int srcOff, int srcLen, char dest, int destOff ) : int

Converts a sequence of unicode code points to a sequence of Java characters.

toCodePoints ( char src, int srcOff, int srcLen, int dest, int destOff ) : int

Converts a sequence of Java characters to a sequence of unicode code points.

Private Methods

Method Description
ReadFully ( System.IO.TextReader reader, char dest, int offset, int len ) : int

Method Details

CodePointAt() public abstract method

Returns the code point at the given index of the char array where only elements with index less than the limit are used. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.
/// - if the array is null. /// - if the value offset is negative or not less than the length of /// the char array.
public abstract CodePointAt ( char chars, int offset, int limit ) : int
chars char /// a character array
offset int /// the offset to the char values in the chars array to be converted
limit int the index afer the last element that should be used to calculate /// codepoint. ///
return int

CodePointAt() public abstract method

Returns the code point at the given index of the CharSequence. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.
/// - if the sequence is null. /// - if the value offset is negative or not less than the length of /// the character sequence.
public abstract CodePointAt ( string seq, int offset ) : int
seq string /// a character sequence
offset int /// the offset to the char values in the chars array to be converted ///
return int

CodePointCount() public abstract method

Return the number of characters in seq.
public abstract CodePointCount ( string seq ) : int
seq string
return int

Fill() public method

Convenience method which calls fill(buffer, reader, buffer.buffer.length).
public Fill ( CharacterBuffer buffer, System.IO.TextReader reader ) : bool
buffer CharacterBuffer
reader System.IO.TextReader
return bool

Fill() public abstract method

Fills the CharacterBuffer with characters read from the given reader Reader. This method tries to read numChars characters into the CharacterBuffer, each call to fill will start filling the buffer from offset 0 up to numChars. In case code points can span across 2 java characters, this method may only fill numChars - 1 characters in order not to split in the middle of a surrogate pair, even if there are remaining characters in the Reader.

Depending on the Version passed to CharacterUtils#getInstance(Version) this method implements supplementary character awareness when filling the given buffer. For all Version > 3.0 #fill(CharacterBuffer, Reader, int) guarantees that the given CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders.

A return value of false means that this method call exhausted the reader, but there may be some bytes which have been read, which can be verified by checking whether buffer.getLength() > 0.

/// if the reader throws an .
public abstract Fill ( CharacterBuffer buffer, System.IO.TextReader reader, int numChars ) : bool
buffer CharacterBuffer /// the buffer to fill.
reader System.IO.TextReader /// the reader to read characters from.
numChars int /// the number of chars to read
return bool

GetInstance() public static method

Returns a CharacterUtils implementation according to the given Version instance.
public static GetInstance ( Lucene.Net.Util.Version matchVersion ) : CharacterUtils
matchVersion Lucene.Net.Util.Version /// a version instance
return CharacterUtils

NewCharacterBuffer() public static method

Creates a new CharacterBuffer and allocates a char[] of the given bufferSize.
public static NewCharacterBuffer ( int bufferSize ) : CharacterBuffer
bufferSize int /// the internal char buffer size, must be >= 2
return CharacterBuffer

OffsetByCodePoints() public abstract method

Return the index within buf[start:start+count] which is by offset code points from index.
public abstract OffsetByCodePoints ( char buf, int start, int count, int index, int offset ) : int
buf char
start int
count int
index int
offset int
return int

ToLower() public method

Converts each unicode codepoint to lowerCase via Character#toLowerCase(int) starting at the given offset.
public ToLower ( char buffer, int offset, int limit ) : void
buffer char the char buffer to lowercase
offset int the offset to start at
limit int the max char in the buffer to lower case
return void

ToUpper() public method

Converts each unicode codepoint to UpperCase via Character#toUpperCase(int) starting at the given offset.
public ToUpper ( char buffer, int offset, int limit ) : void
buffer char the char buffer to UPPERCASE
offset int the offset to start at
limit int the max char in the buffer to lower case
return void

toChars() public method

Converts a sequence of unicode code points to a sequence of Java characters.
public toChars ( int src, int srcOff, int srcLen, char dest, int destOff ) : int
src int
srcOff int
srcLen int
dest char
destOff int
return int

toCodePoints() public method

Converts a sequence of Java characters to a sequence of unicode code points.
public toCodePoints ( char src, int srcOff, int srcLen, int dest, int destOff ) : int
src char
srcOff int
srcLen int
dest int
destOff int
return int