C# Class Lucene.Net.Analysis.Util.CharacterUtils

CharacterUtils provides a unified interface to Character-related operations to implement backwards compatible character operations based on a Version instance. @lucene.internal

Datei anzeigen Open project: paulirwin/lucene.net Class Usage Examples

Public Methods

Method	Description
CodePointAt ( char chars, int offset, int limit ) : int	Returns the code point at the given index of the char array where only elements with index less than the limit are used. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.
CodePointAt ( string seq, int offset ) : int	Returns the code point at the given index of the CharSequence. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.
CodePointCount ( string seq ) : int	Return the number of characters in `seq`.
Fill ( CharacterBuffer buffer, System.IO.TextReader reader ) : bool	Convenience method which calls `fill(buffer, reader, buffer.buffer.length)`.
Fill ( CharacterBuffer buffer, System.IO.TextReader reader, int numChars ) : bool	Fills the CharacterBuffer with characters read from the given reader Reader. This method tries to read `numChars` characters into the CharacterBuffer, each call to fill will start filling the buffer from offset `0` up to `numChars`. In case code points can span across 2 java characters, this method may only fill `numChars - 1` characters in order not to split in the middle of a surrogate pair, even if there are remaining characters in the Reader. Depending on the Version passed to CharacterUtils#getInstance(Version) this method implements supplementary character awareness when filling the given buffer. For all Version > 3.0 #fill(CharacterBuffer, Reader, int) guarantees that the given CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders. A return value of `false` means that this method call exhausted the reader, but there may be some bytes which have been read, which can be verified by checking whether `buffer.getLength() > 0`.
GetInstance ( Lucene.Net.Util.Version matchVersion ) : CharacterUtils	Returns a CharacterUtils implementation according to the given Version instance.
NewCharacterBuffer ( int bufferSize ) : CharacterBuffer	Creates a new CharacterBuffer and allocates a `char[]` of the given bufferSize.
OffsetByCodePoints ( char buf, int start, int count, int index, int offset ) : int	Return the index within `buf[start:start+count]` which is by `offset` code points from `index`.
ToLower ( char buffer, int offset, int limit ) : void	Converts each unicode codepoint to lowerCase via Character#toLowerCase(int) starting at the given offset.
ToUpper ( char buffer, int offset, int limit ) : void	Converts each unicode codepoint to UpperCase via Character#toUpperCase(int) starting at the given offset.
toChars ( int src, int srcOff, int srcLen, char dest, int destOff ) : int	Converts a sequence of unicode code points to a sequence of Java characters.
toCodePoints ( char src, int srcOff, int srcLen, int dest, int destOff ) : int	Converts a sequence of Java characters to a sequence of unicode code points.

Private Methods

Method	Description
ReadFully ( System.IO.TextReader reader, char dest, int offset, int len ) : int

Method Details

CodePointAt() public abstract method

Returns the code point at the given index of the char array where only elements with index less than the limit are used. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

/// - if the array is null. /// - if the value offset is negative or not less than the length of /// the char array.

public abstract CodePointAt ( char chars, int offset, int limit ) : int
chars	char	/// a character array
offset	int	/// the offset to the char values in the chars array to be converted
limit	int	the index afer the last element that should be used to calculate /// codepoint. ///
return	int

CodePointAt() public abstract method

Returns the code point at the given index of the CharSequence. Depending on the Version passed to CharacterUtils#getInstance(Version) this method mimics the behavior of Character#codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

/// - if the sequence is null. /// - if the value offset is negative or not less than the length of /// the character sequence.

public abstract CodePointAt ( string seq, int offset ) : int
seq	string	/// a character sequence
offset	int	/// the offset to the char values in the chars array to be converted ///
return	int

CodePointCount() public abstract method

Return the number of characters in seq.

public abstract CodePointCount ( string seq ) : int
seq	string
return	int

Fill() public method

Convenience method which calls fill(buffer, reader, buffer.buffer.length).

public Fill ( CharacterBuffer buffer, System.IO.TextReader reader ) : bool
buffer	CharacterBuffer
reader	System.IO.TextReader
return	bool

Fill() public abstract method

Fills the CharacterBuffer with characters read from the given reader Reader. This method tries to read numChars characters into the CharacterBuffer, each call to fill will start filling the buffer from offset 0 up to numChars. In case code points can span across 2 java characters, this method may only fill numChars - 1 characters in order not to split in the middle of a surrogate pair, even if there are remaining characters in the Reader.

Depending on the Version passed to CharacterUtils#getInstance(Version) this method implements supplementary character awareness when filling the given buffer. For all Version > 3.0 #fill(CharacterBuffer, Reader, int) guarantees that the given CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders.

A return value of false means that this method call exhausted the reader, but there may be some bytes which have been read, which can be verified by checking whether buffer.getLength() > 0.

/// if the reader throws an .

public abstract Fill ( CharacterBuffer buffer, System.IO.TextReader reader, int numChars ) : bool
buffer	CharacterBuffer	/// the buffer to fill.
reader	System.IO.TextReader	/// the reader to read characters from.
numChars	int	/// the number of chars to read
return	bool

GetInstance() public static method

Returns a CharacterUtils implementation according to the given Version instance.

public static GetInstance ( Lucene.Net.Util.Version matchVersion ) : CharacterUtils
matchVersion	Lucene.Net.Util.Version	/// a version instance
return	CharacterUtils

NewCharacterBuffer() public static method

Creates a new CharacterBuffer and allocates a char[] of the given bufferSize.

public static NewCharacterBuffer ( int bufferSize ) : CharacterBuffer
bufferSize	int	/// the internal char buffer size, must be `>= 2`
return	CharacterBuffer

OffsetByCodePoints() public abstract method

Return the index within buf[start:start+count] which is by offset code points from index.

public abstract OffsetByCodePoints ( char buf, int start, int count, int index, int offset ) : int
buf	char
start	int
count	int
index	int
offset	int
return	int

ToLower() public method

Converts each unicode codepoint to lowerCase via Character#toLowerCase(int) starting at the given offset.

public ToLower ( char buffer, int offset, int limit ) : void
buffer	char	the char buffer to lowercase
offset	int	the offset to start at
limit	int	the max char in the buffer to lower case
return	void

ToUpper() public method

Converts each unicode codepoint to UpperCase via Character#toUpperCase(int) starting at the given offset.

public ToUpper ( char buffer, int offset, int limit ) : void
buffer	char	the char buffer to UPPERCASE
offset	int	the offset to start at
limit	int	the max char in the buffer to lower case
return	void

toChars() public method

Converts a sequence of unicode code points to a sequence of Java characters.

public toChars ( int src, int srcOff, int srcLen, char dest, int destOff ) : int
src	int
srcOff	int
srcLen	int
dest	char
destOff	int
return	int

toCodePoints() public method

Converts a sequence of Java characters to a sequence of unicode code points.

public toCodePoints ( char src, int srcOff, int srcLen, int dest, int destOff ) : int
src	char
srcOff	int
srcLen	int
dest	int
destOff	int
return	int