C# Класс Lucene.Net.Util.NumericUtils

this is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.

To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. this reduces the number of terms dramatically.

this class generates terms to achieve this: First the numerical integer values need to be converted to bytes. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting byte[] is sortable like the original integer value (even using UTF-8 sort order). Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: #doubleToSortableLong, #floatToSortableInt. You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to longs or ints (e.g. date to long: java.util.Date#getTime).

For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index int, long, float, and double. For querying, NumericRangeQuery and NumericRangeFilter implement the query part for the same data types.

this class can also be used, to generate lexicographically sortable (according to BytesRef#getUTF8SortedAsUTF16Comparator()) representations of numeric data types for other usages (e.g. sorting). @lucene.internal @since 2.9, API changed non backwards-compliant in 4.0

Показать файл Открыть проект Примеры использования класса

Открытые методы

Метод	Описание
DoubleToSortableLong ( double val ) : long	Converts a `double` value to a sortable signed `long`. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long. The sort order (including Double#NaN) is defined by Double#compareTo; {@code NaN} is greater than positive infinity.
FilterPrefixCodedInts ( TermsEnum termsEnum ) : TermsEnum	Filters the given TermsEnum by accepting only prefix coded 32 bit terms with a shift value of `0`.
FilterPrefixCodedLongs ( TermsEnum termsEnum ) : TermsEnum	Filters the given TermsEnum by accepting only prefix coded 64 bit terms with a shift value of `0`.
FloatToSortableInt ( float val ) : int	Converts a `float` value to a sortable signed `int`. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int. The sort order (including Float#NaN) is defined by Float#compareTo; {@code NaN} is greater than positive infinity.
GetPrefixCodedIntShift ( BytesRef val ) : int	Returns the shift value from a prefix encoded {@code int}.
GetPrefixCodedLongShift ( BytesRef val ) : int	Returns the shift value from a prefix encoded {@code long}.
IntToPrefixCoded ( int val, int shift, BytesRef bytes ) : void	Returns prefix coded bits after reducing the precision by `shift` bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.
IntToPrefixCodedBytes ( int val, int shift, BytesRef bytes ) : void	Returns prefix coded bits after reducing the precision by `shift` bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.
LongToPrefixCoded ( long val, int shift, BytesRef bytes ) : void	Returns prefix coded bits after reducing the precision by `shift` bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.
LongToPrefixCodedBytes ( long val, int shift, BytesRef bytes ) : void	Returns prefix coded bits after reducing the precision by `shift` bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.
PrefixCodedToInt ( BytesRef val ) : int	Returns an int from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. this method can be used to decode a term's value.
PrefixCodedToLong ( BytesRef val ) : long	Returns a long from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. this method can be used to decode a term's value.
SortableIntToFloat ( int val ) : float	Converts a sortable `int` back to a `float`.
SortableLongToDouble ( long val ) : double	Converts a sortable `long` back to a `double`.
SplitIntRange ( IntRangeBuilder builder, int precisionStep, int minBound, int maxBound ) : void	Splits an int range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its IntRangeBuilder#addRange(BytesRef,BytesRef) method. this method is used by NumericRangeQuery.
SplitLongRange ( LongRangeBuilder builder, int precisionStep, long minBound, long maxBound ) : void	Splits a long range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its LongRangeBuilder#addRange(BytesRef,BytesRef) method. this method is used by NumericRangeQuery.

Приватные методы

Метод	Описание
AddRange ( object builder, int valSize, long minBound, long maxBound, int shift ) : void	Helper that delegates to correct range builder
NumericUtils ( ) : Lucene.Net.Documents
SplitRange ( object builder, int valSize, int precisionStep, long minBound, long maxBound ) : void	this helper does the splitting for both 32 and 64 bit.

Описание методов

DoubleToSortableLong() публичный статический Метод

Converts a double value to a sortable signed long. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long. The sort order (including Double#NaN) is defined by Double#compareTo; {@code NaN} is greater than positive infinity.

public static DoubleToSortableLong ( double val ) : long
val	double
Результат	long

FilterPrefixCodedInts() публичный статический Метод

Filters the given TermsEnum by accepting only prefix coded 32 bit terms with a shift value of 0.

public static FilterPrefixCodedInts ( TermsEnum termsEnum ) : TermsEnum
termsEnum	TermsEnum	/// the terms enum to filter
Результат	TermsEnum

FilterPrefixCodedLongs() публичный статический Метод

Filters the given TermsEnum by accepting only prefix coded 64 bit terms with a shift value of 0.

public static FilterPrefixCodedLongs ( TermsEnum termsEnum ) : TermsEnum
termsEnum	TermsEnum	/// the terms enum to filter
Результат	TermsEnum

FloatToSortableInt() публичный статический Метод

Converts a float value to a sortable signed int. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int. The sort order (including Float#NaN) is defined by Float#compareTo; {@code NaN} is greater than positive infinity.

public static FloatToSortableInt ( float val ) : int
val	float
Результат	int

GetPrefixCodedIntShift() публичный статический Метод

Returns the shift value from a prefix encoded {@code int}.

if the supplied is /// not correctly prefix encoded.

public static GetPrefixCodedIntShift ( BytesRef val ) : int
val	BytesRef
Результат	int

GetPrefixCodedLongShift() публичный статический Метод

Returns the shift value from a prefix encoded {@code long}.

if the supplied is /// not correctly prefix encoded.

public static GetPrefixCodedLongShift ( BytesRef val ) : int
val	BytesRef
Результат	int

IntToPrefixCoded() публичный статический Метод

Returns prefix coded bits after reducing the precision by shift bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.

public static IntToPrefixCoded ( int val, int shift, BytesRef bytes ) : void
val	int	the numeric value
shift	int	how many bits to strip from the right
bytes	BytesRef	will contain the encoded value
Результат	void

IntToPrefixCodedBytes() публичный статический Метод

Returns prefix coded bits after reducing the precision by shift bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.

public static IntToPrefixCodedBytes ( int val, int shift, BytesRef bytes ) : void
val	int	the numeric value
shift	int	how many bits to strip from the right
bytes	BytesRef	will contain the encoded value
Результат	void

LongToPrefixCoded() публичный статический Метод

Returns prefix coded bits after reducing the precision by shift bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.

public static LongToPrefixCoded ( long val, int shift, BytesRef bytes ) : void
val	long	the numeric value
shift	int	how many bits to strip from the right
bytes	BytesRef	will contain the encoded value
Результат	void

LongToPrefixCodedBytes() публичный статический Метод

Returns prefix coded bits after reducing the precision by shift bits. this is method is used by NumericTokenStream. After encoding, {@code bytes.offset} will always be 0.

public static LongToPrefixCodedBytes ( long val, int shift, BytesRef bytes ) : void
val	long	the numeric value
shift	int	how many bits to strip from the right
bytes	BytesRef	will contain the encoded value
Результат	void

PrefixCodedToInt() публичный статический Метод

Returns an int from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. this method can be used to decode a term's value.

if the supplied is /// not correctly prefix encoded.

public static PrefixCodedToInt ( BytesRef val ) : int
val	BytesRef
Результат	int

PrefixCodedToLong() публичный статический Метод

Returns a long from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. this method can be used to decode a term's value.

if the supplied is /// not correctly prefix encoded.

public static PrefixCodedToLong ( BytesRef val ) : long
val	BytesRef
Результат	long

SortableIntToFloat() публичный статический Метод

Converts a sortable int back to a float.

public static SortableIntToFloat ( int val ) : float
val	int
Результат	float

SortableLongToDouble() публичный статический Метод

Converts a sortable long back to a double.

public static SortableLongToDouble ( long val ) : double
val	long
Результат	double

SplitIntRange() публичный статический Метод

Splits an int range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its IntRangeBuilder#addRange(BytesRef,BytesRef) method.

this method is used by NumericRangeQuery.

public static SplitIntRange ( IntRangeBuilder builder, int precisionStep, int minBound, int maxBound ) : void
builder	IntRangeBuilder
precisionStep	int
minBound	int
maxBound	int
Результат	void

SplitLongRange() публичный статический Метод

Splits a long range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its LongRangeBuilder#addRange(BytesRef,BytesRef) method.

this method is used by NumericRangeQuery.

public static SplitLongRange ( LongRangeBuilder builder, int precisionStep, long minBound, long maxBound ) : void
builder	LongRangeBuilder
precisionStep	int
minBound	long
maxBound	long
Результат	void