C# Class Lucene.Net.Util.BytesRefHash

BytesRefHash is a special purpose hash-map like data-structure optimized for BytesRef instances. BytesRefHash maintains mappings of byte arrays to ids (Map<BytesRef,int>) storing the hashed bytes efficiently in continuous storage. The mapping to the id is encapsulated inside BytesRefHash and is guaranteed to be increased for each added BytesRef.

Note: The maximum capacity BytesRef instance passed to #add(BytesRef) must not be longer than ByteBlockPool#BYTE_BLOCK_SIZE-2. The internal storage is limited to 2GB total byte storage.

@lucene.internal
Show file Open project: apache/lucenenet Class Usage Examples

Public Methods

Method Description
Add ( BytesRef bytes ) : int

Adds a new BytesRef

AddByPoolOffset ( int offset ) : int

Adds a "arbitrary" int offset instead of a BytesRef term. this is used in the indexer to hold the hash for term vectors, because they do not redundantly store the byte[] term directly and instead reference the byte[] term already stored by the postings BytesRefHash. See add(int textStart) in TermsHashPerField.

ByteStart ( int bytesID ) : int

Returns the bytesStart offset into the internally used ByteBlockPool for the given bytesID

BytesRefHash ( ) : System

Creates a new BytesRefHash with a ByteBlockPool using a DirectAllocator.

BytesRefHash ( Lucene.Net.Util.ByteBlockPool pool ) : System

Creates a new BytesRefHash

BytesRefHash ( Lucene.Net.Util.ByteBlockPool pool, int capacity, BytesStartArray bytesStartArray ) : System

Creates a new BytesRefHash

Clear ( ) : void
Clear ( bool resetPool ) : void

Clears the BytesRef which maps to the given BytesRef

Close ( ) : void

Closes the BytesRefHash and releases all internally used memory

Compact ( ) : int[]

Returns the ids array in arbitrary order. Valid ids start at offset of 0 and end at a limit of #size() - 1

Note: this is a destructive operation. #clear() must be called in order to reuse this BytesRefHash instance.

Find ( BytesRef bytes ) : int

Returns the id of the given BytesRef.

Get ( int bytesID, BytesRef @ref ) : BytesRef

Populates and returns a BytesRef with the bytes for the given bytesID.

Note: the given bytesID must be a positive integer less than the current size (#size())

Reinit ( ) : void

reinitializes the BytesRefHash after a previous #clear() call. If #clear() has not been called previously this method has no effect.

Size ( ) : int

Returns the number of BytesRef values in this BytesRefHash.

Sort ( IComparer comp ) : int[]

Returns the values array sorted by the referenced byte values.

Note: this is a destructive operation. #clear() must be called in order to reuse this BytesRefHash instance.

Private Methods

Method Description
DoHash ( byte bytes, int offset, int length ) : int
Equals ( int id, BytesRef b ) : bool
FindHash ( BytesRef bytes ) : int
Rehash ( int newSize, bool hashOnData ) : void

Called when hash is too small (> 50% occupied) or too large (< 20% occupied).

Shrink ( int targetSize ) : bool

Method Details

Add() public method

Adds a new BytesRef
/// if the given bytes are > 2 + ///
public Add ( BytesRef bytes ) : int
bytes BytesRef /// the bytes to hash
return int

AddByPoolOffset() public method

Adds a "arbitrary" int offset instead of a BytesRef term. this is used in the indexer to hold the hash for term vectors, because they do not redundantly store the byte[] term directly and instead reference the byte[] term already stored by the postings BytesRefHash. See add(int textStart) in TermsHashPerField.
public AddByPoolOffset ( int offset ) : int
offset int
return int

ByteStart() public method

Returns the bytesStart offset into the internally used ByteBlockPool for the given bytesID
public ByteStart ( int bytesID ) : int
bytesID int /// the id to look up
return int

BytesRefHash() public method

Creates a new BytesRefHash with a ByteBlockPool using a DirectAllocator.
public BytesRefHash ( ) : System
return System

BytesRefHash() public method

Creates a new BytesRefHash
public BytesRefHash ( Lucene.Net.Util.ByteBlockPool pool ) : System
pool Lucene.Net.Util.ByteBlockPool
return System

BytesRefHash() public method

Creates a new BytesRefHash
public BytesRefHash ( Lucene.Net.Util.ByteBlockPool pool, int capacity, BytesStartArray bytesStartArray ) : System
pool Lucene.Net.Util.ByteBlockPool
capacity int
bytesStartArray BytesStartArray
return System

Clear() public method

public Clear ( ) : void
return void

Clear() public method

Clears the BytesRef which maps to the given BytesRef
public Clear ( bool resetPool ) : void
resetPool bool
return void

Close() public method

Closes the BytesRefHash and releases all internally used memory
public Close ( ) : void
return void

Compact() public method

Returns the ids array in arbitrary order. Valid ids start at offset of 0 and end at a limit of #size() - 1

Note: this is a destructive operation. #clear() must be called in order to reuse this BytesRefHash instance.

public Compact ( ) : int[]
return int[]

Find() public method

Returns the id of the given BytesRef.
public Find ( BytesRef bytes ) : int
bytes BytesRef /// the bytes to look for ///
return int

Get() public method

Populates and returns a BytesRef with the bytes for the given bytesID.

Note: the given bytesID must be a positive integer less than the current size (#size())

public Get ( int bytesID, BytesRef @ref ) : BytesRef
bytesID int /// the id
@ref BytesRef
return BytesRef

Reinit() public method

reinitializes the BytesRefHash after a previous #clear() call. If #clear() has not been called previously this method has no effect.
public Reinit ( ) : void
return void

Size() public method

Returns the number of BytesRef values in this BytesRefHash.
public Size ( ) : int
return int

Sort() public method

Returns the values array sorted by the referenced byte values.

Note: this is a destructive operation. #clear() must be called in order to reuse this BytesRefHash instance.

public Sort ( IComparer comp ) : int[]
comp IComparer /// the used for sorting
return int[]