C# Class Lucene.Net.Facet.Taxonomy.WriterCache.CompactLabelToOrdinal

This is a very efficient LabelToOrdinal implementation that uses a CharBlockArray to store all labels and a configurable number of HashArrays to reference the labels.

Since the HashArrays don't handle collisions, a CollisionMap is used to store the colliding labels.

This data structure grows by adding a new HashArray whenever the number of collisions in the CollisionMap exceeds {@code loadFactor} * #getMaxOrdinal(). Growing also includes reinserting all colliding labels into the HashArrays to possibly reduce the number of collisions. For setting the {@code loadFactor} see #CompactLabelToOrdinal(int, float, int).

This data structure has a much lower memory footprint (~30%) compared to a Java HashMap<String, Integer>. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels. @lucene.experimental

Inheritance: Lucene.Net.Facet.Taxonomy.WriterCache.LabelToOrdinal
Datei anzeigen Open project: apache/lucenenet Class Usage Examples

Public Methods

Method Description
AddLabel ( FacetLabel label, int ordinal ) : void
CompactLabelToOrdinal ( int initialCapacity, float loadFactor, int numHashArrays ) : System

Sole constructor.

GetOrdinal ( FacetLabel label ) : int

Private Methods

Method Description
AddLabel ( HashArray a, FacetLabel label, int hash, int ordinal ) : bool
AddLabelOffset ( int hash, int cid, int knownOffset ) : void
AddLabelOffsetToHashArray ( HashArray a, int hash, int ordinal, int knownOffset ) : bool
CompactLabelToOrdinal ( ) : System
DetermineCapacity ( int minCapacity, int initialCapacity ) : int
Flush ( Stream stream ) : void
GetMemoryUsage ( ) : int

Returns an estimate of the amount of memory used by this table. Called only in this package. Memory is consumed mainly by three structures: the hash arrays, label repository and collision map.

GetOrdinal ( HashArray a, FacetLabel label, int hash ) : int
Grow ( ) : void
IndexFor ( int h, int length ) : int

Returns index for hash code h.

Init ( ) : void
Open ( FileInfo file, float loadFactor, int numHashArrays ) : CompactLabelToOrdinal

Opens the file and reloads the CompactLabelToOrdinal. The file it expects is generated from the Flush(Stream) command.

StringHashCode ( Lucene.Net.Facet.Taxonomy.WriterCache.CharBlockArray labelRepository, int offset ) : int
StringHashCode ( FacetLabel label ) : int

Method Details

AddLabel() public method

public AddLabel ( FacetLabel label, int ordinal ) : void
label FacetLabel
ordinal int
return void

CompactLabelToOrdinal() public method

Sole constructor.
public CompactLabelToOrdinal ( int initialCapacity, float loadFactor, int numHashArrays ) : System
initialCapacity int
loadFactor float
numHashArrays int
return System

GetOrdinal() public method

public GetOrdinal ( FacetLabel label ) : int
label FacetLabel
return int