Name | Description |
---|---|
CategoryPathUtils | Utilities for use of FacetLabel by CompactLabelToOrdinal. |
CompactLabelToOrdinal | This is a very efficient LabelToOrdinal implementation that uses a CharBlockArray to store all labels and a configurable number of HashArrays to reference the labels. Since the HashArrays don't handle collisions, a CollisionMap is used to store the colliding labels. This data structure grows by adding a new HashArray whenever the number of collisions in the CollisionMap exceeds {@code loadFactor} * #getMaxOrdinal(). Growing also includes reinserting all colliding labels into the HashArrays to possibly reduce the number of collisions. For setting the {@code loadFactor} see #CompactLabelToOrdinal(int, float, int). This data structure has a much lower memory footprint (~30%) compared to a Java HashMap<String, Integer>. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels. @lucene.experimental |
CompactLabelToOrdinal.HashArray | |
TestCharBlockArray | |
TestCompactLabelToOrdinal | |
TestCompactLabelToOrdinal.LabelToOrdinalMap |