C# Class StopGuessing.DataStructures.BinomialLadderFilter

A binomial ladder filter maps elements keys to H indexes in an n bit table of bits, which are initially set at random with a probability 0.5. Each index represents a rung on the element's ladder. The indexes set to 1/true are rungs that are below that element on the ladder and the indexes that are 0/false are rungs above the element, or which the element has yet to climb. On avarege, an element that has not been seen before will map to, on average, H/2 index that are set to 1 (true) and H/2 indexes that are set to 0 (false). This means that half the rungs are below the element and half are above it, and so the element is half way up the ladder. If a non-empty subset of the indexes that an element maps to contains 0, observing that element (the Step() method) will cause a random member of that subset to be changed from 0 to 1, causing the element to move one rung up the ladder. To ensure that the fraction of bits set to 1 stays constant, the step operation willsimultaneously clear a random bit selected from subset of indexes in the entire sketch that have value 1. The more times an element has been observed, the higher the expected number of indexes that will have been set, the higher it moves up the ladder until reaching the top (when all of its bits are set). To count the subset of a keys indexes that have been set, one can call the GetRungsBelow method. Natural variance will cause some never-before-seen keys to have more than k/2 bits set and keys that have been observed very rarely (or only in the distant past) to have fewer than k/2 bit set. an element that is observed only a few times (with ~ k/2 + epsilon bits set) will be among ~ .5^{epsilon} false-positive keys that have as many bits set by chance. Thus, until an element has been observed a more substantial number of times it is hard to differentiate from false positives. To have confidence greater than one in a million that an element has been observed, it will take an average of 20 observations.
Inheritance: FilterArray, IBinomialLadderFilter
Show file Open project: Microsoft/StopGuessing Class Usage Examples

Public Methods

Method Description
BinomialLadderFilter ( int numberOfRungsInArray, int maxLadderHeightInRungs ) : System

Construct a binomial sketch, in which a set of k hash functions (k=MaxLadderHeightInRungs) will map any element to k points with an array of n bits (numberOfRungsInArray). When one Adds an element to a binomial sketch, a random bit among the subset of k that are currently 0 will be set to 1. To ensure roughly half the bits remain zero, a random index from the subset of all k bits that are currently 1 will be set to 0. Over time, popular keys will have almost all of their bits set and unpopular keys will be expected to have roughly half their bits set.

GetHeight ( string key, int heightOfLadderInRungs = null ) : int
GetHeightAsync ( string element, int heightOfLadderInRungs = null, System.TimeSpan timeout = null, CancellationToken cancellationToken = newCancellationToken() ) : Task
Step ( string key, int heightOfLadderInRungs = null ) : int

When one Adds an element to a binomial sketch, a random bit among the subset of k that are currently 0 (false) will be set to 1 (true). To ensure roughly half the bits remain zero at all times, a random index from the subset of all k bits that are currently 1 (true) will be set to 0 (false).

StepAsync ( string key, int heightOfLadderInRungs = null, System.TimeSpan timeout = null, CancellationToken cancellationToken = newCancellationToken() ) : Task

Protected Methods

Method Description
GetIndexOfRandomBitOfDesiredValue ( bool desiredValueOfElement ) : int

Method Details

BinomialLadderFilter() public method

Construct a binomial sketch, in which a set of k hash functions (k=MaxLadderHeightInRungs) will map any element to k points with an array of n bits (numberOfRungsInArray). When one Adds an element to a binomial sketch, a random bit among the subset of k that are currently 0 will be set to 1. To ensure roughly half the bits remain zero, a random index from the subset of all k bits that are currently 1 will be set to 0. Over time, popular keys will have almost all of their bits set and unpopular keys will be expected to have roughly half their bits set.
public BinomialLadderFilter ( int numberOfRungsInArray, int maxLadderHeightInRungs ) : System
numberOfRungsInArray int The total number of bits to maintain in the table. /// In theoretical discussions of bloom filters and sketches, this is usually referrted to by the letter n.
maxLadderHeightInRungs int The number of indexes to map each element to, each of which is assigned a unique pseudorandom /// hash function. This is typically referred to by the letter k.
return System

GetHeight() public method

public GetHeight ( string key, int heightOfLadderInRungs = null ) : int
key string
heightOfLadderInRungs int
return int

GetHeightAsync() public method

public GetHeightAsync ( string element, int heightOfLadderInRungs = null, System.TimeSpan timeout = null, CancellationToken cancellationToken = newCancellationToken() ) : Task
element string
heightOfLadderInRungs int
timeout System.TimeSpan
cancellationToken System.Threading.CancellationToken
return Task

GetIndexOfRandomBitOfDesiredValue() protected method

protected GetIndexOfRandomBitOfDesiredValue ( bool desiredValueOfElement ) : int
desiredValueOfElement bool
return int

Step() public method

When one Adds an element to a binomial sketch, a random bit among the subset of k that are currently 0 (false) will be set to 1 (true). To ensure roughly half the bits remain zero at all times, a random index from the subset of all k bits that are currently 1 (true) will be set to 0 (false).
public Step ( string key, int heightOfLadderInRungs = null ) : int
key string The element to add to the set.
heightOfLadderInRungs int Set if using a ladder shorter than the default for this sketch
return int

StepAsync() public method

public StepAsync ( string key, int heightOfLadderInRungs = null, System.TimeSpan timeout = null, CancellationToken cancellationToken = newCancellationToken() ) : Task
key string
heightOfLadderInRungs int
timeout System.TimeSpan
cancellationToken System.Threading.CancellationToken
return Task