C# (CSharp) Redzen.Numerics Namespace

Classes

Name Description
BoxMullerGaussianSampler Source of random values sample from a Gaussian distribution. Uses the polar form of the Box-Muller method. http://en.wikipedia.org/wiki/Box_Muller_transform
DiscreteDistribution Represents a distribution over a discrete set of possible states. Total probability over all states must add up to 1.0 This class was previously called RouletteWheelLayout.
DiscreteDistributionUtils Static methods for roulette wheel selection from a set of choices with predefined probabilities.
HistogramData Histogram data. Frequency counts arranged into bins..
NumericsUtils
XorShiftRandom A fast random number generator for .NET Colin Green, January 2005 Note. A forked version of this class exists in Math.Net at time of writing (XorShift class). Key points: 1) Based on a simple and fast xor-shift pseudo random number generator (RNG) specified in: Marsaglia, George. (2003). Xorshift RNGs. http://www.jstatsoft.org/v08/i14/paper This particular implementation of xorshift has a period of 2^128-1. See the above paper to see how this can be easily extened if you need a longer period. At the time of writing I could find no information on the period of System.Random for comparison. 2) Faster than System.Random. Up to 8x faster, depending on which methods are called. 3) Direct replacement for System.Random. This class implements all of the methods that System.Random does plus some additional methods. The like named methods are functionally equivalent. 4) Allows fast re-initialisation with a seed, unlike System.Random which accepts a seed at construction time which then executes a relatively expensive initialisation routine. This provides a vast speed improvement if you need to reset the pseudo-random number sequence many times, e.g. if you want to re-generate the same sequence of random numbers many times. An alternative might be to cache random numbers in an array, but that approach is limited by memory capacity and the fact that you may also want a large number of different sequences cached. Each sequence can be represented by a single seed value (int) when using FastRandom.
ZigguratGaussianSampler A fast Gaussian distribution sampler for .Net Colin Green, 11/09/2011 An implementation of the Ziggurat algorithm for random sampling from a Gaussian distribution. See: - Wikipedia:Ziggurat algorithm (http://en.wikipedia.org/wiki/Ziggurat_algorithm). - The Ziggurat Method for Generating Random Variables, George Marsaglia and Wai Wan Tsang (http://www.jstatsoft.org/v05/i08/paper). - An Improved Ziggurat Method to Generate Normal Random Samples, Jurgen A Doornik (http://www.doornik.com/research/ziggurat.pdf) Ziggurat Algorithm Overview ============================ Consider the right hand side of the Gaussian probability density function (for x >=0) as described by y = f(x). This half of the distribution is covered by a series of stacked horizontal rectangles, like so: _____ | | R6 S6 | | |_____|_ | | R5 S5 |_______|_ | | R4 S4 |_________|__ |____________|__ R3 S3 |_______________|________ R2 S2 |________________________| R1 S1 |________________________| R0 S0 (X) The Basics ---------- (1) Each rectangle is assigned a number (the R numbers shown above). (2) The right hand edge of each rectangle is placed so that it just covers the distribution, that is, the bottom right corner is on the curve, and therefore some of the area in the top right of the rectangle is outside of the distribution (points with y greater than f(x)), except for R0 (see next point). Therefore the rectangles taken together cover an area slightly larger than the distribution curve. (3) R0 is a special case. The tail of the Gaussian effectively projects into x=Infinity asymptotically approaching zero, thus we do not cover the tail with a rectangle. Instead we define a cut-off point (x=3.442619855899 in this implementation). R0's right hand edge is at the cut-off point with its top right corner on the distribution curve. The tail is then defined as that part of the distribution with x > tailCutOff and is combined with R0 to form segment S0. Note that the whole of R0 is within the distribution, unlike the other rectangles. (4) Segments. Each rectangle is also referred to as a segment with the exception of R0 which is a special case as explained above. Essentially S[i] == R[i], except for S[0] == R[0] + tail. (5) Each segment has identical area A, this also applies to the special segment S0, thus the area of R0 is A minus the area represented by the tail. For all other segments the segment area is the same as the rectangle area. (6) R[i] has right hand edge x[i]. And from drawing the rectangles over the distribution curve it is clear that the region of R[i] to the left of x[i+1] is entirely within the distribution curve, whereas the region greater than x[i+1] is partially above the distribution curve. (7) R[i] has top edge of y[i]. Operation --------- (1) Randomly select a segment to sample from, call this S[i], this amounts to a low resolution random y coordinate. Because the segments have equal area we can select from them with equal probability. (Also see special notes, below). (2) Segment 0 is a special case, if S0 is selected then generate a random area value w between 0 and A. If w is less than or equal to the area of R0 then we are sampling a point from within R0 (step 2A), otherwise we are sampling from the tail (step 2B). (2A) Sampling from R0. R0 is entirely within the distribution curve and we have already generated a random area value w. Convert w to an x value that we can return by dividing w by the height of R0 (y[0]). (2B) Sampling from the tail. To sample from the tail we fall back to a slow implementation using logarithms, see: Generating a Variable from the Tail of the Normal Distribution, George Marsaglia (1963). (http://www.dtic.mil/cgi-bin/GetTRDoc?AD=AD423993&Location=U2&doc=GetTRDoc.pdf) The area represented by the tail is relatively small and therefore this execution pathway is avoided for a significant proportion of samples generated. (3) Sampling from all other rectangles/segments other then R0/S0. Randomly select x from within R[i]. If x is less than x[i+1] then x is within the curve, return x. If x is greater than or equal to x[i+1] then generate a random y variable from within R[i] (this amounts to producing a high resolution y coordinate, a refinement of the low resolution y coord we effectively produced by selecting a rectangle/segment). If y is below f(x) then return x, otherwise we disregard the sample point and return to step 1. We specifically do *not* re-attempt to sample from R[i] until a valid point is found (see special notes 1). (4) Finally, all of the above describes sampling from the positive half of the distribution (x greater than or equal to zero) hence to obtain a symetrical distribution we need one more random bit to decide whether to flip the sign of the returned x. Special notes ------------- (Note 1) Segments have equal area and are thus selected with equal probability. However, the area under the distribution curve covered by each segment/rectangle differs where rectangles overlap the edge of the distribution curve. Thus it has been suggested that to avoid sampling bias that the segments should be selected with a probability that reflects the area of the distribution curve they cover not their total area, this is an incorrect approach for the algorithm as described above and implemented in this class. To explain why consider an extreme case. Say that rectangle R1 covers an area entirely within the distribution curve, now consider R2 that covers an area only 10% within the curve. Both rectangles are chosen with equal probability, thus the argument is that R2 will be 10x overrepresented (will generate sample points as often as R1 despite covering a much smaller proportion of the area under the distribution curve). In reality sample points within R2 will be rejected 90% of the time and we disregard the attempt to sample from R2 and go back to step 1 (select a segment to sample from). If instead we re-attempted sampling from R2 until a valid point was found then R2 would indeed become over-represented, hence we do not do this and the algorithm therefore does not exhibit any such bias. (Note 2) George Marsaglia's original implementation used a single random number (32bit unsigned integer) for both selecting the segment and producing the x coordinate with the chosen segment. The segment index was taken from the the least significant bits (so the least significant 7 bits if using 128 segments). This effectively created a perculair type of bias in which all x coords produced within a given segment would have an identical least significant 7 bits, albeit prior to casting to a floating point value. The bias is perhaps small especially in comparison to the performance gain (one less call to the RNG). This implementation avoids this bias by not re-using random bits in such a way. For more info see: An Improved Ziggurat Method to Generate Normal Random Samples, Jurgen A Doornik (http://www.doornik.com/research/ziggurat.pdf) Optimizations ------------- (Optimization 1) On selecting a segment/rectangle we generate a random x value within the range of the rectangle (or the range of the area of S0), this requires multiplying a random number with range [0,1] to the requried x range before performing the first test for x being within the 'certain' left-hand side of the rectangle. We avoid this multiplication and indeed conversion of a random integer into a float with range [0,1], thus allowing the first comparison to be performed using integer arithmetic. Instead of using the x coord of RN+1 to test whether a randomly generated point within RN is within the 'certain' left hand side part of the distribution, we precalculate the probability of a random x coord being within the safe part for each rectangle. Furthermore we store this probability as a UInt with range [0, 0xffffffff] thus allowing direct comparison with randomly generated UInts from the RNG, this allows the comparison to be performed using integer arithmetic. If the test succeeds then we continue to convert the random value into an appropriate x sample value. (Optimization 2) Simple collapsing of calculations into precomputed values where possible. This affects readability, but hopefully the above explanations will help understand the code if necessary. (Optimization 3) The gaussian probability density function (PDF) contains terms for distribution mean and standard deviation. We remove all excess terms and denormalise the function to obtain a simpler equation with the same shape. This simplified equation is no longer a PDF as the total area under the curve is no loner 1.0 (a key property of PDFs), however as it has the same overall shape it remains suitable for sampling from a Gaussian using rejection methods such as the Ziggurat algorithm (it's the shape of the curve that matters, not the absolute area under the curve).