C# Class Lucene.Net.Misc.SweetSpotSimilarity

A similarity with a lengthNorm that provides for a "plateau" of equally good lengths, and tf helper functions.

For lengthNorm, A min/max can be specified to define the plateau of lengths that should all have a norm of 1.0. Below the min, and above the max the lengthNorm drops off in a sqrt function.

For tf, baselineTf and hyperbolicTf functions are provided, which subclasses can choose between.

Inheritance: Lucene.Net.Search.Similarities.DefaultSimilarity
Show file Open project: apache/lucenenet Class Usage Examples

Public Methods

Method Description
BaselineTf ( float freq ) : float

Implemented as: (x <= min) ? base : Math.Sqrt(x+(base**2)-min) ...but with a special case check for 0.

This degrates to Math.Sqrt(x) when min and base are both 0

ComputeLengthNorm ( int numTerms ) : float

Implemented as: 1/sqrt( steepness * (Math.Abs(x-min) + Math.Abs(x-max) - (max-min)) + 1 ) .

This degrades to 1/Math.Sqrt(x) when min and max are both 1 and steepness is 0.5

:TODO: potential optimization is to just flat out return 1.0f if numTerms is between min and max.

HyperbolicTf ( float freq ) : float

Uses a hyperbolic tangent function that allows for a hard max... tf(x)=min+(max-min)/2*(((base**(x-xoffset)-base**-(x-xoffset))/(base**(x-xoffset)+base**-(x-xoffset)))+1)

This code is provided as a convenience for subclasses that want to use a hyperbolic tf function.

LengthNorm ( Lucene.Net.Index.FieldInvertState state ) : float

Implemented as state.Boost * ComputeLengthNorm(numTokens) where numTokens does not count overlap tokens if discountOverlaps is true by default or true for this specific field.

SetBaselineTfFactors ( float @base, float min ) : void

Sets the baseline and minimum function variables for baselineTf

SetHyperbolicTfFactors ( float min, float max, double @base, float xoffset ) : void

Sets the function variables for the hyperbolicTf functions

SetLengthNormFactors ( int min, int max, float steepness, bool discountOverlaps ) : void

Sets the default function variables used by lengthNorm when no field specific variables have been set.

SweetSpotSimilarity ( ) : Lucene.Net.Index
Tf ( float freq ) : float

Delegates to baselineTf

Method Details

BaselineTf() public method

Implemented as: (x <= min) ? base : Math.Sqrt(x+(base**2)-min) ...but with a special case check for 0.

This degrates to Math.Sqrt(x) when min and base are both 0

public BaselineTf ( float freq ) : float
freq float
return float

ComputeLengthNorm() public method

Implemented as: 1/sqrt( steepness * (Math.Abs(x-min) + Math.Abs(x-max) - (max-min)) + 1 ) .

This degrades to 1/Math.Sqrt(x) when min and max are both 1 and steepness is 0.5

:TODO: potential optimization is to just flat out return 1.0f if numTerms is between min and max.

public ComputeLengthNorm ( int numTerms ) : float
numTerms int
return float

HyperbolicTf() public method

Uses a hyperbolic tangent function that allows for a hard max... tf(x)=min+(max-min)/2*(((base**(x-xoffset)-base**-(x-xoffset))/(base**(x-xoffset)+base**-(x-xoffset)))+1)

This code is provided as a convenience for subclasses that want to use a hyperbolic tf function.

public HyperbolicTf ( float freq ) : float
freq float
return float

LengthNorm() public method

Implemented as state.Boost * ComputeLengthNorm(numTokens) where numTokens does not count overlap tokens if discountOverlaps is true by default or true for this specific field.
public LengthNorm ( Lucene.Net.Index.FieldInvertState state ) : float
state Lucene.Net.Index.FieldInvertState
return float

SetBaselineTfFactors() public method

Sets the baseline and minimum function variables for baselineTf
public SetBaselineTfFactors ( float @base, float min ) : void
@base float
min float
return void

SetHyperbolicTfFactors() public method

Sets the function variables for the hyperbolicTf functions
public SetHyperbolicTfFactors ( float min, float max, double @base, float xoffset ) : void
min float the minimum tf value to ever be returned (default: 0.0)
max float the maximum tf value to ever be returned (default: 2.0)
@base double
xoffset float the midpoint of the hyperbolic function (default: 10.0)
return void

SetLengthNormFactors() public method

Sets the default function variables used by lengthNorm when no field specific variables have been set.
public SetLengthNormFactors ( int min, int max, float steepness, bool discountOverlaps ) : void
min int
max int
steepness float
discountOverlaps bool
return void

SweetSpotSimilarity() public method

public SweetSpotSimilarity ( ) : Lucene.Net.Index
return Lucene.Net.Index

Tf() public method

Delegates to baselineTf
public Tf ( float freq ) : float
freq float
return float