C# Class Lucene.Net.Queries.CommonTermsQuery

A query that executes high-frequency terms in a optional sub-query to prevent slow queries due to "common" terms like stopwords. This query builds 2 queries off the #add(Term) added terms: low-frequency terms are added to a required boolean clause and high-frequency terms are added to an optional boolean clause. The optional clause is only executed if the required "low-frequency" clause matches. Scores produced by this query will be slightly different than plain BooleanQuery scorer mainly due to differences in the Similarity#coord(int,int) number of leaf queries in the required boolean clause. In most cases, high-frequency terms are unlikely to significantly contribute to the document score unless at least one of the low-frequency terms are matched. This query can improve query execution times significantly if applicable.

CommonTermsQuery has several advantages over stopword filtering at index or query time since a term can be "classified" based on the actual document frequency in the index and can prevent slow queries even across domains without specialized stopword files.

Note: if the query only contains high-frequency terms the query is rewritten into a plain conjunction query ie. all high-frequency terms need to match in order to match a document.

Inheritance: Lucene.Net.Search.Query
Mostra file Open project: paulirwin/lucene.net

Protected Properties

Property Type Description
disableCoord bool
highFreqBoost float
highFreqOccur Lucene.Net.Search.BooleanClause.Occur
lowFreqBoost float
lowFreqOccur Lucene.Net.Search.BooleanClause.Occur
maxTermFrequency float
terms IList

Public Methods

Method Description
Add ( Lucene.Net.Index.Term term ) : void

Adds a term to the CommonTermsQuery

CollectTermContext ( IndexReader reader, IList leaves, TermContext contextArray, Lucene.Net.Index.Term queryTerms ) : void
CommonTermsQuery ( BooleanClause highFreqOccur, BooleanClause lowFreqOccur, float maxTermFrequency ) : System

Creates a new CommonTermsQuery

CommonTermsQuery ( BooleanClause highFreqOccur, BooleanClause lowFreqOccur, float maxTermFrequency, bool disableCoord ) : System

Creates a new CommonTermsQuery

Equals ( object obj ) : bool
ExtractTerms ( ISet terms ) : void
GetHashCode ( ) : int
Rewrite ( IndexReader reader ) : Query
ToString ( string field ) : string

Protected Methods

Method Description
BuildQuery ( int maxDoc, TermContext contextArray, Lucene.Net.Index.Term queryTerms ) : Query
CalcHighFreqMinimumNumberShouldMatch ( int numOptional ) : int
CalcLowFreqMinimumNumberShouldMatch ( int numOptional ) : int
NewTermQuery ( Lucene.Net.Index.Term term, TermContext context ) : Query

Builds a new TermQuery instance.

This is intended for subclasses that wish to customize the generated queries.

Private Methods

Method Description
MinNrShouldMatch ( float minNrShouldMatch, int numOptional ) : int

Method Details

Add() public method

Adds a term to the CommonTermsQuery
public Add ( Lucene.Net.Index.Term term ) : void
term Lucene.Net.Index.Term /// the term to add
return void

BuildQuery() protected method

protected BuildQuery ( int maxDoc, TermContext contextArray, Lucene.Net.Index.Term queryTerms ) : Query
maxDoc int
contextArray Lucene.Net.Index.TermContext
queryTerms Lucene.Net.Index.Term
return Lucene.Net.Search.Query

CalcHighFreqMinimumNumberShouldMatch() protected method

protected CalcHighFreqMinimumNumberShouldMatch ( int numOptional ) : int
numOptional int
return int

CalcLowFreqMinimumNumberShouldMatch() protected method

protected CalcLowFreqMinimumNumberShouldMatch ( int numOptional ) : int
numOptional int
return int

CollectTermContext() public method

public CollectTermContext ( IndexReader reader, IList leaves, TermContext contextArray, Lucene.Net.Index.Term queryTerms ) : void
reader Lucene.Net.Index.IndexReader
leaves IList
contextArray Lucene.Net.Index.TermContext
queryTerms Lucene.Net.Index.Term
return void

CommonTermsQuery() public method

Creates a new CommonTermsQuery
/// if is pass as lowFreqOccur or /// highFreqOccur
public CommonTermsQuery ( BooleanClause highFreqOccur, BooleanClause lowFreqOccur, float maxTermFrequency ) : System
highFreqOccur Lucene.Net.Search.BooleanClause /// used for high frequency terms
lowFreqOccur Lucene.Net.Search.BooleanClause /// used for low frequency terms
maxTermFrequency float /// a value in [0..1) (or absolute number >=1) representing the /// maximum threshold of a terms document frequency to be considered a /// low frequency term.
return System

CommonTermsQuery() public method

Creates a new CommonTermsQuery
/// if is pass as lowFreqOccur or /// highFreqOccur
public CommonTermsQuery ( BooleanClause highFreqOccur, BooleanClause lowFreqOccur, float maxTermFrequency, bool disableCoord ) : System
highFreqOccur Lucene.Net.Search.BooleanClause /// used for high frequency terms
lowFreqOccur Lucene.Net.Search.BooleanClause /// used for low frequency terms
maxTermFrequency float /// a value in [0..1) (or absolute number >=1) representing the /// maximum threshold of a terms document frequency to be considered a /// low frequency term.
disableCoord bool /// disables in scoring for the low /// / high frequency sub-queries
return System

Equals() public method

public Equals ( object obj ) : bool
obj object
return bool

ExtractTerms() public method

public ExtractTerms ( ISet terms ) : void
terms ISet
return void

GetHashCode() public method

public GetHashCode ( ) : int
return int

NewTermQuery() protected method

Builds a new TermQuery instance.

This is intended for subclasses that wish to customize the generated queries.

protected NewTermQuery ( Lucene.Net.Index.Term term, TermContext context ) : Query
term Lucene.Net.Index.Term term
context Lucene.Net.Index.TermContext the TermContext to be used to create the low level term query. Can be null.
return Lucene.Net.Search.Query

Rewrite() public method

public Rewrite ( IndexReader reader ) : Query
reader Lucene.Net.Index.IndexReader
return Lucene.Net.Search.Query

ToString() public method

public ToString ( string field ) : string
field string
return string

Property Details

disableCoord protected_oe property

protected bool disableCoord
return bool

highFreqBoost protected_oe property

protected float highFreqBoost
return float

highFreqOccur protected_oe property

protected BooleanClause.Occur,Lucene.Net.Search highFreqOccur
return Lucene.Net.Search.BooleanClause.Occur

lowFreqBoost protected_oe property

protected float lowFreqBoost
return float

lowFreqOccur protected_oe property

protected BooleanClause.Occur,Lucene.Net.Search lowFreqOccur
return Lucene.Net.Search.BooleanClause.Occur

maxTermFrequency protected_oe property

protected float maxTermFrequency
return float

terms protected_oe property

protected IList terms
return IList