C# Class Lucene.Net.Index.MergePolicy

Expert: a MergePolicy determines the sequence of primitive merge operations to be used for overall merge and optimize operations.

Whenever the segments in an index have been altered by {@link IndexWriter}, either the addition of a newly flushed segment, addition of many segments from addIndexes* calls, or a previous merge that may now need to cascade, {@link IndexWriter} invokes {@link #findMerges} to give the MergePolicy a chance to pick merges that are now required. This method returns a {@link MergeSpecification} instance describing the set of merges that should be done, or null if no merges are necessary. When IndexWriter.optimize is called, it calls {@link #findMergesForOptimize} and the MergePolicy should then return the necessary merges.

Note that the policy can return more than one merge at a time. In this case, if the writer is using {@link SerialMergeScheduler}, the merges will be run sequentially but if it is using {@link ConcurrentMergeScheduler} they will be run concurrently.

The default MergePolicy is {@link LogByteSizeMergePolicy}.

NOTE: This API is new and still experimental (subject to change suddenly in the next release)

Show file Open project: apache/lucenenet Class Usage Examples

Protected Properties

Property Type Description
DEFAULT_MAX_CFS_SEGMENT_SIZE long
MaxCFSSegmentSize long
NoCFSRatio_Renamed double
Writer SetOnce

Public Methods

Method Description
Clone ( ) : object
Dispose ( ) : void

Release all resources for the policy.

FindForcedDeletesMerges ( SegmentInfos segmentInfos ) : MergeSpecification

Determine what set of merge operations is necessary in order to expunge all deletes from the index.

FindForcedMerges ( SegmentInfos segmentInfos, int maxSegmentCount, bool?>.IDictionary segmentsToMerge ) : MergeSpecification

Determine what set of merge operations is necessary in order to merge to <= the specified segment count. IndexWriter calls this when its IndexWriter#forceMerge method is called. this call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

FindMerges ( MergeTrigger mergeTrigger, SegmentInfos segmentInfos ) : MergeSpecification

Determine what set of merge operations are now necessary on the index. IndexWriter calls this whenever there is a change to the segments. this call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

MergePolicy ( ) : System

Creates a new merge policy instance. Note that if you intend to use it without passing it to IndexWriter, you should call #setIndexWriter(IndexWriter).

UseCompoundFile ( SegmentInfos infos, Lucene.Net.Index.SegmentCommitInfo mergedInfo ) : bool

Returns true if a new segment (regardless of its origin) should use the compound file format. The default implementation returns true iff the size of the given mergedInfo is less or equal to #getMaxCFSSegmentSizeMB() and the size is less or equal to the TotalIndexSize * #getNoCFSRatio() otherwise false.

Protected Methods

Method Description
IsMerged ( SegmentInfos infos, Lucene.Net.Index.SegmentCommitInfo info ) : bool

Returns true if this single info is already fully merged (has no pending deletes, is in the same dir as the writer, and matches the current compound file setting

MergePolicy ( double defaultNoCFSRatio, long defaultMaxCFSSegmentSize ) : System

Creates a new merge policy instance with default settings for noCFSRatio and maxCFSSegmentSize. this ctor should be used by subclasses using different defaults than the MergePolicy

Size ( Lucene.Net.Index.SegmentCommitInfo info ) : long

Return the byte size of the provided {@link SegmentCommitInfo}, pro-rated by percentage of non-deleted documents is set.

Method Details

Clone() public method

public Clone ( ) : object
return object

Dispose() public abstract method

Release all resources for the policy.
public abstract Dispose ( ) : void
return void

FindForcedDeletesMerges() public abstract method

Determine what set of merge operations is necessary in order to expunge all deletes from the index.
public abstract FindForcedDeletesMerges ( SegmentInfos segmentInfos ) : MergeSpecification
segmentInfos SegmentInfos /// the total set of segments in the index
return MergeSpecification

FindForcedMerges() public abstract method

Determine what set of merge operations is necessary in order to merge to <= the specified segment count. IndexWriter calls this when its IndexWriter#forceMerge method is called. this call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.
public abstract FindForcedMerges ( SegmentInfos segmentInfos, int maxSegmentCount, bool?>.IDictionary segmentsToMerge ) : MergeSpecification
segmentInfos SegmentInfos /// the total set of segments in the index
maxSegmentCount int /// requested maximum number of segments in the index (currently this /// is always 1)
segmentsToMerge bool?>.IDictionary /// contains the specific SegmentInfo instances that must be merged /// away. this may be a subset of all /// SegmentInfos. If the value is True for a /// given SegmentInfo, that means this segment was /// an original segment present in the /// to-be-merged index; else, it was a segment /// produced by a cascaded merge.
return MergeSpecification

FindMerges() public abstract method

Determine what set of merge operations are now necessary on the index. IndexWriter calls this whenever there is a change to the segments. this call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.
public abstract FindMerges ( MergeTrigger mergeTrigger, SegmentInfos segmentInfos ) : MergeSpecification
mergeTrigger MergeTrigger the event that triggered the merge
segmentInfos SegmentInfos /// the total set of segments in the index
return MergeSpecification

IsMerged() protected method

Returns true if this single info is already fully merged (has no pending deletes, is in the same dir as the writer, and matches the current compound file setting
protected IsMerged ( SegmentInfos infos, Lucene.Net.Index.SegmentCommitInfo info ) : bool
infos SegmentInfos
info Lucene.Net.Index.SegmentCommitInfo
return bool

MergePolicy() public method

Creates a new merge policy instance. Note that if you intend to use it without passing it to IndexWriter, you should call #setIndexWriter(IndexWriter).
public MergePolicy ( ) : System
return System

MergePolicy() protected method

Creates a new merge policy instance with default settings for noCFSRatio and maxCFSSegmentSize. this ctor should be used by subclasses using different defaults than the MergePolicy
protected MergePolicy ( double defaultNoCFSRatio, long defaultMaxCFSSegmentSize ) : System
defaultNoCFSRatio double
defaultMaxCFSSegmentSize long
return System

Size() protected method

Return the byte size of the provided {@link SegmentCommitInfo}, pro-rated by percentage of non-deleted documents is set.
protected Size ( Lucene.Net.Index.SegmentCommitInfo info ) : long
info Lucene.Net.Index.SegmentCommitInfo
return long

UseCompoundFile() public method

Returns true if a new segment (regardless of its origin) should use the compound file format. The default implementation returns true iff the size of the given mergedInfo is less or equal to #getMaxCFSSegmentSizeMB() and the size is less or equal to the TotalIndexSize * #getNoCFSRatio() otherwise false.
public UseCompoundFile ( SegmentInfos infos, Lucene.Net.Index.SegmentCommitInfo mergedInfo ) : bool
infos SegmentInfos
mergedInfo Lucene.Net.Index.SegmentCommitInfo
return bool

Property Details

DEFAULT_MAX_CFS_SEGMENT_SIZE protected static property

Default max segment size in order to use compound file system. Set to Long#MAX_VALUE.
protected static long DEFAULT_MAX_CFS_SEGMENT_SIZE
return long

MaxCFSSegmentSize protected property

If the size of the merged segment exceeds this value then it will not use compound file format.
protected long MaxCFSSegmentSize
return long

NoCFSRatio_Renamed protected property

If the size of the merge segment exceeds this ratio of the total index size then it will remain in non-compound format
protected double NoCFSRatio_Renamed
return double

Writer protected property

IndexWriter that contains this instance.
protected SetOnce Writer
return SetOnce