C# Class Lucene.Net.Facet.Taxonomy.Directory.DirectoryTaxonomyWriter

TaxonomyWriter which uses a Directory to store the taxonomy information on disk, and keeps an additional in-memory cache of some or all categories.

In addition to the permanently-stored information in the Directory, efficiency dictates that we also keep an in-memory cache of recently seen or all categories, so that we do not need to go back to disk for every category addition to see which ordinal this category already has, if any. A TaxonomyWriterCache object determines the specific caching algorithm used.

This class offers some hooks for extending classes to control the IndexWriter instance that is used. See #openIndexWriter. @lucene.experimental

Inheritance: TaxonomyWriter
Datei anzeigen Open project: paulirwin/lucene.net Class Usage Examples

Private Properties

Property Type Description
AddCategoryDocument int
AddToCache void
CombinedCommitData string>.IDictionary
DoClose void
InitReaderManager void
InternalAddCategory int
PerhapsFillCache void
ReadCommitData string>.IDictionary
RefreshReaderManager void

Public Methods

Method Description
AddCategory ( FacetLabel categoryPath ) : int
AddTaxonomy ( Directory taxoDir, OrdinalMap map ) : void

Takes the categories from the given taxonomy directory, and adds the missing ones to this taxonomy. Additionally, it fills the given OrdinalMap with a mapping from the original ordinal to the new ordinal.

Commit ( ) : void
DefaultTaxonomyWriterCache ( ) : TaxonomyWriterCache

Defines the default TaxonomyWriterCache to use in constructors which do not specify one.

The current default is Cl2oTaxonomyWriterCache constructed with the parameters (1024, 0.15f, 3), i.e., the entire taxonomy is cached in memory while building it.

DirectoryTaxonomyWriter ( Directory directory, OpenMode openMode = OpenMode.CREATE_OR_APPEND ) : System

Creates a new instance with a default cache as defined by #defaultTaxonomyWriterCache().

DirectoryTaxonomyWriter ( Directory directory, OpenMode openMode, TaxonomyWriterCache cache ) : System

Construct a Taxonomy writer.

Dispose ( ) : void

Frees used resources as well as closes the underlying IndexWriter, which commits whatever changes made to it to the underlying Directory.

GetParent ( int ordinal ) : int
PrepareCommit ( ) : void

prepare most of the work needed for a two-phase commit. See IndexWriter#prepareCommit.

ReplaceTaxonomy ( Directory taxoDir ) : void

Replaces the current taxonomy with the given one. This method should generally be called in conjunction with IndexWriter#addIndexes(Directory...) to replace both the taxonomy as well as the search index content.

Rollback ( ) : void

Rollback changes to the taxonomy writer and closes the instance. Following this method the instance becomes unusable (calling any of its API methods will yield an AlreadyClosedException).

Unlock ( Directory directory ) : void

Forcibly unlocks the taxonomy in the named directory.

Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this taxonomy.

This method is unnecessary if your Directory uses a NativeFSLockFactory instead of the default SimpleFSLockFactory. When the "native" lock is used, a lock does not stay behind forever when the process using it dies.

Protected Methods

Method Description
CloseResources ( ) : void

A hook for extending classes to close additional resources that were used. The default implementation closes the IndexReader as well as the TaxonomyWriterCache instances that were used.
NOTE: if you override this method, you should include a super.closeResources() call in your implementation.

CreateIndexWriterConfig ( OpenMode openMode ) : IndexWriterConfig

Create the IndexWriterConfig that would be used for opening the internal index writer.
Extensions can configure the IndexWriter as they see fit, including setting a Lucene.Net.index.MergeScheduler merge-scheduler, or Lucene.Net.index.IndexDeletionPolicy deletion-policy, different RAM size etc.

NOTE: internal docids of the configured index must not be altered. For that, categories are never deleted from the taxonomy index. In addition, merge policy in effect must not merge none adjacent segments.

EnsureOpen ( ) : void

Verifies that this instance wasn't closed, or throws AlreadyClosedException if it is.

FindCategory ( FacetLabel categoryPath ) : int

Look up the given category in the cache and/or the on-disk storage, returning the category's ordinal, or a negative number in case the category does not yet exist in the taxonomy.

OpenIndexWriter ( Directory directory, IndexWriterConfig config ) : IndexWriter

Open internal index writer, which contains the taxonomy data.

Extensions may provide their own IndexWriter implementation or instance.
NOTE: the instance this method returns will be closed upon calling to #close().
NOTE: the merge policy in effect must not merge none adjacent segments. See comment in #createIndexWriterConfig(IndexWriterConfig.OpenMode) for the logic behind this.

Private Methods

Method Description
AddCategoryDocument ( FacetLabel categoryPath, int parent ) : int

Note that the methods calling addCategoryDocument() are synchornized, so this method is effectively synchronized as well.

AddToCache ( FacetLabel categoryPath, int id ) : void
CombinedCommitData ( string>.IDictionary commitData ) : string>.IDictionary

Combine original user data with the taxonomy epoch.

DoClose ( ) : void
InitReaderManager ( ) : void

Opens a ReaderManager from the internal IndexWriter.

InternalAddCategory ( FacetLabel cp ) : int

Add a new category into the index (and the cache), and return its new ordinal.

Actually, we might also need to add some of the category's ancestors before we can add the category itself (while keeping the invariant that a parent is always added to the taxonomy before its child). We do this by recursion.

PerhapsFillCache ( ) : void
ReadCommitData ( Directory dir ) : string>.IDictionary

Reads the commit data from a Directory.

RefreshReaderManager ( ) : void

Method Details

AddCategory() public method

public AddCategory ( FacetLabel categoryPath ) : int
categoryPath FacetLabel
return int

AddTaxonomy() public method

Takes the categories from the given taxonomy directory, and adds the missing ones to this taxonomy. Additionally, it fills the given OrdinalMap with a mapping from the original ordinal to the new ordinal.
public AddTaxonomy ( Directory taxoDir, OrdinalMap map ) : void
taxoDir Directory
map OrdinalMap
return void

CloseResources() protected method

A hook for extending classes to close additional resources that were used. The default implementation closes the IndexReader as well as the TaxonomyWriterCache instances that were used.
NOTE: if you override this method, you should include a super.closeResources() call in your implementation.
protected CloseResources ( ) : void
return void

Commit() public method

public Commit ( ) : void
return void

CreateIndexWriterConfig() protected method

Create the IndexWriterConfig that would be used for opening the internal index writer.
Extensions can configure the IndexWriter as they see fit, including setting a Lucene.Net.index.MergeScheduler merge-scheduler, or Lucene.Net.index.IndexDeletionPolicy deletion-policy, different RAM size etc.

NOTE: internal docids of the configured index must not be altered. For that, categories are never deleted from the taxonomy index. In addition, merge policy in effect must not merge none adjacent segments.
protected CreateIndexWriterConfig ( OpenMode openMode ) : IndexWriterConfig
openMode OpenMode see
return IndexWriterConfig

DefaultTaxonomyWriterCache() public static method

Defines the default TaxonomyWriterCache to use in constructors which do not specify one.

The current default is Cl2oTaxonomyWriterCache constructed with the parameters (1024, 0.15f, 3), i.e., the entire taxonomy is cached in memory while building it.

public static DefaultTaxonomyWriterCache ( ) : TaxonomyWriterCache
return TaxonomyWriterCache

DirectoryTaxonomyWriter() public method

Creates a new instance with a default cache as defined by #defaultTaxonomyWriterCache().
public DirectoryTaxonomyWriter ( Directory directory, OpenMode openMode = OpenMode.CREATE_OR_APPEND ) : System
directory Directory
openMode OpenMode
return System

DirectoryTaxonomyWriter() public method

Construct a Taxonomy writer.
/// if the taxonomy is corrupted. /// if the taxonomy is locked by another writer. If it is known /// that no other concurrent writer is active, the lock might /// have been left around by an old dead process, and should be /// removed using . /// if another error occurred.
public DirectoryTaxonomyWriter ( Directory directory, OpenMode openMode, TaxonomyWriterCache cache ) : System
directory Directory /// The in which to store the taxonomy. Note that /// the taxonomy is written directly to that directory (not to a /// subdirectory of it).
openMode OpenMode /// Specifies how to open a taxonomy for writing: APPEND /// means open an existing index for append (failing if the index does /// not yet exist). CREATE means create a new index (first /// deleting the old one if it already existed). /// APPEND_OR_CREATE appends to an existing index if there /// is one, otherwise it creates a new index.
cache TaxonomyWriterCache /// A implementation which determines /// the in-memory caching policy. See for example /// and . /// If null or missing, is used.
return System

Dispose() public method

Frees used resources as well as closes the underlying IndexWriter, which commits whatever changes made to it to the underlying Directory.
public Dispose ( ) : void
return void

EnsureOpen() protected method

Verifies that this instance wasn't closed, or throws AlreadyClosedException if it is.
protected EnsureOpen ( ) : void
return void

FindCategory() protected method

Look up the given category in the cache and/or the on-disk storage, returning the category's ordinal, or a negative number in case the category does not yet exist in the taxonomy.
protected FindCategory ( FacetLabel categoryPath ) : int
categoryPath FacetLabel
return int

GetParent() public method

public GetParent ( int ordinal ) : int
ordinal int
return int

OpenIndexWriter() protected method

Open internal index writer, which contains the taxonomy data.

Extensions may provide their own IndexWriter implementation or instance.
NOTE: the instance this method returns will be closed upon calling to #close().
NOTE: the merge policy in effect must not merge none adjacent segments. See comment in #createIndexWriterConfig(IndexWriterConfig.OpenMode) for the logic behind this.

protected OpenIndexWriter ( Directory directory, IndexWriterConfig config ) : IndexWriter
directory Directory /// the on top of which an /// should be opened.
config IndexWriterConfig /// configuration for the internal index writer.
return IndexWriter

PrepareCommit() public method

prepare most of the work needed for a two-phase commit. See IndexWriter#prepareCommit.
public PrepareCommit ( ) : void
return void

ReplaceTaxonomy() public method

Replaces the current taxonomy with the given one. This method should generally be called in conjunction with IndexWriter#addIndexes(Directory...) to replace both the taxonomy as well as the search index content.
public ReplaceTaxonomy ( Directory taxoDir ) : void
taxoDir Directory
return void

Rollback() public method

Rollback changes to the taxonomy writer and closes the instance. Following this method the instance becomes unusable (calling any of its API methods will yield an AlreadyClosedException).
public Rollback ( ) : void
return void

Unlock() public static method

Forcibly unlocks the taxonomy in the named directory.

Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this taxonomy.

This method is unnecessary if your Directory uses a NativeFSLockFactory instead of the default SimpleFSLockFactory. When the "native" lock is used, a lock does not stay behind forever when the process using it dies.

public static Unlock ( Directory directory ) : void
directory Directory
return void