C# Class Phamhilator.NLP.BagOfWords

Exibir arquivo Open project: ArcticEcho/Phamhilator

Public Methods

Method Description
AddDocument ( uint documentID, ushort>.IDictionary termTFs ) : void
BagOfWords ( ) : System
BagOfWords ( Term>.IDictionary terms ) : System
BagOfWords ( IEnumerable terms ) : System
ContainsDocument ( uint docID ) : bool
GetSimilarity ( IEnumerable terms, ushort maxDocsToReturn ) : float>.Dictionary

Calculates the cosine similarity of the given strings (normally words) compared to the current collection of Terms.

RecalculateIDFs ( ) : void
RemoveDocument ( uint documentID, ushort>.IDictionary termTFs ) : void

Private Methods

Method Description
CalculateDocumentLength ( uint docID, List terms ) : float
CalculateQueryLength ( float>.Dictionary queryVector ) : float
CalculateQueryTfIdfVector ( IEnumerable terms ) : float>.Dictionary
GetDocument ( uint docID ) : List

Method Details

AddDocument() public method

public AddDocument ( uint documentID, ushort>.IDictionary termTFs ) : void
documentID uint
termTFs ushort>.IDictionary
return void

BagOfWords() public method

public BagOfWords ( ) : System
return System

BagOfWords() public method

public BagOfWords ( Term>.IDictionary terms ) : System
terms Term>.IDictionary
return System

BagOfWords() public method

public BagOfWords ( IEnumerable terms ) : System
terms IEnumerable
return System

ContainsDocument() public method

public ContainsDocument ( uint docID ) : bool
docID uint
return bool

GetSimilarity() public method

Calculates the cosine similarity of the given strings (normally words) compared to the current collection of Terms.
public GetSimilarity ( IEnumerable terms, ushort maxDocsToReturn ) : float>.Dictionary
terms IEnumerable A collection of tokens (i.e., words) for a given string.
maxDocsToReturn ushort
return float>.Dictionary

RecalculateIDFs() public method

public RecalculateIDFs ( ) : void
return void

RemoveDocument() public method

public RemoveDocument ( uint documentID, ushort>.IDictionary termTFs ) : void
documentID uint
termTFs ushort>.IDictionary
return void