C# Class QACExperimenter.Approaches.Text.Tokenizer

Provides string tokenization and cleaning
Datei anzeigen Open project: stewhir/recent-robust-qac

Public Methods

Method Description
TokenizeString ( string inputText, bool returnFirstNgramOnly = false ) : List

Simple whitespace tokenizer TODO: fix this to do filtering etc

Private Methods

Method Description
NormalizeQueryOrTitle ( string inputText ) : string

Normalizes a query or title to the same format, with punctuation removed.

StripPunctuation ( string inputString ) : string

Remove any punctuation (high-performance single pass method)

Method Details

TokenizeString() public static method

Simple whitespace tokenizer TODO: fix this to do filtering etc
public static TokenizeString ( string inputText, bool returnFirstNgramOnly = false ) : List
inputText string
returnFirstNgramOnly bool
return List