C# Класс ABB.Swum.SamuraiIdSplitter

Used to split the identifiers in a program into their constituent words.
Наследование: IdSplitter
Показать файл Открыть проект Примеры использования класса

Открытые методы

Метод Описание
CountProgramWords ( ISrcMLArchive archive ) : int>.Dictionary

Counts the number of occurrences of words within the identifiers in the given srcml files.

SamuraiIdSplitter ( int>.Dictionary programWordCount ) : System

Creates a new IdentifierSplitter using the specified word count dictionary.

SamuraiIdSplitter ( string programWordCountPath ) : System

Creates a new IdentifierSplitter using the specified program word count file.

Split ( string identifier ) : string[]

Splits a program identifier into its constituent words.

Split ( string identifier, bool printSplitTrace ) : string[]

Splits a program identifier into its constituent words.

Приватные методы

Метод Описание
IncludeIdentifier ( string word, int count ) : bool
Initialize ( int>.Dictionary programWordCount ) : void

Reads the necessary data files and initializes the member variables.

IsPrefix ( string word ) : bool

Checks whether the supplied word is a known prefix.

IsSuffix ( string word ) : bool

Checks whether the supplied word is a known suffix.

Score ( string word ) : double
SplitOnUppercaseToLowercase ( string word ) : string[]

Splits a word where an uppercase letter is followed by a lowercase letter. The word is split only once, at the first matching location. This method assumes the input consists of zero-or-more uppercase letters followed by zero-or-more lowercase letters.

SplitSameCase ( string word ) : string[]

Splits a word into subwords. The word should be either (1) all lowercase, (2) all uppercase, or (3) a single uppercase letter followed by lowercase letters

SplitSameCase ( string word, double noSplitScore ) : string[]

Splits a word into subwords. The word should be either (1) all lowercase, (2) all uppercase, or (3) a single uppercase letter followed by lowercase letters

Описание методов

CountProgramWords() публичный статический метод

Counts the number of occurrences of words within the identifiers in the given srcml files.
public static CountProgramWords ( ISrcMLArchive archive ) : int>.Dictionary
archive ISrcMLArchive An archive containing the srcml files to analyze.
Результат int>.Dictionary

SamuraiIdSplitter() публичный метод

Creates a new IdentifierSplitter using the specified word count dictionary.
public SamuraiIdSplitter ( int>.Dictionary programWordCount ) : System
programWordCount int>.Dictionary A dictionary containing the local program word counts.
Результат System

SamuraiIdSplitter() публичный метод

Creates a new IdentifierSplitter using the specified program word count file.
public SamuraiIdSplitter ( string programWordCountPath ) : System
programWordCountPath string The path to the file containing the local program word counts.
Результат System

Split() публичный метод

Splits a program identifier into its constituent words.
public Split ( string identifier ) : string[]
identifier string The identifier to split.
Результат string[]

Split() публичный метод

Splits a program identifier into its constituent words.
public Split ( string identifier, bool printSplitTrace ) : string[]
identifier string The identifier to split.
printSplitTrace bool Whether or not to print a trace of the splitting process.
Результат string[]