C# 클래스 org.apache.lucene.analysis.miscellaneous.PatternAnalyzer

상속: Analyzer
파일 보기 프로젝트 열기: paulirwin/lucene.net 1 사용 예제들

공개 프로퍼티들

프로퍼티 타입 설명
DEFAULT_ANALYZER PatternAnalyzer
EXTENDED_ANALYZER PatternAnalyzer
NON_WORD_PATTERN Pattern
WHITESPACE_PATTERN Pattern

공개 메소드들

메소드 설명
Equals ( object other ) : bool

Indicates whether some other object is "equal to" this one.

GetHashCode ( ) : int

Returns a hash code value for the object.

PatternAnalyzer ( System.Version matchVersion, Pattern pattern, bool toLowerCase, CharArraySet stopWords ) : System

Constructs a new instance with the given parameters.

createComponents ( string fieldName, Reader reader ) : TokenStreamComponents

Creates a token stream that tokenizes all the text in the given Reader; This implementation forwards to tokenStream(String, Reader, String) and is less efficient than tokenStream(String, Reader, String).

createComponents ( string fieldName, Reader reader, string text ) : TokenStreamComponents

Creates a token stream that tokenizes the given string into token terms (aka words).

비공개 메소드들

메소드 설명
ToString ( Reader input ) : string

Reads until end-of-stream and returns all read chars, finally closes the stream.

eq ( object o1, object o2 ) : bool

equality where o1 and/or o2 can be null

eqPattern ( Pattern p1, Pattern p2 ) : bool

assumes p1 and p2 are not null

메소드 상세

Equals() 공개 메소드

Indicates whether some other object is "equal to" this one.
public Equals ( object other ) : bool
other object /// the reference object with which to compare.
리턴 bool

GetHashCode() 공개 메소드

Returns a hash code value for the object.
public GetHashCode ( ) : int
리턴 int

PatternAnalyzer() 공개 메소드

Constructs a new instance with the given parameters.
public PatternAnalyzer ( System.Version matchVersion, Pattern pattern, bool toLowerCase, CharArraySet stopWords ) : System
matchVersion System.Version currently does nothing
pattern Pattern /// a regular expression delimiting tokens
toLowerCase bool /// if true returns tokens after applying /// String.toLowerCase()
stopWords CharArraySet /// if non-null, ignores all tokens that are contained in the /// given stop set (after previously having applied toLowerCase() /// if applicable). For example, created via /// and/or /// as in /// WordlistLoader.getWordSet(new File("samples/fulltext/stopwords.txt") /// or other stop words /// lists .
리턴 System

createComponents() 공개 메소드

Creates a token stream that tokenizes all the text in the given Reader; This implementation forwards to tokenStream(String, Reader, String) and is less efficient than tokenStream(String, Reader, String).
public createComponents ( string fieldName, Reader reader ) : TokenStreamComponents
fieldName string /// the name of the field to tokenize (currently ignored).
reader Reader /// the reader delivering the text
리턴 TokenStreamComponents

createComponents() 공개 메소드

Creates a token stream that tokenizes the given string into token terms (aka words).
public createComponents ( string fieldName, Reader reader, string text ) : TokenStreamComponents
fieldName string /// the name of the field to tokenize (currently ignored).
reader Reader /// reader (e.g. charfilter) of the original text. can be null.
text string /// the string to tokenize
리턴 TokenStreamComponents

프로퍼티 상세

DEFAULT_ANALYZER 공개적으로 정적으로 프로퍼티

A lower-casing word analyzer with English stop words (can be shared freely across threads without harm); global per class loader.
public static PatternAnalyzer,org.apache.lucene.analysis.miscellaneous DEFAULT_ANALYZER
리턴 PatternAnalyzer

EXTENDED_ANALYZER 공개적으로 정적으로 프로퍼티

A lower-casing word analyzer with extended English stop words (can be shared freely across threads without harm); global per class loader. The stop words are borrowed from http://thomas.loc.gov/home/stopwords.html, see http://thomas.loc.gov/home/all.about.inquery.html
public static PatternAnalyzer,org.apache.lucene.analysis.miscellaneous EXTENDED_ANALYZER
리턴 PatternAnalyzer

NON_WORD_PATTERN 공개적으로 정적으로 프로퍼티

"\\W+"; Divides text at non-letters (NOT Character.isLetter(c))
public static Pattern NON_WORD_PATTERN
리턴 Pattern

WHITESPACE_PATTERN 공개적으로 정적으로 프로퍼티

"\\s+"; Divides text at whitespaces (Character.isWhitespace(c))
public static Pattern WHITESPACE_PATTERN
리턴 Pattern