C# 클래스 RTools.Util.StreamTokenizer

A StreamTokenizer similar to Java's. This breaks an input stream (coming from a TextReader) into Tokens based on various settings. The settings are stored in the TokenizerSettings property, which is a StreamTokenizerSettings instance.

This is configurable in that you can modify TokenizerSettings.CharTypes[] array to specify which characters are which type, along with other settings such as whether to look for comments or not.

WARNING: This is not internationalized. This treats all characters beyond the 7-bit ASCII range (decimal 127) as Word characters.

There are two main ways to use this: 1) Parse the entire stream at once and get an List of Tokens (see the Tokenize* methods), and 2) call NextToken() successively. This reads from a TextReader, which you can set directly, and this also provides some convenient methods to parse files and strings. This returns an Eof token if the end of the input is reached.

Here's an example of the NextToken() style of use: StreamTokenizer tokenizer = new StreamTokenizer(); tokenizer.GrabWhitespace = true; tokenizer.Verbosity = VerbosityLevel.Debug; // just for debugging tokenizer.TextReader = File.OpenText(fileName); Token token; while (tokenizer.NextToken(out token)) log.Info("Token = '{0}'", token);

Here's an example of the Tokenize... style of use: StreamTokenizer tokenizer = new StreamTokenizer("some string"); List tokens = new List(); if (!tokenizer.Tokenize(tokens)) { // error handling } foreach (Token t in tokens) Console.WriteLine("t = {0}", t);

Comment delimiters are hardcoded (// and /*), not affected by char type table.

This sets line numbers in the tokens it produces. These numbers are normally the line on which the token starts. There is one known caveat, and that is that when GrabWhitespace setting is true, and a whitespace token contains a newline, that token's line number will be set to the following line rather than the line on which the token started.

파일 보기 프로젝트 열기: PaulMineau/AIMA.Net 1 사용 예제들

공개 프로퍼티들

프로퍼티 타입 설명
NChars int

공개 메소드들

메소드 설명
Display ( ) : void

Display the state of this object.

Display ( string prefix ) : void

Display the state of this object, with a per-line prefix.

NextToken ( Token &token ) : bool

Get the next token. The last token will be an EofToken unless there's an unterminated quote or unterminated block comment and Settings.DoUntermCheck is true, in which case this throws an exception of type StreamTokenizerUntermException or sub-class.

SpeedTest ( ) : bool

Speed test. This tests the speed of the parse.

StreamTokenizer ( ) : System

Default constructor.

StreamTokenizer ( TextReader sr ) : System

Construct and set this object's TextReader to the one specified.

StreamTokenizer ( string str ) : System

Construct and set a string to tokenize.

TestSelf ( ) : bool

Simple self test. See StreamTokenizerTestCase for full tests.

Tokenize ( List tokens ) : bool

Parse the rest of the stream and put all the tokens in the input List. This resets the line number to 1.

TokenizeFile ( string fileName ) : RTools.Util.Token[]

Tokenize a file completely and return the tokens in a Token[].

TokenizeFile ( string fileName, List tokens ) : bool

Parse all tokens from the specified file, put them into the input List.

TokenizeReader ( TextReader tr, List tokens ) : bool

Parse all tokens from the specified TextReader, put them into the input List.

TokenizeStream ( Stream s, List tokens ) : bool

Parse all tokens from the specified Stream, put them into the input List.

TokenizeString ( string str, List tokens ) : bool

Parse all tokens from the specified string, put them into the input List.

보호된 메소드들

메소드 설명
SpeedTestParse ( StreamTokenizer tokenizer, Stream stream ) : double

Use the supplied tokenizer to tokenize the specified stream and time it.

비공개 메소드들

메소드 설명
GetNextChar ( ) : int

Read the next character from the stream, or from backString if we backed up.

GrabInt ( CharBuffer sb, bool allowPlus, char &thisChar ) : bool

Starting from current stream location, scan forward over an int. Determine whether it's an integer or not. If so, push the integer characters to the specified CharBuffer. If not, put them in backString (essentially leave the stream as it was) and return false.

If it was an int, the stream is left 1 character after the end of the int, and that character is output in the thisChar parameter.

The formats for integers are: 1, +1, and -1

The + and - signs are included in the output buffer.

Initialize ( ) : void

Utility function, things common to constructors.

InitializeStream ( ) : void

Clear the stream settings.

PickNextState ( byte ctype, int c ) : NextTokenState

Pick the next state given just a single character. This is used at the start of a new token.

PickNextState ( byte ctype, int c, NextTokenState excludeState ) : NextTokenState

Pick the next state given just a single character. This is used at the start of a new token.

메소드 상세

Display() 공개 메소드

Display the state of this object.
public Display ( ) : void
리턴 void

Display() 공개 메소드

Display the state of this object, with a per-line prefix.
public Display ( string prefix ) : void
prefix string The pre-line prefix.
리턴 void

NextToken() 공개 메소드

Get the next token. The last token will be an EofToken unless there's an unterminated quote or unterminated block comment and Settings.DoUntermCheck is true, in which case this throws an exception of type StreamTokenizerUntermException or sub-class.
public NextToken ( Token &token ) : bool
token Token The output token.
리턴 bool

SpeedTest() 공개 정적인 메소드

Speed test. This tests the speed of the parse.
public static SpeedTest ( ) : bool
리턴 bool

SpeedTestParse() 보호된 정적인 메소드

Use the supplied tokenizer to tokenize the specified stream and time it.
protected static SpeedTestParse ( StreamTokenizer tokenizer, Stream stream ) : double
tokenizer StreamTokenizer
stream Stream
리턴 double

StreamTokenizer() 공개 메소드

Default constructor.
public StreamTokenizer ( ) : System
리턴 System

StreamTokenizer() 공개 메소드

Construct and set this object's TextReader to the one specified.
public StreamTokenizer ( TextReader sr ) : System
sr TextReader The TextReader to read from.
리턴 System

StreamTokenizer() 공개 메소드

Construct and set a string to tokenize.
public StreamTokenizer ( string str ) : System
str string The string to tokenize.
리턴 System

TestSelf() 공개 정적인 메소드

Simple self test. See StreamTokenizerTestCase for full tests.
public static TestSelf ( ) : bool
리턴 bool

Tokenize() 공개 메소드

Parse the rest of the stream and put all the tokens in the input List. This resets the line number to 1.
public Tokenize ( List tokens ) : bool
tokens List The List to append to.
리턴 bool

TokenizeFile() 공개 메소드

Tokenize a file completely and return the tokens in a Token[].
public TokenizeFile ( string fileName ) : RTools.Util.Token[]
fileName string The file to tokenize.
리턴 RTools.Util.Token[]

TokenizeFile() 공개 메소드

Parse all tokens from the specified file, put them into the input List.
public TokenizeFile ( string fileName, List tokens ) : bool
fileName string The file to read.
tokens List The List to put tokens in.
리턴 bool

TokenizeReader() 공개 메소드

Parse all tokens from the specified TextReader, put them into the input List.
public TokenizeReader ( TextReader tr, List tokens ) : bool
tr TextReader The TextReader to read from.
tokens List The List to append to.
리턴 bool

TokenizeStream() 공개 메소드

Parse all tokens from the specified Stream, put them into the input List.
public TokenizeStream ( Stream s, List tokens ) : bool
s Stream
tokens List The List to put tokens in.
리턴 bool

TokenizeString() 공개 메소드

Parse all tokens from the specified string, put them into the input List.
public TokenizeString ( string str, List tokens ) : bool
str string
tokens List The List to put tokens in.
리턴 bool

프로퍼티 상세

NChars 공개적으로 정적으로 프로퍼티

This is the number of characters in the character table.
public static int NChars
리턴 int