C# Class Antlr.Runtime.BaseRecognizer

A generic recognizer that can handle recognizers generated from lexer, parser, and tree grammars. This is all the parsing support code essentially; most of it is error recovery stuff and backtracking.

Datei anzeigen Open project: antlr/antlrcs Class Usage Examples

Protected Properties

Property	Type	Description
state	RecognizerSharedState

Public Methods

Method	Description
AlreadyParsedRule ( IIntStream input, int ruleIndex ) : bool	* Has this rule already parsed input at the current index in the * input stream? Return the stop token index or MEMO_RULE_UNKNOWN. * If we attempted but failed to parse properly before, return * MEMO_RULE_FAILED. * * This method has a side-effect: if we have seen this input for * this rule and successfully parsed before, then seek ahead to * 1 past the stop token matched for this rule last time. *
BaseRecognizer ( ) : System.Collections.Generic
BaseRecognizer ( Antlr.Runtime.RecognizerSharedState state ) : System.Collections.Generic
BeginResync ( ) : void	* A hook to listen in on the token consumption during error recovery. * The DebugParser subclasses this to fire events to the listenter. *
ConsumeUntil ( IIntStream input, BitSet set ) : void	Consume tokens until one matches the given token set
ConsumeUntil ( IIntStream input, int tokenType ) : void
DisplayRecognitionError ( string tokenNames, RecognitionException e ) : void
EmitErrorMessage ( string msg ) : void	Override this method to change where error messages go
EndResync ( ) : void
GetErrorHeader ( RecognitionException e ) : string	What is the error header, normally line/character position information?
GetErrorMessage ( RecognitionException e, string tokenNames ) : string	What error message should be generated for the various exception types? * Not very object-oriented code, but I like having all error message * generation within one method rather than spread among all of the * exception classes. This also makes it much easier for the exception * handling because the exception classes do not have to have pointers back * to this object to access utility routines and so on. Also, changing * the message for an exception type would be difficult because you * would have to subclassing exception, but then somehow get ANTLR * to make those kinds of exception objects instead of the default. * This looks weird, but trust me--it makes the most sense in terms * of flexibility. * * For grammar debugging, you will want to override this to add * more information such as the stack frame with * getRuleInvocationStack(e, this.getClass().getName()) and, * for no viable alts, the decision description and state etc... * * Override this to change the message generated for one or more * exception types. *
GetRuleInvocationStack ( ) : IList	* Return IList{T} of the rules in your parser instance * leading up to a call to this method. You could override if * you want more details such as the file/line info of where * in the parser java code a rule is invoked. * * This is very useful for error messages and for context-sensitive * error recovery. *
GetRuleInvocationStack ( System.Diagnostics.StackTrace trace ) : IList	* A more general version of GetRuleInvocationStack where you can * pass in the StackTrace of, for example, a RecognitionException * to get it's rule stack trace. *
GetRuleMemoization ( int ruleIndex, int ruleStartIndex ) : int	* Given a rule number and a start token index number, return * MEMO_RULE_UNKNOWN if the rule has not parsed input starting from * start index. If this rule has parsed input starting from the * start index before, then return where the rule stopped parsing. * It returns the index of the last token matched by the rule. * * For now we use a hashtable and just the slow Object-based one. * Later, we can make a special one for ints and also one that * tosses out data after we commit past input position i. *
GetRuleMemoizationCacheSize ( ) : int	return how many rule/input-index pairs there are in total.
GetTokenErrorDisplay ( IToken t ) : string	* How should a token be displayed in an error message? The default * is to display just the text, but during development you might * want to have a lot of information spit out. Override in that case * to use t.ToString() (which, for CommonToken, dumps everything about * the token). This is better than forcing you to override a method in * your token objects because you don't have to go modify your lexer * so that it creates a new Java type. *
Match ( IIntStream input, int ttype, BitSet follow ) : object	* Match current input symbol against ttype. Attempt * single token insertion or deletion error recovery. If * that fails, throw MismatchedTokenException. * * To turn off single token insertion or deletion error * recovery, override recoverFromMismatchedToken() and have it * throw an exception. See TreeParser.recoverFromMismatchedToken(). * This way any error in a rule will cause an exception and * immediate exit from rule. Rule would recover by resynchronizing * to the set of symbols that can follow rule ref. *
MatchAny ( IIntStream input ) : void	Match the wildcard: in a symbol
Memoize ( IIntStream input, int ruleIndex, int ruleStartIndex ) : void	* Record whether or not this rule parsed the input at this position * successfully. Use a standard java hashtable for now. *
MismatchIsMissingToken ( IIntStream input, BitSet follow ) : bool
MismatchIsUnwantedToken ( IIntStream input, int ttype ) : bool
Recover ( IIntStream input, RecognitionException re ) : void	* Recover from an error found on the input stream. This is * for NoViableAlt and mismatched symbol exceptions. If you enable * single token insertion and deletion, this will usually not * handle mismatched symbol exceptions but there could be a mismatched * token that the match() routine could not recover from. *
RecoverFromMismatchedSet ( IIntStream input, RecognitionException e, BitSet follow ) : object
ReportError ( RecognitionException e ) : void	Report a recognition problem. * This method sets errorRecovery to indicate the parser is recovering * not parsing. Once in recovery mode, no errors are generated. * To get out of recovery mode, the parser must successfully match * a token (after a resync). So it will go: * * 1. error occurs * 2. enter recovery mode, report error * 3. consume until token found in resynch set * 4. try to resume parsing * 5. next match() will reset errorRecovery mode * * If you override, make sure to update syntaxErrors if you care about that. *
Reset ( ) : void	reset the parser's state; subclasses must rewinds the input stream
SetState ( RecognizerSharedState value ) : void
ToStrings ( ICollection tokens ) : List	* A convenience method for use most often with template rewrites. * Convert a list of IToken to a list of string. *
TraceIn ( string ruleName, int ruleIndex, object inputSymbol ) : void
TraceOut ( string ruleName, int ruleIndex, object inputSymbol ) : void

Protected Methods

Method	Description
CombineFollows ( bool exact ) : BitSet
ComputeContextSensitiveRuleFOLLOW ( ) : BitSet	* Compute the context-sensitive FOLLOW set for current rule. * This is set of token types that can follow a specific rule * reference given a specific call chain. You get the set of * viable tokens that can possibly come next (lookahead depth 1) * given the current call chain. Contrast this with the * definition of plain FOLLOW for rule r: *
ComputeErrorRecoverySet ( ) : BitSet
GetCurrentInputSymbol ( IIntStream input ) : object	* Match needs to return the current input symbol, which gets put * into the label for the associated token ref; e.g., x=ID. Token * and tree parsers need to return different objects. Rather than test * for input stream type or change the IntStream interface, I use * a simple method to ask the recognizer to tell me what the current * input symbol is. * This is ignored for lexers.
GetMissingSymbol ( IIntStream input, RecognitionException e, int expectedTokenType, BitSet follow ) : object	Conjure up a missing token during error recovery. * The recognizer attempts to recover from single missing * symbols. But, actions might refer to that missing symbol. * For example, x=ID {f($x);}. The action clearly assumes * that there has been an identifier matched previously and that * $x points at that token. If that token is missing, but * the next token in the stream is what we want we assume that * this token is missing and we keep going. Because we * have to return some token to replace the missing token, * we have to conjure one up. This method gives the user control * over the tokens returned for missing tokens. Mostly, * you will want to create something special for identifier * tokens. For literals such as '{' and ',', the default * action in the parser or tree parser works. It simply creates * a CommonToken of the appropriate type. The text will be the token. * If you change what tokens must be created by the lexer, * override this method to create the appropriate tokens. *
InitDFAs ( ) : void
PopFollow ( ) : void
PushFollow ( BitSet fset ) : void	Push a rule's follow set using our own hardcoded stack
RecoverFromMismatchedToken ( IIntStream input, int ttype, BitSet follow ) : object	Attempt to recover from a single missing or extra token.

Private Methods

Method	Description
DebugBeginBacktrack ( int level ) : void
DebugEndBacktrack ( int level, bool successful ) : void
DebugEnterAlt ( int alt ) : void
DebugEnterDecision ( int decisionNumber, bool couldBacktrack ) : void
DebugEnterRule ( string grammarFileName, string ruleName ) : void
DebugEnterSubRule ( int decisionNumber ) : void
DebugExitDecision ( int decisionNumber ) : void
DebugExitRule ( string grammarFileName, string ruleName ) : void
DebugExitSubRule ( int decisionNumber ) : void
DebugLocation ( int line, int charPositionInLine ) : void
DebugRecognitionException ( RecognitionException ex ) : void
DebugSemanticPredicate ( bool result, string predicate ) : void

Method Details

AlreadyParsedRule() public method

* Has this rule already parsed input at the current index in the * input stream? Return the stop token index or MEMO_RULE_UNKNOWN. * If we attempted but failed to parse properly before, return * MEMO_RULE_FAILED. *

* This method has a side-effect: if we have seen this input for * this rule and successfully parsed before, then seek ahead to * 1 past the stop token matched for this rule last time. *

public AlreadyParsedRule ( IIntStream input, int ruleIndex ) : bool
input	IIntStream
ruleIndex	int
return	bool

BaseRecognizer() public method

public BaseRecognizer ( ) : System.Collections.Generic
return	System.Collections.Generic

BaseRecognizer() public method

public BaseRecognizer ( Antlr.Runtime.RecognizerSharedState state ) : System.Collections.Generic
state	Antlr.Runtime.RecognizerSharedState
return	System.Collections.Generic

BeginResync() public method

* A hook to listen in on the token consumption during error recovery. * The DebugParser subclasses this to fire events to the listenter. *

public BeginResync ( ) : void
return	void

CombineFollows() protected method

protected CombineFollows ( bool exact ) : BitSet
exact	bool
return	BitSet

ComputeContextSensitiveRuleFOLLOW() protected method

* Compute the context-sensitive FOLLOW set for current rule. * This is set of token types that can follow a specific rule * reference given a specific call chain. You get the set of * viable tokens that can possibly come next (lookahead depth 1) * given the current call chain. Contrast this with the * definition of plain FOLLOW for rule r: *

protected ComputeContextSensitiveRuleFOLLOW ( ) : BitSet
return	BitSet

ComputeErrorRecoverySet() protected method

protected ComputeErrorRecoverySet ( ) : BitSet
return	BitSet

ConsumeUntil() public method

Consume tokens until one matches the given token set

public ConsumeUntil ( IIntStream input, BitSet set ) : void
input	IIntStream
set	BitSet
return	void

ConsumeUntil() public method

public ConsumeUntil ( IIntStream input, int tokenType ) : void
input	IIntStream
tokenType	int
return	void

DisplayRecognitionError() public method

public DisplayRecognitionError ( string tokenNames, RecognitionException e ) : void
tokenNames	string
e	RecognitionException
return	void

EmitErrorMessage() public method

Override this method to change where error messages go

public EmitErrorMessage ( string msg ) : void
msg	string
return	void

EndResync() public method

public EndResync ( ) : void
return	void

GetCurrentInputSymbol() protected method

* Match needs to return the current input symbol, which gets put * into the label for the associated token ref; e.g., x=ID. Token * and tree parsers need to return different objects. Rather than test * for input stream type or change the IntStream interface, I use * a simple method to ask the recognizer to tell me what the current * input symbol is. *

This is ignored for lexers.

protected GetCurrentInputSymbol ( IIntStream input ) : object
input	IIntStream
return	object

GetErrorHeader() public method

What is the error header, normally line/character position information?

public GetErrorHeader ( RecognitionException e ) : string
e	RecognitionException
return	string

GetErrorMessage() public method

What error message should be generated for the various exception types?

* Not very object-oriented code, but I like having all error message * generation within one method rather than spread among all of the * exception classes. This also makes it much easier for the exception * handling because the exception classes do not have to have pointers back * to this object to access utility routines and so on. Also, changing * the message for an exception type would be difficult because you * would have to subclassing exception, but then somehow get ANTLR * to make those kinds of exception objects instead of the default. * This looks weird, but trust me--it makes the most sense in terms * of flexibility. * * For grammar debugging, you will want to override this to add * more information such as the stack frame with * getRuleInvocationStack(e, this.getClass().getName()) and, * for no viable alts, the decision description and state etc... * * Override this to change the message generated for one or more * exception types. *

public GetErrorMessage ( RecognitionException e, string tokenNames ) : string
e	RecognitionException
tokenNames	string
return	string

GetMissingSymbol() protected method

Conjure up a missing token during error recovery.

* The recognizer attempts to recover from single missing * symbols. But, actions might refer to that missing symbol. * For example, x=ID {f($x);}. The action clearly assumes * that there has been an identifier matched previously and that * $x points at that token. If that token is missing, but * the next token in the stream is what we want we assume that * this token is missing and we keep going. Because we * have to return some token to replace the missing token, * we have to conjure one up. This method gives the user control * over the tokens returned for missing tokens. Mostly, * you will want to create something special for identifier * tokens. For literals such as '{' and ',', the default * action in the parser or tree parser works. It simply creates * a CommonToken of the appropriate type. The text will be the token. * If you change what tokens must be created by the lexer, * override this method to create the appropriate tokens. *

protected GetMissingSymbol ( IIntStream input, RecognitionException e, int expectedTokenType, BitSet follow ) : object
input	IIntStream
e	RecognitionException
expectedTokenType	int
follow	BitSet
return	object

GetRuleInvocationStack() public method

* Return IList{T} of the rules in your parser instance * leading up to a call to this method. You could override if * you want more details such as the file/line info of where * in the parser java code a rule is invoked. *

* This is very useful for error messages and for context-sensitive * error recovery. *

public GetRuleInvocationStack ( ) : IList
return	IList

GetRuleInvocationStack() public static method

* A more general version of GetRuleInvocationStack where you can * pass in the StackTrace of, for example, a RecognitionException * to get it's rule stack trace. *

public static GetRuleInvocationStack ( System.Diagnostics.StackTrace trace ) : IList
trace	System.Diagnostics.StackTrace
return	IList

GetRuleMemoization() public method

* Given a rule number and a start token index number, return * MEMO_RULE_UNKNOWN if the rule has not parsed input starting from * start index. If this rule has parsed input starting from the * start index before, then return where the rule stopped parsing. * It returns the index of the last token matched by the rule. *

* For now we use a hashtable and just the slow Object-based one. * Later, we can make a special one for ints and also one that * tosses out data after we commit past input position i. *

public GetRuleMemoization ( int ruleIndex, int ruleStartIndex ) : int
ruleIndex	int
ruleStartIndex	int
return	int

GetRuleMemoizationCacheSize() public method

return how many rule/input-index pairs there are in total.

public GetRuleMemoizationCacheSize ( ) : int
return	int

GetTokenErrorDisplay() public method

* How should a token be displayed in an error message? The default * is to display just the text, but during development you might * want to have a lot of information spit out. Override in that case * to use t.ToString() (which, for CommonToken, dumps everything about * the token). This is better than forcing you to override a method in * your token objects because you don't have to go modify your lexer * so that it creates a new Java type. *

public GetTokenErrorDisplay ( IToken t ) : string
t	IToken
return	string

InitDFAs() protected method

protected InitDFAs ( ) : void
return	void

Match() public method

* Match current input symbol against ttype. Attempt * single token insertion or deletion error recovery. If * that fails, throw MismatchedTokenException. *

* To turn off single token insertion or deletion error * recovery, override recoverFromMismatchedToken() and have it * throw an exception. See TreeParser.recoverFromMismatchedToken(). * This way any error in a rule will cause an exception and * immediate exit from rule. Rule would recover by resynchronizing * to the set of symbols that can follow rule ref. *

public Match ( IIntStream input, int ttype, BitSet follow ) : object
input	IIntStream
ttype	int
follow	BitSet
return	object

MatchAny() public method

Match the wildcard: in a symbol

public MatchAny ( IIntStream input ) : void
input	IIntStream
return	void

Memoize() public method

* Record whether or not this rule parsed the input at this position * successfully. Use a standard java hashtable for now. *

public Memoize ( IIntStream input, int ruleIndex, int ruleStartIndex ) : void
input	IIntStream
ruleIndex	int
ruleStartIndex	int
return	void

MismatchIsMissingToken() public method

public MismatchIsMissingToken ( IIntStream input, BitSet follow ) : bool
input	IIntStream
follow	BitSet
return	bool

MismatchIsUnwantedToken() public method

public MismatchIsUnwantedToken ( IIntStream input, int ttype ) : bool
input	IIntStream
ttype	int
return	bool

PopFollow() protected method

protected PopFollow ( ) : void
return	void

PushFollow() protected method

Push a rule's follow set using our own hardcoded stack

protected PushFollow ( BitSet fset ) : void
fset	BitSet
return	void

Recover() public method

* Recover from an error found on the input stream. This is * for NoViableAlt and mismatched symbol exceptions. If you enable * single token insertion and deletion, this will usually not * handle mismatched symbol exceptions but there could be a mismatched * token that the match() routine could not recover from. *

public Recover ( IIntStream input, RecognitionException re ) : void
input	IIntStream
re	RecognitionException
return	void

RecoverFromMismatchedSet() public method

public RecoverFromMismatchedSet ( IIntStream input, RecognitionException e, BitSet follow ) : object
input	IIntStream
e	RecognitionException
follow	BitSet
return	object

RecoverFromMismatchedToken() protected method

Attempt to recover from a single missing or extra token.

protected RecoverFromMismatchedToken ( IIntStream input, int ttype, BitSet follow ) : object
input	IIntStream
ttype	int
follow	BitSet
return	object

ReportError() public method

Report a recognition problem.

* This method sets errorRecovery to indicate the parser is recovering * not parsing. Once in recovery mode, no errors are generated. * To get out of recovery mode, the parser must successfully match * a token (after a resync). So it will go: * * 1. error occurs * 2. enter recovery mode, report error * 3. consume until token found in resynch set * 4. try to resume parsing * 5. next match() will reset errorRecovery mode * * If you override, make sure to update syntaxErrors if you care about that. *

public ReportError ( RecognitionException e ) : void
e	RecognitionException
return	void

Reset() public method

reset the parser's state; subclasses must rewinds the input stream

public Reset ( ) : void
return	void

SetState() public method

public SetState ( RecognizerSharedState value ) : void
value	RecognizerSharedState
return	void

ToStrings() public method

* A convenience method for use most often with template rewrites. * Convert a list of IToken to a list of string. *

public ToStrings ( ICollection tokens ) : List
tokens	ICollection
return	List

TraceIn() public method

public TraceIn ( string ruleName, int ruleIndex, object inputSymbol ) : void
ruleName	string
ruleIndex	int
inputSymbol	object
return	void

TraceOut() public method

public TraceOut ( string ruleName, int ruleIndex, object inputSymbol ) : void
ruleName	string
ruleIndex	int
inputSymbol	object
return	void

Property Details

state protected_oe property

* State of a lexer, parser, or tree parser are collected into a state * object so the state can be shared. This sharing is needed to * have one grammar import others and share same error variables * and other state variables. It's a kind of explicit multiple * inheritance via delegation of methods and shared state. *

protected RecognizerSharedState state
return	RecognizerSharedState