C# Class Antlr.Runtime.BaseRecognizer

A generic recognizer that can handle recognizers generated from lexer, parser, and tree grammars. This is all the parsing support code essentially; most of it is error recovery stuff and backtracking.
Mostra file Open project: antlr/antlrcs Class Usage Examples

Protected Properties

Property Type Description
state RecognizerSharedState

Public Methods

Method Description
AlreadyParsedRule ( IIntStream input, int ruleIndex ) : bool

* Has this rule already parsed input at the current index in the * input stream? Return the stop token index or MEMO_RULE_UNKNOWN. * If we attempted but failed to parse properly before, return * MEMO_RULE_FAILED. *

* This method has a side-effect: if we have seen this input for * this rule and successfully parsed before, then seek ahead to * 1 past the stop token matched for this rule last time. *

BaseRecognizer ( ) : System.Collections.Generic
BaseRecognizer ( Antlr.Runtime.RecognizerSharedState state ) : System.Collections.Generic
BeginResync ( ) : void

* A hook to listen in on the token consumption during error recovery. * The DebugParser subclasses this to fire events to the listenter. *

ConsumeUntil ( IIntStream input, BitSet set ) : void

Consume tokens until one matches the given token set

ConsumeUntil ( IIntStream input, int tokenType ) : void
DisplayRecognitionError ( string tokenNames, RecognitionException e ) : void
EmitErrorMessage ( string msg ) : void

Override this method to change where error messages go

EndResync ( ) : void
GetErrorHeader ( RecognitionException e ) : string

What is the error header, normally line/character position information?

GetErrorMessage ( RecognitionException e, string tokenNames ) : string

What error message should be generated for the various exception types?

* Not very object-oriented code, but I like having all error message * generation within one method rather than spread among all of the * exception classes. This also makes it much easier for the exception * handling because the exception classes do not have to have pointers back * to this object to access utility routines and so on. Also, changing * the message for an exception type would be difficult because you * would have to subclassing exception, but then somehow get ANTLR * to make those kinds of exception objects instead of the default. * This looks weird, but trust me--it makes the most sense in terms * of flexibility. * * For grammar debugging, you will want to override this to add * more information such as the stack frame with * getRuleInvocationStack(e, this.getClass().getName()) and, * for no viable alts, the decision description and state etc... * * Override this to change the message generated for one or more * exception types. *

GetRuleInvocationStack ( ) : IList

* Return IList{T} of the rules in your parser instance * leading up to a call to this method. You could override if * you want more details such as the file/line info of where * in the parser java code a rule is invoked. *

* This is very useful for error messages and for context-sensitive * error recovery. *

GetRuleInvocationStack ( System.Diagnostics.StackTrace trace ) : IList

* A more general version of GetRuleInvocationStack where you can * pass in the StackTrace of, for example, a RecognitionException * to get it's rule stack trace. *

GetRuleMemoization ( int ruleIndex, int ruleStartIndex ) : int

* Given a rule number and a start token index number, return * MEMO_RULE_UNKNOWN if the rule has not parsed input starting from * start index. If this rule has parsed input starting from the * start index before, then return where the rule stopped parsing. * It returns the index of the last token matched by the rule. *

* For now we use a hashtable and just the slow Object-based one. * Later, we can make a special one for ints and also one that * tosses out data after we commit past input position i. *

GetRuleMemoizationCacheSize ( ) : int

return how many rule/input-index pairs there are in total.

GetTokenErrorDisplay ( IToken t ) : string

* How should a token be displayed in an error message? The default * is to display just the text, but during development you might * want to have a lot of information spit out. Override in that case * to use t.ToString() (which, for CommonToken, dumps everything about * the token). This is better than forcing you to override a method in * your token objects because you don't have to go modify your lexer * so that it creates a new Java type. *

Match ( IIntStream input, int ttype, BitSet follow ) : object

* Match current input symbol against ttype. Attempt * single token insertion or deletion error recovery. If * that fails, throw MismatchedTokenException. *

* To turn off single token insertion or deletion error * recovery, override recoverFromMismatchedToken() and have it * throw an exception. See TreeParser.recoverFromMismatchedToken(). * This way any error in a rule will cause an exception and * immediate exit from rule. Rule would recover by resynchronizing * to the set of symbols that can follow rule ref. *

MatchAny ( IIntStream input ) : void

Match the wildcard: in a symbol

Memoize ( IIntStream input, int ruleIndex, int ruleStartIndex ) : void

* Record whether or not this rule parsed the input at this position * successfully. Use a standard java hashtable for now. *

MismatchIsMissingToken ( IIntStream input, BitSet follow ) : bool
MismatchIsUnwantedToken ( IIntStream input, int ttype ) : bool
Recover ( IIntStream input, RecognitionException re ) : void

* Recover from an error found on the input stream. This is * for NoViableAlt and mismatched symbol exceptions. If you enable * single token insertion and deletion, this will usually not * handle mismatched symbol exceptions but there could be a mismatched * token that the match() routine could not recover from. *

RecoverFromMismatchedSet ( IIntStream input, RecognitionException e, BitSet follow ) : object
ReportError ( RecognitionException e ) : void

Report a recognition problem.

* This method sets errorRecovery to indicate the parser is recovering * not parsing. Once in recovery mode, no errors are generated. * To get out of recovery mode, the parser must successfully match * a token (after a resync). So it will go: * * 1. error occurs * 2. enter recovery mode, report error * 3. consume until token found in resynch set * 4. try to resume parsing * 5. next match() will reset errorRecovery mode * * If you override, make sure to update syntaxErrors if you care about that. *

Reset ( ) : void

reset the parser's state; subclasses must rewinds the input stream

SetState ( RecognizerSharedState value ) : void
ToStrings ( ICollection tokens ) : List

* A convenience method for use most often with template rewrites. * Convert a list of IToken to a list of string. *

TraceIn ( string ruleName, int ruleIndex, object inputSymbol ) : void
TraceOut ( string ruleName, int ruleIndex, object inputSymbol ) : void

Protected Methods

Method Description
CombineFollows ( bool exact ) : BitSet
ComputeContextSensitiveRuleFOLLOW ( ) : BitSet

* Compute the context-sensitive FOLLOW set for current rule. * This is set of token types that can follow a specific rule * reference given a specific call chain. You get the set of * viable tokens that can possibly come next (lookahead depth 1) * given the current call chain. Contrast this with the * definition of plain FOLLOW for rule r: *

ComputeErrorRecoverySet ( ) : BitSet
GetCurrentInputSymbol ( IIntStream input ) : object

* Match needs to return the current input symbol, which gets put * into the label for the associated token ref; e.g., x=ID. Token * and tree parsers need to return different objects. Rather than test * for input stream type or change the IntStream interface, I use * a simple method to ask the recognizer to tell me what the current * input symbol is. *

This is ignored for lexers.

GetMissingSymbol ( IIntStream input, RecognitionException e, int expectedTokenType, BitSet follow ) : object

Conjure up a missing token during error recovery.

* The recognizer attempts to recover from single missing * symbols. But, actions might refer to that missing symbol. * For example, x=ID {f($x);}. The action clearly assumes * that there has been an identifier matched previously and that * $x points at that token. If that token is missing, but * the next token in the stream is what we want we assume that * this token is missing and we keep going. Because we * have to return some token to replace the missing token, * we have to conjure one up. This method gives the user control * over the tokens returned for missing tokens. Mostly, * you will want to create something special for identifier * tokens. For literals such as '{' and ',', the default * action in the parser or tree parser works. It simply creates * a CommonToken of the appropriate type. The text will be the token. * If you change what tokens must be created by the lexer, * override this method to create the appropriate tokens. *

InitDFAs ( ) : void
PopFollow ( ) : void
PushFollow ( BitSet fset ) : void

Push a rule's follow set using our own hardcoded stack

RecoverFromMismatchedToken ( IIntStream input, int ttype, BitSet follow ) : object

Attempt to recover from a single missing or extra token.

Private Methods

Method Description
DebugBeginBacktrack ( int level ) : void
DebugEndBacktrack ( int level, bool successful ) : void
DebugEnterAlt ( int alt ) : void
DebugEnterDecision ( int decisionNumber, bool couldBacktrack ) : void
DebugEnterRule ( string grammarFileName, string ruleName ) : void
DebugEnterSubRule ( int decisionNumber ) : void
DebugExitDecision ( int decisionNumber ) : void
DebugExitRule ( string grammarFileName, string ruleName ) : void
DebugExitSubRule ( int decisionNumber ) : void
DebugLocation ( int line, int charPositionInLine ) : void
DebugRecognitionException ( RecognitionException ex ) : void
DebugSemanticPredicate ( bool result, string predicate ) : void

Method Details

AlreadyParsedRule() public method

* Has this rule already parsed input at the current index in the * input stream? Return the stop token index or MEMO_RULE_UNKNOWN. * If we attempted but failed to parse properly before, return * MEMO_RULE_FAILED. *
* This method has a side-effect: if we have seen this input for * this rule and successfully parsed before, then seek ahead to * 1 past the stop token matched for this rule last time. *
public AlreadyParsedRule ( IIntStream input, int ruleIndex ) : bool
input IIntStream
ruleIndex int
return bool

BaseRecognizer() public method

public BaseRecognizer ( ) : System.Collections.Generic
return System.Collections.Generic

BaseRecognizer() public method

public BaseRecognizer ( Antlr.Runtime.RecognizerSharedState state ) : System.Collections.Generic
state Antlr.Runtime.RecognizerSharedState
return System.Collections.Generic

BeginResync() public method

* A hook to listen in on the token consumption during error recovery. * The DebugParser subclasses this to fire events to the listenter. *
public BeginResync ( ) : void
return void

CombineFollows() protected method

protected CombineFollows ( bool exact ) : BitSet
exact bool
return BitSet

ComputeContextSensitiveRuleFOLLOW() protected method

* Compute the context-sensitive FOLLOW set for current rule. * This is set of token types that can follow a specific rule * reference given a specific call chain. You get the set of * viable tokens that can possibly come next (lookahead depth 1) * given the current call chain. Contrast this with the * definition of plain FOLLOW for rule r: *
protected ComputeContextSensitiveRuleFOLLOW ( ) : BitSet
return BitSet

ComputeErrorRecoverySet() protected method

protected ComputeErrorRecoverySet ( ) : BitSet
return BitSet

ConsumeUntil() public method

Consume tokens until one matches the given token set
public ConsumeUntil ( IIntStream input, BitSet set ) : void
input IIntStream
set BitSet
return void

ConsumeUntil() public method

public ConsumeUntil ( IIntStream input, int tokenType ) : void
input IIntStream
tokenType int
return void

DisplayRecognitionError() public method

public DisplayRecognitionError ( string tokenNames, RecognitionException e ) : void
tokenNames string
e RecognitionException
return void

EmitErrorMessage() public method

Override this method to change where error messages go
public EmitErrorMessage ( string msg ) : void
msg string
return void

EndResync() public method

public EndResync ( ) : void
return void

GetCurrentInputSymbol() protected method

* Match needs to return the current input symbol, which gets put * into the label for the associated token ref; e.g., x=ID. Token * and tree parsers need to return different objects. Rather than test * for input stream type or change the IntStream interface, I use * a simple method to ask the recognizer to tell me what the current * input symbol is. *
This is ignored for lexers.
protected GetCurrentInputSymbol ( IIntStream input ) : object
input IIntStream
return object

GetErrorHeader() public method

What is the error header, normally line/character position information?
public GetErrorHeader ( RecognitionException e ) : string
e RecognitionException
return string

GetErrorMessage() public method

What error message should be generated for the various exception types?
* Not very object-oriented code, but I like having all error message * generation within one method rather than spread among all of the * exception classes. This also makes it much easier for the exception * handling because the exception classes do not have to have pointers back * to this object to access utility routines and so on. Also, changing * the message for an exception type would be difficult because you * would have to subclassing exception, but then somehow get ANTLR * to make those kinds of exception objects instead of the default. * This looks weird, but trust me--it makes the most sense in terms * of flexibility. * * For grammar debugging, you will want to override this to add * more information such as the stack frame with * getRuleInvocationStack(e, this.getClass().getName()) and, * for no viable alts, the decision description and state etc... * * Override this to change the message generated for one or more * exception types. *
public GetErrorMessage ( RecognitionException e, string tokenNames ) : string
e RecognitionException
tokenNames string
return string

GetMissingSymbol() protected method

Conjure up a missing token during error recovery.
* The recognizer attempts to recover from single missing * symbols. But, actions might refer to that missing symbol. * For example, x=ID {f($x);}. The action clearly assumes * that there has been an identifier matched previously and that * $x points at that token. If that token is missing, but * the next token in the stream is what we want we assume that * this token is missing and we keep going. Because we * have to return some token to replace the missing token, * we have to conjure one up. This method gives the user control * over the tokens returned for missing tokens. Mostly, * you will want to create something special for identifier * tokens. For literals such as '{' and ',', the default * action in the parser or tree parser works. It simply creates * a CommonToken of the appropriate type. The text will be the token. * If you change what tokens must be created by the lexer, * override this method to create the appropriate tokens. *
protected GetMissingSymbol ( IIntStream input, RecognitionException e, int expectedTokenType, BitSet follow ) : object
input IIntStream
e RecognitionException
expectedTokenType int
follow BitSet
return object

GetRuleInvocationStack() public method

* Return IList{T} of the rules in your parser instance * leading up to a call to this method. You could override if * you want more details such as the file/line info of where * in the parser java code a rule is invoked. *
* This is very useful for error messages and for context-sensitive * error recovery. *
public GetRuleInvocationStack ( ) : IList
return IList

GetRuleInvocationStack() public static method

* A more general version of GetRuleInvocationStack where you can * pass in the StackTrace of, for example, a RecognitionException * to get it's rule stack trace. *
public static GetRuleInvocationStack ( System.Diagnostics.StackTrace trace ) : IList
trace System.Diagnostics.StackTrace
return IList

GetRuleMemoization() public method

* Given a rule number and a start token index number, return * MEMO_RULE_UNKNOWN if the rule has not parsed input starting from * start index. If this rule has parsed input starting from the * start index before, then return where the rule stopped parsing. * It returns the index of the last token matched by the rule. *
* For now we use a hashtable and just the slow Object-based one. * Later, we can make a special one for ints and also one that * tosses out data after we commit past input position i. *
public GetRuleMemoization ( int ruleIndex, int ruleStartIndex ) : int
ruleIndex int
ruleStartIndex int
return int

GetRuleMemoizationCacheSize() public method

return how many rule/input-index pairs there are in total.
public GetRuleMemoizationCacheSize ( ) : int
return int

GetTokenErrorDisplay() public method

* How should a token be displayed in an error message? The default * is to display just the text, but during development you might * want to have a lot of information spit out. Override in that case * to use t.ToString() (which, for CommonToken, dumps everything about * the token). This is better than forcing you to override a method in * your token objects because you don't have to go modify your lexer * so that it creates a new Java type. *
public GetTokenErrorDisplay ( IToken t ) : string
t IToken
return string

InitDFAs() protected method

protected InitDFAs ( ) : void
return void

Match() public method

* Match current input symbol against ttype. Attempt * single token insertion or deletion error recovery. If * that fails, throw MismatchedTokenException. *
* To turn off single token insertion or deletion error * recovery, override recoverFromMismatchedToken() and have it * throw an exception. See TreeParser.recoverFromMismatchedToken(). * This way any error in a rule will cause an exception and * immediate exit from rule. Rule would recover by resynchronizing * to the set of symbols that can follow rule ref. *
public Match ( IIntStream input, int ttype, BitSet follow ) : object
input IIntStream
ttype int
follow BitSet
return object

MatchAny() public method

Match the wildcard: in a symbol
public MatchAny ( IIntStream input ) : void
input IIntStream
return void

Memoize() public method

* Record whether or not this rule parsed the input at this position * successfully. Use a standard java hashtable for now. *
public Memoize ( IIntStream input, int ruleIndex, int ruleStartIndex ) : void
input IIntStream
ruleIndex int
ruleStartIndex int
return void

MismatchIsMissingToken() public method

public MismatchIsMissingToken ( IIntStream input, BitSet follow ) : bool
input IIntStream
follow BitSet
return bool

MismatchIsUnwantedToken() public method

public MismatchIsUnwantedToken ( IIntStream input, int ttype ) : bool
input IIntStream
ttype int
return bool

PopFollow() protected method

protected PopFollow ( ) : void
return void

PushFollow() protected method

Push a rule's follow set using our own hardcoded stack
protected PushFollow ( BitSet fset ) : void
fset BitSet
return void

Recover() public method

* Recover from an error found on the input stream. This is * for NoViableAlt and mismatched symbol exceptions. If you enable * single token insertion and deletion, this will usually not * handle mismatched symbol exceptions but there could be a mismatched * token that the match() routine could not recover from. *
public Recover ( IIntStream input, RecognitionException re ) : void
input IIntStream
re RecognitionException
return void

RecoverFromMismatchedSet() public method

public RecoverFromMismatchedSet ( IIntStream input, RecognitionException e, BitSet follow ) : object
input IIntStream
e RecognitionException
follow BitSet
return object

RecoverFromMismatchedToken() protected method

Attempt to recover from a single missing or extra token.
protected RecoverFromMismatchedToken ( IIntStream input, int ttype, BitSet follow ) : object
input IIntStream
ttype int
follow BitSet
return object

ReportError() public method

Report a recognition problem.
* This method sets errorRecovery to indicate the parser is recovering * not parsing. Once in recovery mode, no errors are generated. * To get out of recovery mode, the parser must successfully match * a token (after a resync). So it will go: * * 1. error occurs * 2. enter recovery mode, report error * 3. consume until token found in resynch set * 4. try to resume parsing * 5. next match() will reset errorRecovery mode * * If you override, make sure to update syntaxErrors if you care about that. *
public ReportError ( RecognitionException e ) : void
e RecognitionException
return void

Reset() public method

reset the parser's state; subclasses must rewinds the input stream
public Reset ( ) : void
return void

SetState() public method

public SetState ( RecognizerSharedState value ) : void
value RecognizerSharedState
return void

ToStrings() public method

* A convenience method for use most often with template rewrites. * Convert a list of IToken to a list of string. *
public ToStrings ( ICollection tokens ) : List
tokens ICollection
return List

TraceIn() public method

public TraceIn ( string ruleName, int ruleIndex, object inputSymbol ) : void
ruleName string
ruleIndex int
inputSymbol object
return void

TraceOut() public method

public TraceOut ( string ruleName, int ruleIndex, object inputSymbol ) : void
ruleName string
ruleIndex int
inputSymbol object
return void

Property Details

state protected_oe property

* State of a lexer, parser, or tree parser are collected into a state * object so the state can be shared. This sharing is needed to * have one grammar import others and share same error variables * and other state variables. It's a kind of explicit multiple * inheritance via delegation of methods and shared state. *
protected RecognizerSharedState state
return RecognizerSharedState