C# Class Tidy.Core.Lexer

Lexer for html parser (c) 1998-2000 (W3C) MIT, INRIA, Keio University See Tidy.cs for the copyright notice. Derived from HTML Tidy Release 4 Aug 2000
Given a file stream fp it returns a sequence of tokens. GetToken(fp) gets the next token UngetToken(fp) provides one level undo The tags include an attribute list: - linked list of attribute/value nodes - each node has 2 null-terminated strings. - entities are replaced in attribute values white space is compacted if not in preformatted mode If not in preformatted mode then leading white space is discarded and subsequent white space sequences compacted to single space chars. If XmlTags is no then Tag names are folded to upper case and attribute names to lower case. Not yet done: - Doctype subset and marked sections
Datei anzeigen Open project: r1pper/TidyNetPortable Class Usage Examples

Public Properties

Property Type Description
BadAccess int
BadChars int
BadDoctype bool
BadForm int
BadLayout int
Columns int
Doctype HtmlVersion
ExcludeBlocks bool
Exiled bool
Inode Node
Input StreamIn
Insert int
Insertspace bool
Istack Stack
Istackbase int
Isvoyager bool
Lexbuf byte[]
Lexlength int
Lexsize int
Lines int
Messages TidyMessageCollection
Options TidyOptions
Pushed bool
State short
Styles Style
Token Node
Txtend int
Txtstart int
Versions HtmlVersion
Waswhite bool

Protected Properties

Property Type Description
SeenBodyEndTag int

Public Methods

Method Description
AddByte ( int c ) : void
AddCharToLexer ( int c ) : void
AddGenerator ( Node root ) : bool
AddStringLiteral ( string str ) : void
AddStringToLexer ( string str ) : void
ApparentVersion ( ) : HtmlVersion
CanPrune ( Node element ) : bool
ChangeChar ( byte c ) : void
CheckDocTypeKeyWords ( Node doctype ) : bool
CloneAttributes ( AttVal attrs ) : AttVal
CloneNode ( Node node ) : Node
DeferDup ( ) : void
EndOfInput ( ) : bool
ExpectsContent ( Node node ) : bool
FindGivenVersion ( Node doctype ) : HtmlVersion
FixDocType ( Node root ) : bool
FixHtmlNameSpace ( Node root, string profile ) : void
FixId ( Node node ) : void
FixXmlPi ( Node root ) : bool
FoldCase ( char c, bool tocaps, bool xmlTags ) : char
GetBytes ( string str ) : byte[]
GetCdata ( Node container ) : Node
GetHtmlVersion ( ) : HtmlVersion
GetString ( byte bytes, int offset, int length ) : string
GetToken ( short mode ) : Node
HtmlVersionName ( ) : string
InferredTag ( string name ) : Node
InlineDup ( Node node ) : int
InsertedToken ( ) : Node
IsPushed ( Node node ) : bool
IsValidAttrName ( string attr ) : bool
Lexer ( StreamIn input, TidyOptions options ) : System
NewLineNode ( ) : Node
NewNode ( ) : Node
NewNode ( short type, byte textarray, int start, int end ) : Node
NewNode ( short type, byte textarray, int start, int end, string element ) : Node
ParseAsp ( ) : Node
ParseAttribute ( MutableBoolean isempty, MutableObject asp, MutableObject php ) : string
ParseAttrs ( MutableBoolean isempty ) : AttVal
ParseEntity ( short mode ) : void
ParsePhp ( ) : Node
ParseServerInstruction ( ) : int
ParseTagName ( ) : char
ParseValue ( string name, bool foldCase, MutableBoolean isempty, MutableInteger pdelim ) : string
PopInline ( Node node ) : void
PushInline ( Node node ) : void
SetXhtmlDocType ( Node root ) : bool
UngetToken ( ) : void

Protected Methods

Method Description
UpdateNodeTextArrays ( byte oldtextarray, byte newtextarray ) : void

Private Methods

Method Description
FindBadSubString ( string s, string p, int len ) : bool
Lexer ( ) : System
Map ( char c ) : short
MapStr ( string str, int code ) : void

Method Details

AddByte() public method

public AddByte ( int c ) : void
c int
return void

AddCharToLexer() public method

public AddCharToLexer ( int c ) : void
c int
return void

AddGenerator() public method

public AddGenerator ( Node root ) : bool
root Node
return bool

AddStringLiteral() public method

public AddStringLiteral ( string str ) : void
str string
return void

AddStringToLexer() public method

public AddStringToLexer ( string str ) : void
str string
return void

ApparentVersion() public method

public ApparentVersion ( ) : HtmlVersion
return HtmlVersion

CanPrune() public method

public CanPrune ( Node element ) : bool
element Node
return bool

ChangeChar() public method

public ChangeChar ( byte c ) : void
c byte
return void

CheckDocTypeKeyWords() public method

public CheckDocTypeKeyWords ( Node doctype ) : bool
doctype Node
return bool

CloneAttributes() public method

public CloneAttributes ( AttVal attrs ) : AttVal
attrs AttVal
return AttVal

CloneNode() public method

public CloneNode ( Node node ) : Node
node Node
return Node

DeferDup() public method

public DeferDup ( ) : void
return void

EndOfInput() public method

public EndOfInput ( ) : bool
return bool

ExpectsContent() public static method

public static ExpectsContent ( Node node ) : bool
node Node
return bool

FindGivenVersion() public method

public FindGivenVersion ( Node doctype ) : HtmlVersion
doctype Node
return HtmlVersion

FixDocType() public method

public FixDocType ( Node root ) : bool
root Node
return bool

FixHtmlNameSpace() public method

public FixHtmlNameSpace ( Node root, string profile ) : void
root Node
profile string
return void

FixId() public method

public FixId ( Node node ) : void
node Node
return void

FixXmlPi() public method

public FixXmlPi ( Node root ) : bool
root Node
return bool

FoldCase() public static method

public static FoldCase ( char c, bool tocaps, bool xmlTags ) : char
c char
tocaps bool
xmlTags bool
return char

GetBytes() public static method

public static GetBytes ( string str ) : byte[]
str string
return byte[]

GetCdata() public method

public GetCdata ( Node container ) : Node
container Node
return Node

GetHtmlVersion() public method

public GetHtmlVersion ( ) : HtmlVersion
return HtmlVersion

GetString() public static method

public static GetString ( byte bytes, int offset, int length ) : string
bytes byte
offset int
length int
return string

GetToken() public method

public GetToken ( short mode ) : Node
mode short
return Node

HtmlVersionName() public method

public HtmlVersionName ( ) : string
return string

InferredTag() public method

public InferredTag ( string name ) : Node
name string
return Node

InlineDup() public method

public InlineDup ( Node node ) : int
node Node
return int

InsertedToken() public method

public InsertedToken ( ) : Node
return Node

IsPushed() public method

public IsPushed ( Node node ) : bool
node Node
return bool

IsValidAttrName() public static method

public static IsValidAttrName ( string attr ) : bool
attr string
return bool

Lexer() public method

public Lexer ( StreamIn input, TidyOptions options ) : System
input StreamIn
options TidyOptions
return System

NewLineNode() public method

public NewLineNode ( ) : Node
return Node

NewNode() public method

public NewNode ( ) : Node
return Node

NewNode() public method

public NewNode ( short type, byte textarray, int start, int end ) : Node
type short
textarray byte
start int
end int
return Node

NewNode() public method

public NewNode ( short type, byte textarray, int start, int end, string element ) : Node
type short
textarray byte
start int
end int
element string
return Node

ParseAsp() public method

public ParseAsp ( ) : Node
return Node

ParseAttribute() public method

public ParseAttribute ( MutableBoolean isempty, MutableObject asp, MutableObject php ) : string
isempty MutableBoolean
asp MutableObject
php MutableObject
return string

ParseAttrs() public method

public ParseAttrs ( MutableBoolean isempty ) : AttVal
isempty MutableBoolean
return AttVal

ParseEntity() public method

public ParseEntity ( short mode ) : void
mode short
return void

ParsePhp() public method

public ParsePhp ( ) : Node
return Node

ParseServerInstruction() public method

public ParseServerInstruction ( ) : int
return int

ParseTagName() public method

public ParseTagName ( ) : char
return char

ParseValue() public method

public ParseValue ( string name, bool foldCase, MutableBoolean isempty, MutableInteger pdelim ) : string
name string
foldCase bool
isempty MutableBoolean
pdelim MutableInteger
return string

PopInline() public method

public PopInline ( Node node ) : void
node Node
return void

PushInline() public method

public PushInline ( Node node ) : void
node Node
return void

SetXhtmlDocType() public method

public SetXhtmlDocType ( Node root ) : bool
root Node
return bool

UngetToken() public method

public UngetToken ( ) : void
return void

UpdateNodeTextArrays() protected method

protected UpdateNodeTextArrays ( byte oldtextarray, byte newtextarray ) : void
oldtextarray byte
newtextarray byte
return void

Property Details

BadAccess public_oe property

public int BadAccess
return int

BadChars public_oe property

public int BadChars
return int

BadDoctype public_oe property

public bool BadDoctype
return bool

BadForm public_oe property

public int BadForm
return int

BadLayout public_oe property

public int BadLayout
return int

Columns public_oe property

public int Columns
return int

Doctype public_oe property

public HtmlVersion Doctype
return HtmlVersion

ExcludeBlocks public_oe property

public bool ExcludeBlocks
return bool

Exiled public_oe property

public bool Exiled
return bool

Inode public_oe property

public Node,Tidy.Core Inode
return Node

Input public_oe property

public StreamIn,Tidy.Core Input
return StreamIn

Insert public_oe property

public int Insert
return int

Insertspace public_oe property

public bool Insertspace
return bool

Istack public_oe property

public Stack Istack
return Stack

Istackbase public_oe property

public int Istackbase
return int

Isvoyager public_oe property

public bool Isvoyager
return bool

Lexbuf public_oe property

public byte[] Lexbuf
return byte[]

Lexlength public_oe property

public int Lexlength
return int

Lexsize public_oe property

public int Lexsize
return int

Lines public_oe property

public int Lines
return int

Messages public_oe property

public TidyMessageCollection Messages
return TidyMessageCollection

Options public_oe property

public TidyOptions,Tidy.Core Options
return TidyOptions

Pushed public_oe property

public bool Pushed
return bool

SeenBodyEndTag protected_oe property

protected int SeenBodyEndTag
return int

State public_oe property

public short State
return short

Styles public_oe property

public Style,Tidy.Core Styles
return Style

Token public_oe property

public Node,Tidy.Core Token
return Node

Txtend public_oe property

public int Txtend
return int

Txtstart public_oe property

public int Txtstart
return int

Versions public_oe property

public HtmlVersion Versions
return HtmlVersion

Waswhite public_oe property

public bool Waswhite
return bool