DOMTreeBuilder
class DOMTreeBuilder implements EventHandler (View source)
Create an HTML5 DOM tree from events.
This attempts to create a DOM from events emitted by a parser. This attempts (but does not guarantee) to up-convert older HTML documents to HTML5. It does this by applying HTML5's rules, but it will not change the architecture of the document itself.
Many of the error correction and quirks features suggested in the specification are implemented herein; however, not all of them are. Since we do not assume a graphical user agent, no presentation-specific logic is conducted during tree building.
FIXME: The present tree builder does not exactly follow the state machine rules for insert modes as outlined in the HTML5 spec. The processor needs to be re-written to accomodate this. See, for example, the Go language HTML5 parser.
Constants
NAMESPACE_HTML |
Defined in http://www.w3.org/TR/html51/infrastructure.html#html-namespace-0 |
NAMESPACE_MATHML |
|
NAMESPACE_SVG |
|
NAMESPACE_XLINK |
|
NAMESPACE_XML |
|
NAMESPACE_XMLNS |
|
IM_INITIAL |
Defined in 8.2.5. |
IM_BEFORE_HTML |
|
IM_BEFORE_HEAD |
|
IM_IN_HEAD |
|
IM_IN_HEAD_NOSCRIPT |
|
IM_AFTER_HEAD |
|
IM_IN_BODY |
|
IM_TEXT |
|
IM_IN_TABLE |
|
IM_IN_TABLE_TEXT |
|
IM_IN_CAPTION |
|
IM_IN_COLUMN_GROUP |
|
IM_IN_TABLE_BODY |
|
IM_IN_ROW |
|
IM_IN_CELL |
|
IM_IN_SELECT |
|
IM_IN_SELECT_IN_TABLE |
|
IM_AFTER_BODY |
|
IM_IN_FRAMESET |
|
IM_AFTER_FRAMESET |
|
IM_AFTER_AFTER_BODY |
|
IM_AFTER_AFTER_FRAMESET |
|
IM_IN_SVG |
|
IM_IN_MATHML |
|
Methods
No description
Get the document.
Get the DOM fragment for the body.
A doctype declaration.
Process the start tag.
An end-tag.
A comment section (unparsed character data).
A unit of parsed character data.
Indicates that the document has been entirely processed.
Emitted when the parser encounters an error condition.
No description
A CDATA section.
This is a holdover from the XML spec.
Details
at line line 150
__construct($isFragment = false,
array $options = array())
at line line 183
document()
Get the document.
at line line 198
DOMFragmentDocumentFragment
fragment()
Get the DOM fragment for the body.
This returns a DOMNodeList because a fragment may have zero or more DOMNodes at its root.
at line line 209
setInstructionProcessor(
InstructionProcessor $proc)
Provide an instruction processor.
This is used for handling Processor Instructions as they are inserted. If omitted, PI's are inserted directly into the DOM tree.
at line line 214
doctype(
string $name,
int $idType,
string $id = null,
boolean $quirks = false)
A doctype declaration.
at line line 236
numeric
startTag(
string $name,
array $attributes = array(),
boolean $selfClosing = false)
Process the start tag.
at line line 428
endTag($name)
An end-tag.
at line line 508
comment($cdata)
A comment section (unparsed character data).
at line line 515
text($data)
A unit of parsed character data.
Entities in this text are already decoded.
at line line 536
eof()
Indicates that the document has been entirely processed.
at line line 541
parseError($msg, $line, $col)
Emitted when the parser encounters an error condition.
at line line 546
getErrors()
at line line 551
cdata(
string $data)
A CDATA section.
at line line 557
processingInstruction(
string $name,
string $data = null)
This is a holdover from the XML spec.
While user agents don't get PIs, server-side does.