DOMTreeBuilder
class DOMTreeBuilder implements EventHandler
Create an HTML5 DOM tree from events.
This attempts to create a DOM from events emitted by a parser. This attempts (but does not guarantee) to up-convert older HTML documents to HTML5. It does this by applying HTML5's rules, but it will not change the architecture of the document itself.
Many of the error correction and quirks features suggested in the specification are implemented herein; however, not all of them are. Since we do not assume a graphical user agent, no presentation-specific logic is conducted during tree building.
FIXME: The present tree builder does not exactly follow the state machine rules for insert modes as outlined in the HTML5 spec. The processor needs to be re-written to accomodate this. See, for example, the Go language HTML5 parser.
Constants
NAMESPACE_HTML |
Defined in http://www.w3.org/TR/html51/infrastructure.html#html-namespace-0 |
NAMESPACE_MATHML |
|
NAMESPACE_SVG |
|
NAMESPACE_XLINK |
|
NAMESPACE_XML |
|
NAMESPACE_XMLNS |
|
OPT_DISABLE_HTML_NS |
|
OPT_TARGET_DOC |
|
OPT_IMPLICIT_NS |
|
IM_INITIAL |
Defined in 8.2.5. |
IM_BEFORE_HTML |
|
IM_BEFORE_HEAD |
|
IM_IN_HEAD |
|
IM_IN_HEAD_NOSCRIPT |
|
IM_AFTER_HEAD |
|
IM_IN_BODY |
|
IM_TEXT |
|
IM_IN_TABLE |
|
IM_IN_TABLE_TEXT |
|
IM_IN_CAPTION |
|
IM_IN_COLUMN_GROUP |
|
IM_IN_TABLE_BODY |
|
IM_IN_ROW |
|
IM_IN_CELL |
|
IM_IN_SELECT |
|
IM_IN_SELECT_IN_TABLE |
|
IM_AFTER_BODY |
|
IM_IN_FRAMESET |
|
IM_AFTER_FRAMESET |
|
IM_AFTER_AFTER_BODY |
|
IM_AFTER_AFTER_FRAMESET |
|
IM_IN_SVG |
|
IM_IN_MATHML |
|
Methods
No description
Get the document.
Get the DOM fragment for the body.
A doctype declaration.
Process the start tag.
An end-tag.
A comment section (unparsed character data).
A unit of parsed character data.
Indicates that the document has been entirely processed.
Emitted when the parser encounters an error condition.
No description
A CDATA section.
This is a holdover from the XML spec.
Details
at line line 162
__construct($isFragment = false,
array $options = array())
at line line 206
document()
Get the document.
at line line 221
DOMFragmentDocumentFragment
fragment()
Get the DOM fragment for the body.
This returns a DOMNodeList because a fragment may have zero or more DOMNodes at its root.
at line line 232
setInstructionProcessor(
InstructionProcessor $proc)
Provide an instruction processor.
This is used for handling Processor Instructions as they are inserted. If omitted, PI's are inserted directly into the DOM tree.
at line line 237
doctype(
string $name,
int $idType,
string $id = null,
boolean $quirks = false)
A doctype declaration.
at line line 259
int
startTag(
string $name,
array $attributes = array(),
boolean $selfClosing = false)
Process the start tag.
at line line 460
endTag($name)
An end-tag.
at line line 540
comment($cdata)
A comment section (unparsed character data).
at line line 547
text($data)
A unit of parsed character data.
Entities in this text are already decoded.
at line line 568
eof()
Indicates that the document has been entirely processed.
at line line 573
parseError($msg, $line, $col)
Emitted when the parser encounters an error condition.
at line line 578
getErrors()
at line line 583
cdata(
string $data)
A CDATA section.
at line line 589
processingInstruction(
string $name,
string $data = null)
This is a holdover from the XML spec.
While user agents don't get PIs, server-side does.