Masterminds\Html5\Parser\DOMTreeBuilder

class DOMTreeBuilder implements EventHandler

Create an HTML5 DOM tree from events.

This attempts to create a DOM from events emitted by a parser. This attempts (but does not guarantee) to up-convert older HTML documents to HTML5. It does this by applying HTML5's rules, but it will not change the architecture of the document itself.

Many of the error correction and quirks features suggested in the specification are implemented herein; however, not all of them are. Since we do not assume a graphical user agent, no presentation-specific logic is conducted during tree building.

FIXME: The present tree builder does not exactly follow the state machine rules for insert modes as outlined in the HTML5 spec. The processor needs to be re-written to accomodate this. See, for example, the Go language HTML5 parser.

Constants

NAMESPACE_HTML	Defined in http://www.w3.org/TR/html51/infrastructure.html#html-namespace-0
NAMESPACE_MATHML
NAMESPACE_SVG
NAMESPACE_XLINK
NAMESPACE_XML
NAMESPACE_XMLNS
OPT_DISABLE_HTML_NS
OPT_TARGET_DOC
OPT_IMPLICIT_NS
IM_INITIAL	Defined in 8.2.5.
IM_BEFORE_HTML
IM_BEFORE_HEAD
IM_IN_HEAD
IM_IN_HEAD_NOSCRIPT
IM_AFTER_HEAD
IM_IN_BODY
IM_TEXT
IM_IN_TABLE
IM_IN_TABLE_TEXT
IM_IN_CAPTION
IM_IN_COLUMN_GROUP
IM_IN_TABLE_BODY
IM_IN_ROW
IM_IN_CELL
IM_IN_SELECT
IM_IN_SELECT_IN_TABLE
IM_AFTER_BODY
IM_IN_FRAMESET
IM_AFTER_FRAMESET
IM_AFTER_AFTER_BODY
IM_AFTER_AFTER_FRAMESET
IM_IN_SVG
IM_IN_MATHML

Methods

__construct($isFragment = false, array $options = array())

No description

document()

Get the document.

DOMFragmentDocumentFragment

fragment()

Get the DOM fragment for the body.

setInstructionProcessor( InstructionProcessor $proc)

Provide an instruction processor.

doctype( string $name, int $idType, string $id = null, boolean $quirks = false)

A doctype declaration.

int

startTag( string $name, array $attributes = array(), boolean $selfClosing = false)

Process the start tag.

endTag($name)

An end-tag.

comment($cdata)

A comment section (unparsed character data).

text($data)

A unit of parsed character data.

eof()

Indicates that the document has been entirely processed.

parseError($msg, $line, $col)

Emitted when the parser encounters an error condition.

getErrors()

No description

cdata( string $data)

A CDATA section.

processingInstruction( string $name, string $data = null)

This is a holdover from the XML spec.

Details

at line line 162
`__construct($isFragment = false, array $options = array())`

Parameters

	$isFragment
array	$options

at line line 206
`document()`

Get the document.

at line line 221
`DOMFragmentDocumentFragment fragment()`

Get the DOM fragment for the body.

This returns a DOMNodeList because a fragment may have zero or more DOMNodes at its root.

Return Value

DOMFragmentDocumentFragment

at line line 232
`setInstructionProcessor( InstructionProcessor $proc)`

Provide an instruction processor.

This is used for handling Processor Instructions as they are inserted. If omitted, PI's are inserted directly into the DOM tree.

Parameters

InstructionProcessor

$proc

at line line 237
`doctype( string $name, int $idType, string $id = null, boolean $quirks = false)`

A doctype declaration.

Parameters

string	$name	The name of the root element.
int	$idType	One of DOCTYPENONE, DOCTYPEPUBLIC, or DOCTYPE_SYSTEM.
string	$id	The identifier. For DOCTYPEPUBLIC, this is the public ID. If DOCTYPESYSTEM, then this is a system ID.
boolean	$quirks	Indicates whether the builder should enter quirks mode.

at line line 259
`int startTag( string $name, array $attributes = array(), boolean $selfClosing = false)`

Process the start tag.

Parameters

string	$name	The tag name.
array	$attributes	An array with all of the tag's attributes.
boolean	$selfClosing	An indicator of whether or not this tag is self-closing ()

Return Value

int	One of the Tokenizer::TEXTMODE_* constants.

at line line 460
`endTag($name)`

An end-tag.

Parameters

$name

at line line 540
`comment($cdata)`

A comment section (unparsed character data).

Parameters

$cdata

at line line 547
`text($data)`

A unit of parsed character data.

Entities in this text are already decoded.

Parameters

$data

at line line 568
`eof()`

Indicates that the document has been entirely processed.

at line line 573
`parseError($msg, $line, $col)`

Emitted when the parser encounters an error condition.

Parameters

	$msg
	$line
	$col

at line line 578
`getErrors()`

at line line 583
`cdata( string $data)`

A CDATA section.

Parameters

string

$data

The unparsed character data.

at line line 589
`processingInstruction( string $name, string $data = null)`

This is a holdover from the XML spec.

While user agents don't get PIs, server-side does.

Parameters

string	$name	The name of the processor (e.g. 'php').
string	$data	The unparsed data.

DOMTreeBuilder

Constants

Methods

Details

at line line 162 __construct($isFragment = false, array $options = array())

Parameters

at line line 206 document()

at line line 221 DOMFragmentDocumentFragment fragment()

Return Value

See also

at line line 232 setInstructionProcessor( InstructionProcessor $proc)

Parameters

at line line 237 doctype( string $name, int $idType, string $id = null, boolean $quirks = false)

Parameters

at line line 259 int startTag( string $name, array $attributes = array(), boolean $selfClosing = false)

Parameters

Return Value

at line line 460 endTag($name)

Parameters

at line line 540 comment($cdata)

Parameters

at line line 547 text($data)

Parameters

at line line 568 eof()

at line line 573 parseError($msg, $line, $col)

Parameters

at line line 578 getErrors()

at line line 583 cdata( string $data)

Parameters

at line line 589 processingInstruction( string $name, string $data = null)

Parameters

at line line 162
`__construct($isFragment = false, array $options = array())`

at line line 206
`document()`

at line line 221
`DOMFragmentDocumentFragment fragment()`

at line line 232
`setInstructionProcessor( InstructionProcessor $proc)`

at line line 237
`doctype( string $name, int $idType, string $id = null, boolean $quirks = false)`

at line line 259
`int startTag( string $name, array $attributes = array(), boolean $selfClosing = false)`

at line line 460
`endTag($name)`

at line line 540
`comment($cdata)`

at line line 547
`text($data)`

at line line 568
`eof()`

at line line 573
`parseError($msg, $line, $col)`

at line line 578
`getErrors()`

at line line 583
`cdata( string $data)`

at line line 589
`processingInstruction( string $name, string $data = null)`