DOM Class Interface Reference

Since DOM is becoming the interface of choice in the Perl-XML world, it deserves more elaboration. The following sections describe class interfaces individually, listing their properties, methods, and intended purposes.

WARNING: The DOM specification calls for UTF-16 as the standard encoding. However, most Perl implementations assume a UTF-8 encoding. Due to limitations in Perl, working with characters of lengths other than 8 bits is difficult. This will change in a future version, and encodings like UTF-16 will be supported more readily.

Document

The Document class controls the overall document, creating new objects when requested and maintaining high-level information such as references to the document type declaration and the root element.

Properties

Methods

DocumentFragment

The DocumentFragment class is used to contain a document fragment. Its children are (zero or more) nodes representing the tops of XML trees. This class contrasts with Document, which has at most one child element, the document root, plus metadata like the document type. In this respect, DocumentFragment's content is not well-formed, though it must obey the XML well-formed rules in all other respects (no illegal characters in text, etc.)

No specific methods or properties are defined; use the generic node methods to access data.

DocumentType

This class contains all the information contained in the document type declaration at the beginning of the document, except the specifics about an external DTD. Thus, it names the root element and any declared entities or notations in the internal subset.

No specific methods are defined for this class, but the properties are public (but read-only).

Properties

Node

All node types inherit from the class Node. Any properties or methods common to all node types can be accessed through this class. A few properties, such as the value of the node, are undefined for some node types, like Element. The generic methods of this class are useful in some developing contexts, such as when writing code that processes nodes of different types. At other times, you'll know in advance what type you're working with, and you should use the specific class's methods instead.

All properties but nodeValue and prefix are read-only.

Properties

Methods

NodeList

This class is a container for an ordered list of nodes. It is "live," meaning that any changes to the nodes it references will appear in the document immediately.

Properties

Methods

NamedNodeMap

This unordered set of nodes is designed to allow access to nodes by name. An alternate access by index is also provided for enumerations, but no order is implied.

Properties

Methods

CharacterData

This class extends Node to facilitate access to certain types of nodes that contain character data, such as Text, CDATASection, Comment, and ProcessingInstruction. Specific classes like Text inherit from this class.

Properties

Methods

Element

This is the most common type of node you will encounter. An element can contain other nodes and has attribute nodes.

Properties

Methods

Attr

Properties

Text

Methods

CDATASection

CDATA Section is like a text node, but protects its contents from being parsed. It may contain markup characters (<, &) that would be illegal in text nodes. Use generic Node methods to access data.

ProcessingInstruction

Properties

Comment

This is a class representing comment nodes. Use the generic Node methods to access the data.

EntityReference

This is a reference to an entity defined by an Entity node. Sometimes the parser will be configured to resolve all entity references into their values for you. If that option is disabled, the parser should create this node. No explicit methods force resolution, but some actions to the node may have that side effect.

Entity

This class provides access to an entity in the document, based on information in an entity declaration in the DTD.

Properties

Notation

Notation represents a notation declaration appearing in the DTD.

Properties