Since DOM is becoming the interface of choice in the Perl-XML world, it deserves more elaboration. The following sections describe class interfaces individually, listing their properties, methods, and intended purposes.
WARNING: The DOM specification calls for UTF-16 as the standard encoding. However, most Perl implementations assume a UTF-8 encoding. Due to limitations in Perl, working with characters of lengths other than 8 bits is difficult. This will change in a future version, and encodings like UTF-16 will be supported more readily.
The Document class controls the overall document, creating new objects when requested and maintaining high-level information such as references to the document type declaration and the root element.
Generates a new node object.
Generates a new element or attribute node object with a specified namespace qualifier.
Creates a container object for a document's subtree.
Returns a NodeList of all elements having a given tag name at any level of the document.
Returns a NodeList of all elements having a given namespace qualifier and local name. The asterisk character (*) matches any element or any namespace, allowing you to find all elements in a given namespace.
Returns a reference to the node that has a specified ID attribute.
Creates a new node that is the copy of a node from another document. Acts like a "copy to the clipboard" operation for importing markup.
The DocumentFragment class is used to contain a document fragment. Its children are (zero or more) nodes representing the tops of XML trees. This class contrasts with Document, which has at most one child element, the document root, plus metadata like the document type. In this respect, DocumentFragment's content is not well-formed, though it must obey the XML well-formed rules in all other respects (no illegal characters in text, etc.)
No specific methods or properties are defined; use the generic node methods to access data.
This class contains all the information contained in the document type declaration at the beginning of the document, except the specifics about an external DTD. Thus, it names the root element and any declared entities or notations in the internal subset.
No specific methods are defined for this class, but the properties are public (but read-only).
The name of the root element.
A NamedNodeMap of entity declarations.
A NamedNodeMap of notation declarations.
The internal subset of the DTD represented as a string.
The external subset of the DTD's public identifier.
The external subset of the DTD's system identifier.
All node types inherit from the class Node. Any properties or methods common to all node types can be accessed through this class. A few properties, such as the value of the node, are undefined for some node types, like Element. The generic methods of this class are useful in some programming contexts, such as when writing code that processes nodes of different types. At other times, you'll know in advance what type you're working with, and you should use the specific class's methods instead.
All properties but nodeValue and prefix are read-only.
A property that is defined for elements, attributes, and entities. In the context of elements this property would be the tag's name.
A property defined for attributes, text nodes, CDATA nodes, PIs, and comments.
One of the following types of nodes: Element, Attr, Text, CDATASection, EntityReference, Entity, ProcessingInstruction, Comment, Document, DocumentType, DocumentFragment, or Notation.
A reference to the parent of this node.
An ordered list of references to children of this node (if any).
References to the first and last of the node's children (if any).
The node immediately preceding or following this one, respectively.
An unordered list (NamedNodeMap) of nodes that are attributes of this one (if any).
A reference to the object containing the whole document -- useful when you need to generate a new node.
A namespace URI if this node has a namespace prefix; otherwise it is null.
The namespace prefix associated with this node.
Inserts a node before a reference child element.
Swaps a child node with a new one you supply, giving you the old one in return.
Adds a new node to the end of this node's list of children.
True if there are children of this node; otherwise, it is false.
Returns a duplicate copy of this node. It provides an alternate way to generate nodes. All properties will be identical except for parentNode, which will be undefined, and childNodes, which will be empty. Cloned elements will all have the same attributes as the original. If the argument deep is set to true, then the node and all its descendants will be copied.
Returns true if this node has defined attributes.
Returns true if this implementation supports a specific feature.
This class is a container for an ordered list of nodes. It is "live," meaning that any changes to the nodes it references will appear in the document immediately.
This unordered set of nodes is designed to allow access to nodes by name. An alternate access by index is also provided for enumerations, but no order is implied.
Retrieves or adds a node using the node's nodeName property as the key.
Takes a node with the specified name out of the set and returns it.
Given an integer value n, returns a reference to the nth node in the set. Note that this method does not imply any order and is provided only for unique enumeration.
Retrieves a node based on a namespace-qualified name (a namespace prefix and local name).
Takes an item out of the list and returns it, based on its namespace-qualified name.
Adds a node to the list using its namespace-qualified name.
This class extends Node to facilitate access to certain types of nodes that contain character data, such as Text, CDATASection, Comment, and ProcessingInstruction. Specific classes like Text inherit from this class.
Appends a string of character data to the end of the data property.
Extracts and returns a segment of the data property from offset to offset + count.
Inserts a string inside the data property at the location given by offset.
Sets the data property to an empty string.
Changes the contents of data property with a new string that you provide.
This is the most common type of node you will encounter. An element can contain other nodes and has attribute nodes.
Returns the value of an attribute, or a reference to the attribute node, with a given name.
Adds a new attribute to the element's list or replaces an existing attribute of the same name.
Returns the value of an attribute and removes it from the element's list.
Returns a NodeList of descendant elements who match a name.
Collapses adjacent text nodes. You should use this method whenever you add new text nodes to ensure that the structure of the document remains the same, without erroneous extra children.
Retrieves an attribute value based on its qualified name (the namespace prefix plus the local name).
Gets an attribute's node by using its qualified name.
Returns a NodeList of elements among this element's descendants that match a qualified name.
Returns true if this element has an attribute with a given name.
Returns true if this element has an attribute with a given qualified name.
Removes and returns an attribute node from this element's list, based on its namespace-qualified name.
Adds a new attribute to the element's list, given a namespace-qualified name and a value.
Adds a new attribute node to the element's list with a namespace-qualified name.
The attribute's name.
If the program or the document explicitly set the attribute, this property is true. If it was set in the DTD as a default and not reset anywhere else, then it will be false.
The attribute's value, represented as a text node.
The element to which this attribute belongs.
Breaks the text node into two adjacent text nodes, each with part of the original text content. Content in the first node is from the beginning of the original up to, but not including, a character whose position is given by offset. The second node has the rest of the original node's content. This method is useful for inserting a new element inside a span of text.
CDATA Section is like a text node, but protects its contents from being parsed. It may contain markup characters (<, &) that would be illegal in text nodes. Use generic Node methods to access data.
This is a class representing comment nodes. Use the generic Node methods to access the data.
This is a reference to an entity defined by an Entity node. Sometimes the parser will be configured to resolve all entity references into their values for you. If that option is disabled, the parser should create this node. No explicit methods force resolution, but some actions to the node may have that side effect.
This class provides access to an entity in the document, based on information in an entity declaration in the DTD.
Copyright © 2002 O'Reilly & Associates. All rights reserved.