XML DOM Processing
Processing XML DOM with Javascript
Contents
The XML DOM is used to process XML data: creating or modifying data from an in-memory copy of XML data. The DOM provides a hierarchy of Node objects, with the Document node sitting at the top. The structure may be represented as:
- Document Node
-
- Document Type Node
- Processing Instruction Node
- Comment Node
- Document Element Node
-
- Comment Node
- Processing Instruction Node
- Text Node
- CDATASection Node
- Element Node
- Element Node
- Comment Node
- Processing Instruction Node
- Text Node
- CDATASection Node
- Entity Node
- EntityReference Node
- Entity Node
-
- Element Node
- Comment Node
- Processing Instruction Node
- Text Node
- CDATASection Node
- EntityReference Node
- EntityReference Node
-
- Element Node
- Comment Node
- Processing Instruction Node
- Text Node
- CDATASection Node
- EntityReference Node
Attr Nodes are not considered part of the DOM hierarchy, but are said to be associated with an Element node. The text content of an Attr Node is contained in a Text Node.
The javascript in the sections below use the zxml library developed by Nicholas C. Zakas to provide cross-browser support for XML processing.
The Document Element
Top BottomThe Document Element node interface exposes three properties:
- documentElement
- doctype
- implementation
...and 14 methods:
- createAttribute(name)
- createAttributeNS(namespaceURI, name)
- createCDATASection(data)
- createComment(data)
- createDocumentFragment()
- createElement(tagName)
- createElementNS(namespaceURI, tagName)
- createEntityReference(name)
- createProcessingInstruction(target,data)
- createTextNode(data)
- getElementById(ID)
- getElementByTagName(tagname)
- getElementByTagNameNS(namespaceURI, tagname)
- importNode(importedNode, deep)
We'll see the use of the create...() methods, in later sections. For this example we'll be using the documentElement property and getElementById() method. The property also features, but this is a property available for all nodes.
The documentElement property returns the Document Element Node object for an XML data file.
The form below consists of two textareas and four buttons. The first textarea is used as a container for our source XML data. The 'Show Source File' button calls the showSourceFile function with two parameters: the name of the source XML data file and the id value of the first textarea element. We use the zXmlHttp.createRequest() method to create a http request for the XML source file, and collect the response in an object representing the XML DOM. We then use getElementById to access the first textarea element and set the 'value' attribute to the XML content of our XML DOM object.
The 'Get Document Element' button, uses the documentElement method to get an object corresponding to the Document Element. We then access the name for the Document Element using the nodeName property, and display this in the second textarea element.
The 'Clear Source Field' and 'Clear Results Field' use getElementById to set the value for their respective textareas to ''
The script looks like this:
function loadXMLSource(filename) {
var oReq = zXmlHttp.createRequest();
oReq.open("GET", filename, false);
oReq.send(null);
return oReq.responseXML;
}
function showSourceFile(filename, container) {
var oDom = loadXMLSource(filename);
var oInput = document.getElementById(container);
oInput.value = oDom.xml;
}
function showResults(text, container) {
var oResults = document.getElementById(container);
oResults.value += text + "\n";
}
function getDocumentElement(filename,container) {
var oDom = loadXMLSource(filename);
showResults("Document Element: " + oDom.documentElement.nodeName, container);
}
Node Properties
Top BottomNow that we can access Nodes using documentElement and getElementById, we can use various properties of node objects to further process our XML data. Commonly used node properties are:
- attributes
- childNodes
- firstChild
- lastChild
- nextSibling
- previousSibling
- parentNode
- nodeName
- nodeType
- nodeValue
- ownerDocument
- localName
- namespaceURI
- prefix
The following example uses childNodes, nodeName, nodeType, nodeValue and firstChild to display some properties of the Document Element node children.
Note that where we used childNodes, we only appear to be printing the odd numbered elements (1,3,5,..). This is because the even numbered child nodes are carriage returns. To avoid nasty surprises, before we try to access the nodeName or firstChild.nodeValue of any childNodes element, we should test for a nodeType == 1.
The additional function used by the 'Show Child Nodes' button is:
function displayNodes(filename, container) {
var oDom = loadXMLSource(filename);
var sHeader = "Child\tGrandChild\tGreatGrandChild";
showResults(sHeader, container);
var commandElements = oDom.documentElement.childNodes;
for (var i = 0; i < commandElements.length; i++) {
var commandChildElements = commandElements[i].childNodes;
for (var j = 0; j < commandChildElements.length; j++) {
var oCommandChild = commandChildElements[j];
if (oCommandChild.nodeType == 1) {
var oCommandChildText = oCommandChild.firstChild;
var text = i + "\t(" + j + ")" + oCommandChild.nodeName + "\t\t" +
oCommandChildText.nodeValue;
showResults(text, container);
}
}
}
}
Node Methods
Top BottomWe have used node properties to access the data in our XML source file. We can also use methods to alter the XML data. Commonly used node methods are:
- appendChild(newChild)
- insertBefore(newChild,currentChild)
- removeChild(oldChild)
- replaceChild(newChild,oldChild)
- cloneNode(deep)
- hasAttributes()
- hasChildNodes()
The following example uses appendChild to add a new node and removeChild to remove a node. We also rely on some methods and properties that belong to the Document element node:
- createElement(name)
- createTextNode(name)
- createAttribute(name)
And one method belonging to the attributes NamedNodeMap:
- setNamedItem(attrObject)
Note that when deleting a node we check that nodeType == 1 to ensure that we delete an element node and not a newline node. The new functions used for the two new buttons are:
function addNode(filename,container) {
var oDom = loadXMLSource(filename);
var oNewCommand = oDom.createElement("command");
var oNewName = oDom.createElement("name");
var oNewDescription = oDom.createElement("description");
var oNewNameText = oDom.createTextNode("pwd");
var oNewDescriptionText = oDom.createTextNode("print name of current working directory");
oNewDescription.appendChild(oNewDescriptionText);
oNewName.appendChild(oNewNameText);
oNewCommand.appendChild(oNewName);
oNewCommand.appendChild(oNewDescription);
oDom.documentElement.appendChild(oNewCommand);
showResults(oDom.xml,container);
}
function deleteNode(filename,container) {
var oDom = loadXMLSource(filename);
var commands = oDom.documentElement.childNodes;
for (var i = 0; i < commands.length ; i++) {
if (commands[i].nodeType == 1) {
var oToDelete = commands[i];
oToDelete.parentNode.removeChild(oToDelete);
i = commands.length;
}
}
showResults(oDom.xml,container);
}
Attribute Methods and Properties
Top BottomThe attributes property for a node object returns a NamedNodeMap object. A NamedNodeMap object has one property:
- length
...and seven methods:
- getNamedItem(name)
- getNamedItemNS(namespaceURI, localName)
- item(index)
- removeNamedItem(name)
- removeNamedItemNS(namespaceURI, localName)
- setNamedItem(name)
- setNamedItemNS(namespaceURI, localName)
We've already used setNamedItem in the Node Methods example. removeNamedItem() and getNamedItem() are similarly called by first creating an 'elementObject.attributes' NamedNodeMap:
