XPath Version 1.0 Notes
Using XPath 1.0 to Select XML Data
Contents
The current specification can be found at XPath 2.0. Specifications for XPath 2.0 functions can be found at XPath 2.0 functions
XPath is the XML Path Language used to select parts of an XML instance document for processing by various other XML technologies such as XSLT, XPointer, XQuery and XForms. XPath provides a language for modelling XML instance documents as a set of hierarchical nodes with axes used to navigate from one node to another. A number of functions are available for processing contents of nodes and predicates can be used to further filter node identifiers.
<!-- A single step location path --> child::Phone[@type="Home"] <!-- A two step location path --> child::Person/child::Phone[@type="Home"]
In XPath, the context is defined as the location, size and position of the node currently being processed. Relative XPath locations are specified in relation to the current context
XPath Nodes
Top BottomIn XPath version 1.0 there are seven types of nodes:
- Root Node
- An XML document has one and only one root element sometimes called the document element. The root element is a child of the root node. The root node may have other child elements consisting of comment nodes or processing instruction nodes. In XPath, the root node represents the document itself. The text value of the root node is the concatenation of all text values in descendant nodes.
- Element Node
- Each element in an XML instance document is represented as an element node. The name for each element node consists of a namespace URI or the namespace prefix and the localpart of its name, seperated with a colon. The text content of an element node is the concatenation of all text values in descendant nodes.
- Attribute Node
- Each element attribute is represented as an attribute node. Although the element node to which it belongs is the parent node, the attribute node is not a child node of the element. Attribute nodes can not therefore be accessed via the 'child' axis of its parent element, but instead using the 'attribute' axis. The parent axis can however be used to access the parent element from an attribute node.
- Text Node
- The text content of an element node is represented as a text node.
- Namespace Node
- All in-scope namespaces of a node are represented as namespace nodes. The name() function returns the namespace prefix associated with a node. The 'self::node()' expression (or '.') returns the namespace URI.
- Comment Node
- Comment nodes represent comments in the XPath data model
- Processing Instruction Node
- Processing instruction nodes represent comments in the XPath data model
XPath Axes
Top BottomXPath axes are used to navigate the node tree of the XPath data model. There are 13 axes available in XPath version 1.0:
- child axis
- The default axis in XPath. Selects immeadiate child nodes of the context node. Because child is the default axis, location paths can be expressed as either 'child::itemname' or simply 'itemname'. 'child::*' or '*' returns all child nodes with a name (that is elements only) of the current context node. To select all nodes use 'child::node()' or simply 'node()'. To select text node children only use 'child::text()' or 'text()'.
- attribute axis
- Selects attribute nodes associated with an element node. 'attribute::*' can be abbreviated to '@*'. To select a specific attribute only use 'attribute::attname' or '@attname'
- ancestor axis
- Recursively selects all parent nodes for the current context node up to and including the root node.
- ancestor-or-self axis
- Returns all ancestor nodes plus the context node.
- descendant axis
- Recursively returns the child nodes of the current context node.
- descendant-or-self axis
- Returns all descendant nodes plus the current context node.
- following axis
- Returns all nodes that come after the context node in document order, but excludes descendant, attribute and namespace nodes associated with the context node.
- following-sibling axis
- Returns all following nodes that share the same parent as the context node.
- namespace axis
- Returns all in-scope namespace nodes for context node.
- parent axis
- Returns the parent node for the context node.
- preceding axis
- Returns all nodes that come before the context node in document order, excluding ancestor, attribute and namespace nodes.
- preceding-sibling axis
- Returns all preceding nodes that share the same parent as the context node.
- self axis
- Returns the context node. Can be specified as 'self::node()' or simply '.'.
Functions
Top BottomA built-in function library exists as part of the XPath specification that can be used with predicates to add further filtering to an XPath expression.
Boolean Functions
Top Bottom- boolean()
- Tests argument and returns true or false
- false()
- Returns false
- lang()
- Returns true if context node language matches string argument
- not()
- Returns opposite boolean value of its argument
- true()
- Returns true
Node-Set Functions
Top Bottom- count()
- Returns number of nodes in node-set
- id()
- Returns node-set of nodes with id attribute equal to its argument
- last()
- Returns context size
- local-name()
- Returns localpart of the name of the node set argument, or of the context node if no argument given
- name()
- Returns name of element in prefix::localpart format
- namespace-uri()
- Returns namespace URI for node-set argument, or for context node if no argument provided
- position()
- Returns value equal to context position
Numeric Functions
Top Bottom- ceiling()
- Returns smallest integer value greater than numeric argument
- floor()
- Returns smallest integer value less than numberic argument
- number()
- Returns numberic value of argument
- round()
- Rounds its argument
- sum()
- Returns sum of its node-set argument's values
String Functions
Top Bottom- concat()
- Returns concatenation of string arguments
- contains()
- Returns true if first string argument contains the second string argument
- normalize-space()
- Strips leading and trailing space and replaces consecutive whitespace with a single space character
- starts-with()
- Returns true if first arguement string starts with second arguement string
- string()
- Returns string value of argument
- string-length()
- Returns length of string argument
- substring()
- Returns a string from the first argument beginning at a number specified in second argument and optionally ending at number specified in third argument
- substring-after()
- Returns string from first argument that occurs after second string argument
- substring-before()
- Returns string from first argument that occurs before second string argument
- translate()
- Returns first string argument, with characters from second argument translated to corresponding characters in third argument
