W1 | W2 | W3 | W4 | W5 | W6 | W7

Webdesign en XML

XPath

expressies; ancestors, descendants en siblings; functies

XPath addressing

In one of the first classes we saw the model of granularity:

XPath is a so called scheme that uses the sequential and hierarchical context of elements. It uses expressions (strings that contain meaningful symbols) to select specific elements from an xml-file.

Chapter/Paragr

This expression will select the Paragr element that is a direct child of a Chapter element. The / character separates the steps.

There are two ways to write the expression, we can use the abbreviated form but the verbose form is more clear:

Chapter/Paragr child::Chapter/child::Paragr

Suppose we want to select the title elements from as well the introduction as from the chapters. We can do so using two expressions, or in one line using a wildcard

book/intro/title
book/chap/title
book/*/title

In general we have two kinds of expressions: absolute paths that start at a fixed reference point, namely the root and relative paths that start at a variable point, the context node.

/ the node containing the root element
* any element or attribute
@ any attribute (attribute::)
node() any node except the root and attributes
text() any text node
comment() any comment node
processing-instruction() any processing instruction
. the context node (self::)
/* the document element
.. parent::node()
//element any element descending from root
.//element any element descending from the current node

With these paths we can navigate through our document. This is used in XSL(T) that we deal with in the next class.

Consider this small xml-file:

<? xml version="1.0" ?>
<quotelist> 
  <quote id="1">  
    <body xml:lang="nl-nl">    
    Windows XP Home Edition is dus de eerste windows-versie voor     
    thuisgebruik die is gebaseerd op de technologie van windows NT en 2000  
    </body> 
  </quote>  
  <quote id="2">  
    <body xml:lang="en-uk">    
    When you refer to a specific form object, such as document.simple.stuff.value,    
    it takes a lot of typing to access that last little element  
    </body> 
  </quote>  
  <quote id="3">  
    <body>    
    In general we have two kinds of expressions: absolute paths and relative paths      
    </body> 
  <quote> 
</quotelist> 

the element quotelist is the root element,
to select all the body elements we could say:
quote//body
child::quote/descendant-or-self::node()/child::body

to select only the quote with id="2":
id(2) but this would also select other elements that have id='2'
quote[ @id='2' ]

to select only this quotes that are in Dutch:
quote//body[ lang("nl") ]

When the xml source gets more complex we will really notice how useful XPath is.

<? xml version="1.0" ?>
 <book>
  <title>Book title</title>
  <intro>
   <title>Introduction</title>
   <para>
  this is a first paragraph in the introduction of this book
   </para>
   <para>
  this is the second paragraph
   </para>
  </intro>
   
  <chapter id="1">
   <title>First chapter</title>
   <para>
  <title>sub-title for this paragraph</title>
  First paragraph
   </para>
  </chapter>
   
  <chapter id="2">
   <title>Second chapter</title>
   <para>
  First paragraph
   </para>
  </chapter>
 </book>

If we would call child::title we would get all titles from the whole document. Bradley (p.147) stresses that this expression does select all children of the current element that have the name title.

book/title would only give us the title of the book, but book//title would give us all titles!

To know what title elements there are that share the same parent element we could say: ../title or in the verbose form: parent::node()/child::title (select all title elements that share the same parent as the context element). What would this select if our current element is chapter/title ?

following-sibling::para[1] would select the next paragraph that is on the same level as the current element


This page was created on March 5. 2003 by Elwin Koster and was last updated on 27 juli 2003. All rights reserved.