XQuery/XQuery and Python

      Over at [1] Cameron Laird has an example of Python code to extract and list the a tags on an XHTML page:

          import elementtree.ElementTree
          
          for element in  elementtree.ElementTree.parse("draft2.xml").findall("//a"):
              if element.tag == "a":
                  attributes = element.attrib
                  if "href" in attributes:
                      print "'%s' is at URL '%s'." % (element.text,
                                                      attributes['href'])
                  if "name" in attributes:
                      print "'%s' anchors '%s'." % (element.text,
                                                      attributes['name'])
              
      

      The equivalent to this Python code in XQuery is:

       for $a in doc("http://en.wikipedia.org/wiki/XQuery")//*:a
       return   
         if ($a/@href)
         then concat("'", $a,"'  is at  URL '",$a/@href,"'
")
         else if ($a/@name)
         then concat("'", $a,"'  anchors '",$a/@name,"'
")
         else ()
              
      

      tags in XQuery on Wikipedia (view source)

      Here the namespace prefix is a wild-card since we don't know what the html namespace might be.

      More succinctly but less readably (and with quotes omitted from the output for clarity), this could be expressed as:

       string-join(
            doc("http://en.wikipedia.org/wiki/XQuery")//*:a/
              (if (@href)
              then concat(.,"  is at  URL ",@href)
              else if (@name)
              then concat(.," anchors ", @name)
              else ()
              )
               ,'
'
           )
      

      tags in XQuery on Wikipedia (view source)

      More usefully, we might supply the url of any XHTML page as a parameter and generate an HTML page of external links:

      declare option exist:serialize "method=xhtml media-type=text/html";
      
      let $url :=request:get-parameter("url",())
      return
        <html>
            <h1>External links in {$url}</h1>
             { 
              for $a in doc($url)//*:a[text()][starts-with(@href,'http://')]
              return 
                     <div><b>{string($a)}</b> is at  <a href="{$a/@href}"><i>{string($a/@href)}</i> </a></div>
             }
        </html>
      

      XQuery on Wikipedia

      Last modified on 15 April 2013, at 12:37