XQuery/Wiki weapons page< XQuery
Over on Matt Turner's blog, he uses MarkLogic to get a list of medieval weapons from the wiki page as the first step in the enrichment of the texts of Shakespeare's plays.
Here's another attempt at this task, using only standard XQuery functions. Again, we are fortunate that wiki pages are well-formed XML.
declare namespace h= "http://www.w3.org/1999/xhtml" ; let $url := "http://en.wikipedia.org/wiki/List_of_medieval_weapons" let $wikipage := doc($url) return string-join($wikipage//h:div[@id="bodyContent"]//h:li[h:a/@title][empty(h:ul)]/h:a,',')
The complex path here is it ensure that only the relevant li tags are included and that only terminals in a hierarchy of terms are included, hence the check that the li has no ul child.