XQuery/BBC Weather Forecast

BBC Weather forecastsEdit

Some weather data is available from the BBC as RSS feeds. Currently this includes the current conditions and the 3-day forecast. Lacking a standard set of tags fro weather properties, the conditions are expressed in a string and string parsing is needed to access the elemental data.

For other forecasts such as the 24-hr and 5 day which are not available as RSS we must scrape the HTML page.

One approach to this task is this Yahoo Pipe which converts the page to an RSS feed. However the data would be more useful converted to XML elements.

Dates and timesEdit

In all these pages and feeds there is a problem to assign a date to a forecast or observation. Dates are often omitted or expressed as a day-of-the week. This leads to complications in processing both RSS and HTMl pages.

24-hour forecastEdit

This script uses the eXist module httpclient to get the HTML, parses the HTML and generates an XML file. This XML could then be transformed via XSLT to a viewable page.


This script has two parameter:

  • region - required - a numeric code unique to the BBC (? code list)
  • area - optional - a sub region , typically the beginning of the postcode
declare namespace h ="http://www.w3.org/1999/xhtml";

declare function local:day-of-week($date) {
    ('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat')
       [  xs:integer(($date - xs:date('1901-01-06'))
          div xs:dayTimeDuration('P1D')) mod 7

let $area := request:get-parameter("area",())
let $region := request:get-parameter("region","2")
let $url := concat ("http://news.bbc.co.uk/weather/forecast/",$region, "?state=fo:B", if (exists($area)) then concat("&area=",$area)  else ())
let $doc := httpclient:get(xs:anyURI($url),false(),())
let $currentDate := current-date()
let $currentTime := current-time()
let $dow := local:day-of-week($currentDate)
element forecasts {
          element region {$region},
          if (exists($area)) then element area {$area} else () ,
          element source {"BBC"},
  for $row in  $doc/httpclient:body//h:table/h:tbody/h:tr
  let $raw-time :=normalize-space($row/h:td[1])
  let $time := if (contains($raw-time," ")) then substring-before($raw-time," ") else $raw-time
  let $time := xs:time(concat($time,":00"))
  let $pdow := if (contains($raw-time,"(")) then substring-before(substring-after($raw-time,"("),")") else $dow
  let $date := if ($pdow ne $dow)   then $currentDate + xs:dayTimeDuration("P1D") else $currentDate
      element forecast {
         element date {$date},
         element time {$time},
         element dow {$pdow},
         element summary {string($row/h:td[2]//h:p[@class="sum"])},
         element imageurl {string($row/h:td[2]//h:div[@class="summary"]//h:img/@src)},
         element maxTemp{ attribute units {"degc"} , $row/h:td[3]//h:span[@class="cent"]/text()},
         element maxTemp {attribute units {"degf"} , $row/h:td[3]//h:span[contains(@class,"fahr")]/text()},
         element windDirection {string($row/h:td[4]//h:span[contains(@class,"wind")]/@title)},
         element windSpeed {attribute units {"mph"} , substring-before($row/h:td[4]//h:span[contains(@class,"mph")], "mph")},
         element windSpeed {attribute units {"kph"} ,substring-before($row/h:td[4]//h:span[contains(@class,"kph")], "km/h")},
         element humidity {attribute units {"%"}, normalize-space(substring-before($row/h:td[5]//h:span[contains(@class,"hum")], "%"))},
         element pressure  { attribute units {"mb"} , normalize-space(substring-before($row/h:td[5]//h:span[@class="pres"], "mB"))},
         element visibility {normalize-space($row/h:td[5]//h:span[contains(@class,"vis")])}

24 hour forecast for Bristol