XQuery/Using Intermediate Documents
Processing XML often involves the creation of intermediate XML fragments for subsequent processing. Here is an example of two approaches, one using multiple passes on the same data, the other a constructed intermediate view of the data.
MusicXML
editMusicXML is an XML application for recording music scores. There is a range of software which produces and consumes MusicXML.
There are two styles of MusicXML with two related schemas, one in which measures are within parts (partwise), the other in which parts are within measures (timewise).
An example of a MusicXML partwise score is Mozart's Piano Sonata in A Major, K. 331
Here is a sample definition of a note:
<note>
<pitch>
<step>A</step>
<octave>3</octave>
</pitch>
<duration>2</duration>
<voice>3</voice>
<type>eighth</type>
<stem>down</stem>
<staff>2</staff>
<beam number="1">begin</beam>
<notations>
<slur type="stop" number="1"/>
</notations>
</note>
Notes Range
editThe Recordare site has some sample code to demonstrate the use of XQuery to process MusicXML [1]. The first script finds the lowest and highest notes in the score. The script shown on the site is not conformant to the current XQuery standard, but a few minor changes brings it up-to-date.
declare function local:MidiNote($thispitch as element(pitch) ) as xs:integer
{
let $step := $thispitch/step
let $alter :=
if (empty($thispitch/alter)) then 0
else xs:integer($thispitch/alter)
let $octave := xs:integer($thispitch/octave)
let $pitchstep :=
if ($step = "C") then 0
else if ($step = "D") then 2
else if ($step = "E") then 4
else if ($step = "F") then 5
else if ($step = "G") then 7
else if ($step = "A") then 9
else if ($step = "B") then 11
else 0
return 12 * ($octave + 1) + $pitchstep + $alter
} ;
let $doc := doc("/db/Wiki/Music/examples/MozartPianoSonata.xml")
let $part := $doc//part[./@id = "P1"]
let $highnote := max(for $pitch in $part//pitch return local:MidiNote($pitch))
let $lownote := min(for $pitch in $part//pitch return local:MidiNote($pitch))
let $highpitch := $part//pitch[local:MidiNote(.) = $highnote]
let $lowpitch := $part//pitch[local:MidiNote(.) = $lownote]
let $highmeas := string($highpitch[1]/../../@number)
let $lowmeas := string($lowpitch[1]/../../@number)
return
<result>
<low-note>{$lowpitch[1]}
<measure>{$lowmeas}</measure>
</low-note>
<high-note>{$highpitch[1]}
<measure>{$highmeas}</measure>
</high-note>
</result>
With output:
<result>
<low-note>
<pitch>
<step>D</step>
<octave>2</octave>
</pitch>
<measure>3</measure>
</low-note>
<high-note>
<pitch>
<step>E</step>
<octave>6</octave>
</pitch>
<measure>5</measure>
</high-note>
</result>
Ancestor access
editThe path to the measure in which a note is located
let $highmeas := string($highpitch[1]/../../@number)
uses a fixed set of steps back up the hierarchy. This limits the application of this script to one type of MusicXML schema because the position of the measure in the hierarchy is different in the two schemas. When the script was written, the ancestor axis was not supported but it is now, so those lines are more generally expressible as:
let $highmeas := string($highpitch/ancestor::measure/@number)
Note-to-midi
editThe function to convert notes to midi numbers uses nested if-then-else expressions. XQuery lacks a switch expression which might be used but a clearer approach would be to use a lookup-table, defined either locally in the script or stored in the database.
Here a sequence of notes is created as a look-up table. This is bound to a global variable which is used in a revised note-to-midi function:
declare variable $NOTESTEP :=
(
<note name="C" stepNo="0"/>,
<note name="D" stepNo="2"/>,
<note name="E" stepNo="4"/>,
<note name="F" stepNo="5"/>,
<note name="G" stepNo="7"/>,
<note name="A" stepNo="9"/>,
<note name="B" stepNo="11"/>
);
declare function local:MidiNote($thispitch as element(pitch) ) as xs:integer
{
let $alter := xs:integer(($thispitch/alter,0)[1])
let $octave := xs:integer($thispitch/octave)
let $pitchstepNo := xs:integer($NOTESTEP[@name = $thispitch/step]/@stepNo)
return 12 * ($octave + 1) + $pitchstepNo + $alter
} ;
Intermediate XML
editThe original script required repeated access to the original MusicXML source. An alternative approach would be to create an intermediate structure to hold the midi notes and use this in subsequent analysis. This structure is a computed view of the original notes augmented with derived data - the midi note and the measure.
let $midiNotes :=
for $pitch in $part//pitch
return
<pitch>
{$pitch/*}
<midi>{local:MidiNote($pitch)}</midi>
<measure>{string($pitch/../../@number)}</measure>
</pitch>
and this view is then used to locate the high and low notes and their position in the score:
let $highnote := max($midiNotes/midi)
let $lownote := min($midiNotes/midi)
let $highpitch := $midiNotes[midi = $highnote]
let $lowpitch := $midiNotes[midi = $lownote]
Revised script
editdeclare variable $NOTESTEP :=
(
<note name="C" step="0"/>,
<note name="D" step="2"/>,
<note name="E" step="4"/>,
<note name="F" step="5"/>,
<note name="G" step="7"/>,
<note name="A" step="9"/>,
<note name="B" step="11"/>
);
declare function local:MidiNote($thispitch as element(pitch) ) as xs:integer
{
let $alter := xs:integer(($thispitch/alter,0)[1])
let $octave := xs:integer($thispitch/octave)
let $name := $thispitch/step
let $pitchstep := xs:integer($NOTESTEP[@name = $name]/@step)
return 12 * ($octave + 1) + $pitchstep + $alter
} ;
let $doc := doc("/db/Wiki/Music/examples/MozartPianoSonata.xml")
let $part := $doc//part[./@id = "P1"]
let $midiNotes :=
for $pitch in $part//pitch
return
<pitch>
{$pitch/*}
<midi>{local:MidiNote($pitch)}</midi>
<measure>{string($pitch/ancestor::measure/@number)}</measure>
</pitch>
let $highnote := max($midiNotes/midi)
let $lownote := min($midiNotes/midi)
return
<result>
<low-note>
{$midiNotes[midi = $lownote]}
</low-note>
<high-note>
{ $midiNotes[midi = $highnote]}
</high-note>
</result>
Discussion
editAlthough arguably a cleaner, more direct design, the second script relies on the construction of temporary XML nodes which are then the subject of XPath expressions. These temporary XML nodes are handled differently in different implementations. In older versions of eXist each is written to a temporary document in the database which creates a performance overhead and problems of garbage collection. In the 1.3 release, intermediate XML nodes remain in memory, resulting in a major performance improvement.
There is, however, another problem with this approach. The size of the intermediate node may exceed pre-set, but configurable, limits on the size of constructed nodes.