SPARQL/Sentences
Comma, Semicolon and Period
editAt the Basics chapter we’ve seen all children of Johann Sebastian Bach – more specifically: all items with the father Johann Sebastian Bach. But Bach had two wives, and so those items have two different mothers: what if we only want to see the children of Johann Sebastian Bach with his first wife, Maria Barbara Bach (Q57487)? Try writing that query, based on the one above.
Done that? Okay, then onto the solution! The simplest way to do this is to add a second triple with that restriction:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339.
?child wdt:P25 wd:Q57487.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
In English, this reads:
Child has father Johann Sebastian Bach. Child has mother Maria Barbara Bach.
That sounds a bit awkward, doesn’t it? In natural language, we’d abbreviate this:
Child has father Johann Sebastian Bach and mother Maria Barbara Bach.
In fact, it’s possible to express the same abbreviation in SPARQL as well: if you end a triple with a semicolon (;
) instead of a period, you can add another predicate-object pair. This allows us to abbreviate the above query to:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
which has the same results, but less repetition in the query.
Now suppose that, out of those results, we’re interested only in those children who also were also composers and pianists. The relevant properties and items are occupation (P106), composer (Q36834) and pianist (Q486748). Try updating the above query to add these restrictions!
Here’s my solution:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834;
wdt:P106 wd:Q486748.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This uses the ;
abbreviation two more times to add the two required occupations. But as you might notice, there’s still some repetition. This is as if we said:
Child has occupation composer and occupation pianist.
which we would usually abbreviate as:
Child has occupation composer and pianist.
And SPARQL has some syntax for that as well: just like a ;
allows you to append a predicate-object pair to a triple (reusing the subject), a ,
allows you to append another object to a triple (reusing both subject and predicate). With this, the query can be abbreviated to:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834,
wd:Q486748.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Note: indentation and other whitespace doesn’t actually matter – I’ve just indented the query to make it more readable. You can also write this as:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834, wd:Q486748.
# both occupations in one line
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
or, rather less readable:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834,
wd:Q486748.
# no indentation; makes it hard to distinguish between ; and ,
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Luckily, the WDQS editor indents lines for you automatically, so you usually don’t have to worry about this.
Alright, let’s summarize here. We’ve seen that queries are structured like text. Each triple about a subject is terminated by a period. Multiple predicates about the same subject are separated by semicolons, and multiple objects for the same subject and predicate can be listed separated by commas.
SELECT ?s1 ?s2 ?s3
WHERE
{
?s1 p1 o1;
p2 o2;
p3 o31, o32, o33.
?s2 p4 o41, o42.
?s3 p5 o5;
p6 o6.
}
Brackets ([ ])
editNow I want to introduce one more abbreviation that SPARQL offers. So if you’ll humor me for one more hypothetical scenario…
Suppose we’re not actually interested in Bach’s children. (Who knows, perhaps that’s actually true for you!) But we are interested in his grandchildren. (Hypothetically.) There’s one complication here: a grandchild may be related to Bach via the mother or the father. That’s two different properties, which is inconvenient. Instead, let’s flip the relation around: Wikidata also has a “child” property, child (P40), which points from parent to child and is gender-independent. With this information, can you write a query that returns Bach’s grandchildren?
Here’s my solution:
SELECT ?grandChild ?grandChildLabel
WHERE
{
wd:Q1339 wdt:P40 ?child.
?child wdt:P40 ?grandChild.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
In natural language, this reads:
Bach has a child
?child
.?child
has a child?grandChild
.
Once more, I propose that we abbreviate this English sentence, and then I want to show you how SPARQL supports a similar abbreviation. Observe how we actually don’t care about the child: we don’t use the variable except to talk about the grandchild. We could therefore abbreviate the sentence to:
Bach has as child someone who has a child
?grandChild
.
Instead of saying who Bach’s child is, we just say “someone”: we don’t care who it is. But we can refer back to them because we’ve said “someone who”: this starts a relative clause, and within that relative clause we can say things about “someone” (e. g., that he or she “has a child ?grandChild
”). In a way, “someone” is a variable, but a special one that’s only valid within this relative clause, and one that we don’t explicitly refer to (we say “someone who is this and does that”, not “someone who is this and someone who does that” – that’s two different “someone”s).
In SPARQL, this can be written as:
SELECT ?grandChild ?grandChildLabel
WHERE
{
wd:Q1339 wdt:P40 [ wdt:P40 ?grandChild ].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
You can use a pair of brackets ([]
) in place of a variable, which acts as an anonymous variable. Inside the brackets, you can specify predicate-object pairs, just like after a ;
after a normal triple; the implicit subject is in this case the anonymous variable that the brackets represent. (Note: also just like after a ;
, you can add more predicate-object pairs with more semicolons, or more objects for the same predicate with commas.)
And that’s it for triple patterns! There’s more to SPARQL, but as we’re about to leave the parts of it that are strongly analogous to natural language,
Summary
editI’d like to summarize that relationship once more:
natural language | example | SPARQL | example |
---|---|---|---|
sentence | Juliet loves Romeo. | period | juliet loves romeo.
|
conjunction (clause) | Romeo loves Juliet and kills himself. | semicolon | romeo loves juliet; kills romeo.
|
conjunction (noun) | Romeo kills Tybalt and himself. | comma | romeo kills tybalt, romeo.
|
relative clause | Juliet loves someone who kills Tybalt. | brackets | juliet loves [ kills tybalt ].
|
References
edit