SPARQL/FILTER
FILTER(condition)
is a clause you can insert into your SPARQL query to, well, filter the results. Inside the parentheses, you can put any expression of boolean type, and only those results where the expression returns true
are used.
FILTER on values
editFor example, to get a list of all humans born in 2015, we first get all humans with their date of birth –
SELECT ?person ?personLabel ?dob
WHERE
{
?person wdt:P31 wd:Q5;
wdt:P569 ?dob.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
– and then filter that to only return the results where the year of the date of birth is 2015. There are two ways to do that: extract the year of the date with the YEAR
function, and test that it’s 2015 –
FILTER(YEAR(?dob) = 2015).
– or check that the date is between Jan. 1st (inclusive), 2015 and Jan. 1st, 2016 (exclusive):
FILTER("2015-01-01"^^xsd:dateTime <= ?dob && ?dob < "2016-01-01"^^xsd:dateTime).
I’d say that the first one is more straightforward, but it turns out the second one is much faster, so let’s use that:
SELECT ?person ?personLabel ?dob
WHERE
{
?person wdt:P31 wd:Q5;
wdt:P569 ?dob.
FILTER("2015-01-01"^^xsd:dateTime <= ?dob && ?dob < "2016-01-01"^^xsd:dateTime).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
FILTER on values in Labels
editAnother possible use of FILTER
is related to labels. The label service is very useful if you just want to display the label of a variable. But if you want to do stuff with the label – for example: check if it starts with “Mr. ” – you’ll find that it doesn’t work:
SELECT ?human ?humanLabel
WHERE
{
?human wdt:P31 wd:Q15632617. # fictional human
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER(STRSTARTS(?humanLabel, "Mr. ")). # This does not work
}
This query finds all instances of Q15632617 and tests if their label starts with "Mr. "
(STRSTARTS
is short for “string starts [with]”; there’s also STRENDS
and CONTAINS
). The reason why this doesn’t work is that the label service adds its variables very late during query evaluation; at the point where we try to filter on ?humanLabel
, the label service hasn’t created that variable yet.
Fortunately, the label service isn’t the only way to get an item’s label. Labels are also stored as regular triples, using the predicate rdfs:label
. Of course, this means all labels, not just English ones; if we only want English labels, we’ll have to filter on the language of the label:
FILTER(LANG(?label) = "en").
The LANG
function returns the language of a monolingual string, and here we only select those labels that are in English. The full query is:
SELECT ?human ?humanLabel
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
}
We get the label with the ?human rdfs:label ?label
triple, restrict it to English labels, and then check if it starts with “Mr. ”.
FILTER NOT EXISTS
editLet's add images to these fictional humans
SELECT ?human ?humanLabel ?image
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
OPTIONAL{ ?human wdt:P18 ?image. }
}
The code OPTIONAL{ ?human wdt:P18 ?image. }
shows all the fictional humans that start with “Mr. ” and shows an image if available.
To select only fictional humans starting with “Mr. ” with an image the code ?human wdt:P18 ?image.
can be used.
To select those fictional humans without an image the code FILTER NOT EXISTS{ ?human wdt:P18 ?image. }
should be used
SELECT ?human ?humanLabel ?image
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
FILTER NOT EXISTS{ ?human wdt:P18 ?image. } # without images
}
MINUS
editA similar way of negation provided in SPARQL is MINUS
which evaluates both its arguments, then calculates solutions in the left-hand side that are not compatible with the solutions on the right-hand side.
SELECT ?human ?humanLabel ?image
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
MINUS{ ?human wdt:P18 ?image. } # without images
}
References
edit