Equality and Inequality in SPARQL

In previous blog posts I’ve touched a little on equality and inequality in SPARQL; in this post I’m going to look at some of the more confusing aspects of these (and SPARQL expression semantics in general) that can surprise even seasoned SPARQL developers like myself sometimes.

Previously, I introduced the fact that the equality operators in SPARQL (= and !=) represents value equality and inequality. This means that non-identical RDF terms can be considered equal/non-equal if they represent the same value. Consider the following example:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ("1"^^xsd:integer = "01"^^xsd:integer AS ?equals) WHERE { }

The two RDF terms are not the same term, but they both represent the value 1 so the expression evaluates to give the value true:

----------
| equals |
=========
|  true  |
----------

Conversely, the following returns false:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ("1"^^xsd:integer != "01"^^xsd:integer AS ?notEquals) WHERE { }
----------
| equals |
=========
| false  |
----------

Malformed Data

Where things start to get tricky is once you start considering data that is not well-formed. Consider the following:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ("1"^^xsd:integer = "xyz"^^xsd:integer AS ?equals) WHERE { }

The first term is a valid integer but the second term is not, so what do you think the answer should be here?

I bet that most of you are thinking false, right?

Strangely enough you’d be wrong; this actually gives us an unbound value, which probably sets a few heads spinning.

----------
| equals |
=========
|        |
----------

The Operator Mapping

This behavior is due to something in the SPARQL specification which most everyday SPARQL users likely never worry about – the Operator Mapping [1]. What is happening is that the SPARQL engine is selecting different implementations of the operators to use based on the arguments provided to them.

In our first two examples, both arguments are valid numeric terms so the SPARQL implementation will use numeric equality/inequality; hence, we get true/false as appropriate.  In the third example, however, because one argument is an invalid numeric term the SPARQL implementation cannot select numeric equality and instead selects RDF term equality. This returns true if, and only if, the terms are identical; otherwise, it produces a type error, it is this type error which results in no value for our third query.

SPARQL specifies that an implementation chooses the most appropriate operator implementation available and allows implementations to extend the Operator Mapping with additional datatypes if they so desire. This is why in some SPARQL implementations you could legally get different results if your implementation understood how to turn “xyz”^^xsd:integer into a valid integer.

Type Errors?

Right now, you are probably wondering what exactly a type error is. Essentially, this is how SPARQL accounts for the fact that data is often messy. Often RDF data is generated or extracted from existing real world data sources that themselves contain errors and inconsistencies; this results in RDF data that often contains those same errors and inconsistencies.

Any expression in SPARQL expects its arguments to conform to certain datatypes; if they do not, then that expression will produce a type error. Type errors propagate up expression trees and may have different effects depending on the type of expression being used.

This is particularly relevant for operators like != because typically SPARQL defines them to be logical negation of their = equivalent, as logical negation of an error is still an error. The best way to see this is to try the != version of our third query:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ("1"^^xsd:integer != "xyz"^^xsd:integer AS ?equals) WHERE { }

This, like the previous query, still returns an unbound value for this reason:

----------
| equals |
=========
|        |
----------

FILTER and Type Errors

Where Operator Mapping and type errors really start to confuse people is in how these interact with FILTER, because FILTER is required to just provide a true/false to a SPARQL engine as to whether it should include/exclude a possible solution. Therefore, FILTER is defined such that a type error is treated as equivalent to a false for the purpose of filtering.

This can lead to some apparent inconsistencies if you are expecting = and != to be reflexive as they would be in most programming languages. For example, let’s turn one of our earlier queries into an ASK, like so:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ASK { FILTER("1"^^xsd:integer = "xyz"^^xsd:integer) }

This gives us false, which if you’ve followed the explanations up to now you should be expecting. The expression produces a type error, which gets treated as false by FILTER.

So when we use != instead, we will still get false as our response:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ASK { FILTER("1"^^xsd:integer != "xyz"^^xsd:integer) }

If you weren’t aware of the type error semantics of expression evaluation in SPARQL, then seeing these two queries producing these results is going to make you scratch your head in puzzlement.

Conclusion

Expression semantics in SPARQL, particularly around things that people are used to from other languages like = and !=, can often be confusing to new users and even befuddle experienced SPARQL developers like myself. As long as you are aware of type errors, you should usually be able to understand what is occurring. Remember that if you are using FILTER, type errors are treated the same as false. If you are having trouble understanding what is going on, it can be useful to use a project expression or a BIND instead to see whether a value is being generated or not. If there is no value generated, then you know you got a type error.

Acknowledgements

Big thanks to Paul Gearon and Andy Seaborne for helping me get some of the thornier parts of this straight in my own head so that I could write this blog post.  The example results given in this post are the result of running the queries at http://sparql.org.

[1] http://www.w3.org/TR/sparql11-query/#OperatorMapping

Rob Vesse

Speak Your Mind

Your email address will not be published. Required fields are marked *