SPARQL
SPARQL, short for “SPARQL Protocol and RDF Query Language” and pronounced “sparkle,” is a query language that allows users to query triplestores.
SPARQL queries take the form of a string. They are directed at a SPARQL endpoint, a location on the internet that is capable of receiving and processing SPARQL queries.
It is useful to think of a SPARQL query as a set of sentences with blanks. The database will take this query and find every set of matching statements that correctly fills in those blanks. In other words, the query is looking for data that follows a pattern that you have described. What makes SPARQL powerful is the ability to create complex queries that reference many variables at a time.
SPARQL queries can be used to query named graphs, such as those created and maintained by LINCS.
To do SPARQL queries, you will need to know:
- How to construct queries
- What sorts of questions can be asked with a query
Check out the LINCS SPARQL Endpoint and run queries right without leaving the LINCS site.
Construct a Query
A SPARQL query is like a recipe. There are four main ingredients:
- Prefix(es)
- Type of Query
- Query
- Modifier(s)
Prefixes
Prefixes are shorthand abbreviations for the full Internationalized Resource Identifiers (IRIs) that tell the SPARQL endpoint where to go to look for the data. Prefixes are placed at the top of your query so that you do not have to type out the full IRIs every time you want to refer to them.
In the following example, a prefix has been added for the CWRC ontology, the Resource Description Framework (RDF), the Resource Description Framework Schema (RDFS), and the Simple Knowledge Organization System (SKOS):
PREFIX identity: <http://id.lincsproject.ca/identity#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
Type of Query
Following your prefixes, you need to declare the type of SPARQL query. There are four types of queries: ASK, SELECT, DESCRIBE, and CONSTRUCT. Each type of SPARQL query includes the same essential components, but each serves a different purpose and will give you a different type of results.
ASK Query
ASK queries return a yes or no answer.
SELECT Query
SELECT queries return a list of all of the things that match your query item.
DESCRIBE Query
DESCRIBE queries return all known information about a particular entity.
CONSTRUCT Query
CONSTRUCT queries return new triples by pulling information from multiple triples.
Coming soon! Example queries will be provided once the LINCS SPARQL Endpoint is live.
Query
Triples
After you have declared which type of query you are going to construct, you need to fill in the structure of the query. The query structure is composed of triples: a subject, predicate, and object. Each component of a triple is either a query variable or a Uniform Resource Identifier (URI).
A query variable is the object that you are searching for. Variables are indicated with a question mark followed by a word. The word you choose for a variable is arbitrary, but should be human-readable for ease of understanding if shared with others. It is important that you use a variable consistently within a query.
?name
?item
A URI is a unique identifier that represents a thing that exists in the LINCS triplestore. It can be a property, entity, graph, class in the ontology (or ontologies), or even a vocabulary term (type). URIs are typically shortened using a prefix, or namespace. For example, the full URI for the identity property “woman,” <http://id.lincsproject.ca/identity/woman>
, can be shortened to identity:woman
using the identity namespace.
WHERE Statement
Each query must have a WHERE statement. The WHERE statement follows the declaration of the query type and the list of predicates that will be used as headings in the table of results. It comes before the query pattern and indicates that what follows is WHERE to look for the pattern that the query must match.
Syntax
Your query will only work if you use the proper syntax.
Following your WHERE statement, use a curly bracket to enclose your query pattern. Curly brackets must appear in pairs. Every bracket you open must close later in your query. You can add extra curly brackets to help you organize your query, but each opening bracket must have a corresponding closing bracket.
Every triple in a query must end in a period.
If a subject used in one line is repeated in the next line, the subject can be omitted as long as there is a semicolon at the end of the first line to indicate use of the same subject in the second line.
Coming soon! Example queries will be added to the LINCS SPARQL Endpoint as data is published.
Modifiers
In a more complicated query, you can add modifiers to string together multiple criteria. For a list of all possible modifiers, see W3C’s Solution Sequence Modifiers. Common modifiers are described below.
OPTIONAL Modifier
The OPTIONAL modifier allows you to indicate something extra that you would like to have included in your results, if it is present in the data. For example, you can ask for optional images or optional additional information. Using the OPTIONAL modifier means that you will get results even for things that do not contain the information that you have marked as optional. For example, a query with optional images will return all correct results with and without images, but will include the images where they are available.
UNION Modifier
The UNION modifier allows you to combine the results of multiple graphs. You can use it to pull together multiple queries, or to ask the same query in more than one way. Asking the same query in more than one way can potentially broaden your results.
FILTER Modifier
The FILTER modifier allows you to filter your results so that you only see a subset of what appears in the data. For example, you can use the YEAR function within a filter to retrieve results corresponding to a specific time period, or the LANG function within a filter if you are querying a dataset that includes multiple languages, and you would only like to see results in one language.
ORDER BY ?variable Modifier
The ORDER BY ?variable modifier allows you to sort the order in which your results appear, for example in alphabetical or chronological order.
LIMIT ?number Modifier
The LIMIT ?number modifier allows you to limit the number of results that come back. This modifier is useful if you want to check if your query works without spooling out hundreds or thousands of results, as the more results your query generates, the slower it will run.
Determine Questions
To construct a SPARQL query, you first need to determine what you can ask:
- Make a list of the things you want to know
- Break down your question into as many smaller questions as possible
- Come up with some potential correct and incorrect answers so you will be able to check that your query is working
- Look through your data to see what information it has and make up a question that will lead you back to this information
Here is a simplified example of a graph showing the results generated by a SPARQL query. This example, from the University of Saskatchewan Art Collection, presents information about the painting “People Going into the Dancing Hall,” which was created by Allen Sapp in the twentieth century.
Ovals represent entities. Each entity is a URI and has a human-readable label. Rectangles represent literals—strings of characters that are human-readable rather than machine processable, such as names and vague dates.
While you may expect the graph to centre on the art object, the relationships of interest to our query (who made the object and when) are actually linked through an intermediary node (in blue), which represents an event, in this case a Production Event. This is because LINCS has adopted CIDOC CRM as its upper-level ontology. CIDOC-CRM is an event-centric model, so many of the triples within the datasets hosted by LINCS include event somewhere within them. Understanding the data model is essential to using SPARQL. If you are interested in learning more about the data model, here are some resources to get you started:
- Ontologies
- CIDOC-CRM
- LINCS Application Profile [Coming soon!]
- Project-Specific Application Profiles [Coming soon!]
- Using SPARQL with LINCS Data [Coming soon!]
Tips for learning to build queries:
- Start small
- Borrow components from other queries, and tweak them bit by bit
- Make simple queries and then look for ways to make them more complex
- Backtrack and try again if your query breaks
Summary
- SPARQL is a query language that allows users to query triplestores.
- SPARQL queries are directed at a SPARQL endpoint, a location on the internet that is capable of receiving and processing SPARQL queries.
- SPARQL queries have four main ingredients: prefix(es); type of query; query; and modifier(s)
- There are four types of SPARQL queries: ASK, SELECT, DESCRIBE, and CONSTRUCT.
- To construct an effective SPARQL query, you first need to determine what you can ask.
Resources
To learn more about querying with SPARQL, see the following resources.
Introductory Information:
- Blaney (2017) “Introduction to the Principles of Linked Open Data”
- bobdc (2015) “SPARQL in 11 Minutes” [Video]
Beginner Information:
- Ontotext (2022) “What is SPARQL?”
- Stardog Union (2022) “Learn SPARQL”
- W3C (2008) “SPARQL By Example: The Cheat Sheet” [PowerPoint]
Intermediate Information:
- Feigenbaum (2009) “SPARQL By Example”
- Gruber (2018) “0 to 60 on SPARQL Queries in 50 Minutes” [PowerPoint]
- Lincoln (2015) “Using SPARQL to Access Linked Open Data”
Advanced Information:
- Apache Jena (2022) “SPARQL Tutorial”
- W3C (2008) “SPARQL Query Language for RDF”
- W3C (2013) “SPARQL 1.1 Overview”
- W3C (2014) “RDF 1.1 Concepts and Abstract Syntax”
Wikidata-Specific Information:
- Jones (2020) “Computational Knowledge: Wikidata, Wikidata Query Service, and Women who are Mayors!”
- Wikibooks (2017) “SPARQL/Basics”
- Wikidata (2020) “A Gentle Introduction to the Wikidata Query Service”
- Wikidata (2022) “SPARQL Tutorial”
- Wiki Education (2022) “Querying Wikidata: SPARQL”
- Wikimedia Commons (2020) “Wikidata Query Service in Brief”
- Wikimedia Foundation (2018) “Querying Wikidata with SPARQL for Absolute Beginners” [Video]