Entity Identifiers and URI Minting
Introduction
Linked Data (LD) relies on having a shared way to reference the same concept or entity, such as a person, place, document, category, or event. To achieve this, LD resources are uniquely identified by their Uniform Resource Identifier (URI). Each URI is unique to a single concept or entity and should be accessible over the web.
Many URIs look like web addresses or URLs, and in the context of LINCS, this is the primary form they take. However, while URI formats may be the same as a URL, a URI does not necessarily need to resolve to a web page or have a network location; all URLs are URIs but not all URIs are URLs.
In practice, not all URLs are suitable as URIs for the purposes of Linked Open Data (LOD), since some are more persistent and authoritative than others. For instance, using the URL for a person’s LinkedIn profile is not as desirable as using an ORCID identifier. Using a link to a downloadable PDF of an article from a faculty member’s departmental web page is not good practice compared to using a Digital Object Identifier (DOI) for the same article. If a URI does resolve to a human-readable webpage, this means that it is a “dereferenced” URI.
Learn More about URIs
To learn more about URIs, see the below resources:
- LINCS Glossary
- Linked Open Data Basics
- Europeana URI Document
- CHIN GitHub Issue Ticket
- W3C Good URIs
- W3C Cool URIs
- BnF @ SWIB19
- Ruben Verborgh, “Web Fundamentals: The Semantic Web & Linked Data”
Types of URIs
When LINCS converts a dataset into LOD, entities are identified in the source data and are prepared for conversion into LOD resources. These resources require a canonical URI for identification and access. Wherever possible, LINCS will reuse existing URIs from external vocabularies and authorities by matching, or “reconciling”, entities in the source data with appropriate URIs. When an external identifier for a resource does not exist, a URI must be created, or “minted,” to represent the resource.
URI Scenarios
Converting data using LINCS involves one or more URI scenarios.
Matching URI Exists
When a URI exists that matches the entity, LINCS does not need to produce a new URI because pre-existing URIs are easy to integrate into the LINCS knowledge graph. Management of that URI stays with the authority that created the URI.
To work well with LINCS tools, URIs should follow standard LOD practices, such as having a label preferably in both French and English.
See the LINCS Entity Reconciliation Guide for guidance on choosing external URIs.
No Matching URI Exists; Data Holder Mints a URI
When no matching URI exists that matches the entity, a data holder (typically an organization or large ongoing research project) may mint their own URI according to standard LOD practices.
Data holders may negotiate with a third-party organization that maintains a vocabulary or authority file to add the required term and assign it a URI, or they may elect to use a third-party service, such as Wikidata, to mint the URI. In all such cases, the URI will use an external namespace. The URI creator is responsible for the management of the URI.
See the LINCS Entity Reconciliation Guide for details.
No Matching URI Exists; LINCS Mints a a URI
Data contributors may give LINCS the authority to create new URIs if there is no matching existing URI. These URIs are under a LINCS namespace, using a LINCS-generated identifier.
LINCS-Minted URIs
LINCS-minted URIs begin with http://id.lincsproject.ca/
followed by a randomly generated 11-character unique identifier based on the NanoID library.
For example, a URI that LINCS has minted and now hosts would look like this:
http://id.lincsproject.ca/4sUVOmgE6GB
Such URIs are typically used for named entities, such as persons or organizations, and for events associated with those entities.
LINCS makes an exception to using randomly generated URIs when representing vocabulary terms from LINCS-hosted vocabularies. In such cases, LINCS works with data contributors to choose human-readable words for the vocabulary and the vocabulary term to include in a URI.
For example, the vocabulary term for a person’s name from the LINCS Biography Vocabulary has the following URI:
http://id.lincsproject.ca/biography/personalName
Note that this term still has a French label, “nom personnel.”
See the LINCS vocabulary documentation to learn about working with LINCS to develop a vocabulary, and the LINCS Vocabulary Browser to explore LINCS-hosted vocabularies.
How LINCS URIs are Minted
During the conversion process, whenever there is an entity that has not been reconciled, LINCS creates a temporary URI to represent that entity. LINCS uses the prefix http://temp.lincsproject.ca/
. These temporary URIs stay in the data while LINCS finishes converting and refining the data. Before the data is officially published, LINCS either replaces these temporary URIs with external URIs, or mints official http://id.lincsproject.ca/
URIs.
Some tools in the LINCS ecosystem such as ResearchSpace and LEAF-Writer allow for data creation, including the creation of URIs, by approved contributors. See the tool-specific documentation to learn how you can mint new LINCS URIs directly through LINCS tools.
How LINCS URIs are Maintained
LINCS is committed to maintaining persistent representations of all LINCS-minted URIs. Should a LINCS URI become outdated or incorrect, LINCS will deprecate the URI rather than delete it. In contrast, an external URI that is referenced in LINCS data can be deleted from the LINCS knowledge graph if all references to that URI are removed, since responsibility for maintaining that URI lies elsewhere.
Use LINCS URIs
See Explore LOD to learn about the tools you can use to find LINCS entities and their URIs.
Refer to a LINCS URI
To refer to a LINCS entity in your data, use the URI that follows this format:
http://id.lincsproject.ca/{id}
If you are viewing information about a LINCS entity on the web, the URL that you see in the address bar of your web browser may not be the entity’s URI. Please note that the actual URI of the resource starts with http
and not https
, and should not have a slash at the end. It should also not include prefixes from the website you are on, such as: https://rs.lincsproject.ca/resource/?uri=
Get Information from a LINCS URI
What happens when a human or machine makes an HTTP request to a LINCS URI (e.g., navigates to the URI in a web browser)? In technical terms, the URI is dereferenced or resolved by the client (e.g., browser), at which time the host server (e.g., ResearchSpace) returns a response. The response can take several forms. Most importantly, the response can either be a redirection to another domain (think call forwarding), or the host can serve a representation of the content directly (e.g., in human-readable HTML format, or in machine-readable TTL or RDF/XML).
Human-Readable Format
In practice, when you navigate to a LINCS entity’s URI in your web browser, you will be redirected to ResearchSpace, where you can view and interact with information about that entity from the LINCS triplestore.
You will see that you are redirected from:
to:
https://rs.lincsproject.ca/resource/?uri=http://id.lincsproject.ca/{id}
Machine-Readable Formats
Alternatively, you can get the RDF representation of an entity with an HTTP request of the form:
curl -L http://id.lincsproject.ca/{id} -H "Accept: text/turtle"
For example, for a LINCS-minted entity:
curl -L http://id.lincsproject.ca/4sUVOmgE6GB -H "Accept: text/turtle"
Or for an entity in the LINCS knowledge graph that uses an external URI, the request could look like this:
curl https://rs.lincsproject.ca/resource/?uri=http://vocab.getty.edu/aat/300011914 -H "Accept: text/turtle"
Supported formats and respective mimetypes are:
Format | MIME Types |
BinaryRDF | application/x-binary-rdf |
JSON-LD | application/ld+json |
N3 | text/n3, text/rdf+n3 |
N-Quads | text/x-nquads |
N-Triples | text/plain |
RDF/JSON | application/rdf+json |
RDF/XML | application/rdf+xml, application/xml |
TriG | application/x-trig |
TriX | application/trix |
Turtle | text/turtle, application/x-turtle |