Internet-Draft | REST API Linked Data Keywords | November 2024 |
Polli | Expires 9 May 2025 | [Page] |
This document defines two keywords to provide semantic information in OpenAPI Specification and JSON Schema documents, and support contract-first semantic schema design.¶
This note is to be removed before publishing as an RFC.¶
Status information for this document may be found at https://datatracker.ietf.org/doc/draft-polli-restapi-ld-keywords/.¶
information can be found at https://github.com/ioggstream/draft-polli-restapi-ld-keywords.¶
Source for this draft and an issue tracker can be found at https://github.com/ioggstream/draft-polli-restapi-ld-keywords/issues.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 9 May 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
API providers usually specify semantic information in text or out-of-band documents; at best, this information is described in prose into specific sections of interface definition documents (see Section 1.2).¶
This is because API providers do not always value machine-readable semantics, or because they have no knowledge of semantic technologies - that are perceived as unnecessarily complex.¶
A full-semantic approach (e.g. writing RDF oriented APIs) has not become widespread because transferring and processing the semantics on every message significantly increases data transfer and computation requirements.¶
Moreover the semantic landscape do not provide easy ways of defining / constraining the syntax of an object: tools like [SHACL] and [OWL] restrictions are considered computationally intensive to process and complex to use from web and mobile developers.¶
This document provides a simple mechanism to attach semantic information to REST APIs that rely on different dialects of [JSONSCHEMA], thus supporting a contract-first schema design.¶
For example, the OpenAPI Specifications (see [OAS]) allow to describe REST APIs interactions and capabilities using a machine-readable format based on [JSON] or [YAML]. OAS 3.0 is based on JSON Schema draft-4 while OAS 3.1 relies on the latest JSON Schema draft.¶
This document has the following goals:¶
describe in a single specification document backed by [JSONSCHEMA] (e.g. an OpenAPI document) both the syntax and semantics of JSON objects. This information can be either be provided editing the document by hand or via automated tools;¶
easy for non-semantic experts and with reduced complexity;¶
support for OAS 3.0 / JSON Schema Draft4;¶
while it is not intended to:¶
integrate the syntax defined using [JSONSCHEMA];¶
infer semantic information where it is not provided;¶
convert [JSONSCHEMA] documents to RDF Schema (see [RDFS]) or XML Schema.¶
Thus, the following design choices have been made:¶
the semantic context of a JSON object will be described using [JSON-LD-11] and its keywords;¶
property names are limited to characters that can be used in variable
names (e.g. excluding :
and .
)
to avoid interoperability issues with code-generation tools;¶
privilege a deterministic behavior over automation and composability;¶
interoperable with the mechanisms described in Section 6.1 of [JSON-LD-11] for conveying semantic context in REST APIs.¶
[JSONSCHEMA] allows to define the structure of the exchanged data using specific keywords.
Properties' semantics can be expressed in prose via the description
keyword.¶
[JSON-LD-11] defines a way to interpret a JSON object as JSON-LD:
the example schema instance (a JSON document conformant to a given schema)
provided in the above "Person" schema can be integrated with
semantic information adding the @type
and @context
properties.¶
This document shows how
to integrate into a JSON Schema document
information that can be used
to add the @context
and @type
properties to the associated JSON Schema instances.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. These words may also appear in this document in lower case as plain English words, absent their normative meanings.¶
The terms "content", "content negotiation", "resource", and "user agent" in this document are to be interpreted as in [HTTP].¶
The terms "fragment" and "fragment identifier" in this document are to be interpreted as in [URI].¶
The terms "node", "alias node", "anchor" and "named anchor" in this document are to be intepreded as in [YAML].¶
The terms "schema" and "schema instance" in this document are to be intepreded as in [JSONSCHEMA] draft-4 and higher.¶
The terms "JSON object", "JSON document", "member", "member name" in this document are to be intepreded as in [JSON]. The term "property" - when referred to a JSON document such as a schema instance - is a synonym of "member name", and the term "property value" is a synonym of "member value".¶
The terms "@context", "@type", "@id", "@value" and "@language" are to be interpreted as JSON-LD keywords in [JSON-LD-11], whereas the term "context" is to be interpreted as a JSON-LD Context defined in the same document.¶
Since JSON-LD is a serialization format for RDF, the document can use JSON-LD and RDF interchangeably when it refers to the semantic interpretation of a resource.¶
The JSON Schema keywords defined in Section 2 are collectively named "semantic keywords".¶
A schema (see [JSONSCHEMA]) MAY use the following JSON Schema keywords, collectively named "semantic keywords" to provide semantic information for all related schema instances.¶
This keyword conveys an RDF type (see [RDF]) for the JSON schema instances described by the associate schema. It is defined in Section 2.1.¶
This keyword conveys a JSON-LD context for the JSON schema instances described by the associate schema. It is defined in Section 2.2.¶
This specification MAY be used to:¶
populate the @type
property along the schema instance objects;¶
compose an "instance context" to populate the @context
property at the root of the schema instance.¶
The schema MUST be of type "object". This is because [JSON-LD-11] does not define a way to provide semantic information on JSON values that are not JSON objects.¶
The schema MUST NOT describe a JSON-LD
(e.g. of application/ld+json
media type)
or conflicts will arise, such as
which is the correct @context
or @type
(see Section 4.2).¶
Both JSON Schema keywords defined in this document might contain URI references. Those references MUST NOT be dereferenced automatically, since there is no guarantee that they point to actual locations. Moreover they could reference unsecured resources (e.g. using the "http://" URI scheme [HTTP]).¶
Appendix A provides various examples of integrating semantic information in schema instances.¶
The x-jsonld-type value provides information on the RDF type of the associate schema instances.¶
This value MUST be valid according to the JSON-LD @type
keyword
as described in Section 3.5 of JSON-LD-11;
it is thus related to the information provided via the x-jsonld-context keyword (see Section 2.2).¶
It SHOULD NOT reference an RDF Datatype because it is not intended to provide syntax information, but only semantic information.¶
The x-jsonld-context value provides the information required to interpret the associated schema instances as JSON-LD according to the specification in Section 6.1 of JSON-LD-11.¶
Its value MUST be a valid JSON-LD Context (see Section 9.15 of JSON-LD-11 ).¶
When context composition (see Section 3.4) is needed, the context SHOULD be provided in the form of a JSON object; in fact, if the x-jsonld-context is a URL string, that URL needs to be dereferenced and processed to generate the instance context.¶
This section describes an OPTIONAL workflow to interpret a schema instance as JSON-LD.¶
ensure that the initial schema instance does not contain
any @context
or @type
property.
For further information see Section 4.2;¶
add the @context
property with the value of x-jsonld-context.
This will be the initial "instance context": the only one that will be mangled;¶
add the @type
property with the value of x-jsonld-type;¶
iterate on each instance property like the following:¶
identify the sub-schema associated to the property (e.g. resolving $refs) and check the presence of semantic keywords;¶
for the x-jsonld-type, add the @type
property to the sub-instance;¶
for the x-jsonld-context, integrate its information in the instance context when they are not already present;¶
iterate this process in case of nested entries.¶
The specific algorithm for integrating the values of x-jsonld-context present in sub-schemas into the instance context (see Section 2) is an implementation detail.¶
See the interoperability considerations for the media types and specifications used, including [YAML-IANA], [JSON], [OAS], [JSONSCHEMA] and [JSON-LD-11].¶
Annotating a schema with semantic keywords
containing JSON-LD keywords
(e.g. @context
, @type
and @language
)
may hinder its ability to be interpreted as a JSON-LD document
(e.g. using the JSON-LD 1.1 context for the JSON Schema vocabulary);
this can be mitigated extending that context and specifying
that Linked Data keywords are JSON Literals.¶
{ "@context": { "x-jsonld-context: { "@type": "@json"}, "x-jsonld-type: { "@type": "@json"} } }¶
This is generally not a problem, since a generic [JSONSCHEMA] document cannot be reliably interpreted as JSON-LD using a single context: this is because the same JSON member keys can have different meanings depending on their JSON Schema position (see the notes in the Interpreting JSON Schema as JSON-LD 1.1 section of [JSON-SCHEMA-RDF]).¶
This specification is not designed to restrict the syntax of a JSON value nor to support a conversion between JSON Schema and XMLSchema (see Section 2.1).¶
Not all RDF resources can be expressed as JSON documents
annotated with @context
and @type
:
this specification is limited by the possibilities
of Section 6.1 of JSON-LD-11.
On the other hand, since this approach
delegates almost all the processing to of JSON-LD,
as long as JSON-LD evolves
it will cover further use cases.¶
This specification is not designed to pre-process
or mangle JSON-LD documents
(e.g. to add a missing @type
to a JSON-LD document),
but only to support schemas that do not describe JSON-LD documents.¶
Applications exchanging JSON-LD documents
need to explicitly populate @type
and @context
,
and use a proper media type
since Linked Data processing and interpretation
requires further checks.¶
If these applications describe messages using [JSONSCHEMA] or [OAS], they need to process them with a JSON-LD processor and declare all required properties in the schema - like in the example below.¶
Limited composability can be achieved applying the process described
in Section 2.3.
Automatic composability is not an explicit goal of this specification
because of its complexity. One of the issue is that
the meaning of a JSON-LD keyword is affected by
its position. For example, the @type
keyword:¶
in a node object, adds an rdf:type
arc to the RDF graph
(it also has a few other effects on processing, e.g. by enabling type-scoped contexts);¶
in a value object, specifies the datatype of the produced literal;¶
in the context, and more precisely in a term definition, specifies type coercion. It only applies when the value of the term is a string.¶
These issues can be tackled in future versions of this specifications.¶
Moreover, well-designed schemas do not usually have more than 3 or 4 nested levels. This means that, when needed, it is possible to assemble and optimize an instance context (see Section 2) at design time and use it to valorize x-jsonld-context (see Figure 8).¶
Once a context is assembled, the RDF data can be generated using the algorithms described in [JSONLD-11-API] for example through a library.¶
from pyld import jsonld ... jsonld_text = jsonld.expand(schema_instance, context)¶
See the interoperability considerations for the media types and specifications used, including [YAML-IANA], [JSON], [OAS], [JSONSCHEMA] and [JSON-LD-11].¶
Adding a semantic context to a JSON document alters its value and, in an implementation-dependent way, can lead to reordering of fields. This process can thus affect the processing of digitally signed content.¶
If an OAS document includes the keywords defined in Section 2 the provider explicitly states that the semantic of the schema instance:¶
is defined at contract level;¶
is the same for every message;¶
and is not conveyed nor specific for each message.¶
In this case, processing the semantic conveyed in a message might have security implications.¶
An application that relies on this specification
might want to define separate processing streams for JSON documents
and RDF graphs, even when RDF graphs are serialized as JSON-LD documents.
For example, it might want to raise an error
when an application/json
resource contains unexpected properties
impacting on the application logic
like @type
and @context
.¶
None¶
The following example shows a Person JSON Schema with semantic information provided by the x-jsonld-type and x-jsonld-context. Type information is provided as a URI reference.¶
The example object is assembled as a JSON-LD object as follows.¶
{ "@context": { "@vocab": "https://schema.org/", "custom_id": null }, "@type": "https://schema.org/Person", "familyName": "Doe", "givenName": "John", "country": "FRA", "custom_id": "12345" }¶
The above JSON-LD can be represented as text/turtle
as follows.¶
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> @prefix schema: <https://schema.org/> _:b0 rdf:type schema:Person ; schema:country "FRA" ; schema:familyName "Doe" ; schema:givenName "John" .¶
The following example shows a "Person" schema with semantic information provided by the x-jsonld-type and x-jsonld-context.¶
The resulting RDF graph is¶
The following schema contains a cyclic reference.
Type information is resolved using the @vocab
keyword
specified in the x-jsonld-context.¶
Person: description: Simple cyclic example. x-jsonld-type: Person x-jsonld-context: "email": "@id" "@vocab": "https://w3.org/ns/person#" children: "@container": "@set" type: object properties: email: { type: string } children: type: array items: $ref: '#/Person' example: email: "mailto:a@example" children: - email: "mailto:dough@example" - email: "mailto:son@example"¶
The example schema instance contained in the above schema results in the following JSON-LD document.¶
{ "email": "mailto:a@example", "children": [ { "email": "mailto:dough@example", "@type": "Person" }, { "email": "mailto:son@example", "@type": "Person" } ], "@type": "Person", "@context": { "email": "@id", "@vocab": "https://w3.org/ns/person#", "children": { "@container": "@set" } } }¶
Applying the workflow described in Section 2.3 just recursively copying the x-jsonld-context, the instance context could have been more complex.¶
In the following schema document, the "Citizen" schema references the "BirthPlace" schema.¶
The example schema instance contained in the above schema results in the following JSON-LD document. The instance context contains information from both "Citizen" and "BirthPlace" semantic keywords.¶
That can be serialized as text/turtle
as¶
Thanks to Giorgia Lodi, Matteo Fortini and Saverio Pulizzi for being the initial contributors of this work.¶
In addition to the people above, this document owes a lot to the extensive discussion inside and outside the workgroup. The following contributors have helped improve this specification by opening pull requests, reporting bugs, asking smart questions, drafting or reviewing text, and evaluating open issues:¶
Pierre-Antoine Champin, and Vladimir Alexiev.¶
This section is to be removed before publishing as an RFC.¶
There's currently no standard way to provide machine-readable semantic information in [OAS] / [JSONSCHEMA] to be used at contract time.¶
This document is focused on annotating schemas that are used at contract/design time, so that application can exchange compact JSON object without dereferencing nor interpreting external resources at runtime.¶
While you can use the provided semantic information to generate JSON-LD objects, it is not the primary goal of this specification: context information are not expected to be dereferenced at runtime (see security considerations in JSON-LD) and the semantics of exchanged messages is expected to be constrained inside the application.¶
externalDocs
?We already tried, but this was actually squatting a keyword designed for human readable documents.¶
x-
keywords?OpenAPI 3.0 considers invalid unregistered keywords that don't start with x-
,
and we want a solution that is valid for all OAS versions >= 3.0.¶
This approach allows API providers to attach metadata to their specification without modifying their actual services nor their implementation, since custom keywords are ignored by OpenAPI toolings like Gateways and code generators.¶
This is actually problematic. Look at this example that reuses
the TaxCode
schema and semantic in different properties.¶
Web and mobile developers consider JSON Schema is easier to use than SHACL. Moreover, OWL restrictions are about semantics, and are not designed to restrict the syntax.¶
JSON-LD is a complex specification.
Consider the following schemas, where Contract
references TaxCode
.¶
TaxCode: type: string $linkedData: "@id": "https://w3id.org/italia/onto/CPV/taxCode" "term": "taxCode" Contract: ... properties: employer_tax_code: # Beware! TaxCode.$linkedData.term == 'taxCode' $ref: "#/components/schemas/TaxCode" employee_tax_code: # Here we are reusing not only the schema, # but even the same term. $ref: "#/components/schemas/TaxCode"¶
The result will be that only one of the properties will be correctly annotated. For this reason, composability is limited to the object level.¶
x-jsonld-type
be an rdf:Property
? Would this allow to reuse the same schema in different objects without modifying the @context
?Under normal circumstances, i.e. when designing public or financial service APIs,
you don't want x-jsonld-type
to be an rdf:Property
.
The value of x-jsonld-type
usually maps to a owl:Class
, not an owl:DataTypeProperty
;
for example a sensible value for x-jsonld-type
would be rdfs:Literal
(that is, the rdfs:range
of CPV:taxCode
),
but this would be mostly a syntactic information, which instead is provided by JSON Schema.¶
This section is to be removed before publishing as an RFC.¶
TBD¶