Since my post on application profiles and JSON-LD I have been putting some more thought on the question of how to publish application profiles ("APs") in a formal machine-readable way. As far as I know, there doesn't exist any common practice for publishing machine-readable documentations of application profiles. Here, I will give it a first try to publish an AP documentation using OAI-ORE. (If you are only interested in this and want to skip the preliminaries, just directly jump to the section entitled Describing an application profile with OAI-ORE.)
I already talked about this in the last post on APs but will nonetheless first present use cases for and benefits of machine-readable application profiles. Here I three use cases I can think of:
- Enable people to create data/describe resources in accordance with the application profile: A machine-readable AP might enable developers to automatically create the necessary input forms for describing resources based on the AP. At best, the AP would enable an automatic input validation.
- Enable people (and machines) to understand the data that underlies an application independent of a specific natural language: E.g., studying an application profile makes sense if you want to build an application based on data that is created according to the AP. A machine-readable AP can on the one hand be understood by people independent of the natural languages they understand and might on the other hand enable automatic creation of an application that works with the data.
- Machine-Readable application profiles could play a role in generating vocabulary usage statistics. This could be interesting for services like Linked Open Vocabularies (LOV) that provide statistics on vocabulary/property usage.
As my last post might suggest otherwise, I want to make one thing clear: A JSON-LD context document hardly meets all characteristica of an application profile as defined by the Dublin Core community. According to the Guidelines for Dublin Core Application Profiles an AP should meet the following criteria:
A DCAP is a document (or set of documents) that specifies and describes the metadata used in a particular application. To accomplish this, a profile:
- describes what a community wants to accomplish with its application (Functional Requirements);
- characterizes the types of things described by the metadata and their relationships (Domain Model);
- enumerates the metadata terms to be used and the rules for their use (Description Set Profile and Usage Guidelines); and
- defines the machine syntax that will be used to encode the data (Syntax Guidelines and Data Formats).
As a JSON-LD context document isn't much more than a list of metadata terms employed in an application, it doesn't possess all the characterictics of an AP.
In the following, I am definitely not concerned with fulfilling all four criteria for APs through a machine-readable document but will focus on the third one: 'enumerate the metadata terms to be used and the rules for their use'.
If all people would (re)use a vocabulary in the same way there would be no need to describe the usage for a specific service. To find out what might be the characteristics of an application profile we first have to answer the question: What operations are involved in the re-use of RDF properties and classes? In this paragraph, I will try to name and illustrate some of these.
First, here is a simple example RDF description which a service that serves descriptions of libraries might provide. 1
We can see that the AP used for this library description draws five properties (foaf:name, rdfs:seeAlso, dcterms:identifier, gn:locatedIn, org:classification, foaf:isPrimaryTopicOf) and one class (foaf:Organization) from five different RDF vocabularies.
Generally, when vocabulary terms are re-used, this may be accompanied by a more restrictive usage in the context of an application in comparison to the existing wider usage. The example description illustrates how a service might use properties and classes. foaf:name is the only property that is used according to the wider practice. The use of the other properties is adjusted for this specific service:
- In the context of this fictional service, the generic property rdfs:seeAlso is only used to state links to a related DBpedia resource (line 12).
- The generic dcterms:identifier property isn't used with different identifiers but solely indicates an organisation's ISIL. Also an according datatype is attached to the string value.
- The property gn:located is solely used to link an organisation to the city it resides in, using the city's geoname URI.
- org:classification is used in the context of the service to indicate the organisation type using a specific controlled vocabulary.
I named another example in the previous post on APs: the different inteprretation fo dcterms:alternative as "unifrom title" by one group and as "other title information" by another. A machine-readable application profile should somehow reflect all these side-effects of reusing vocabularies.
The basic idea of this post is: Application profiles are aggregations of terms from different RDF vocabularies. OAI-ORE is an RDF vocabulary to describe aggregations of things. Shouldn't it make sense then to describe APs using OAI-ORE?
If you want to know more about OAI-ORE, I suggest starting with the wikipedia entry.
Here is how one could describe the underlying application profile for the fictional service described above:
- As the aggregation itself is an abstract resource it is described in a RDF document called "resource map". After the prefix declarations (lines 1-9) the resource map is described (lines 11-18) with information about license, creator, creation date etc. and the information which aggregation is described by the resource map.
- In lines 12-24 the aggregation - i. e. the application profile itself - is described, its name and the aggregated resources (in this case the properties and classes used in the application profile) are stated.
- The rest of the document (lines 26-60) is concerned with making clear where the usage of the re-used properties diverges from their original definition in their home vocabulary. In a human-readable way this information is provided in rdfs:comment but it also happens via the rdfs:domain and rdfs:range declarations. For example, lines 32-40 state the regular expression to which values of dcterms:identifier are constrained in the AP.
OAI-ORE offers what one needs to represent application profiles - or at least the characteristics of APs we were looking for - in RDF. The question is whether it is also comfortable to create and make use of or whether one should look for a lightweight alternative. I'd be happy to hear opinions and to learn about other approaches.
|1||The example was constructed for the purpose of this text but it reflects current or past practices in the lobid organisations index.|