Providing machine-readable application profiles with OAI-ORE

Since my post on application profiles and JSON-LD I have been putting some more thought on the question of how to publish application profiles ("APs") in a formal machine-readable way. As far as I know, there doesn't exist any common practice for publishing machine-readable documentations of application profiles. Here, I will give it a first try to publish an AP documentation using OAI-ORE. (If you are only interested in this and want to skip the preliminaries, just directly jump to the section entitled Describing an application profile with OAI-ORE.)

Benefits

I already talked about this in the last post on APs but will nonetheless first present use cases for and benefits of machine-readable application profiles. Here I three use cases I can think of:

  1. Enable people to create data/describe resources in accordance with the application profile: A machine-readable AP might enable developers to automatically create the necessary input forms for describing resources based on the AP. At best, the AP would enable an automatic input validation.
  2. Enable people (and machines) to understand the data that underlies an application independent of a specific natural language: E.g., studying an application profile makes sense if you want to build an application based on data that is created according to the AP. A machine-readable AP can on the one hand be understood by people independent of the natural languages they understand and might on the other hand enable automatic creation of an application that works with the data.
  3. Machine-Readable application profiles could play a role in generating vocabulary usage statistics. This could be interesting for services like Linked Open Vocabularies (LOV) that provide statistics on vocabulary/property usage.

An application profile is more than a list of metadata terms

As my last post might suggest otherwise, I want to make one thing clear: A JSON-LD context document hardly meets all characteristica of an application profile as defined by the Dublin Core community. According to the Guidelines for Dublin Core Application Profiles an AP should meet the following criteria:

A DCAP is a document (or set of documents) that specifies and describes the metadata used in a particular application. To accomplish this, a profile:

  • describes what a community wants to accomplish with its application (Functional Requirements);
  • characterizes the types of things described by the metadata and their relationships (Domain Model);
  • enumerates the metadata terms to be used and the rules for their use (Description Set Profile and Usage Guidelines); and
  • defines the machine syntax that will be used to encode the data (Syntax Guidelines and Data Formats).

As a JSON-LD context document isn't much more than a list of metadata terms employed in an application, it doesn't possess all the characterictics of an AP.

In the following, I am definitely not concerned with fulfilling all four criteria for APs through a machine-readable document but will focus on the third one: 'enumerate the metadata terms to be used and the rules for their use'.

Side-effects of vocabulary re-use

If all people would (re)use a vocabulary in the same way there would be no need to describe the usage for a specific service. To find out what might be the characteristics of an application profile we first have to answer the question: What operations are involved in the re-use of RDF properties and classes? In this paragraph, I will try to name and illustrate some of these.

First, here is a simple example RDF description which a service that serves descriptions of libraries might provide. 1

We can see that the AP used for this library description draws five properties (foaf:name, rdfs:seeAlso, dcterms:identifier, gn:locatedIn, org:classification, foaf:isPrimaryTopicOf) and one class (foaf:Organization) from five different RDF vocabularies.

Generally, when vocabulary terms are re-used, this may be accompanied by a more restrictive usage in the context of an application in comparison to the existing wider usage. The example description illustrates how a service might use properties and classes. foaf:name is the only property that is used according to the wider practice. The use of the other properties is adjusted for this specific service:

  • In the context of this fictional service, the generic property rdfs:seeAlso is only used to state links to a related DBpedia resource (line 12).
  • The generic dcterms:identifier property isn't used with different identifiers but solely indicates an organisation's ISIL. Also an according datatype is attached to the string value.
  • The property gn:located is solely used to link an organisation to the city it resides in, using the city's geoname URI.
  • org:classification is used in the context of the service to indicate the organisation type using a specific controlled vocabulary.

I named another example in the previous post on APs: the different inteprretation fo dcterms:alternative as "unifrom title" by one group and as "other title information" by another. A machine-readable application profile should somehow reflect all these side-effects of reusing vocabularies.

Describing an application profile with OAI-ORE

The basic idea of this post is: Application profiles are aggregations of terms from different RDF vocabularies. OAI-ORE is an RDF vocabulary to describe aggregations of things. Shouldn't it make sense then to describe APs using OAI-ORE?
If you want to know more about OAI-ORE, I suggest starting with the wikipedia entry.

Here is how one could describe the underlying application profile for the fictional service described above:

  • As the aggregation itself is an abstract resource it is described in a RDF document called "resource map". After the prefix declarations (lines 1-9) the resource map is described (lines 11-18) with information about license, creator, creation date etc. and the information which aggregation is described by the resource map.
  • In lines 12-24 the aggregation - i. e. the application profile itself - is described, its name and the aggregated resources (in this case the properties and classes used in the application profile) are stated.
  • The rest of the document (lines 26-60) is concerned with making clear where the usage of the re-used properties diverges from their original definition in their home vocabulary. In a human-readable way this information is provided in rdfs:comment but it also happens via the rdfs:domain and rdfs:range declarations. For example, lines 32-40 state the regular expression to which values of dcterms:identifier are constrained in the AP.

Conclusion

OAI-ORE offers what one needs to represent application profiles - or at least the characteristics of APs we were looking for - in RDF. The question is whether it is also comfortable to create and make use of or whether one should look for a lightweight alternative. I'd be happy to hear opinions and to learn about other approaches.

_______________

Footnotes
Ref Notes
1 The example was constructed for the purpose of this text but it reflects current or past practices in the lobid organisations index.

Add a comment

Stichwörter

oai-ore oai-ore Löschen
applicationprofiles applicationprofiles Löschen
Geben Sie Stichwörter ein, die dieser Seite hinzugefügt werden sollen:
Please wait 
Sie suchen ein Stichwort? Beginnen Sie einfach zu schreiben.
  1. 14.11.2013

    Jan Schnasse sagt:

    Isn't it the goal of ontologies (e.g OWL) to align your application specific voc...

    Isn't it the goal of ontologies (e.g OWL) to align your application specific vocabulary with others? An Application Profile is made of "Functional Requirements", "Domain Model", "Description Set Profile and Usage Guidelines", and "Syntax Guidelines and Data Formats". Why not use OWL to define at least the Domain Model and the Formats? Ok, I see that the proxy-pattern fits perfectly well to your need of describing a specific context. But with an aligned vocabulary you would achieve the same thing. 

    1. 14.11.2013

      Adrian Pohl sagt:

      Creating new properties and classes for an application and aligning these to oth...

      Creating new properties and classes for an application and aligning these to other vocabularies indeed makes sense in some circumstances. (Regarding the dcterms:identifier example we actually went that way and finally created a lv:isil property.) But I am adressing the case where I only reuse existing vocabularies without creating a new vocabulary. Defining a new vocabulary with OWL simply is no solution for this (re)use case.

      The question is whether using OWL on {{ore:Proxy}}s makes any sense. It is definitely NOT simpler than creating a new aligned vocab as you also have to deal with OAI-ORE...

  2. 14.11.2013

    Lars G. Svensson sagt:

    This is interesting work, Adrian. It would be intriguing to see an OAI-ORE descr...

    This is interesting work, Adrian. It would be intriguing to see an OAI-ORE description of the DINI-KIM profile for bibliographic metadata!

  3. 21.02.2014

    Felix Ostrowski sagt:

    What I am wondering is if using OAI-ORE is actually necessary here, why not simp...

    What I am wondering is if using OAI-ORE is actually necessary here, why not simply create an application-specific ontology that defines subclasses and subproperties of the classes and properties in other vocabularies we want to use? Assuming it is valid for a closed-world (the application that it is a profile for), such an ontology could also be used for validation.

    An example defining an application-specific view of Documents (having at least one creator of type Person and exactly one title) and Persons (having exactly one name, possibly being the author of a document and having a valid email address):