The Second (LRM focused) Technical Workshop in Grenoble
Xerox Research Centre Europe (XRCE) organized a second technical workshop, focused on the latest version of the Linked Resource Model (LRM) and on using/experimenting with its descriptive power on a few selected cases relevant to PERICLES partners working within Work Package 3 (Dependency and change management models for digital ecosystems). The LRM is a general purpose modelling language that is being developed within PERICLES, to enable the description of evolving digital ecosystems. A static version of the LRM, that enables modelling of entities and dependencies, has already been produced, and a dynamic model, incorporating ecosystem evolution, is under current development. This workshop was hosted by the XRCE in Grenoble, and took place on the 10, 11 and 12 December 2014. It was decided to concentrate on technical issues that are central to those amongst the PERICLES consortium who would actually be using the LRM, and to gather precious feedback from them.
Amongst the participants were Johannes Biermann and Anna Grit Eggers from the University of Göttingen (UGOE) who are going to develop an LRM derived/compatible ecosystem model. Anna also contributed to the creation of the PERICLES Extraction Tool (PET), which is expected at some point, to generate LRM compatible metadata on change events. We also invited Simon Waddington from King’s College London (KCL) who worked on the space sciences use cases, and Stratos Kontopoulos from CERTH, who used the static LRM to model TATE uses cases.
Our original agenda was left intentionally quite open, but was mainly organized around three steps:
(i) Sharing know-how as well as concrete tools and environments to develop and operate on ontologies (editors, verifiers, visualization and transformation tools);
(ii) Draw together the basic lines of an LRM based model to describe the example ecosystem developed in the deliverable “PERICLES Initial Report on Preservation Ecosystem Management” (section 6 - Exemplary application), and finally,
(iii) Introduce the work in progress along the dynamic LRM and accordingly, the proposed evolutions of the static LRM.
Day 1 (10 December 2014)
We presented the latest version of the LRM, split into a static part and a dynamic part, i.e. the ontological definitions adapted to model the dynamics of the ecosystem. With regards to the static part, some modifications have been introduced in order to more clearly distinguish abstract entities from concrete entities, defined as having a physical extension and therefore, characterized by a location attribute (specifying some spatial information; for a digital resource this information can be the URL required to retrieve and download the bit stream, i.e. its digital extension). Both can be used jointly to precisely describe a resource such as Bill Viola’s Nantes Triptych (see case study presented in day 2) as both an abstract entity (the very idea of the artwork, as referred by papers talking about the artist’s intention behind the created object) and as a concrete digital file (the video stream that one can load and play in order to manifest and perceive the artwork). Both entities can be related through a specific predicate named realizedAs, expressing that the video file is a concrete realization of the abstract art piece.
Most notably, the new version of the static LRM has been made independent from the PROV ontology (a recent W3C recommendation to express provenance information). One can still use PROV in any LRM derived sub-ontologies, but other provenance oriented models can also be involved at domain specific levels.
A special focus was made on the versioning model we proposed as a way to capture and consistently track the evolutions of particular resources. We illustrated first how the versioning model was connected to the LRM and its dependency mechanisms, and moved on to explain how a versioning operation was decomposed step by step (considering what is the state of the instance graph before and after the versioning step).
The picture above shows the various possible states of a versioned resource, according to our semantic versioning model. The verified state (green) corresponds to a situation where the coherence of invariant and specification properties attached to the resource has been formally verified. This verification process can be purely automated (e.g. a software test procedure), left to the responsibility of a human operator (is the new video compliant with the detailed synopsis that specifies the resource?), or a combination of both (e.g. a formal proof involving a logic-based proof assistant and a knowledgeable operator having the right skill to build the proof).
After diving into our resource lifecycle model and discussing the rationale behind the proposal, we decided to end this first day by an easier topic: discussing and demonstrating the various tools of interest to help us developing ontologies, among which we distinguished:
- Notepad++: general purpose editor (and a particular syntax highlighting mode for turtle developed on purpose by JYVD)
- RDF2RDF: a Java-based conversion tool for various RDF linearization formats (also turned out to be useful to capture syntax error at an early stage)
- Protégé with its many standard plugins (a must)
- XTurtle: an Eclipse plugin for turtle, offering a very strong integration of the turtle syntax and lexical semantics with editing functionalities
- The Apache JENA framework
Day 2 (11 December 2014)
We decided to start working on Simon’s mock example (a document rendering process, see first draft diagram below) and to build an LRM-compliant model of this particular case study.
This example offered us the opportunity to:
- Use the new static LRM
- Review the ecosystem related vocabulary and concepts (Process, user, policy, …)
- Discuss the relations between these notions and represent them either as properties or as dependencies. This naturally resulted in adding specializations of dependencies (which was what was expected by the designers of the LRM model).
- Discuss relations between these notions and to represent them either as properties or as dependencies. This naturally resulted in adding specializations of dependencies.
An important question that kept emerging from the discussions was, “What is the relationship between a process and a policy?”. We will need to investigate whether there indeed is a 1-1 relationship and whether it can be represented via a dependency.
- As a result, we proposed various concepts:
- TechnicalService: Modelled as an lrm:AbstractEntity. It relies on Infrastructure, that has SW and HW components (via “isComposedOf” property) and relies on an Interface. SWComponents relate to InfrastructureComponents via the “runsOn” property.
- Process: Defined as subclass of lrm:Activity (should have a temporal extension that will be discussed later). Via a “describesProcess” property we relate a ConcreteEntity to a Process.
- Policy: Not sure how to model them. Temporarily, we chose to represent them as lrm:Digital-resources. We also introduced a dependency of a process on a policy, which leaves room for specifying the intention of the policy..
- User: Represented as lrm:Agent (since they are associated with performing activities).
- Digital Object: Modelled simply as a subclass of lrm:Digital-resource (D5.1.1 does not introduce any further refinements). We need to clarify if there are differences between digital objects and digital resources (is maybe one a specialization of the other? do we need to introduce the notion of aggregation?). For now, we decided to avoid further specializing the definitions.
Then we went on to analyse the ecosystem mock example proposed by UGOE (using LRM classes and dependencies to model an ecosystem, mainly around services and related infrastructure constraints).
- We defined a top-level class called “EcosystemEntity” having properties common to all child-classes.
- We defined properties for representing affinity, probabilities, sensitivity etc.
- We identified several open issues:
- “changeProbability” will depend on time (should it also be associated with entities?).
- “changeTimeframe” requires more thorough investigation.
- The datatype property “owner” associates a policy to a string value. It is mostly used as a decoration, but needs further discussions (do we need to involve a stakeholder here? if the stakeholder changes, does that imply something for changes taking place in the ecosystem?).
- Implicit dependencies? Do we need to include them at the modelling level? The first impression is “No”, so at this point we decided to omit it - maybe it will be added in the future if we come up with an example that signifies its use.
- Weights on dependencies? Not sure how to calculate them and how to assign different weights to different points in time.
Eventually, we came back to the dynamic LRM and the versioning model through developing the example inspired by the Bill Viola Artwork case study proposed by Tate (The Nantes Triptych, see below).
Bill Viola’s Nantes Triptych
Bill Viola’s Nantes Triptych was made in 1992 and acquired in 1994. This work was originally shown using cathode ray tube projectors and laser disc, video player. It was synchronized using a simple computer control system. It is made up of three images and one soundtrack. […]
Later (2007) the artist changed the dimensions of the work to allow a smaller version to be installed .Later again (2007/8) the artist wanted to revisit the original SD master material and upgrade it to HD . This is still under discussion.
Before going to the use case,
- We first discussed the underlying temporal model, based on three fundamentally distinct notions: instants, duration and intervals
- We introduced a novel base class suited to capture description of time sensitive entity: the Temporal Entity/Resource class, formally defined as intersection of the general Entity/Resource class (be it abstract or concrete) and a time interval. This concretely means that this class aims at modelling all entities that have an extension in time, however precise the temporal information is (the system may know when the entity was created, but not when it was/will be destroyed; or the reverse; or may know the whole time extension as well)
- We introduced the Event model, and the fundamental distinction we propose between the endogenous vs the exogenous events. The first category is dedicated to internal operations required to maintain the representation in coherence with the ecosystem, whereas the second category is dedicated to model the changes that occurred in the ecosystem itself. This notion of endogenous vs exogenous changes is deeply related to the principle of reflexivity we believe is fundamental to the PERICLES approach.
- We introduced the Activities, and how they relate to the events. More precisely, an activity is formally defined as the intersection of the Abstract Entity class with a Temporal Entity.
Regarding the versioning model, we introduced more precisely the various kind of versioning operations, based on the impact of change on the usage/perception of the resource (note that in all cases except for updates, the change may concern the resource itself or its specification or both: from the standpoint of versioning, the resource and its specification form a unified entity for that both are needed to assess the impact):
- Major versions are used to capture change having a backward incompatible impact on the resources that depend on the subject resource (NB the (in)compatibility is appreciated on the basis of the specification associated with the subject resource, in all cases);
- Minor versions are used to capture backward compatible modifications;
- Micro versions are dedicated to changes that have no visible effect, according to the specification (understood that other impact may exist, but are not considered as significant);
- Updates are used to record changes that are not qualified (but possibly later during the life cycle of the resource).
The figure above illustrates a transformation path for a resource r with an associated specification Ir ; note that the satisfaction relation is verified at the origin of the transformation, as it is at the end (resource r satisfies the specification Ir is formally denoted by the logical relation ⊨ ). The transformation of the pair (r, Ir ) is decomposed in 3 change tracking steps and one certification step: so as starting from a certified version with label 1.1.3, it ends into a certified version 1.2.0 (so called a minor version evolution). Note that the first two updates relate to a change of the resource itself (symbolized by δ1 and δ2), whereas the last one relates to a change of its specification (δ3).
Going into lower abstraction levels, we also discussed our idea to integrate a rule based mechanism in order to describe in a rather natural way the low-level endogenous operations needed to maintain the graph consistency. This would be a mix of forward chaining and backward chaining rules, which syntax would be inspired by the W3C proposal called SWRL (Semantic Web Rule Language). The non-monotonicity of the rewriting/inferencing execution model has been debated.
Finally, we worked on a detailed example to illustrate Tate’s ‘change scenario’; the figure below represents a detailed description of the mock scenario (only the last stage: making the hypothesis that Bill Viola changed the graphical resolution from low to high), as reflected by an LRM compliant RDF graph instance:
As we reviewed the proposed solution, it turned out to be quite straightforward to model, and no apparent issues emerged.
Day 3 (December 12)
We did a recap of the dynamic LRM and the versioning model, with a focus on SWRL inspired rules embedded in the knowledge representation. Johannes mentioned the JBoss Drools platform and tools and recommended to analyse their model and tools.
We worked on Anna’s example, related to meta-information potentially produced by the PET. After listing some possible information items, we discussed and clarified the abstract vs concrete entity notion. We agree on the interest and possibility of enriching PET outputs with LRM compatible descriptions (dependencies and other meta-information).
We then worked more deeply on the time model, and the use of Allen’s algebra to reason about time, especially when time information is partially specified or partially known, or yet, known with uncertainty or poor/undefined precision.
- The LRM model has been deeply revised. There are now two “modules”: Static and Dynamic.
- LRM is now detached from PROV.
- We also discussed tooling (tools used and other candidate tools).
- We worked on mock examples provided by partners.
- We discovered inconsistencies and iteratively fixed them.
- We relied on the definitions of D5.1.1 (which may also need refinements).
- We realized there are specific notions in the models that need clarifications.
- We saw the semantic versioning aspects. Implemented the example demonstrated during the recent EU review meeting (“Nantes Triptych”). Here the video is changed from SD to HD resolution which should be represented by semantic versioning.
- Tried to clarify the distinction between abstract and concrete entities.
- We had a short discussion about the upcoming rule language. It can be used to create new versions (semantic versioning) as well as for generic OWL model usage. Specification is a work in progress.
- We plan to work on LRM serialization of results (SEI extraction) from PET. This will lead to ontology enrichment, i.e. add new instances but also classes to the ontology models.
- The time model (based on Allen’s interval algebra) needs further elaboration and will be delivered as a milestone to be used by the rest of the partners.
10-12 December 2014
Johannes Biermann, Anna Grit Eggers (UGOE)
Jean-Pierre Chanod, Nikolaos Lagos, Jean-Yves Vion-Dury (Xerox Research Centre Europe)
Simon Waddington (King’s College London)
Stratos Kontopoulos (CERTH)