Pericles project
Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics

Blog

PERICLES workshop on “Automated capture of the environment in a sheer curation context”



PERICLES workshop on “Automated capture of the environment in a sheer curation context”

PERICLES joined the 10th International Digital Curation Conference with a half day workshop on ‘Automated capture of the environment in a sheer curation context’ on February 11, 2015 at 30 Euston Square, London, UK.

The aim of the workshop was to disseminate current outputs to people tasked with all aspects of long-term use of data. In addition, it was an opportunity to engage with IDCC delegates on current project activities to get additional input to help making the upcoming developments of as much relevance as possible.

The workshop was well attended by a very engaged and knowledgeable audience including repository providers, curators, developers and researchers from Europe, USA, Canada and Japan.

The event focused on three aspects of the PERICLES project: information extraction, information encapsulation and information selection.

Fabio Corubolo (University of Liverpool) described the motivation for developing the PERICLES Extraction Tool (https://github.com/pericles-project/pet), to extract the significant environment information for a digital object, in order to support its long-term access and reuse. He started by introducing the concept of Significant Environment Information (SEI), and described how this could be determined and used in the context of long-term preservation to provide a better context for data use and reuse. He went on to demonstrate the tool with an example of how it could be applied to recording the process of resolving an operational problem with a piece of equipment. The remedial action left a digital trail from the different activities (such as searching a fault database, looking up past history of the equipment, contacting experts etc.). The tool would record these changes at the time of happening, allowing further analysis and inference of cross object dependencies. He showed how this information could serve many purposes: historical information for the equipment, information relevant to assessing the quality of the data and as a knowledge base for the operation of the equipment. The demonstration drew a lot of questions from the audience wishing to understand how to configure and use the tool as well as in which environments it could be applied.

PET architecture

Next on the programme was an introduction to the framework for scenario-based information encapsulation. Anna Grit Eggers (Göttingen State and University Library) explained the concept behind PERICLES Content Aggregation tool (PeriCAT), work which had recently started and will be completed later in 2015. She explained how the diversity of Information Encapsulation (IE) techniques (e.g. packaging and embedding) makes it hard to decide which technique to use for a particular scenario. PeriCAT has been designed to recommend an information encapsulation technique based on asking the user to enter a set of criteria such as the importance of visibility of the payload embedded in the carrier file. Preceding the workshop, Anna had presented a poster describing the tool at the main IDCC conference.

PeriCAT poster

The early work on PeriCAT drew several questions from the audience who were interested in determining what type of embedded or encapsulated information should be stored in a digital object and also in a reuse environment.

Next on the presentation panel was Simon Waddington (King’s College London) who provided an outline of the objectives of the recently started task on appraisal. The working definition of appraisal was “identifying digital objects of continuing business value”. The question of relevant appraisal criteria and their evaluation was discussed, motivated by examples from the PERICLES science and media case studies.

Simon introduced the main task objective of providing partial/full automation or computer-assisted guidance for the evaluation of key appraisal criteria. This includes determining criteria where there is a clear value in automation and modelling the relevant decision processes. The obvious interest from the audience underpinned the community interest in these topics and supported the approach PERICLES has taken based on modelling the environment of digital objects.

Three parallel breakout sessions were held relating to the introductory presentations. A set of sample scenarios drawn from the PERICLES project were explored during the appraisal break-out. The diversity of the group, which included archivists, computer scientists and domain specialists in various arts and science topics contributed to a very stimulating discussion. Participants were provided with an extensive list of appraisal criteria, and asked to determine the most relevant criteria for the given scenarios, and those with the most value in automation. Group work outcomes included the proposed appraisal criteria, as well as considerable useful input regarding relevant resources, project outputs and ideas for automation.

Appraisal session

The practical demonstration of the PET also generated lots of useful questions on how the tool could be used. Participants were interested to find out whether the PET could identify which document was used to solve the problem and interact with the user to get feedback. They also asked whether the PET could detect web dependencies. Overall the tool was found to be useful by the attendees, covering a novel approach, with a recognition of the importance of the automated collection of the data. There was also interest in the work related to defining a model for the information collected by the PET, and its use to construct dependency graphs. The idea of providing means to ask for user interaction was also proposed for the tool, which could integrate with the data already collected by the tool. It was also suggested that the tool could be submitted to the Open Preservation Coalition for longer term sustainability. The concept of the SEI was also considered of interest, and participants shared their intention to follow the future development of the PERICLES environment and ecosystem models and analysis tools.

With regards to PeriCAT, which is due for publication later this year, the workshop offered interesting use-cases, which will be useful for the further development of the encapsulation tool in the upcoming months.

Overall the half day workshop was a big success and very inspiring for the PERICLES team, and we thank our very engaged audience for providing very useful feedback. 

IDCC 2015 workshop 11 February, 2015, 9am-12.45pm

Add a comment