Exploring appraisal, quality assurance and risk assessment in the data continuum
PERICLES was invited by the IDCC to run a workshop on 25 February 2016 as part of the 11th International Digital Curation Conference (IDCC16, Amsterdam). We prepared a half-day session on appraisal, quality assurance and risk assessment in relation to the lives of complex digital objects.
The workshop was designed to introduce participants to some of the key concepts being explored in the PERICLES approach of model-driven preservation in a continually evolving environment. The event aimed on the one hand at identifying parts of appraisal processes that lend themselves to being automated, and discussing development plans for appraisal tools, and on the other hand at risk assessment and quality assurance based on the dependency modelling approach we are investigating.
Image: Illustration of PERICLES model-driven preservation approach
As the PERICLES project reaches its final year, the opportunity to engage with the data management communities represented another important step towards the consolidation of our core concepts and validation of assumptions for the implementation of models and software components. With this objective in mind, we had planned for an interactive session to follow the introductory presentations, to facilitate knowledge exchange and benefit from useful input from the practitioner community.
The PERICLES workshop attracted a very engaged and inspiring audience comprising professionals and experts from the library, science and research data management domains.
The half-day session started in the morning with an introduction by Simon Waddington (King’s College London) to the key objectives of PERICLES, and in particular the model-driven approach, which enables dynamic representations of digital ecosystems to be created and manipulated to study the potential impact of change. He went on to explain the challenges related to the appraisal of complex digital objects with a special focus on technical appraisal methods and the assessment of risks and proximity (i.e. timescale in which the risk is expected to occur).
This was followed by an introduction to the work done by PERICLES researcher Fabio Corubolo (University of Liverpool) in the areas or policy implementation for automated change management and quality assurance (QA).
Image: Fabio Corubolo introducing Quality Assurance and change management in the context of PERICLES
The group then split into two breakout teams to discuss the themes of appraisal and quality assurance. To trigger a useful dialogue, the groups were introduced to a scenario where scientific experiments are carried out in a Biophysics Institute that has a mandate to retain research together with the software required for its execution for a minimum of 10 years.
The example prompted the appraisal group to share real-life examples of the heterogeneity and complexity of data that characterises some of their institutional archives and to describe respective digital preservation challenges. There is a demand for automated appraisal tools, particularly from university repositories who receive a large amount of heterogenous data for archiving, but have limited resources to perform manual appraisal.
Participants also reflected on the importance of tracking data usage and using statistics to understand the value of content for appraisal purposes. On the same lines, the usefulness of tools capable of automating the process of creating metadata around content (e.g. PET) was central in the discussion.
The group also discussed the implications for users of archives where much of the content is no longer reusable, particularly entities subject to rapid technological changes such as software. There it was thought that risk information could be integrated with search and retrieval tools to provide an indication to users of the likely reusability of the digital objects.
Another interesting topic was community tracking to assess shift in usage of certain software in various disciplines together with the need for automating the capture of changes in the whole preservation environment, for example with regards to metadata schemas.
Overall the group found the session very useful and confirmed the growing demand for doing risk analysis and predictive proximity estimation to assess whether content is reusable.
The second work group (quality assurance) also reflected on the given scenario and identified similarities with their institutions. They discussed on how policies are currently being implemented through the system infrastructure; the proposed model for policy and quality assurance, based on the same scientific scenario, was then discussed.
Image: Policy QA scenario ecosystem
The nature of policies differ from institution to institution. All agreed that policies should always reflect the vision of the institution and therefore contain principles that are more aspirational in nature. This often leads to the MoSCoW type of distinctions in policy formulation: Must - Should - Could - Would. Aspirational policies are considered important drivers that should be modelled even when not completely implementable.
It was considered that a model as the one described in PERICLES can be of great use also for communicating user requirements, and communicating them across different roles in the organisation, aside from its use for quality assurance. The group discussion highlighted the fact that formal policy languages and policy frameworks are rarely found in institutions as they often prove hard to implement, and that sadly quality assurance is more often than not limited to bits preservation. The implementation of policies into automated rules has proved to be a serious issue in previous projects. The possibility to use the model to generate user requirements was highlighted, as a possibility to derive them from the high level, aspirational policy, in line with the policy driven approach.
Some very useful suggestions were made on possible additions to the policy and QA model, such as: the importance of expressing the level of implementation of policies and QA so that these are represented in the model also before they are implementable (aspirational policies); and the importance of expressing the frequency of QA execution.
Participants did recognise the potential usefulness of the partially automated quality assurance mechanisms proposed and the value of the ecosystem model as means to aligning the infrastructure at high level using the same language by using policy and ecosystem modelling as intermediary; although they recognised that sometimes it is hard in their cases to justify the time spent on these useful models. This seems to suggest that a lightweight, functional model such as the one we propose can be useful as a way to facilitate the creation of models that are generic and tailored to the use, without requiring a high investment in complex, more formal models.
There was interest by some participants in the group, to evaluate the testing of the proposed methodologies in their institutions, and in general a recognition of the potential usefulness of the methodology.
Overall the breakout sessions were very well received and gave us the opportunity to relate PERICLES research to real-life examples of digital preservation challenges and requirements for appraisal and quality assurance.
We thank our very engaged audience for joining our workshop and providing very useful feedback.