Skip to content The Open University

Runtime Adaptation for Reliability

Some complex software systems, such as Eclipse, are made of several components that can be installed, run, stopped, and updated independently. This provides several advantages over distributing the software as a single components: updates to specific components can be distributed without having to re-package the entire software; several organisations can provide components and features independently from the core of the software; users can choose to only install and use the features they want or need; dependencies between components allow for a wide variety of working configurations; etc. However, the ability to deploy a custom configuration, as well as varying levels of quality depending on the vendors that release components, can sometimes lead to bugs or crashes in components. In systems with lots of components and complex dependencies, reverting a component to a previous version can be tedious if done manually. In this talk, we propose to automate the downgrade of faulty components, using a 3-steps approach:
  • identify the faulty component from an error stack-trace;
  • compute, given a suitable proximity function, the closest configuration that allows for the faulty component to be downgraded, while keeping dependencies between components satisfied, and minimising the changes to be made to the configuration;
  • updating the configuration so the user can continue using the software, almost without interruption.
We also present a case study we are currently running, using Eclipse and OSGi, which involves hundreds of components. We also describe another case study we would like to look at, using the Gentoo Linux distribution, which involves over 15,000 packages.