FORMAL VERIFICATION OF SAFETY-CRITICAL DISTRIBUTED SYSTEMS

  • Elena Pagani profile
    Elena Pagani
    11 April 2016 - updated 4 years ago
    Total votes: 3

Reference persons:

 

The big picture: Describe your vision for a game-changing future technology. Why is it new? What difference would it make for Europe's economy, society and citizens?

Many distributed systems relay on sophisticated multi-parties algorithms, whose correctness verification is a crucial, although challenging, task.  Examples of those systems range from replicated database maintenance in cloud platforms, to wireless sensor networks for industrial plants and environmental monitoring, from distributed simulation platforms for air traffic prediction and control, to infrastructures for reliable and timely delivery of stock quotes for decision support systems used by financial operators.

So far, the implementation of these systems does not involve mechanisms to assist both software designers and programmers during software development so as to pre-empt them from introducing logic bugs, while testing is performed by running sets of pre-defined case-studies considered significant, but far from being exhaustive.  Systems robustness simply relies on developers’ experience and informal testing that may lead to patches whose side-effects are not always clear.

By contrast, a framework able to take in input the problem specification and a description of the system behaviour, and producing in output either a guarantee of the software correctness, or a trace of incorrect behaviour helping developers to intervene appropriately through a virtuous implementation-verification feedback loop, is strongly needed but still missing.  The capability of formally and thoroughly verifying the robustness of such systems would have a strong impact in the contexts where they are used.  As an example:

  1. wireless sensor and actuator networks whose safety is validated before deployment in industrial plants can reduce workplace accidents (e.g., due to toxic vapour concentration beyond a tolerated threshold), by guaranteeing prompt and secure detection and the application of appropriate countermeasures.  Similar arguments apply to environmental monitoring for citizens safety;
  2. the proper and fair delivery of stock quotes to all involved operators helps preventing market speculations guaranteeing equity amongst stakeholders.  Similarly, formal verification of online banking platforms gives strong guarantees to customers about the proper application of their banking operations;
  3. the validation of distributed platforms for air traffic control can increase safety of air travels;
  4. the correctness of algorithms adopted in cloud platforms – today used by many enterprises – guarantees the appropriate management of maintained data, that is, data consistency and durability across (possibly concurrent) updates.

The work needed: What are the main breakthroughs that a proactive initiative on this would need to achieve? What range of disciplines and stakeholders should be involved?

Distributed systems where the number N of components is undefined and their behaviour is driven by their interactions with one another, are known as parameterized reactive systems.  The formal and complete verification of their safety is hindered by their intrinsic infinite-state nature.  So far, a framework for completely automated verification of infinite-state parameterized reactive systems, performed in reasonable time and with reasonable computing resources, has yet to be developed, and would be the final goal of such a proactive initiative.  In fact, some logical frameworks have been proposed in the literature that aim at achieving this goal albeit imposing several (and severe) limitations, such as:

  1. verification limited to systems with known and low N (due to computation constraints);
  2. verification limited to finite-state systems;
  3. use of abstractions that may fail in capturing all the system behaviours, or leads to instability, or introduces spurious behaviours to be eliminated in a post-processing phase to assess the real robustness of the analysed system;
  4. partially automated verification that requires heavy and careful interaction with the human operator, thus being slow as a side effect.

Involved disciplines are: (modal / temporal) logic, automated reasoning, distributed algorithms.  Stakeholders are all those enterprises developing either software for mission-critical and/or safety-critical distributed platforms, or operating systems software.

The opportunity: What makes you believe that, with suitable time and investment, this can be achieved? Are there developments in science or society that make it plausible? What will drive this to real innovation and impact?

As mentioned above, there already are some techniques and tools going in the proposed direction, although possibly suffering some of the mentioned limitations, or providing just a subset of the necessary functionalities.  They have been developed both in academy and industry (for instance, since many years, Intel Corporation is bringing on efforts on formal methods for system verification [1]).  In the following, we enumerate existing technologies and platforms that represent interesting starting points for this research:

  1. finite-state model checkers are very well consolidated technologies, such as for example NuSMV (http://nusmv.fbk.eu) and SPIN (http://spinroot.com/spin/whatispin.html);
  2. SMT solvers: they are continuously potentiated.  The most popular ones are z3 (Microsoft Research, http://research.microsoft.com/en-us/projects/z3m/) – which includes the μZ model checker; Yices (by SRI International, Computer Science Laboratory, http://yices.csl.sri.com/index.html); CVC4 (http://cvc4.cs.nyu.edu/web/);  MathSAT (http://mathsat.fbk.eu) amongst the others;
  3. significant results have been obtained with infinite-state model checking – using SMT solvers in back end – limitedly to the non-parametric case.  For instance, the IC3 algorithm [2,3] – implemented for instance by both μZ and nuXmv (Fondazione Bruno Kessler, https://nuxmv.fbk.eu) to name only a few – yields good results;
  4. infinite-state model checking for parametric systems is just sporadically developed, although many interesting works exist for particular classes of problems.  Yet, stable and widely applicable tools are still rare: we may cite e.g. MCMT (Università degli Studi di Milano, http://users.mat.unimi.it/users/ghilardi/mcmt/) and Cubicle (Laboratoire de Recherche en Informatique, Université Paris-Sud, http://cubicle.lri.fr).  The existing tools need to be potentiated and integrated with one another.  Techniques of either finite-state reduction ('cutoffs') or reductions to non-parametric problems co-exist together with methods that leverage aggressive decision procedures for rich theories such as arrays and lists.

Likely, the formal basis and methods adopted by those solutions have to be combined, and further methods have to be integrated with them, in order to obtain an efficient general-purpose framework for automated system verification.  Another considerable aspect that will make the results of this research of real impact is raising enterprises awareness about the compelling need of monitoring software correctness both during production and at the end of the development process through effective verification mechanisms.

 

[1]  John Harrison, “Formal Methods at Intel – An Overview”, Proc. 2nd NASA Formal Methods Symposium, 2010.

[2]  A.R.Bradley,“Sat-based model checking without unrolling”, Proc.VMCAI, Lecture Notes in Computer Sciences, R. Jhalaand and D.A. Schmidt Eds., vol.6538. Springer, 2011.

[3]  Kryštof Hoder and Nikolaj Bjørner, “Generalized property directed reachability”, Proc. 15th International Conference on Theory and Applications of Satisfiability Testing (SAT), Lecture Notes in Computer Sciences, A. Cimatti and R. Sebastiani Eds., vol.7317, pp.157-171. Springer, 2012.