Reading Group Archive (2018)
On November 16, Cameron Pappas presented his COMP796 project that developed an implementation of incremental parsing in the Rats! parser generator, based on the paper Incremental packrat parsing by Dubroy and Warth from the 2017 Software Language Engineering conference.
The State of Fault Injection Vulnerability Detection
On September 28, Thomas Given-Wilson presented his work described below.
Fault injection is a well known method to test the robustness and security vulnerabilities of software. Fault injections can be explored by simulations (cheap, but not validated) and hardware experiments (true, but very expensive). Recent simulation works have started to apply formal methods to the detection, analysis, and prevention of fault injection attacks to address verifiability. However, these approaches are ad hoc and extremely limited in architecture, fault model, and breadth of application. Further, there is very limited connection between simulation results and hardware experiments. Recent work has started to consider broad spectrum simulation approaches that can cover many fault models and relatively large programs. Similarly the connection between these broad spectrum simulations and hardware experiments is being validated to bridge the gap between the two approaches. This presentation highlights the latest developments in applying formal methods to fault injection vulnerability detection, and validating software and hardware results with one another.
Thomas Given-Wilson is a post-doctoral researcher at INRIA TAMIS, Le Chesnay, France. He holds a PhD from the University of Technology, Sydney.
On September 14, Dominik Klumpp will present his work described below.
Control Flow Reconstruction for Binary Programs: A Trace Abstraction Refinement-Based Method
Binary code is a low-level machine code representation of programs that can be directly executed by a processor. Due to its low-level nature, it is well-suited for high-precision analyses of qualitative and quantitative properties (such as worst case execution time).However, establishing such properties of binary code is notoriously hard, because the code is unstructured and can contain indirect jumps (gotos) whose target address is dynamically calculated at runtime. As a result, standard software analysis techniques that rely on the availability of the control flow graph (CFG) of a program cannot be directly applied to the analysis of binary code.
In this talk, we address the problem of constructing CFGs for binary programs. We present a novel technique based on trace abstraction refinement, and demonstrate that our technique constructs CFGs satisfying several formally-defined quality criteria.
Dominik Klumpp is a student in the joint MSc. Software Engineering course at the University of Augsburg, the Technical University Munich and the Ludwig-Maximilians-University Munich. He is currently visiting at Macquarie University for his master thesis on control flow reconstruction for binary programs.
On September 7 we discussed the paper An Abstract Interpretation-Based Framework for Control Flow Reconstruction from Binaries by Kinder et al from VMCAI 2009.
On August 31 we discussed the paper Programming and reasoning with algebraic effects and dependent types by Edwin Brady from ICFP 2013.
On August 17 and 24 we discussed the paper Do be do be do by Lindley et al from POPL 2017 that defines and illustrates the Frank programming language.
Automated Software Verification
On July 27 we discussed the paper Invariant Synthesis for Combined Theories by Beyer et al.
On June 29 we discussed the paper From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis by Harman and O'Hearn from SCAM 2018.
On June 15 we discussed a paper by Imanishi et al from PEPM 2018 on guess-and-assume approach to loop fusion for program verification.
On June 1 we discussed a paper by Dietsch et al that compares two approaches to checking program traces.
On May 25 we discussed the paper Property Checking Array Programs Using Loop Shrinking by Kumar et al from TACAS 2018.
On May 18 we discussed the tutorial paper Automated Software Verification by Farzan et al.
On May 11 we had a shared coding session.
Disambiguating Parser Conflicts
On May 4, Luís Eduardo de Souza Amorim from TU Delft gave a talk on his work to disambiguate conflicts in generalised parsers.
Context-free grammars are widely used for language prototyping and implementation. They allow formalizing the syntax of domain-specific or general-purpose programming languages concisely and declaratively. However, the natural and concise way of writing a context-free grammar is often ambiguous. Therefore, grammar formalisms support extensions in the form of declarative disambiguation rules to specify operator precedence and associativity, solving ambiguities that are caused by the subset of the grammar that corresponds to expressions. Ambiguities with respect to operator precedence and associativity arise from combining the various operators of a language. While shallow conflicts can be resolved efficiently by one-level tree patterns, deep conflicts require more elaborate techniques, because they can occur arbitrarily nested in a tree. Current state-of-the-art approaches to solving deep priority conflicts come with a severe performance overhead. In this talk, I will present a novel low-overhead implementation technique for disambiguating deep associativity and priority conflicts in scanner-less generalized parsers with lightweight data-dependency. By parsing a corpus of popular open-source repositories written in Java and OCaml, we found that our approach yields speed-ups of up to 1.73 x over a grammar rewriting technique when parsing programs with deep priority conflicts — with a modest overhead of 1 % to 2 % when parsing programs without deep conflicts.
On April 27 Tony Sloane gave a demo of the JupyterLab in-browser computational environment, including some discussion of support for non-notebook-traditional JVM-based languages such as Scala via the beakerx extension.
Author Obfuscation using Differential Privacy
On April 6 Natasha Fernandes presented her MRes thesis work on A Novel Framework for Author Obfuscation using Generalised Differential Privacy.
The problem of obfuscating the authorship of a text document has received little attention in the literature to date. Current approaches are ad-hoc and rely on assumptions about an adversary’s auxiliary knowledge which makes it difficult to reason about the privacy properties of these methods. Differential privacy is a well-known and robust privacy approach, but its reliance on the notion of adjacency between datasets has prevented its application to text document privacy to date. A relatively new approach to privacy known as generalised differential privacy extends differential privacy to arbitrary datasets endowed with a metric and permits the private release of individual data points. In this paper we show how to apply generalised differential privacy to author obfuscation, utilising existing tools and methods from the stylometry and natural language processing literature.
On April 13 we discussed the following two papers:
- Threshold concepts in computer science: do they exist and are they useful? by Boustedt et al
- Identifying threshold concepts: from dead end to a new direction by Shinners-Kennedy et al
On March 16 we discussed the following paper: Abstraction ability as an indicator of success for learning computing science? by Bennedssen and Caspersen.
On March 9 we discussed two short papers about the applications and use of WebAssembly:
- TaintAssembly: Taint-Based Information Flow Control Tracking for WebAssembly by Fu et al.
On March 2 we discussed the paper Mechanising and Verifying the WebAssembly Specification by Watt.
Safety-Critical Use of a Formally-Verified Compiler
On January 19 we discussed the paper CompCert: Practical Experience on Integrating and Qualifying a Formally Verified Optimizing Compiler by Kästner et al.