EuroSys
The European Professional Society on Computer Systems

European chapter of the
Special Interest Group on Operating Systems (SIGOPS)
of the Association for Computing Machinery (ACM)
Home Join or renew membership EuroSys for Students EuroSys for Faculty Job Offers Activities Systems Directory Systems Events and Blog Eastern Europe Initiative Member Area Member News Officers and Volunteers Useful links Press releases

OSDI 2008 - Day 2 - Session 2

Binary Translation Using Peephole Superoptimizers

Sorav Bansal and Alex Aiken, Stanford University

They introduces a new scheme for performing binary translation that produces code whose efficiency is comparable to existing binary translators with much less engineering effort. Instead of hand-coding the translation from one instruction set to another, the approach automatically learns translation rules using superoptimization techniques.
They have implemented a PowerPC-x86 binary translator and report results on small and large compute-intensive benchmarks. When compared to the native compiler, the translated code achieves median performance of 67% on large benchmarks and in some small stress tests actually outperforms the native compiler.
It is not clear whether their approach is thread-safe. Assuming the code on the source architecture is thread-safe, it is not clear if the produced code is thread-safe in the target architecture.
R2: An Application-Level Kernel for Record and Replay

Zhenyu Guo, Microsoft Research Asia; Xi Wang, Tsinghua University; Jian Tang and Xuezheng Liu, Microsoft Research Asia; Zhilei Xu, Tsinghua University; Ming Wu, Microsoft Research Asia; M. Frans Kaashoek, MIT CSAIL; Zheng Zhang, Microsoft Research Asia

Library-based record and replay tools aim to reproduce an application’s execution by recording the results of selected functions in a log, and during replay returning the results from the log rather than executing the functions. These tools must ensure that a replay run is identical to the record run.
These methods have limited usefulness, because function with side effects and reentrant function pose big problems in replaying programs if we don’t execute these functions - the replays may be infeasible/unrealistic.
To alleviate this problem, R2 allows developers to choose functions that can be recorded and replayed correctly. Developers annotate the chosen functions with simple keywords so that R2 can handle calls with side effects and multithreading. R2 generates code for record and replay from templates, allowing developers to avoid implementing stubs for hundreds of functions manually.
It is not clear how easy is to write such annotations. Moreover, if the annotations are wrong, spurious replays may be generated.
KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs

Cristian Cadar, Daniel Dunbar, and Dawson Engler, Stanford University

KLEE is a symbolic execution tool for automatically generating tests with high coverage and discovering bugs. It is a timid tentative to make model checking practical for real applications. In fact, it is not clear from the paper what bugs KLEE tries to discover, but I suppose it tries to discover the same bugs model checkers use to address (e.g., memory errors like null pointer dereference and buffer overflow, or assertions).

KLEE shares many ideas with modern model checking techniques based on symbolic execution. In my opinion, KLEE doesn’t bring any important contribution in the area of bug discovery. All the techniques they used for optimizing the contraint solving (expression rewriting, propagation of constants, counterexample caching, implied value concretization, extracting independent constraints, etc.) are already employed by modern contraint solvers (e.g., SAT solvers), and all modern model checkers have some powerful constraint solver as backend.

They don’t provide any intuition of why the state scheduling strategies that KLEE uses (random path selection and coverage-optimized search), the metrics they use in these strategies (e.g., minimum distance to an uncovered instruction and call stack of a state) and the way they switch between strategies (round-robin) are efficient in practice. I suppose the minimum distance to an uncovered instruction is useful for a fast convergence to a high coverage of the tests. I suppose they use the call stack information to find tests having as few nested function calls as possible, and possibly to also bound the recursive calls. We can only speculate, since they don’t provide any clear intuition behind these heuristics they use. However, I believe these heuristics are only useful for converging faster to a good coverage. I don’t know how well they will work for bug discovery. They may harm the bug discovery, instead of helping it. In my opinion, finding a bug efficiently doesn’t have so much to do with the coverage of the symbolic execution, but rather with the amount of relevant states explored. They have to employ other heuristics for optimizing the bug discovery.

The programs they used to test KLEE have about 1KLOC on average (452 applications totaling 430 KLOC). Applications of this size don’t constitute a challenge for modern model checkers — they can easily handle 10KLOC applications. They also don’t compare KLEE to existing test generators and model checkers, on the same test C programs.

An important missing feature is the ability to symbolically execute multi-threaded programs. The state explosion is even worse for multi-threaded programs, compared to sequential programs (i.e., factorial vs. exponential blowup). There are already model checkers that can handle multi-threaded programs.

The only real contribution of KLEE seems to be the environment modeling. KLEE is able to execute programs that are environmentally-intensive. The environmnent is represented by command line arguments, file data, network packets, etc. All variables which denote inputs (e.g., command line arguments) are symbolic variables — they can have any initial value. Files whose names are stored by symbolic variables become symbolic files; the other files are concrete files. KLEE handles in an interesting way the symbolic files, by implementing a symbolic file system that efficiently simulates all I/O operations performed on that file. The I/O operations are executed natively on concrete files.

The native execution of I/O operations will constitute a bottleneck for programs that do heavy I/O on concrete files in certain states and revisit those states very often. I would therefore renounce at the notion of concrete file — I would consider all files symbolic. Moreover, it is not clear how the I/O operations on concrete files are isolated while symbolically executing different paths. Mechanisms like copy-on-write (or some sort of versioning and rollback) should be used on top of the I/O operations — they cannot be just executed natively.

Comments are closed.