Foundations

Since its inception in the 1970s [Bur74] [Kin76] [DE82], Symbolic Execution has been extensively applied for bug finding (e.g., for automated whitebox fuzzing [GLM12] and test case generation [CDE08]) and program verification (e.g., as a main building block for deductive software verification [ABB+16]). Given its popularity, one may find it surprising that literature on the technique’s semantic foundations is relatively sparse. Surveys such as [BCCDElia+18] [YFB+19] provide a high-level idea of what SE is about before quickly moving on to technical details such as the implementation of symbolic memory and solutions to the path explosion problem.

The literature agrees that SE maintains (1) a path condition (or path constraint) accumulating conditions leading to the current execution path, and (2) a symbolic store mapping variables to (symbolic) values. Simply speaking, the store is updated by assignments, while the path condition is updated by case distinctions (e.g., an if statement). Together, symbolic store and path condition form a symbolic (execution) state. Each such state is additionally associated to a point in the program, the program counter, identifying the next instruction to execute.

The following questions, associated to the semantics of SE, are usually left open:

  1. What is the relation between symbolic and concrete states?

  2. What does it mean to symbolically execute? In particular,

  3. What properties must a symbolic execution satisfy to be useful for testing and/or verification?

We know of four takes on the semantics of SE [Kne91] [LRA17] [dBB19] [Steinhofel20]. All of them provide two complementary “correctness” properties, where usually one is related to program proving and the other one to testing, respectively. Only the system by Steinhöfel [Steinhofel20] provides a semantics of symbolic states; the formalism by Lucanu et al. [LRA17] does not explicitly incorporate symbolic states. With the exception of [Steinhofel20], which develops a semantics centered around symbolic states, all formalisms explain SE by relating concrete and symbolic transition systems using some notion of simulation.

The proposed correctness properties are not equivalent; yet, they generally share the same idea: One property holds for SE if at least all feasible paths are explored, while the other one hold if at most all feasible paths are explored. The exception is the earliest of the mentioned works by Kneuper [Kne91], who defines full SE as a system exploring exactly all feasible paths, and weak SE as a system exploring at least all feasible paths.

In the following, we introduce the system proposed by Steinhöfel in [Steinhofel20]. The advantages of this particular approach to give a semantics to symbolic execution are:

  • It is centered around smallest building block of SE, the symbolic state. The notion of concretizations of symbolic states, which is essential for their semantics, is also useful to define concolic execution later on.

  • The approach is not restricted to either testing or verification. In contrast, e.g., the correctness properties in [Kne91] focus on program verification, and de Boer and Bonsangue [dBB19] drew inspiration from verification when naming their properties.

  • Since the correctness properties in [Steinhofel20] are based on relations between states, as opposed to transition systems, they are more fine grained: The steps in the symbolic transition system are not tied to the ones in a concrete system. If desired, such a coupling can be integrated by imposing additional, orthogonal, properties.

  • The system supports general m-to-n transitions, which, for instance, allows to define powerful state merging rules.

  • Its formal definitions require a relatively small overhead in terms of required underlying formalisms; the work by Lucanu et al. [LRA17], e.g., requires understanding the Reachability Logic of the K system.

After introducing the abstract syntax and semantics of symbolic states in the next section, we define a baseline symbolic interpreter in the subsequent one, along with our correctness properties for symbolic transition relations.

References

ABB+16

Wolfgang Ahrendt, Bernhard Beckert, Richard Bubel, Reiner Hähnle, Peter H. Schmitt, and Mattias Ulbrich, editors. Deductive Software Verification – The KeY Book. Volume 10001 of LNCS. Springer, 2016. doi:10.1007/978-3-319-49812-6.

BCCDElia+18

Roberto Baldoni, Emilio Coppa, Daniele Cono D'Elia, Camil Demetrescu, and Irene Finocchi. A Survey of Symbolic Execution Techniques. ACM Comput. Surv., 51(3):50:1–50:39, 2018. doi:10.1145/3182657.

Bur74

Rodney Matineau Burstall. Program Proving as Hand Simulation with a Little Induction. In Information Processing, pages 308–312. Elsevier, 1974.

CDE08

Cristian Cadar, Daniel Dunbar, and Dawson Engler. KLEE: Unassisted and Automatic Generation of High-coverage Tests for Complex Systems Programs. In 8th USENIX Conference on Operating Systems Design and Implementation, 209–224. Berkeley, CA, USA, 2008. USENIX Association.

DE82

R.B. Dannenberg and G.W. Ernst. Formal Program Verification Using Symbolic Execution. IEEE Transactions on Software Engineering, SE-8(1):43–52, 1982.

dBB19(1,2)

Frank S. de Boer and Marcello M. Bonsangue. On the Nature of Symbolic Execution. In Proc. 3rd World Congress on Formal Methods (FM), 64–80. 2019. doi:10.1007/978-3-030-30942-8\_6.

GLM12

Patrice Godefroid, Michael Y. Levin, and David A. Molnar. SAGE: Whitebox Fuzzing for Security Testing. ACM Queue, 10(1):20, 2012. URL: https://doi.org/10.1145/2090147.2094081, doi:10.1145/2090147.2094081.

Kin76

James C. King. Symbolic Execution and Program Testing. Communications of the ACM, 19(7):385–394, 1976.

Kne91(1,2,3)

Ralf Kneuper. Symbolic Execution: A Semantic Approach. Sci. Comput. Program., 16(3):207–249, 1991. doi:10.1016/0167-6423(91)90008-L.

LRA17(1,2,3)

Dorel Lucanu, Vlad Rusu, and Andrei Arusoaie. A Generic Framework for Symbolic Execution: A Coinductive Approach. J. Symb. Comput., 80:125–163, 2017. doi:10.1016/j.jsc.2016.07.012.

Steinhofel20(1,2,3,4,5)

Dominic Steinhöfel. Abstract Execution: Automatically Proving Infinitely Many Programs. PhD thesis, TU Darmstadt, Dept. of Computer Science, Darmstadt, Germany, 2020. URL: http://tuprints.ulb.tu-darmstadt.de/8540/, doi:10.25534/tuprints-00008540.

YFB+19

Guowei Yang, Antonio Filieri, Mateus Borges, Donato Clun, and Junye Wen. Advances in Symbolic Execution. In Atif M. Memon, editor, Advances in Computers, volume 113 of Advances in Computers, pages 225 – 287. Elsevier, 2019. doi:https://doi.org/10.1016/bs.adcom.2018.10.002.