A Symbolic Interpreter and Correctness of Symbolic Transitions¶

Executing a program with the concrete minipy interpreter updates the passed environment, and potentially raises an exception. Symbolic execution, in contrast, takes into account all feasible paths through a program. We distinguish between online and offline SE (see, e.g., [BCCDElia+18]). Online SE executes multiple symbolic paths simultaneously: For example, executing an if statement leads to two paths which are both explored. Offline SE reasons about one single path at a time. Concolic execution, where SE is guided by a single concrete input, is an example of the latter class. In this section, we regard online SE. Symbolically executing a program in an online symbolic interpreter results in multiple output environments. Also, such an interpreter cannot simply raise an exception, as this would result in a premature termination of the analysis, skipping other feasible program paths. The consequences for our static, online symbolic interpreter are:

Instead of updating the initial symbolic environment, the interpreter produces potentially multiple new environments. Our interpreter even outputs all intermediate states, such that the output is a Symbolic Execution Tree (SET).
Thrown exceptions are stored in the symbolic environment instead of being raise. Thus, we need another field in our SymbolicEnvironment class.

We implement these changes subsequently. Afterward, we show how to implement a static, online symbolic interpreter for minipy, and introduce the correctness properties exhaustiveness and precision for symbolic execution. Finally, we show how these definitions can be used to automatically test the symbolic interpreter. Using this methodology, we discovered two bugs of precious versions of our implementation, one of which is related to a subtle difference in the division semantics in z3 and Python.

Symbolic Execution Trees¶

We first update the implementation of SymbolicEnvironment to include abrupt completion information:

import utils
from minipy import *
from constraint_solving import *
from semantics import *

class SymbolicEnvironment(SymbolicEnvironment):
    def __init__(self,
                 store: Optional[SymbolicStore] = None,
                 path_constraints: Optional[Set[z3.BoolRef]] = None,
                 abrupt_completion: Optional[Exception] = None,
                 functions: Optional[Dict[str, Tuple[Tuple[Variable], Type, Callable]]] = None):
        super().__init__(store, path_constraints, functions)
        self.abrupt_completion = abrupt_completion

    def copy(self):
        return SymbolicEnvironment(self.store, self.path_constraints, self.abrupt_completion, self.functions)

    def set(self, variable: Variable, value: SymbolicValueType) -> 'SymbolicEnvironment':
        return SymbolicEnvironment(self.store.set(variable, value), self.path_constraints,
                                   self.abrupt_completion, self.functions)

    def set_abrupt_completion(self, exc: Optional[Exception]) -> 'SymbolicEnvironment':
        return SymbolicEnvironment(self.store, self.path_constraints, exc, self.functions)

    def add_constraint(self, constraint: z3.BoolRef) -> 'SymbolicEnvironment':
        if z3.is_false(z3.simplify(constraint)):
            new_constraints = {z3.BoolVal(False)}
        elif z3.is_true(z3.simplify(constraint)):
            return self
        else:
            old_constraints = {other_constraint for other_constraint in self.path_constraints
                               if not implies(constraint, other_constraint)}
            new_constraints = old_constraints | {constraint}


        return SymbolicEnvironment(self.store, new_constraints, self.abrupt_completion, self.functions)


    def add_function(self, name: str, params: Tuple[Variable], type: Type, impl: Callable) -> 'SymbolicEnvironment':
        new_functions = copy.deepcopy(self.functions)
        new_functions[name] = (params, type, impl)
        return SymbolicEnvironment(self.store, self.path_constraints, self.abrupt_completion, new_functions)

    def __str__(self):
        return ("(" +
                ("{}" if not self.path_constraints else str(self.path_constraints)) + ", " +
                str(self.store) +
                ("" if self.abrupt_completion is None else repr(self.abrupt_completion)) +
                ")")

    def to_html(self):
        prev_html = z3.in_html_mode()
        z3.set_html_mode(True)

        result = f"({self.path_constraint_to_html()}, {repr(self.abrupt_completion)}, {self.store.to_html()})"

        z3.set_html_mode(prev_html)
        return result

    def path_constraint_to_html(self):
        prev_html = z3.in_html_mode()
        z3.set_html_mode(True)

        path_constraint_string = "{"
        if self.path_constraints:
            path_constraint_string += ", ".join(map(z3_html_escape, map(z3.obj_to_string, self.path_constraints)))
        path_constraint_string += "}"
        path_constraint_string = path_constraint_string.replace("\n", "<BR/>").replace(" ", "&nbsp;")

        z3.set_html_mode(prev_html)
        return path_constraint_string


    def __repr__(self):
        result = f"SymbolicEnvironment({repr(self.store)}, " \
                 f"{repr(self.path_constraints)}, " \
                 f"{repr(self.abrupt_completion)}, "
        result += repr({f_name: f"fun {f_name}{params} -> "
                                f"{type}" for f_name, (params, type, _) in self.functions.items()})
        return f"{result})"

    def __eq__(self, other):
        return isinstance(other, SymbolicEnvironment) and self.store == other.store and \
               self.path_constraints == other.path_constraints and \
               self.functions == other.functions and \
               (self.abrupt_completion == other.abrupt_completion is None or
                ExceptionWrapper(self.abrupt_completion) == ExceptionWrapper(other.abrupt_completion))

Next, we define a class for nodes in an SET. An SET node comprises a symbolic environment and a string representation of the next instruction to execute. We also define a to_html method to produce pretty output.

class SETNode:
    def __init__(self, environment: SymbolicEnvironment, pc: Optional[Union[ParseTree, str]] = None):
        self.environment = environment
        self.pc = pc

    @property
    def pc(self):
        return self.__pc

    @pc.setter
    def pc(self, pc: Optional[Union[ParseTree, str]]):
        if pc is None:
            self.__pc = ""
        else:
            assert isinstance(pc, str)
            self.__pc = pc.split("\n")[0].strip()

    def to_html(self):
        prev_html = z3.in_html_mode()
        z3.set_html_mode(True)

        result = f'<TABLE BORDER="0"><TR>'
        if "<BR/>" in self.environment.to_html():
            result += f'<TD BALIGN="LEFT" ALIGN="LEFT">{self.environment.to_html()}</TD>'
        else:
            result += f'<TD>{self.environment.to_html()}</TD>'
        result += "</TR>"

        if self.pc:
            result += '<TR><TD ALIGN="CENTER"><FONT FACE="Courier">' + z3_html_escape(self.pc) + "</FONT></TD></TR>"
        result += "</TABLE>"

        z3.set_html_mode(prev_html)
        return result

    def __repr__(self):
        return f"SETNode({repr(self.environment)}, {repr(self.pc)})"

    def __eq__(self, other):
        return type(other) is type(self) and self.environment == other.environment and self.pc == other.pc

Symbolic execution trees themselves are defined just like derivation trees, only with SETNodes instead of strings as nodes.

SET = Tuple[SETNode, List['SET']]

The following function produces graphical representations (using GraphViz) of SETs. Leaves are highlighted using the following color scheme: Red leaves represent raised exceptions, green leaves represent nodes with unsatisfiable path conditions, and blue leaves all other nodes containing “normal” computation results.

from graphviz import Digraph

def display_set(symbolic_execution_tree: SET) -> Digraph:
    def traverse_tree(dot: Digraph, tree: SET, node_id=0):
        node, children = tree
        
        attrs = {}
        if node.environment.unsatisfiable():
            attrs = {"fillcolor": "lightgreen", "style": "filled"}
        elif (node.environment.abrupt_completion is not None and
              not isinstance(node.environment.abrupt_completion, AbruptCompletionNoException)):
            attrs = {"fillcolor": "lightpink1", "style": "filled"}
        elif not children:
            attrs = {"fillcolor": "paleturquoise2", "style": "filled"}

        dot.node(repr(node_id), "<" + node.to_html() + ">", _attributes=attrs)

        if children:
            for child in children:
                nonlocal counter
                counter += 1
                child_id = counter
                dot.edge(repr(node_id), repr(child_id))
                traverse_tree(dot, child, child_id)

    counter = 0
    dot = Digraph(comment="Symbolic Execution Tree")
    dot.attr('node', shape='box')
    traverse_tree(dot, symbolic_execution_tree)
    return dot

Recall our small example from the previous section:

x = 2 * y
if x < 0:
    x = -x

The final SET resulting from symbolically executing this program will contain two leaves, one branch each for the case where the condition of the if statements hold or does not hold, respectively. Apart from the root, it also contains two inner nodes corresponding to the first assignment and the if statement. In the absence of an implemented symbolic interpreter, we manually construct this SET:

x = Variable("x", INT_TYPE)
y = Variable("y", INT_TYPE)

env_0 = SymbolicEnvironment(SymbolicStore({x: x.to_z3(), y: y.to_z3()}))
env_1 = env_0.set(x, z3.IntVal(2) * y.to_z3())
env_2 = env_1.add_constraint(x.to_z3() < z3.IntVal(0))
env_2_1 = env_2.set(x, z3.IntVal(-1) * x.to_z3())
env_3 = env_1.add_constraint(z3.Not(x.to_z3() < z3.IntVal(0)))

root = SETNode(env_0, "x = 2 * y")
if_node = SETNode(env_1, "if x < 0:")
inversion_node = SETNode(env_2, "x = -x")
leave_1 = SETNode(env_2_1)
leave_2 = SETNode(env_3)

tree = (root, [
    (if_node, [
        (inversion_node, [(leave_1, [])]),
        (leave_2, [])])])

display_set(tree)

Evaluating Expressions¶

Also the symbolic evaluation of expressions can have multiple results. Consider the expression y // x: Depending on whether x is zero or non-zero, evaluating this expression results in a value or a ZeroDivisionError. Thus, our result type for expression evaluation is a collection of pairs of either a value or an exception, together with a boolean constraint signaling under which condition the respective result is returned.

EvalExprResult = List[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]]

We define a helper function handle_binary_operation computing all conditioned results for a binary operation (e.g., division or conjunction) applied on two evaluation results.

def handle_binary_operation(left: EvalExprResult, right: EvalExprResult, op: str) -> EvalExprResult:
    result = []

    for left_result in left:
        left_val, left_constr = left_result

        if isinstance(left_val, Exception):
            result.append(left_result)
            continue

        if op == "and":
            result.append((z3.BoolVal(False), z3.simplify(z3.And(left_constr, z3.Not(left_val)))))
        if op == "or":
            result.append((z3.BoolVal(True), z3.simplify(z3.And(left_constr, left_val))))

        for right_result in right:
            right_val, right_constr = right_result
            constraint_conjunction = z3.And(left_constr, right_constr)
            constraint_conjunction = z3_and(z3.simplify(constraint_conjunction))

            if z3.is_false(constraint_conjunction):
                continue

            if isinstance(right_val, Exception):
                # Short-circuit evaluation:
                if op == "or":
                    constraint_conjunction = z3.And(constraint_conjunction, z3.Not(left_val))
                elif op == "and":
                    constraint_conjunction = z3.And(constraint_conjunction, left_val)

                constraint_conjunction = z3.simplify(constraint_conjunction)
                if not z3.is_false(constraint_conjunction):
                    result.append((right_val, constraint_conjunction))

                continue

            left_val: Union[z3.ArithRef, z3.BoolRef]
            right_val: Union[z3.ArithRef, z3.BoolRef]

            if op == "and":
                constraint_conjunction = z3.simplify(z3.And(constraint_conjunction, left_val))
                if not z3.is_false(constraint_conjunction):
                    result.append((right_val, constraint_conjunction))
                continue

            if op == "or":
                constraint_conjunction = z3.simplify(z3.And(constraint_conjunction, z3.Not(left_val)))
                if not z3.is_false(constraint_conjunction):
                    result.append((right_val, constraint_conjunction))
                continue

            operator_map = {
                "+": lambda a, b: z3.Concat(a, b) if isinstance(a, z3.SeqRef) else a + b,
                "-": operator.sub,
                "*": operator.mul,
                "==": operator.eq,
                "!=": lambda a, b: z3.Not(a == b),
                "<": operator.lt,
                "<=": operator.le,
                ">": operator.gt,
                ">=": operator.ge,
                # "and": lambda a, b: z3.And(a, b),
                # "or": lambda a, b: z3.Or(a, b)
            }

            if op in operator_map:
                result.append((operator_map[op](left_val, right_val), constraint_conjunction))
                continue

            right_zero = right_val == z3.IntVal(0)
            right_nonzero = z3.Not(right_zero)

            if op == "//":
                result.append((z3.ToInt(z3.ToReal(left_val) / z3.ToReal(right_val)),
                               z3.And(constraint_conjunction, right_nonzero)))
                result.append((ZeroDivisionError("division by zero"),
                               z3.And(constraint_conjunction, right_zero)))
                continue

            if op == "%":
                result.append((left_val % right_val, z3.And(constraint_conjunction, right_nonzero)))
                result.append((ZeroDivisionError("division by zero"),
                               z3.And(constraint_conjunction, right_zero)))
                continue

            assert False

    return result

x = Variable("x", INT_TYPE)
y = Variable("y", INT_TYPE)

result = handle_binary_operation(
    [(x.to_z3(), z3.BoolVal(True))],
    [(y.to_z3(), z3.BoolVal(True))],
    "//")

display_html(str(result))

[(ToInt(ToReal(x)/ToReal(y)), And(True, Not(0 == y))), (ZeroDivisionError('division by zero'), And(True, 0 == y))]

The term produced for the division, ToInt(ToReal(x)/ToReal(y)), might come unexpected: Why not simply x / y? The reason is the semantics of Python’s Integer division //, which is a floor division 1. Essentially, Python rounds towards 0 for positive, and away from 0 for negative numbers. That is, 1 // -2 evaluates to -1, while z3.IntVal(1) / z3.IntVal(-2) evaluates to 0. Since the z3.ToInt function rounds away from zero for negative numbers, the implementation above correctly models Python’s (and minipy’s) floor division semantics.

Based on this function, the implementation of expression evaluation can be handled uniformly in many cases. For example, consider the evaluation of sums and terms below:

class SymbolicInterpreter:
    # ...
    
    def evaluate_sum(self, expr: Sum, environment: SymbolicEnvironment) -> EvalExprResult:
        left = self.evaluate(expr.left, environment)
        right = self.evaluate(expr.right, environment)
        return handle_binary_operation(left, right, expr.op)

    def evaluate_term(self, expr: Term, environment: SymbolicEnvironment) -> EvalExprResult:
        left = self.evaluate(expr.left, environment)
        right = self.evaluate(expr.right, environment)
        return handle_binary_operation(left, right, expr.op)
    
    # ...

When evaluating tuple expressions, we sequentially evaluate all tuple elements and add an exception result if one of these evaluations raises an exception. We create a z3 sequence expression for the case where all tuple elements evaluate to a value. Consider the tuple (x // y, x // (y + 1)). Its symbolic evaluation has three results:

y is 0, then the tuple evaluates to a ZeroDivisionError.
y is -1, then the tuple evaluates to a ZeroDivisionError.
y does not equal either 0 or -1, then the tuple evaluates to a sequence expression containing the symbolic values corresponding to the division.

For considering all these options, we build the product of the individual evaluation results for the tuple elements. This yields a list of tuples, where each tuple has the length of the evaluated tuple and corresponds to one evaluation result. For example:

import itertools

result_1 = handle_binary_operation(
    [(x.to_z3(), z3.BoolVal(True))],
    [(y.to_z3(), z3.BoolVal(True))],
    "//")

result_2 = handle_binary_operation(
    [(x.to_z3(), z3.BoolVal(True))],
    [(y.to_z3() + z3.IntVal(1), z3.BoolVal(True))],
    "//")

results: List[EvalExprResult] = [result_1, result_2]
list(itertools.product(*results))

[((ToInt(ToReal(x)/ToReal(y)), And(True, Not(0 == y))),
  (ToInt(ToReal(x)/ToReal(y + 1)), And(True, Not(0 == y + 1)))),
 ((ToInt(ToReal(x)/ToReal(y)), And(True, Not(0 == y))),
  (ZeroDivisionError('division by zero'), And(True, 0 == y + 1))),
 ((ZeroDivisionError('division by zero'), And(True, 0 == y)),
  (ToInt(ToReal(x)/ToReal(y + 1)), And(True, Not(0 == y + 1)))),
 ((ZeroDivisionError('division by zero'), And(True, 0 == y)),
  (ZeroDivisionError('division by zero'), And(True, 0 == y + 1)))]

We loop over the elements of the tuples in the list, and stop at the first exception (if any), which will be the evaluation result. Along the way, we collect the path constraints of the seen evaluation results for the tuple elements. Note that the list above contains four, and not three elements. When considering those, the third and fourth evaluation alternatives collapse to a single one, since we stop after processing the evaluation result for the first tuple element, which is an exception (the fourth element is anyway infeasible, since y cannot attain the values 0 and -1 at the same time).

To avoid creating the same result for equivalent evaluations (e.g., if the same exception is raised under the same condition when evaluating different tuple elements), we first collect evaluation results in a set. We wrap exceptions inside a wrapper object computing sensible hashes as Python exceptions generally do not supply a suitable hashing implementation.

class SymbolicInterpreter:
    # ...
    
    def evaluate_tuple(self, expr: TupleNode, environment: SymbolicEnvironment) -> EvalExprResult:
        sub_results: List[EvalExprResult] = [self.evaluate(elem, environment) 
                                             for elem in expr.elems]
        result: OrderedSet[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]] = OrderedSet([])

        combinations: List[Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]] = \
            list(itertools.product(*sub_results))

        combination: Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]
        for combination in combinations:
            values: List[SymbolicValueType] = []
            path_constraint: z3.BoolRef = z3.BoolVal(True)
            sub_result: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]
            for sub_value, sub_constraint in combination:
                path_constraint = z3.simplify(z3.And(path_constraint, sub_constraint))
                if isinstance(sub_value, Exception):
                    result.add(wrap_exceptions((sub_value, path_constraint)))
                    break

                values.append(sub_value)

            if len(values) != len(combination):
                continue

            if len(values) == 1:
                value = values[0]
                result.add((value if isinstance(value.sort(), z3.SeqSortRef) 
                            else z3.Unit(value), path_constraint))
            elif len(values) > 1:
                result.add((z3.Concat(*[value if isinstance(value.sort(), z3.SeqSortRef) 
                                        else z3.Unit(value)
                                        for value in values]), path_constraint))

        return [unwrap_exceptions(r) for r in result]
    
    # ...

For evaluating tuple access (e.g., t[x]), we have to consider the possible exceptions that may arise: Apart form the exceptions raised by evaluating t and x (which can be non-atomic expressions), x might be out of range, or t and x might have the wrong sorts (i.e., not tuple or int). All this is considered in the following implementation:

class SymbolicInterpreter:
    # ...
    
    def handle_tuple_access(self, tuple_result: EvalExprResult, expr_result: EvalExprResult) -> EvalExprResult:
        result: EvalExprResult = []

        for atom_value, atom_constraint in tuple_result:
            if isinstance(atom_value, Exception):
                result.append((atom_value, atom_constraint))
                continue

            if not isinstance(atom_value.sort(), z3.SeqSortRef):
                result.append((TypeError(f"'{atom_value.sort()} is not subscriptable"),
                               atom_constraint))
                continue

            for expr_value, expr_constraint in expr_result:
                conjunction = z3.And(atom_constraint, expr_constraint)
                if isinstance(expr_value, Exception):
                    result.append((expr_value, conjunction))
                    continue

                if expr_value.sort() != z3.IntSort():
                    result.append((TypeError(f"list indices must be integers, not {expr_value.sort()}"),
                                   conjunction))
                    continue

                atom_value: z3.SeqRef

                # In Python, indices can be in the closed interval [-len ; len - 1]
                length_constraint_1 = z3.And(z3.IntVal(0) <= expr_value,
                                             expr_value < z3.Length(atom_value))  # [0 ; len - 1]
                length_constraint_2 = z3.And(z3.IntVal(0) > expr_value,
                                             expr_value >= z3.IntVal(-1) * z3.Length(atom_value))  # [-len ; -1]

                expr = z3.simplify(atom_value[expr_value])
                if expr.decl().kind() == z3.Z3_OP_ITE and \
                        implies(z3.And(conjunction, length_constraint_1), expr.children()[0]):
                    expr = expr.children()[1]

                result.append((expr, z3.And(conjunction, length_constraint_1)))

                expr = z3.simplify(atom_value[z3.Length(atom_value) + expr_value])
                if expr.decl().kind() == z3.Z3_OP_ITE and \
                        implies(z3.And(conjunction, length_constraint_2), expr.children()[0]):
                    expr = expr.children()[1]

                result.append((expr, z3.And(conjunction, length_constraint_2)))

                result.append((IndexError("tuple index out of range"),
                               z3.And(conjunction, z3.Not(length_constraint_1), z3.Not(length_constraint_2))))

        return result

    # ...

The execution of function definitions adds a Callable to the symbolic environment, which accepts an environment, a list of symbolic arguments, and returns an EvalExprResult. The evaluation of a function call has to consider all exceptions thrown when evaluating the arguments. Additionally, an exception is registered if an unexpected number of arguments is given. Note that we do not check argument types, which is generally not done in a dynamically typed language like Python. In minipy, however, we always have complete type information, which is why this problem could be ruled out by a static check taking place before symbolic execution.

class SymbolicInterpreter:
    # ...
    
    def evaluate_fun_call(self, expr: FunctionCall, environment: SymbolicEnvironment) -> EvalExprResult:
        result: EvalExprResult = []

        f_name = expr.name
        if f_name not in environment.functions:
            raise NameError(f"name '{f_name}' is not defined")

        args_result = [self.evaluate(arg, environment) for arg in expr.args]

        combinations: List[Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]] = \
            list(itertools.product(*args_result))

        combination: Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]
        for combination in combinations:
            values: List[SymbolicValueType] = []
            path_constraint: z3.BoolRef = z3.BoolVal(True)
            sub_result: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]
            for sub_value, sub_constraint in combination:
                path_constraint = z3.simplify(z3.And(path_constraint, sub_constraint))
                if isinstance(sub_value, Exception):
                    result.append(wrap_exceptions((sub_value, path_constraint)))
                    break

                values.append(sub_value)

            if len(values) != len(combination):
                continue

            if f_name in environment.functions:
                params, ret_type, fun = environment.functions[f_name]
                if len(values) != len(params):
                    raise TypeError(
                        f"{f_name}({', '.join(map(str, params))}) takes {len(params)} "
                        f"argument(s) but {len(values)} were given")

                result.extend(fun(environment, *values))
            else:
                assert False

        return result
    
    # ...

Executing Statements¶

Similarly to the evaluation of expressions, the execution of statements has to consider all possible outcomes, such that the execution result is a tree and not a mere update of the environment. We explain the implementation of the execution functions for the stereotypical assignment, if, try, and while statements; moreover, we have a look at the function definitions and calls.

Executing an assignment results in one leaf node per raised exception during the evaluation of the right-hand side, plus a normal evaluation result when the right-hand side evaluates to a value.

class SymbolicInterpreter:
    # ...
    
    def execute_assignment(self, stmt: Assignment, environment: SymbolicEnvironment) -> SET:
        eval_expr_result = [
            (value, constraint)
            for value, constraint in self.evaluate(stmt.expression, environment)
            if not self.unsatisfiable(environment.add_constraint(constraint))
        ]

        children: List[SET] = []
        result = (SETNode(environment, stmt.code), children)

        for value, constraint in eval_expr_result:
            if isinstance(value, Exception):
                new_env = environment.set_abrupt_completion(value)
            else:
                variable = Variable(stmt.lhs, get_type_for_z3_expr(value))
                new_env = environment.set(variable, value)

            new_env = new_env.add_constraint(constraint)

            children.append((SETNode(new_env, None), []))
        
        return result
    
    # ...

In the case of an if statement, we have one leaf for each raised exception when evaluating the guard, plus one leaf each for the execution of the then and else branch of the conditional. To avoid creating infeasible subtrees, we add an unsatisfiability check for each case. To that end, we add a method constraint_unsatisfiable (and a convenience method unsatisfiable for environments) to SymbolicInterpreter, which, as of now, only works as a proxy to is_unsat. Yet, it will greatly simplify adapting the symbolic to a concrete interpreter in Concolic Execution.

class SymbolicInterpreter:
    # ...
    
    def constraint_unsatisfiable(self, constraint: z3.BoolRef) -> bool:
        return is_unsat(constraint)

    def unsatisfiable(self, env: SymbolicEnvironment) -> bool:
        return self.constraint_unsatisfiable(z3_and(*env.path_constraints))
    
    def execute_if_stmt(self, stmt: IfStmt, environment: SymbolicEnvironment) -> SET:
        eval_expr_result = self.evaluate(stmt.guard, environment)

        result: SET = (SETNode(environment, stmt.code), [])

        for value, constraint in eval_expr_result:
            new_env = environment.add_constraint(constraint)

            if self.unsatisfiable(new_env):
                continue

            if isinstance(value, Exception):
                result[1].append((SETNode(new_env.set_abrupt_completion(value), ''), []))
                continue

            assert z3.is_bool(value)

            then_constraint: z3.BoolRef = z3.simplify(value)
            else_constraint = z3.simplify(z3.Not(value))

            then_env = new_env.add_constraint(then_constraint)
            else_env = new_env.add_constraint(else_constraint)

            then_env_unsat = self.unsatisfiable(then_env)
            else_env_unsat = self.unsatisfiable(else_env)
            
            assert not then_env_unsat or not else_env_unsat

            if not then_env_unsat:
                subtree = self.execute(stmt.then_block, then_env)
                result[1].append(subtree)

            if not else_env_unsat:
                if stmt.else_block is not None:
                    subtree = self.execute(stmt.else_block, else_env)
                    result[1].append(subtree)
                else:
                    result[1].append((SETNode(else_env, ''), []))

        return result

    
    # ...

The path condition for the then leg of the if statement is extended by the constraint corresponding to the evaluation of the guard along with the evaluation result of the guard expression. Including the constraint for the value of the guard is important if its evaluation can raise an exception; then, we may only execute the bodies of the if statement in the normal completion case.

Executing a try statement is relatively simple. The only specialty is that we have to look up exception types from except clauses to check whether a raised exception matches the caught type (alternatively, we could resort to simple class name comparisons).

class SymbolicInterpreter:
    # ...
    
    def execute_try_stmt(self, stmt: TryStmt, environment: Environment) -> None:
        try:
            self.execute(stmt.block, environment)
        except Exception as exc:
            if issubclass(type(exc), AbruptCompletionNoException):
                raise exc

            if stmt.exc_type is not None:
                caught_exc_type = getattr(sys.modules["builtins"], stmt.exc_type)
                if issubclass(type(exc), caught_exc_type):
                    self.execute(stmt.except_block, environment)
                    return
            else:
                self.execute(stmt.except_block, environment)
                return

            assert False
    
    # ...

The most “problematic” statement in symbolic execution of minipy is the while statement. Consider the implementation of the execute_while_stmt method in the concrete minipy interpreter:

class Interpreter:
    # ...
    
    def execute_while_stmt(self, stmt: WhileStmt, environment: Environment) -> None:
        eval_guard_res = self.evaluate(stmt.guard, environment)
        assert type(eval_guard_res) is bool

        while eval_guard_res:
            try:
                self.execute(stmt.body, environment)
            except Continue:
                eval_guard_res = self.evaluate(stmt.guard, environment)
                continue
            except Break:
                break
            eval_guard_res = self.evaluate(stmt.guard, environment)
        else:
            if stmt.else_block is not None:
                self.execute(stmt.else_block, environment)
                return

    # ...

The concrete interpreter evaluates the guard, executes the body if the guard evaluates to true, checks for the special Continue and Break exceptions, and continues doing this until the guard evaluates to false (or a Continue or Break was encountered). The simplest way of symbolically executing a loop is to mimic this behavior, which is called loop unrolling in SE terminology [ABB+16] [BCCDElia+18]. Implementing this is not too difficult:

class SymbolicInterpreter:
    # ...
    
    def execute_while_stmt(self, stmt: WhileStmt, environment: SymbolicEnvironment) -> SET:
        eval_guard_res = self.evaluate(stmt.guard, environment)
        result = (SETNode(environment, stmt.code), [])

        for value, constraint in eval_guard_res:
            new_env = environment.add_constraint(constraint)
            if self.unsatisfiable(new_env):
                continue
            
            if isinstance(value, Exception):
                exc_node = SETNode(new_env.set_abrupt_completion(value), '')
                result[1].append((exc_node, []))
                continue

            then_constraint: z3.BoolRef = z3.simplify(value)
            then_env = new_env.add_constraint(then_constraint)

            if not self.unsatisfiable(then_env):
                subtree = self.execute(stmt.body, then_env)
                result[1].append(subtree)
                for leaf in get_leaves(subtree):
                    node, _ = leaf
                    env = node.environment
                    exc = env.abrupt_completion
                    if exc is not None:
                        if isinstance(exc, Continue):
                            env = env.set_abrupt_completion(None)
                        elif isinstance(exc, Break):
                            env = env.set_abrupt_completion(None)
                            result = replace_in_tree(
                                result, leaf, (SETNode(env, ''), []))
                            continue
                        else:
                            continue

                    while_tree = self.execute(stmt, env)
                    result = replace_in_tree(result, leaf, while_tree)
                    continue

            else_constraint: z3.BoolRef = z3.simplify(z3.Not(value))
            else_env = new_env.add_constraint(else_constraint)

            if not self.unsatisfiable(else_env):
                if stmt.else_block is not None:
                    subtree = self.execute(stmt.else_block, else_env)
                    result[1].append((SETNode(else_env, 'else:'), [subtree]))
                else:
                    result[1].append((SETNode(else_env, ''), []))

        return result
    
    # ...

Apart from the standard differences between the concrete and symbolic interpreter (which correspond to the differences of the respective implementations of the execution of an if statement), the most striking difference is that the implementation is recursive, which is a little easier to implement. Furthermore, the execution of the body can, of course, result in a tree with multiple leaves. We replace each leaf from that tree with a leaf indicating what to execute next. For instance, in the case of an encountered Break, we replace the leaf node with a single node without children, and reset the abrupt completion flag (env = env.set_abrupt_completion(None)) since the loop “catches” abrupt completion due to break statements. For the normal completion case, we execute the loop once again, and replace the normal completion leaves of the body with the newly generated trees resulting from the loop execution.

The execution of function definition statements in the symbolic interpreter works analogously to the concrete minipy interpreter: We add a continuation to the execution environment which is used when evaluating later calls to the defined functions. The corresponding implementation in the concrete interpreter is

class Interpreter:
    
    # ...
    
    def execute_function_def(self, stmt: FunctionDef, environment: Environment) -> None:
        f_name = stmt.name
        ret_type = get_type(stmt.t)
        params = [self.evaluate_param(param) for param in stmt.params]

        def func(*args):
            new_env = environment.copy()
            for idx, parameter in enumerate(params):
                new_env[parameter] = args[idx]
            try:
                self.execute_block(stmt.block, new_env)
            except Return as ret:
                return ret.value

        environment[f_name] = (params, ret_type, func)
    
    # ...

The main difference in the symbolic variant is that it has to account for multiple leaves in the symbolic tree resulting from the execution of the function body:

class SymbolicInterpreter:
    # ...
    
    def execute_function_def(self, stmt: FunctionDef, environment: SymbolicEnvironment) -> SET:
        f_name = stmt.name
        ret_type = get_type(stmt.t)
        params = tuple([self.evaluate_param(param) for param in stmt.params])

        def func(environment: SymbolicEnvironment, *args: SymbolicValueType) -> EvalExprResult:
            for idx, parameter in enumerate(params):
                environment = environment.set(parameter, args[idx])

            block_result = self.execute(stmt.block, environment)
            result: EvalExprResult = []

            for node, _ in get_leaves(block_result):
                exc = node.environment.abrupt_completion
                if exc is None:
                    raise RuntimeError("Missing return statement")

                if isinstance(exc, SymbolicReturn):
                    result.append((exc.value, z3_and(*node.environment.path_constraints)))
                else:
                    result.append((exc, z3_and(*node.environment.path_constraints)))

            return result

        return SETNode(environment, stmt.code), [
            (SETNode(environment.add_function(f_name, params, ret_type, func), ''), [])
        ]
    
    # ...

Function calls are evaluated by first considering all evaluation alternatives for the passed parameters: If the evaluation of a parameter raises an exception, the exception is the result of the call. Only if all parameters evaluate normally, the function body is executed. Similarly to the evaluation of tuple expressions, we build the product of the evaluation results for the parameters to account for all evaluation alternatives.

class SymbolicInterpreter:
    # ...
    
    def evaluate_fun_call(self, expr: FunctionCall, environment: SymbolicEnvironment) -> EvalExprResult:
        result: EvalExprResult = []

        f_name = expr.name
        if f_name not in environment.functions:
            raise NameError(f"name '{f_name}' is not defined")

        args_result = [self.evaluate(arg, environment) for arg in expr.args]

        combinations: List[Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]] = \
            list(itertools.product(*args_result))

        combination: Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]
        for combination in combinations:
            values: List[SymbolicValueType] = []
            path_constraint: z3.BoolRef = z3.BoolVal(True)
            sub_result: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]
            for sub_value, sub_constraint in combination:
                path_constraint = z3.simplify(z3.And(path_constraint, sub_constraint))
                if isinstance(sub_value, Exception):
                    result.append(wrap_exceptions((sub_value, path_constraint)))
                    break

                values.append(sub_value)

            if len(values) != len(combination):
                continue

            if f_name in environment.functions:
                params, ret_type, fun = environment.functions[f_name]
                if len(values) != len(params):
                    raise TypeError(f"{f_name}({', '.join(map(str, params))}) takes {len(params)} "
                                    f"argument(s) but {len(values)} were given")

                result.extend(fun(environment, *values))
            else:
                assert False

        return result
    
    # ...

The complete code for the symbolic interpreter, along with some helper functions, follows bellow (press the toggle on the right).

import logging
from typing import Any
from orderedset import OrderedSet


SomeTree = Tuple[Any, List['SomeTree']]


def tree_contains(tree: SET, needle: SET) -> bool:
    if tree is needle:
        return True
    node, children = tree
    if not children:
        return False
    return any(tree_contains(child, needle) for child in children)


def replace_in_tree(tree: SET, to_replace: SET, new_tree: SET) -> SET:
    assert tree_contains(tree, to_replace)
    for path, subtree in path_iterator(tree):
        if subtree is to_replace:
            return replace_tree_path(tree, path, new_tree)
        
        
def replace_tree_path(in_tree: SomeTree, path: Path, replacement_tree: SomeTree) -> SomeTree:
    """In-place subtree replacement preserving identity of other subtrees."""
    if not path:
        return replacement_tree

    curr_parent = in_tree
    last_idx = path[0]
    curr_tree = curr_parent[1][last_idx]
    path = path[1:]

    while path:
        last_idx = path[0]
        path = path[1:]
        curr_parent = curr_tree
        curr_tree = curr_tree[1][last_idx]

    curr_parent[1][last_idx] = replacement_tree

    return in_tree

        
def path_iterator(tree: SomeTree, path: Path = ()) -> Generator[Tuple[Path, SomeTree], None, None]:
    yield path, tree
    if tree[1] is not None:
        for i, child in enumerate(tree[1]):
            yield from path_iterator(child, path + (i,))

        
def get_type_for_z3_expr(expr: z3.ExprRef) -> Type:
    if z3.is_int(expr):
        return INT_TYPE
    elif z3.is_seq(expr):
        return TUPLE_TYPE
    elif z3.is_bool(expr):
        return BOOL_TYPE

    assert False
    
    
def implies(premise: z3.BoolRef, conclusion: z3.BoolRef) -> bool:
    return is_unsat(z3.And(premise, z3.Not(conclusion)))
        
        
class SymbolicReturn(AbruptCompletionNoException):
    def __init__(self, value: SymbolicValueType):
        self.value = value


class SymbolicInterpreter:
    def __init__(self):
        self.logger = logging.getLogger(type(self).__name__)
        
        self.stmt_interpretations: Dict[typing.Type[ASTNode], Callable[[ASTNode, SymbolicEnvironment], SET]] = {
            Stmts: self.execute_stmts,
            Pass: self.execute_pass,
            BreakStmt: self.execute_break,
            ContinueStmt: self.execute_continue,
            ReturnStmt: self.execute_return_stmt,
            Assert: self.execute_assert_stmt,
            Assignment: self.execute_assignment,
            Block: self.execute_block,
            FunctionDef: self.execute_function_def,
            TryStmt: self.execute_try_stmt,
            WhileStmt: self.execute_while_stmt,
            IfStmt: self.execute_if_stmt,
        }

        self.expr_interpretations: Dict[typing.Type[Expression], Callable[[ASTNode, SymbolicEnvironment],
                                                                          EvalExprResult]] = {
            Disjunction: self.evaluate_disjunction,
            Conjunction: self.evaluate_conjunction,
            Inversion: self.evaluate_inversion,
            Comparison: self.evaluate_comparison,
            Sum: self.evaluate_sum,
            Term: self.evaluate_term,
            Factor: self.evaluate_factor,
            TupleNode: self.evaluate_tuple,
            TupleAccess: self.evaluate_tuple_access,
            FunctionCall: self.evaluate_fun_call,
            Param: self.evaluate_param,
            BooleanAtom: self.evaluate_boolean,
            IntAtom: self.evaluate_int,
            NameAtom: self.evaluate_name,
        }
    
    def constraint_unsatisfiable(self, constraint: z3.BoolRef) -> bool:
        return is_unsat(constraint)

    def unsatisfiable(self, env: SymbolicEnvironment) -> bool:
        return self.constraint_unsatisfiable(z3_and(*env.path_constraints))

    def execute(self, stmt: ASTNode, environment: SymbolicEnvironment) -> SET:
        self.logger.debug("Executing statement %s", stmt.code)

        t = type(stmt)
        if isinstance(stmt, WhileStmt):
            return self.execute_while_stmt(stmt, environment)

        if t in self.stmt_interpretations:
            return self.stmt_interpretations[t](stmt, environment)

        if isinstance(stmt, Expression):
            # Expression Statements
            return SETNode(environment, stmt.code), [(SETNode(environment.add_constraint(constraint), None), [])
                                                     for _, constraint in self.evaluate(stmt, environment)]

        assert False

    def execute_stmts(self, stmts: Stmts, environment: SymbolicEnvironment) -> SET:
        return self.execute_sequence(tuple(stmts.stmts), environment)

    def execute_sequence(self, stmts: Tuple[ASTNode, ...], environment: SymbolicEnvironment) -> SET:
        result = None
        leaves: List[SET] = []

        for stmt in stmts:
            if result is None:
                tree = self.execute(stmt, environment)
                result, leaves = tree, get_leaves(tree)
                continue

            for leaf in leaves:
                node, children = leaf
                if node.environment.abrupt_completion is not None or self.unsatisfiable(node.environment):
                    continue
                tree = self.execute(stmt, node.environment)

                assert tree_contains(result, leaf)
                result = replace_in_tree(result, leaf, tree)

            leaves = get_leaves(result)

        return result

    def execute_pass(self, stmt: Pass, environment: SymbolicEnvironment) -> SET:
        return SETNode(environment, stmt.code), [(SETNode(environment, None), [])]

    def execute_break(self, stmt: BreakStmt, environment: SymbolicEnvironment) -> SET:
        return SETNode(environment, stmt.code), [(SETNode(environment.set_abrupt_completion(Break()), None), [])]

    def execute_continue(self, stmt: ContinueStmt, environment: SymbolicEnvironment) -> SET:
        return SETNode(environment, stmt.code), [(SETNode(environment.set_abrupt_completion(Continue()), None), [])]

    def execute_return_stmt(self, stmt: ReturnStmt, environment: SymbolicEnvironment) -> SET:
        eval_expr_result = self.evaluate(stmt.expression, environment)

        children: List[SET] = []
        result = (SETNode(environment, stmt.code), children)

        for value, constraint in eval_expr_result:
            if isinstance(value, Exception):
                children.append((SETNode(environment
                                         .add_constraint(constraint)
                                         .set_abrupt_completion(value), ''), []))
            else:
                children.append((SETNode(environment
                                         .add_constraint(constraint)
                                         .set_abrupt_completion(SymbolicReturn(value)), ''), []))

        return result

    def execute_assert_stmt(self, stmt: Assert, environment: SymbolicEnvironment) -> SET:
        eval_expr_result = self.evaluate(stmt.expression, environment)

        children: List[SET] = []
        result = (SETNode(environment, stmt.code), children)

        for value, constraint in eval_expr_result:
            new_env = environment.add_constraint(constraint)
            if self.unsatisfiable(new_env):
                continue

            if isinstance(value, Exception):
                children.append((SETNode(new_env.set_abrupt_completion(value), ''), []))
                continue
                
            assert isinstance(value, z3.BoolRef)
            
            sat_env = new_env.add_constraint(value)
            viol_env = new_env.add_constraint(z3.Not(value))

            sat_env_unsat = self.unsatisfiable(sat_env)
            viol_env_unsat = self.unsatisfiable(viol_env)
            
            assert not sat_env_unsat or not viol_env_unsat

            if sat_env_unsat:
                children.append((SETNode(environment.set_abrupt_completion(AssertionError()), ''), []))
            elif viol_env_unsat:
                children.append((SETNode(environment, ''), []))
            else:
                children.append((SETNode(sat_env, ''), []))
                children.append((SETNode(viol_env.set_abrupt_completion(AssertionError()), ''), []))

        return result

    def execute_assignment(self, stmt: Assignment, environment: SymbolicEnvironment) -> SET:
        eval_expr_result = [
            (value, constraint)
            for value, constraint in self.evaluate(stmt.expression, environment)
            if not self.unsatisfiable(environment.add_constraint(constraint))
        ]

        children: List[SET] = []
        result = (SETNode(environment, stmt.code), children)

        for value, constraint in eval_expr_result:
            if isinstance(value, Exception):
                new_env = environment.set_abrupt_completion(value)
            else:
                variable = Variable(stmt.lhs, get_type_for_z3_expr(value))
                new_env = environment.set(variable, value)

            new_env = new_env.add_constraint(constraint)

            children.append((SETNode(new_env, None), []))
        
        return result

    def execute_function_def(self, stmt: FunctionDef, environment: SymbolicEnvironment) -> SET:
        f_name = stmt.name
        ret_type = get_type(stmt.t)
        params = tuple([self.evaluate_param(param) for param in stmt.params])

        def func(environment: SymbolicEnvironment, *args: SymbolicValueType) -> EvalExprResult:
            for idx, parameter in enumerate(params):
                environment = environment.set(parameter, args[idx])

            block_result = self.execute(stmt.block, environment)
            result: EvalExprResult = []

            for node, _ in get_leaves(block_result):
                exc = node.environment.abrupt_completion
                if exc is None:
                    raise RuntimeError("Missing return statement")

                if isinstance(exc, SymbolicReturn):
                    result.append((exc.value, z3_and(*node.environment.path_constraints)))
                else:
                    result.append((exc, z3_and(*node.environment.path_constraints)))

            return result

        return SETNode(environment, stmt.code), [
            (SETNode(environment.add_function(f_name, params, ret_type, func), ''), [])
        ]

    def evaluate_param(self, param: Param) -> Variable:
        return Variable(param.name, get_type(param.type))

    def execute_try_stmt(self, stmt: TryStmt, environment: SymbolicEnvironment) -> SET:
        tree = self.execute(stmt.block, environment)
        result = (SETNode(environment, stmt.code), [tree])

        for leaf in get_leaves(tree):
            node, _ = leaf
            exc = node.environment.abrupt_completion
            if exc is None or isinstance(exc, AbruptCompletionNoException):
                continue

            if stmt.exc_type is not None:
                caught_exc_type = getattr(sys.modules["builtins"], stmt.exc_type)
                if isinstance(exc, caught_exc_type):
                    catch_block_tree = self.execute(stmt.except_block,
                                                    node.environment.set_abrupt_completion(None))
                    result = replace_in_tree(result, leaf, catch_block_tree)
            else:
                catch_block_tree = self.execute(stmt.block, node.environment.set_abrupt_completion(None))
                result = replace_in_tree(result, leaf, catch_block_tree)

        return result

    def execute_if_stmt(self, stmt: IfStmt, environment: SymbolicEnvironment) -> SET:
        eval_expr_result = self.evaluate(stmt.guard, environment)

        result: SET = (SETNode(environment, stmt.code), [])

        for value, constraint in eval_expr_result:
            new_env = environment.add_constraint(constraint)
            if self.unsatisfiable(new_env):
                continue

            if isinstance(value, Exception):
                result[1].append((SETNode(new_env.set_abrupt_completion(value), ''), []))
                continue

            assert z3.is_bool(value)

            then_constraint: z3.BoolRef = z3.simplify(value)
            else_constraint = z3.simplify(z3.Not(value))

            then_env = new_env.add_constraint(then_constraint)
            else_env = new_env.add_constraint(else_constraint)

            then_env_unsat = self.unsatisfiable(then_env)
            else_env_unsat = self.unsatisfiable(else_env)

            if not then_env_unsat:
                subtree = self.execute(stmt.then_block, then_env)
                result[1].append(subtree)

            if not else_env_unsat:
                if stmt.else_block is not None:
                    subtree = self.execute(stmt.else_block, else_env)
                    result[1].append(subtree)
                else:
                    result[1].append((SETNode(else_env, ''), []))

        return result

    def execute_while_stmt(self, stmt: WhileStmt, environment: SymbolicEnvironment) -> SET:
        eval_guard_res = self.evaluate(stmt.guard, environment)
        result = (SETNode(environment, stmt.code), [])

        for value, constraint in eval_guard_res:
            new_env = environment.add_constraint(constraint)
            if self.unsatisfiable(new_env):
                continue
            
            if isinstance(value, Exception):
                exc_node = SETNode(new_env.set_abrupt_completion(value), '')
                result[1].append((exc_node, []))
                continue

            then_constraint: z3.BoolRef = z3.simplify(value)
            then_env = new_env.add_constraint(then_constraint)

            if not self.unsatisfiable(then_env):
                subtree = self.execute(stmt.body, then_env)
                result[1].append(subtree)
                for leaf in get_leaves(subtree):
                    node, _ = leaf
                    env = node.environment
                    exc = env.abrupt_completion
                    if exc is not None:
                        if isinstance(exc, Continue):
                            env = env.set_abrupt_completion(None)
                        elif isinstance(exc, Break):
                            env = env.set_abrupt_completion(None)
                            result = replace_in_tree(
                                result, leaf, (SETNode(env, ''), []))
                            continue
                        else:
                            continue

                    while_tree = self.execute(stmt, env)
                    result = replace_in_tree(result, leaf, while_tree)
                    continue

            else_constraint: z3.BoolRef = z3.simplify(z3.Not(value))
            else_env = new_env.add_constraint(else_constraint)

            if not self.unsatisfiable(else_env):
                if stmt.else_block is not None:
                    subtree = self.execute(stmt.else_block, else_env)
                    result[1].append((SETNode(else_env, 'else:'), [subtree]))
                else:
                    result[1].append((SETNode(else_env, ''), []))

        return result

    def execute_block(self, stmt: Block, environment: SymbolicEnvironment) -> SET:
        return self.execute_sequence(tuple(stmt.stmts), environment)

    def evaluate(self, expr: Expression, environment: SymbolicEnvironment) -> EvalExprResult:
        t = type(expr)
        if t in self.expr_interpretations:
            return [(value if isinstance(value, Exception) else z3.simplify(value), z3.simplify(constraint))
                    for value, constraint in self.expr_interpretations[t](expr, environment)
                    if not self.constraint_unsatisfiable(constraint)]

        assert False

    def evaluate_disjunction(self, expr: Disjunction, environment: SymbolicEnvironment) -> EvalExprResult:
        sub_results: List[EvalExprResult] = [self.evaluate(sub, environment) for sub in expr.conjunctions]

        result: EvalExprResult = sub_results[0]
        for sub_result in sub_results[1:]:
            result = handle_binary_operation(result, sub_result, "or")

        return result

    def evaluate_conjunction(self, expr: Conjunction, environment: SymbolicEnvironment) -> EvalExprResult:
        sub_results: List[EvalExprResult] = [self.evaluate(sub, environment) for sub in expr.inversions]

        result: EvalExprResult = sub_results[0]
        for sub_result in sub_results[1:]:
            result = handle_binary_operation(result, sub_result, "and")

        return result

    def evaluate_inversion(self, expr: Inversion, environment: SymbolicEnvironment) -> EvalExprResult:
        sub_result = self.evaluate(expr.inversion, environment)
        return [(exc, constr) for exc, constr in sub_result if isinstance(exc, Exception)] + \
               [(z3.Not(value), constr) for value, constr in sub_result if not isinstance(value, Exception)]

    def evaluate_comparison(self, expr: Comparison, environment: SymbolicEnvironment) -> EvalExprResult:
        left = self.evaluate(expr.left, environment)
        right = self.evaluate(expr.right, environment)
        return handle_binary_operation(left, right, expr.op)

    def evaluate_sum(self, expr: Sum, environment: SymbolicEnvironment) -> EvalExprResult:
        left = self.evaluate(expr.left, environment)
        right = self.evaluate(expr.right, environment)
        return handle_binary_operation(left, right, expr.op)

    def evaluate_term(self, expr: Term, environment: SymbolicEnvironment) -> EvalExprResult:
        left = self.evaluate(expr.left, environment)
        right = self.evaluate(expr.right, environment)
        return handle_binary_operation(left, right, expr.op)

    def evaluate_factor(self, expr: Factor, environment: SymbolicEnvironment) -> EvalExprResult:
        eval_res = self.evaluate(expr.factor, environment)

        if expr.symb == "-":
            return [(result if isinstance(result, Exception) else -result, constr)
                    for result, constr in eval_res]
        else:
            return eval_res

    def evaluate_fun_call(self, expr: FunctionCall, environment: SymbolicEnvironment) -> EvalExprResult:
        result: EvalExprResult = []

        f_name = expr.name
        if f_name not in environment.functions:
            raise NameError(f"name '{f_name}' is not defined")

        args_result = [self.evaluate(arg, environment) for arg in expr.args]

        combinations: List[Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]] = \
            list(itertools.product(*args_result))

        combination: Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]
        for combination in combinations:
            values: List[SymbolicValueType] = []
            path_constraint: z3.BoolRef = z3.BoolVal(True)
            sub_result: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]
            for sub_value, sub_constraint in combination:
                path_constraint = z3.simplify(z3.And(path_constraint, sub_constraint))
                if isinstance(sub_value, Exception):
                    result.append(wrap_exceptions((sub_value, path_constraint)))
                    break

                values.append(sub_value)

            if len(values) != len(combination):
                continue

            if f_name in environment.functions:
                params, ret_type, fun = environment.functions[f_name]
                if len(values) != len(params):
                    raise TypeError(f"{f_name}({', '.join(map(str, params))}) takes {len(params)} "
                                    f"argument(s) but {len(values)} were given")

                result.extend(fun(environment, *values))
            else:
                assert False

        return result

    def evaluate_tuple_access(self, expr: TupleAccess, environment: SymbolicEnvironment) -> EvalExprResult:
        target_res = self.evaluate(expr.atom, environment)
        expr_res = self.evaluate(expr.expression, environment)
        return self.handle_tuple_access(target_res, expr_res)

    def handle_tuple_access(self, tuple_result: EvalExprResult, expr_result: EvalExprResult) -> EvalExprResult:
        result: EvalExprResult = []

        for atom_value, atom_constraint in tuple_result:
            if isinstance(atom_value, Exception):
                result.append((atom_value, atom_constraint))
                continue

            if not isinstance(atom_value.sort(), z3.SeqSortRef):
                result.append((TypeError(f"'{atom_value.sort()} is not subscriptable"),
                               atom_constraint))
                continue

            for expr_value, expr_constraint in expr_result:
                conjunction = z3.And(atom_constraint, expr_constraint)
                if isinstance(expr_value, Exception):
                    result.append((expr_value, conjunction))
                    continue

                if expr_value.sort() != z3.IntSort():
                    result.append((TypeError(f"list indices must be integers, not {expr_value.sort()}"),
                                   conjunction))
                    continue

                atom_value: z3.SeqRef

                # In Python, indices can be in the closed interval [-len ; len - 1]
                length_constraint_1 = z3.And(z3.IntVal(0) <= expr_value,
                                             expr_value < z3.Length(atom_value))  # [0 ; len - 1]
                length_constraint_2 = z3.And(z3.IntVal(0) > expr_value,
                                             expr_value >= z3.IntVal(-1) * z3.Length(atom_value))  # [-len ; -1]

                expr = z3.simplify(atom_value[expr_value])
                if expr.decl().kind() == z3.Z3_OP_ITE and \
                        implies(z3.And(conjunction, length_constraint_1), expr.children()[0]):
                    expr = expr.children()[1]

                result.append((expr, z3.And(conjunction, length_constraint_1)))

                expr = z3.simplify(atom_value[z3.Length(atom_value) + expr_value])
                if expr.decl().kind() == z3.Z3_OP_ITE and \
                        implies(z3.And(conjunction, length_constraint_2), expr.children()[0]):
                    expr = expr.children()[1]

                result.append((expr, z3.And(conjunction, length_constraint_2)))

                result.append((IndexError("tuple index out of range"),
                               z3.And(conjunction, z3.Not(length_constraint_1), z3.Not(length_constraint_2))))

        return result

    def evaluate_name(self, expr: NameAtom, environment: SymbolicEnvironment) -> EvalExprResult:
        return [(environment[expr.name], z3.BoolVal(True))]

    def evaluate_boolean(self, expr: BooleanAtom, environment: SymbolicEnvironment) -> EvalExprResult:
        return [(z3.BoolVal(expr.value), z3.BoolVal(True))]

    def evaluate_int(self, expr: IntAtom, environment: SymbolicEnvironment) -> EvalExprResult:
        return [(z3.IntVal(expr.number), z3.BoolVal(True))]

    def evaluate_tuple(self, expr: TupleNode, environment: SymbolicEnvironment) -> EvalExprResult:
        sub_results: List[EvalExprResult] = [self.evaluate(elem, environment) for elem in expr.elems]

        if not sub_results:
            return [(z3.Empty(z3.SeqSort(z3.IntSort())), z3.BoolVal(True))]
        
        result: OrderedSet[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]] = OrderedSet([])

        combinations: List[Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]] = \
            list(itertools.product(*sub_results))

        combination: Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]
        for combination in combinations:
            values: List[SymbolicValueType] = []
            path_constraint: z3.BoolRef = z3.BoolVal(True)
            sub_result: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]
            for sub_value, sub_constraint in combination:
                path_constraint = z3.simplify(z3.And(path_constraint, sub_constraint))
                if isinstance(sub_value, Exception):
                    result.add(wrap_exceptions((sub_value, path_constraint)))
                    break

                values.append(sub_value)

            if len(values) != len(combination):
                continue

            if len(values) == 1:
                value = values[0]
                result.add((value if isinstance(value.sort(), z3.SeqSortRef) else z3.Unit(value), path_constraint))
            elif len(values) > 1:
                result.add((z3.Concat(*[value if isinstance(value.sort(), z3.SeqSortRef) else z3.Unit(value)
                                        for value in values]), path_constraint))

        return [unwrap_exceptions(r) for r in result]


class ExceptionWrapper(Exception):
    """Wrapper for exception objects with support for equality checks and hashing."""

    def __init__(self, exc: Exception):
        self.exc = exc

    def __hash__(self):
        return hash((type(self.exc), self.exc.args))

    def __eq__(self, other):
        return type(self.exc) is type(other.exc) and self.exc.args == other.exc.args


def wrap_exceptions(r: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]) -> \
        Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]:
    value, constraint = r
    return ExceptionWrapper(value) if isinstance(value, Exception) else value, constraint


def unwrap_exceptions(r: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]) -> \
        Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]:
    value, constraint = r
    return value.exc if isinstance(value, ExceptionWrapper) else value, constraint


def get_leaves(tree: SET) -> List[SET]:
    node, children = tree
    if not children:
        return [tree]
    else:
        return [leaf for child in children for leaf in get_leaves(child)]

Let’s give it a try:

program = """
x = 2 * y
if x < 0:
    x = -x
"""

display_program(program)

x = Variable("x", INT_TYPE)
y = Variable("y", INT_TYPE)

env = SymbolicEnvironment(SymbolicStore({x: x.to_z3(), y: y.to_z3()}))
interpreter = SymbolicInterpreter()
tree = interpreter.execute(parse(program), env)

display_set(tree)

x = 2 * y
if x < 0:
    x = -x

To try out something more complex, we revisit the linear search method from Parser and Interpreter for minipy (we slightly simplified the code, which originally should merely illustrate the extent of the minipy language).

program="""
def find(needle: int, haystack: tuple) -> int:
    i = 0
    while i < len(haystack):
        if haystack[i] == needle:
            break
        
        i = i + 1
    else:
        return -1
    
    return i

t = (1, 2, 3, 4, )
r = find(x, t)
"""

display_program(program)

def find(needle: int, haystack: tuple) -> int:
    i = 0
    while i < len(haystack):
        if haystack[i] == needle:
            break
        
        i = i + 1
    else:
        return -1
    
    return i

t = (1, 2, 3, 4, )
r = find(x, t)

Instead of executing this program with a concrete “needle” value, we use a symbolic input x. Let’s have a look at the result:

env = SymbolicEnvironment(SymbolicStore({x: x.to_z3()}))
interpreter = SymbolicInterpreter()

with ExpectError(print_traceback=False):
    tree = interpreter.execute(parse(program), env)
    display_set(tree)

NameError: name 'len' is not defined (expected)

Ooops… Apparently, the interpreter does not know the len function which is built into Python. This is not surprising: Even the concrete interpreter has to be told about existing built-in functions. For the symbolic interpreter, the situation is much more complex, since we cannot call a built-in Python function with symbolic arguments. There are two ways to approach this problem:

Implement the len function in minipy and add it to our program.
Summarize the len function with a symbolic expression.

Option (2) corresponds to the concept of function summaries or contracts [ABB+16] [BCCDElia+18]. Such a summary is basically a conjunction of pre- and postcondition pairs, such that if a precondition holds in the current state, the postcondition can be assumed after function execution. We choose this option to teach the symbolic interpreter about the len function, since it would be rather impractical to add full implementations of all used library functions to each analyzed program. Furthermore, summaries are a central concept of compositional symbolic execution [AGT08] (see also Compositional Symbolic Execution). It allows to modularly execute one piece of code at a time to derive or prove its summary, while invoking other functions by their summaries instead of executing them repeatedly.

We extend the symbolic interpreter such that we can pass it a collection of predicates that will be considered in function calls whenever a function implementation is not available. Technically, we pass the interpreter a dictionary with predicate names as keys, and triples of argument sorts, target sort, and a Callable producing a symbolic value from the given arguments as values. For the len function:

predicates = {"len": ((z3.SeqSort(z3.IntSort()),), z3.IntSort(), z3.Length)}

class SymbolicInterpreter(SymbolicInterpreter):
    def __init__(self,
                 predicates: Optional[Dict[str, Tuple[Tuple[z3.SortRef, ...], z3.SortRef, Callable]]] = None):
        super().__init__()
        self.predicates: Dict[str, Tuple[Tuple[z3.SortRef], z3.SortRef, Callable]] = \
            {} if predicates is None else predicates

    def evaluate_fun_call(self, expr: FunctionCall, environment: SymbolicEnvironment) -> EvalExprResult:
        result: EvalExprResult = []

        f_name = expr.name
        if f_name not in environment.functions and f_name not in self.predicates:
            raise NameError(f"name '{f_name}' is not defined")

        args_result = [self.evaluate(arg, environment) for arg in expr.args]

        combinations: List[Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]] = \
            list(itertools.product(*args_result))

        combination: Tuple[Tuple[Union[SymbolicValueType, Exception], z3.BoolRef], ...]
        for combination in combinations:
            values: List[SymbolicValueType] = []
            path_constraint: z3.BoolRef = z3.BoolVal(True)
            sub_result: Tuple[Union[SymbolicValueType, Exception], z3.BoolRef]
            for sub_value, sub_constraint in combination:
                path_constraint = z3.simplify(z3.And(path_constraint, sub_constraint))
                if isinstance(sub_value, Exception):
                    result.append(wrap_exceptions((sub_value, path_constraint)))
                    break

                values.append(sub_value)

            if len(values) != len(combination):
                continue

            if f_name in environment.functions:
                params, ret_type, fun = environment.functions[f_name]
                if len(values) != len(params):
                    raise TypeError(f"{f_name}({', '.join(map(str, params))}) takes {len(params)} "
                                    f"argument(s) but {len(values)} were given")

                result.extend(fun(environment, *values))
            elif f_name in self.predicates:
                params_sorts, ret_sort, pred = self.predicates[f_name]
                if len(values) != len(params_sorts):
                    raise TypeError(f"{f_name}({', '.join(map(str, params_sorts))}) takes {len(params_sorts)} "
                                    f"argument(s) but {len(values)} were given")
                call_result: SymbolicValueType = pred(*values)
                assert call_result.sort() == ret_sort
                result.append((call_result, path_constraint))
            else:
                assert False

        return result

env = SymbolicEnvironment(SymbolicStore({x: x.to_z3()}))
interpreter = SymbolicInterpreter(predicates=predicates)

tree = interpreter.execute(parse(program), env)
display_set(tree)

Since the picture is a little oversized, here come the leaves of the tree:

def display_leaves(tree: SET):
    leaves = get_leaves(tree)
    for leaf, _ in leaves:
        display_html(leaf.environment.to_html())

display_leaves(tree)

({1 = x}, None, {x: x, t: Concat(Unit(1), Concat(Unit(2), Concat(Unit(3), Unit(4)))), r: 0})

({2 = x}, None, {x: x, t: Concat(Unit(1), Concat(Unit(2), Concat(Unit(3), Unit(4)))), r: 1})

({3 = x}, None, {x: x, t: Concat(Unit(1), Concat(Unit(2), Concat(Unit(3), Unit(4)))), r: 2})

({4 = x}, None, {x: x, t: Concat(Unit(1), Concat(Unit(2), Concat(Unit(3), Unit(4)))), r: 3})

({¬(3 = x) ∧ ¬(4 = x) ∧ ¬(1 = x) ∧ ¬(2 = x)}, None, {x: x, t: Concat(Unit(1), Concat(Unit(2), Concat(Unit(3), Unit(4)))), r: -1})

The flat shape of the tree reveals one of the characteristics of our symbolic interpreter: Since function calls are evaluated as expressions, we do not see the SET corresponding to the execution of the function body. In Transparent Symbolic Execution of Function Calls, we describe how to transparently integrate SETs corresponding to called functions. Apart from that, the result is as expected: If x occurs in t (i.e., its value is between 1 and 4), the result value of variable r is 1 - x, and -1 otherwise.

But what happens if we also leave the value of t symbolic? In that case, the interpreter would continue unrolling the loop until the stack overflows (we do not demonstrate this here since a timeout is hard to implement in Python). In presence of a symbolic tuple t, we obtain an unbounded loop. A straightforward to deal with such unbounded loops solution is to only unroll loops until a maximum depth. We extend the symbolic interpreter with a depth bound for loops.

class SymbolicInterpreter(SymbolicInterpreter):
    def __init__(self,
                 loop_unrolling_threshold: Optional[int] = None,
                 predicates: Optional[Dict[str, Tuple[Tuple[z3.SortRef, ...], z3.SortRef, Callable]]] = None):
        super().__init__(predicates)
        self.loop_unrolling_threshold = loop_unrolling_threshold

    def execute(self, stmt: ASTNode, environment: SymbolicEnvironment,
                loop_unrolling_threshold: Optional[int] = None) -> SET:
        self.logger.debug("Executing statement %s", stmt.code)

        loop_unrolling_threshold = (
            loop_unrolling_threshold if loop_unrolling_threshold is not None
            else self.loop_unrolling_threshold)

        t = type(stmt)
        if isinstance(stmt, WhileStmt):
            return self.execute_while_stmt(stmt, environment, loop_unrolling_threshold)

        if t in self.stmt_interpretations:
            return self.stmt_interpretations[t](stmt, environment)

        if isinstance(stmt, Expression):
            # Expression Statements
            return SETNode(environment, stmt.code), [(SETNode(environment.add_constraint(constraint), None), [])
                                                     for _, constraint in self.evaluate(stmt, environment)]

        assert False

    def execute_while_stmt(self, stmt: WhileStmt, environment: SymbolicEnvironment,
                           loop_unrolling_threshold: Optional[int] = None) -> SET:
        loop_unrolling_threshold = -1 if loop_unrolling_threshold is None else loop_unrolling_threshold

        eval_guard_res = self.evaluate(stmt.guard, environment)
        result = (SETNode(environment, stmt.code), [])

        for value, constraint in eval_guard_res:
            new_env = environment.add_constraint(constraint)
            if self.unsatisfiable(new_env):
                continue

            if isinstance(value, Exception):
                exc_node = SETNode(new_env.set_abrupt_completion(value), '')
                result[1].append((exc_node, []))
                continue

            then_constraint: z3.BoolRef = z3.simplify(value)
            then_env = new_env.add_constraint(then_constraint)

            if not self.unsatisfiable(then_env) and loop_unrolling_threshold != 0:
                subtree = self.execute(stmt.body, then_env)
                result[1].append(subtree)
                for leaf in get_leaves(subtree):
                    node, _ = leaf
                    env = node.environment
                    exc = env.abrupt_completion
                    if exc is not None:
                        if isinstance(exc, Continue):
                            env = env.set_abrupt_completion(None)
                        elif isinstance(exc, Break):
                            env = env.set_abrupt_completion(None)
                            result = replace_in_tree(
                                result, leaf, (SETNode(env, ''), []))
                            continue
                        else:
                            continue

                    while_tree = self.execute(stmt, env, loop_unrolling_threshold - 1)
                    result = replace_in_tree(result, leaf, while_tree)
                    continue

            else_constraint: z3.BoolRef = z3.simplify(z3.Not(value))
            else_env = new_env.add_constraint(else_constraint)

            if not self.unsatisfiable(else_env):
                if stmt.else_block is not None:
                    subtree = self.execute(stmt.else_block, else_env)
                    result[1].append((SETNode(else_env, 'else:'), [subtree]))
                else:
                    result[1].append((SETNode(else_env, ''), []))

        return result

This time, we directly execute the body of the find function to obtain an SET for the actual loop:

program="""i = 0
while i < len(haystack):
    if haystack[i] == needle:
        break
    i = i + 1
else:
    return -1

return i
"""

display_program(program)

i = 0
while i < len(haystack):
    if haystack[i] == needle:
        break
    i = i + 1
else:
    return -1

return i

needle = Variable("needle", INT_TYPE)
haystack = Variable("haystack", TUPLE_TYPE)

env = SymbolicEnvironment(SymbolicStore({needle: needle.to_z3(), haystack: haystack.to_z3()}))
interpreter = SymbolicInterpreter(predicates=predicates, loop_unrolling_threshold=2)
tree = interpreter.execute(parse(program), env)

display_set(tree)

display_leaves(tree)

({¬(Length(haystack) ≤ 0), If(Length(haystack) ≤ 0,
seq.nth_u(haystack, 0),
seq.nth_i(haystack, 0)) =
needle}, SymbolicReturn(0), {needle: needle, haystack: haystack, i: 0})

({¬(Length(haystack) ≤ 1), If(Length(haystack) ≤ 1,
   seq.nth_u(haystack, 1),
   seq.nth_i(haystack, 1)) =
needle, ¬(If(Length(haystack) ≤ 0,
     seq.nth_u(haystack, 0),
     seq.nth_i(haystack, 0)) =
  needle)}, SymbolicReturn(1), {needle: needle, haystack: haystack, i: 1})

({¬(Length(haystack) ≤ 1), ¬(If(Length(haystack) ≤ 1,
     seq.nth_u(haystack, 1),
     seq.nth_i(haystack, 1)) =
  needle), Length(haystack) ≤ 2, ¬(If(Length(haystack) ≤ 0,
     seq.nth_u(haystack, 0),
     seq.nth_i(haystack, 0)) =
  needle)}, SymbolicReturn(-1), {needle: needle, haystack: haystack, i: 2})

({Length(haystack) ≤ 1, ¬(Length(haystack) ≤ 0), ¬(If(Length(haystack) ≤ 0,
     seq.nth_u(haystack, 0),
     seq.nth_i(haystack, 0)) =
  needle)}, SymbolicReturn(-1), {needle: needle, haystack: haystack, i: 1})

({Length(haystack) ≤ 0}, SymbolicReturn(-1), {needle: needle, haystack: haystack, i: 0})

For each tuple index within our set bound of 2, we obtain two leaves: One where the searched-for element has been found (the abrupt completion flag is set to SymbolicReturn(index), where index is the first position where needle has been found), and one where it has not been found (SymbolicReturn(-1)). Additionally, we have one leaf for the case where the tuple is empty (Length(haystack) ≤ 0).

In Loops in Static Symbolic Execution, we introduce a technique based on loop invariants (summaries for loops), which allows to symbolically execute loops (in an online, static symbolic interpreter) without depth bounds.

But what does it actually mean to stop SE after a given bound has exceeded? What is the effect of choosing coarse summaries for functions or loops? And how can we check whether the implementation of the symbolic interpreter discussed in this section is correct? These questions are related to the semantics of symbolic transition systems, which we discuss next.

References¶

ABB+16(1,2): Wolfgang Ahrendt, Bernhard Beckert, Richard Bubel, Reiner Hähnle, Peter H. Schmitt, and Mattias Ulbrich, editors. Deductive Software Verification – The KeY Book. Volume 10001 of LNCS. Springer, 2016. doi:10.1007/978-3-319-49812-6.
AGT08: Saswat Anand, Patrice Godefroid, and Nikolai Tillmann. Demand-Driven Compositional Symbolic Execution. In Proc. of the 14th Intl. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, 367–381. Springer, 2008.
BCCDElia+18(1,2,3): Roberto Baldoni, Emilio Coppa, Daniele Cono D'Elia, Camil Demetrescu, and Irene Finocchi. A Survey of Symbolic Execution Techniques. ACM Comput. Surv., 51(3):50:1–50:39, 2018. doi:10.1145/3182657.

1: See https://docs.python.org/3/reference/expressions.html#index-67

Symbolic Execution: Foundations, Techniques, Applications and Future Perspectives

A Symbolic Interpreter and Correctness of Symbolic Transitions¶

Symbolic Execution Trees¶

Evaluating Expressions¶

Executing Statements¶

References¶