Definite Assignment Analysis - The Analysis

The Analysis

The following is based on Fruja's formalization of the C# intraprocedural (single method) definite assignment analysis, which is responsible for ensuring that all local variables are assigned before they are used. It simultaneously does definite assignment analysis and constant propagation of boolean values. We define five static functions:

Name Domain Description
before All statements and expressions Variables definitely assigned before the evaluation of the given statement or expression.
after All statements and expressions Variables definitely assigned after the evaluation of the given statement or expression, assuming it completes normally.
vars All statements and expressions All variables available in the scope of the given statement or expression.
true All boolean expressions Variables definitely assigned after the evaluation of the given expression, assuming the expression evaluates to true.
false All boolean expressions Variables definitely assigned after the evaluation of the given expression, assuming the expression evaluates to false.

We supply data-flow equations that define the values of these functions on various expressions and statements, in terms of the values of the functions on their syntactic subexpressions. Assume for the moment that there are no goto, break, continue, return, or exception handling statements. Following are a few examples of these equations:

  • Any expression or statement e that does not affect the set of variables definitely assigned: after(e) = before(e)
  • Let e be the assignment expression loc = v. Then before(v) = before(e), and after(e) = after(v) U {loc}.
  • Let e be the expression true. Then true(e) = before(e) and false(e) = vars(e). In other words, if e evaluates to false, all variables are (vacuously) definitely assigned, because e does not evaluate to false.
  • Since method arguments are evaluated left to right, before(argi + 1) = after(argi). After a method completes, out parameters are definitely assigned.
  • Let s be the conditional statement if (e) s1 else s2. Then before(e) = before(s), before(s1) = true(e), before(s2) = false(e), and after(s) = after(s1) intersect after(s2).
  • Let s be the while loop statement while (e) s1. Then before(e) = before(s), before(s1) = true(e), and after(s) = false(e).
  • And so on.

At the beginning of the method, no local variables are definitely assigned. The verifier repeatedly iterates over the abstract syntax tree and uses the data-flow equations to migrate information between the sets until a fixed point can be reached. Then, the verifier examines the before set of every expression that uses a local variable to ensure that it contains that variable.

The algorithm is complicated by the introduction of control-flow jumps like goto, break, continue, return, and exception handling. Any statement that can be the target of one of these jumps must intersect its before set with the set of definitely assigned variables at the jump source. When these are introduced, the resulting data flow may have multiple fixed points, as in this example:

  1. int i = 1;
  2. L:
  3. goto L;

Since the label L can be reached from two locations, the control-flow equation for goto dictates that before(2) = after(1) intersect before(3). But before(3) = before(2), so before(2) = after(1) intersect before(2). This has two fixed-points for before(2), {i} and the empty set. However, it can be shown that because of the monotonic form of the data-flow equations, there is a unique maximal fixed point (fixed point of largest size) that provides the most possible information about the definitely assigned variables. Such a maximal (or maximum) fixed point may be computed by standard techniques; see data-flow analysis.

An additional issue is that a control-flow jump may render certain control flows infeasible; for example, in this code fragment the variable i is definitely assigned before it is used:

  1. int i;
  2. if (j < 0) return; else i = j;
  3. print(i);

The data-flow equation for if says that after(2) = after(return) intersect after(i = j). To make this work out correctly, we define after(e) = vars(e) for all control-flow jumps; this is vacuously valid in the same sense that the equation false(true) = vars(e) is valid, because it is not possible for control to reach a point immediately after a control-flow jump.

Read more about this topic:  Definite Assignment Analysis

Famous quotes containing the word analysis:

    ... the big courageous acts of life are those one never hears of and only suspects from having been through like experience. It takes real courage to do battle in the unspectacular task. We always listen for the applause of our co-workers. He is courageous who plods on, unlettered and unknown.... In the last analysis it is this courage, developing between man and his limitations, that brings success.
    Alice Foote MacDougall (1867–1945)

    Cubism had been an analysis of the object and an attempt to put it before us in its totality; both as analysis and as synthesis, it was a criticism of appearance. Surrealism transmuted the object, and suddenly a canvas became an apparition: a new figuration, a real transfiguration.
    Octavio Paz (b. 1914)