Tangible Computing
8. Reasoning About Programs




8.1 Properties, Assertions, and Invariants

Logical Property: a logical expression given in terms of the values of the program variables. A property is true, false, or undefined depending on where and when in the program you evaluate the property.

Loop invariant or Program assertion: a property that is true at a specific point and time in the execution of the program.

A point in the program is always before or after a statement. Time is counted in terms of number of statements that have been executed so far.

For the purpose of the next few examples, assume that all variables are integers of unbounded size.

Here is an example of a piece of straight line code that manipulates three variables a, b, and c. We don't know what the values of a and b are before the code begins, all we know is that they satisfy Inv 1.

code/Reasoning/straightline.c

        // Inv1: a > 0 && b < 0
     
        c = a * b;
     
        // Inv2: a > 0 && b < 0 && c == a * b && c < 0
        // or equivalently
        // Inv2: Inv1 && c == a * b && c < 0
     
        a = -a
     
        // Inv3: a < 0 && b < 0 && c == (-a)*b && c < 0
     


Notice that in straight line code, every point in the program is reached exactly once - assuming no run-time errors!

In each case we establish the next invariant by determining how the statements between invariants alter the values of the variables.

The invariant before the code executes is typically called the Pre-condition, and the invariant that holds after the code is finished is called the Post-condition. Programming is all about creating code that assumes that the Pre-Condition holds and then computes in order to make the Post-Condition hold.

The same idea works for conditional statements like if-then-else. The big difference is that points in the program may not be reached, that is they can be reached 0 or 1 times.

code/Reasoning/conditional.c

        // Pre-Condition: a is defined.
     
        b = a;
     
        // Inv1: b == a
     
        if ( b % 2 ) {
            // Inv2.1: Inv1 && b is odd 
     
            b = b + 1;
     
            // Inv2.2: b == a + 1 && b is even
            }
        else {
            // Inv3.1: b is even
            }
     
        // Inv4:  b is even && ( b == a || b == a + 1)
     
        // Post-Condition:  b is the smallest even integer >= a
     


What happens when we have a loop? Now a point in the program can be reached 0, 1, or more times. This makes the invariant difficult to state, and sometimes we need to introduce some dummy variables to do things like count the number of times a point has been reached.

Let's do a version of the long multiplication that one does by hand. First, here is the code, with the desired inputs and outputs given as Pre and Post conditions.

code/Reasoning/loop1.c

        // Pre-Condition: a, b are defined
     
        r = 0;
        p = a;
        q = b;
     
        while ( p ) {
            if ( p & 1 ) {
                r = r + q;
                }
            p = p >> 1;
            q = q + q;
            }
     
        // Post-Condition: r = a * b


Now let's do the loop analysis. It is really an induction in disguise, where the base case is just before the loop begins, and the induction step is going around the loop once.

code/Reasoning/loop2.c

        // Pre-Condition: a, b are defined
     
        r = 0;
        p = a;
        q = b;
     
        // Base: r == a * b - p * q
     
        while ( p ) {
            // Inv1: r == a * b - p * q
     
            // Inv1 (assumed true at top of the loop, due to 
            //    Base for the first time, 
            //    Induction at end of loop for remaining times
     
     
            if ( p & 1 ) {
                r = r + q;
     
                // Inv2: p is odd && r = a * b - p * q + q;
                }
     
            // Inv3: (p is even && r == a * b - p * q) ||
            //       (p is odd  && r == a * b - p * q + q)
     
    // Point *1:
            p = p >> 1;
    // Point *2:
     
            // Inv4: r == a * b - (2 * p * q)
     
    // Point *3:
            q = q + q;
    // Point *4:
     
            // Inv5: r == a * b - (2 * p * q/2)
            // Induction: r == a * b - p * q
            }
     
        // if we get here, it is because p == 0
        // so, r == a * b - 0 * q == a * b
     
        // Post-Condition: r = a * b
     


How did we establish Inv3?
We looked at the case where the condition in the if-then-else was true and where it was false. I.e. even or odd p

How did we establish Inv4?
We needed to consider the two cases inside Inv3 at points *1 and *2.
Case 1: p is even, i.e. p & 1 == 0
At *1
   r == a * b - p * q
At *2 p has been shifted right 1, i.e. divided by 2, with no remainder so
   r == a * b - (2 * p) * q

Case 2: p is odd, i.e. p & 1 == 0
At *1
   r == a * b - p * q + q
At *2 p has been shifted right 1, i.e. divided by 2, but with remainder 1 so
   r == a * b - (2 * p + 1) * q + q
     == a * b - (2 * p) * q - q + q
     == a * b - (2 * p) * q

So both cases give the same result:
   r == a * b - (2 * p * q)


How did we establish Inv5?
At point *3 we have r == a * b - (2 * p * q) but at point *3 q has been multiplied by 2, so substituting in we have
r == a * b - (2 * p * (q/2)) == a * b - p * q


Finally, how did we know the loop actually stopped?
This time we have a Loop variant, which is something that changes each time through the loop in a way that makes progress towards a target goal. In our case the variant is that a is shifted right one bit each time through, so at some point must become 0.

Exercise 1: Modify the above program and invariants so that for given m>0, the program computes (a*b) % m.

Exercise 2: Suppose that you only have 32 bit unsigned arithmetic. Modify the program and invariats so that the results is (q*b) % 2^32. What conditions need to hold for the lengths of a and b?
8. Reasoning About Programs
Tangible Computing / Version 3.20 2013-03-25