Tangible Computing
20. Binary trees


Trees are one of the fundamental data structures of computing, equal in importance to lists, arrays, and hash tables. In its most general form, a tree is a collection of nodes and edges, in which each edge connects two distinct nodes, and there is a unique path between every pair of nodes.

20.1 Binary Trees



A rooted binary tree is a special case of a general tree in which the edges have directions on them, and at most two edges leave each node. An excellent concise discussion is binary trees.

There are many ways to draw trees, this one shows details about the data structure that implements an ordered binary tree:


20.2 Binary Tree Implementations

Because of their diversity, tree data structures are not usually built into a programming language as a primitive. Instead they are implemented via modules.

20.3 Algorithms over Binary Trees

Binary trees are examples of a recursively defined ADT. What this means is that binary trees are defined in terms of how they are constructed using simple primative operations. Anything that can be built with these primitive operations, so long as you follow the rules, is a legal binary tree.

For example, the rules for the basic binary tree are as follows.

Let tree be an arbitrary binary tree.

tree = EmptyTree () makes tree the empty binary tree.

tree = Leaf(value) makes tree be the binary tree consisting of the single node containing the data item value.

tree = Join(value, left, right) makes tree be the binary tree consisting of the root node containing the data item value, and having left subtree left, and right subtree right, provided that the trees left and right do not share any nodes. If they do, the shared nodes must be constant and immutable.

By applying these operations beginning with leaves and empty trees, one can construct a binary tree. For example:

t = Join("*", Join("+", Leaf(1), Leaf(2)), Join("+", Leaf(3), Leaf(4)))

constructs the following binary tree representing an arithmetic expression:


When an ADT is defined recursively, it can be traversed recursively. That is, the function that defines how the ADT is to be walked is expressed in terms of how it was constructed. So for binary trees we can walk the tree in various ways depending on how we move from a node to its children.

In the case of the arithmetic expression stored in the binary tree above, we can build an Eval function that computes the value of the arithmetic expression in the tree. We present a number of interations of this.

First we have a very simple expression evaluator that mixes results with
error messages. code/ExpTree/exptree_0.py

    """
    expression tree evaluator
     
    Evaluate an expression tree expr in the context of the symbols 
    defined in the symtab dictionary.
     
    We define an expression with a grammar
     
    <EXPR> - an expression
    <OP> - an operator '+', '*'
    <NUMBER> - any valid python number
     
    <EXPR> := <CONST_EXPR> | <FORMULA_EXPR>
    <CONST_EXPR> := <NUMBER> 
     
    <FORMULA_EXPR> := '(' <EXPR> <OP> <EXPR> ')'
     
    These expressions natually map onto trees
     
    Representing expressions as trees
     
    An expr tree can be a list of the form
        [ value ]
    representing a <CONST_EXPR>
     
    or a list of the form
        [ op, left, right ]
    representing a <FORMULA_EXPR>
    where 
        op is '+' or '*'
        left, right are expr
     
    """
     
    def exp_eval(expr):
        """  
     
        expr_eval(expr) <-- computes the value of expression tree expr
     
        >>> exp_eval( [ ] )
        'Error: does not compute'
     
        >>> exp_eval( [ 42 ] )
        42
     
        >>> exp_eval( [ [ 42 ] ] )
     
        >>> exp_eval( [ '+',  [2], [3] ] )
        5
     
        >>> exp_eval( [ '*', [ '+', [1], [2]], ['+', [3], [4]] ] )
        21
     
        >>> exp_eval( [ "hello" ] )
        'hello'
     
        >>> exp_eval( [ '+',  [ "hello" ], [ "world"] ] )
        'helloworld'
        
        >>> exp_eval( [ '*',  [ "h" ], [ 3 ] ] )
        'hhh'
        
        >>> exp_eval( [ '+',  [2], ['-', [1], [2]]  ] )
        5
     
        """
     
        if len(expr) == 1:
            return expr[0]
     
        if len(expr) == 3:
            op = expr[0]
            lhs = expr[1]
            rhs = expr[2]
            if op == '+':
                return exp_eval(lhs) + exp_eval(rhs)
            if op == '*':
                return exp_eval(lhs) * exp_eval(rhs)
     
            return "Error: I cannot do " + op
     
        return "Error: does not compute"
     
    if __name__ == '__main__':
        import doctest
        doctest.testmod(verbose = 1)


Then we reorganize the code so that there is a single point of return from the exp_eval function.
code/ExpTree/exptree_1.py

    """
    expression tree evaluator
     
    Evaluate an expression tree expr in the context of the symbols 
    defined in the symtab dictionary.
     
    We define an expression with a grammar
     
    <EXPR> - an expression
    <OP> - an operator '+', '*'
    <NUMBER> - any valid python number
     
    <EXPR> := <CONST_EXPR> | <FORMULA_EXPR>
    <CONST_EXPR> := <NUMBER> 
     
    <FORMULA_EXPR> := '(' <EXPR> <OP> <EXPR> ')'
     
    These expressions natually map onto trees
     
    Representing expressions as trees
     
    An expr tree can be a list of the form
        [ value ]
    representing a <CONST_EXPR>
     
    or a list of the form
        [ op, left, right ]
    representing a <FORMULA_EXPR>
    where 
        op is '+' or '*'
        left, right are expr
     
    """
     
    def exp_eval(expr):
        """  
     
        expr_eval(expr) <-- computes the value of expression tree expr
     
        >>> exp_eval( [ ] )
        'Error: does not compute'
     
        >>> exp_eval( [ 42 ] )
        42
     
        >>> exp_eval( [ '+',  [2], [3] ] )
        5
     
        >>> exp_eval( [ '*', [ '+', [1], [2]], ['+', [3], [4]] ] )
        21
     
        It works on things that are not numbers but understand +, *
     
        >>> exp_eval( [ "hello" ] )
        'hello'
     
        >>> exp_eval( [ '+',  [ "hello" ], [ "world"] ] )
        'helloworld'
        
        >>> exp_eval( [ '*',  [ "h" ], [ 3 ] ] )
        'hhh'
     
        Even on lists!
     
        >>> exp_eval( [ [ 42 ] ] )
        [42]
        
        >>> exp_eval( [ '+',  [ [1] ], [ [2, 3] ] ] )
        [1, 2, 3]
        
        >>> exp_eval( [ '*',  [ [1] ], [ 3 ] ] )
        [1, 1, 1]
     
        Being Pythonic (not Pythonesque?)
        If lists can be used to structure an expression, the tuples should
        also work.  The singleton tuple (1,) is syntactically ugly
     
        >>> exp_eval( ( '*', ( '+', (1,), (2,)), ('+', (3,), (4,)) ) )
        21
        
        But we need some more operations
        >>> exp_eval( [ '+',  [2], ['-', [1], [2]]  ] )
        5
     
        """
     
        val = None
     
        if len(expr) == 1:
            val = expr[0]
     
        elif len(expr) == 3:
            op = expr[0]
            lhs = expr[1]
            rhs = expr[2]
            if op == '+':
                val = exp_eval(lhs) + exp_eval(rhs)
            elif op == '*':
                val = exp_eval(lhs) * exp_eval(rhs)
            else:
                val = "Error: I cannot do " + op
     
        else:
            val = "Error: does not compute"
     
        return val
     
    if __name__ == '__main__':
        import doctest
        doctest.testmod(verbose = 1)



Then we replace error messages as "normal results" by the raising of exceptions. Also pay careful attention to the different ways that an expression tree can be represented, using lists, tuples, or even dictionaries!
code/ExpTree/exptree_2.py

    """
    expression tree evaluator
     
    Evaluate an expression tree expr in the context of the symbols 
    defined in the symtab dictionary.
     
    We define an expression with a grammar
     
    <EXPR> - an expression
    <OP> - an operator '+', '*'
    <NUMBER> - any valid python number
     
    <EXPR> := <CONST_EXPR> | <FORMULA_EXPR>
    <CONST_EXPR> := <NUMBER> 
     
    <FORMULA_EXPR> := '(' <EXPR> <OP> <EXPR> ')'
     
    These expressions natually map onto trees
     
    Representing expressions as trees
     
    An expr tree can be a list of the form
        [ value ]
    representing a <CONST_EXPR>
     
    or a list of the form
        [ op, left, right ]
    representing a <FORMULA_EXPR>
    where 
        op is '+' or '*'
        left, right are expr
     
    """
     
    def exp_eval(expr):
        """  
     
        expr_eval(expr) <-- computes the value of expression tree expr
     
        >>> exp_eval( [ ] )
        Traceback (most recent call last):
          ...
        TypeError: exp_eval: invoked with invalid expression formula tree
     
        >>> exp_eval( [ 42 ] )
        42
     
        >>> exp_eval( [ '+',  [2], [3] ] )
        5
     
        >>> exp_eval( [ '*', [ '+', [1], [2]], ['+', [3], [4]] ] )
        21
     
        It works on things that are not numbers but understand +, *
     
        >>> exp_eval( [ "hello" ] )
        'hello'
     
        >>> exp_eval( [ '+',  [ "hello" ], [ "world"] ] )
        'helloworld'
        
        >>> exp_eval( [ '*',  [ "h" ], [ 3 ] ] )
        'hhh'
     
        Even on lists!
     
        >>> exp_eval( [ [ 42 ] ] )
        [42]
        
        >>> exp_eval( [ '+',  [ [1] ], [ [2, 3] ] ] )
        [1, 2, 3]
        
        >>> exp_eval( [ '*',  [ [1] ], [ 3 ] ] )
        [1, 1, 1]
     
        Being Pythonic:
        If lists can be used to structure an expression, the tuples should
        also work.  The singleton tuple (1,) is syntactically ugly
     
        >>> exp_eval( ( '*', ( '+', (1,), (2,)), ('+', (3,), (4,)) ) )
        21
        
        Or mixed tuples and lists:
     
        >>> exp_eval( ( '*', [ '+', (1,), (2,)], ('+', [3,], (4,)) ) )
        21
     
        Or even hashes if we put them together to satify [ ] index syntax:
        >>> exp_eval( { 0: '*', 1: [ '+', (1,), { 0: 2}], 2: ('+', [3,], (4,)) } )
        21
     
     
        But we need some more operations
        >>> exp_eval( [ '+',  [2], ['-', [1], [2]]  ] )
        Traceback (most recent call last):
          ...
        SyntaxError: exp_eval: unknown operation '-'
     
        """
     
        val = None
     
        if len(expr) == 1:
            val = expr[0]
     
        elif len(expr) == 3:
            op = expr[0]
            lhs = expr[1]
            rhs = expr[2]
            if op == '+':
                val = exp_eval(lhs) + exp_eval(rhs)
            elif op == '*':
                val = exp_eval(lhs) * exp_eval(rhs)
            else:
                raise SyntaxError("exp_eval: unknown operation '" + op + "'")
     
        else:
            raise TypeError(
                "exp_eval: invoked with invalid expression formula tree")
     
        return val
     
    if __name__ == '__main__':
        import doctest
        doctest.testmod(verbose = 1)


This is an example of using the try statement to catch any raised exceptions.
code/ExpTree/try_0.py

    import sys
    from exptree_2 import *
     
    e1 = [ '*', [ '+', [1], [2] ], [ '+', [3], [4] ] ]
     
     
    print("Attempting to evaluate " + str(e1))
    try:
        v = exp_eval(e1)
        print(v)
        x = 1 / 0
     
    except: 
        print("Got an exception:", sys.exc_info()[1]) 
        raise
     


In this example we introduce two new functions, exp_str_inorder and exp_str_postorder. They walk the tree and produce not a final value, but instead produce a string that is the expression in normal infix form or in postfix form. This process of producing a string that looks like the oriinal formula encoded by the expression tree is called unparsing.
code/ExpTree/exptree_3.py

    """
    expression tree evaluator
     
    Evaluate an expression tree expr in the context of the symbols 
    defined in the symtab dictionary.
     
    We define an expression with a grammar
     
    <EXPR> - an expression
    <OP> - an operator '+', '*'
    <NUMBER> - any valid python number
     
    <EXPR> := <CONST_EXPR> | <FORMULA_EXPR>
    <CONST_EXPR> := <NUMBER> 
     
    <FORMULA_EXPR> := '(' <EXPR> <OP> <EXPR> ')'
     
    These expressions natually map onto trees
     
    Representing expressions as trees
     
    An expr tree can be a list of the form
        [ value ]
    representing a <CONST_EXPR>
     
    or a list of the form
        [ op, left, right ]
    representing a <FORMULA_EXPR>
    where 
        op is '+' or '*'
        left, right are expr
     
    """
     
    def exp_eval(expr):
        """  
     
        expr_eval(expr) <-- computes the value of expression tree expr
     
        >>> exp_eval( [ ] )
        Traceback (most recent call last):
          ...
        TypeError: exp_eval: invoked with invalid expression formula tree
     
        >>> exp_eval( [ 42 ] )
        42
     
        >>> exp_eval( [ '+',  [2], [3] ] )
        5
     
        >>> exp_eval( [ '*', [ '+', [1], [2]], ['+', [3], [4]] ] )
        21
     
        It works on things that are not numbers but understand +, *
     
        >>> exp_eval( [ "hello" ] )
        'hello'
     
        >>> exp_eval( [ '+',  [ "hello" ], [ "world"] ] )
        'helloworld'
        
        >>> exp_eval( [ '*',  [ "h" ], [ 3 ] ] )
        'hhh'
     
        Even on lists!
     
        >>> exp_eval( [ [ 42 ] ] )
        [42]
        
        >>> exp_eval( [ '+',  [ [1] ], [ [2, 3] ] ] )
        [1, 2, 3]
        
        >>> exp_eval( [ '*',  [ [1] ], [ 3 ] ] )
        [1, 1, 1]
     
        Being Pythonic (not Pythonesque?)
        If lists can be used to structure an expression, the tuples should
        also work.  The singleton tuple (1,) is syntactically ugly
     
        >>> exp_eval( ( '*', ( '+', (1,), (2,)), ('+', (3,), (4,)) ) )
        21
        
        But we need some more operations
        >>> exp_eval( [ '+',  [2], ['-', [1], [2]]  ] )
        Traceback (most recent call last):
          ...
        SyntaxError: exp_eval: unknown operation '-'
     
        """
     
        val = None
     
        if len(expr) == 1:
            val = expr[0]
     
        elif len(expr) == 3:
            op = expr[0]
            lhs = expr[1]
            rhs = expr[2]
            if op == '+':
                val = exp_eval(lhs) + exp_eval(rhs)
            elif op == '*':
                val = exp_eval(lhs) * exp_eval(rhs)
            else:
                raise SyntaxError("exp_eval: unknown operation '" + op + "'")
     
        else:
            raise TypeError(
                "exp_eval: invoked with invalid expression formula tree")
     
        return val
     
    def exp_str_inorder(expr):
        """
        exp_str_inorder(expr) <-- display expression tree in in-order traversal
     
        >>> exp_str_inorder(['*', [ '+', [1], [2]], [ '*', [2], [ '+', [40], [42]]]])
        '((1+2)*(2*(40+42)))'
     
        """
     
        val = ""
     
        if len(expr) == 1:
            val = str(expr[0])
     
        elif len(expr) == 3:
            op = expr[0]
            lhs = expr[1]
            rhs = expr[2]
            val = "(" + exp_str_inorder(lhs) + op + exp_str_inorder(rhs) + ")"
            
        else:
            raise TypeError(
                "exp_str_inorder: invoked with invalid expression formula tree")
     
        return val
     
    def exp_str_postorder(expr):
        """
        exp_str_postorder(expr) <-- display expression tree in post-order traversal
     
        >>> exp_str_postorder(['*', [ '+', [1], [2]], [ '*', [2], [ '+', [40], [42]]]])
        '1 2 + 2 40 42 + * *'
     
        """
     
        val = ""
     
        if len(expr) == 1:
            val = str(expr[0])
     
        elif len(expr) == 3:
            op = expr[0]
            lhs = expr[1]
            rhs = expr[2]
            val = exp_str_postorder(lhs) + " " + exp_str_postorder(rhs) + " " + op
            
        else:
            raise TypeError(
                "exp_str_postorder: invoked with invalid expression formula tree")
     
        return val
     
    if __name__ == '__main__':
        import doctest
        doctest.testmod(verbose = 1)


This is an example of using the try statement to catch any raised exceptions, and also limiting the scope of the try so as to not catch any additional exceptions that we are not interested in handling.
code/ExpTree/try_1.py

    import sys
    from exptree_3 import *
     
    e1 = [ '*', [ '+', [1], [2] ], [ '+', [3], [4] ] ]
     
     
    print("Attempting to evaluate " + str(e1))
    try:
        v = exp_eval(e1)
     
    except: 
        print("Got an exception:", sys.exc_info()[1]) 
        # raise
     
    else:
        print( exp_str_inorder(e1) + " is " + str(v) )
        print( exp_str_postorder(e1) + " is " + str(v) )
     

20. Binary trees
Tangible Computing / Version 3.20 2013-03-25