Tangible Computing
17. Python


In this course we use imperative programming languages. An imperative language (also called operational or stateful) is one in which the program mentions state directly, allows side-effects, and the meaning of a program is closely related to how it runs. C is the classic example of an imperative language.

Our next excursion into programming languages will be with Python. The main reason for using Python is that although it is also an imperative language, it occupies the other end of the spectrum from C: Python is interpreted, Object-Oriented, and loosely typed.

By being familiar with C and Python, you will find it much easier to understand the other imperative languages such as C++, Java, Perl, COBOL (don't laugh) and so on. The following list gives some of the characteristics of Python that we will be using. They are good in some situations, and not desirable in others.

17.1 The Python Language and Way

We have covered all sorts of language aspects of python. A great overview is the python tutorial.

An excellent site for learning Python is the Waterloo Circles site

This is the recommended style guide for Python: Python Enhancement Proposal 8

And here is another article of many on the web about doctest http://www.doughellmann.com/PyMOTW/doctest/



17.2 What does assignment mean?

When we began programming in C one of the first operations we encountered was assignment. We described it operationally in terms of what was going on in the hardware. When we write x = 42; and say "assign 42 to x" we mean the machine takes the binary representation of 42 and puts those bits into the memory location named x.

Later on we introduced pointers, where pointers were memory locations that contained the address of the location we really wanted to manipulate. Thus if we had a pointer p to x, p = &x, then we could manipulate x by dereferencing p, as in *p = *p + 1 to add one to x. We say that p references or points to x. Often when we talk about pointer operations we will say "assign p to point to x". The good thing about explicitly having pointers is that it is clear when we are manipulating the actual value of the pointer p, and when we are manipulating the thing pointed to by p (as with *p or p[2]). It becomes clear very quickly that when p and q point to the same thing, then any manipulation of *p is also a manipulation of *q. That is, p and q are aliases for the same thing.

In the Python world, all variables are references to some object. So when you see p = x + 42 you do not read it as "the result of adding 42 to the value of x is assigned to p". Instead, it is better to think of it as "assign p to reference the thing that results from adding 42 to the value of x". So a statement like p = q for pointers p and q means to make p and q reference the same thing.

Here is the typical Python illustration of this:
>>> a = [1, 2, 3, 4]
>>> b = a
>>> b
[1, 2, 3, 4]
>>> b[1] = 42
>>> b
[1, 42, 3, 4]
>>> a
[1, 42, 3, 4]
The expression b[1] = 42 and a[1] = 42 in this example do the same thing. They ask the list, referenced by both a and b to update its element at posiiton 1 to reference the constant 42. Note the wording here. Elements of lists are also references. Lists are mutable, in the sense that the list can be modified by adding or removing elements, and the actual element can be modified to reference a new value.

In python, if you really want to check that two variables reference the same object you test if p is q. If you want to see if they are equal, then use p == q.
>>> p = [1, 2, 3]
>>> q = [1, 2, 3]
>>> p is q
False
>>> p == q
True
>>> r = p
>>> r is p
True
>>> r == p
True


SIDE NOTE: conceptually there is only one instance of any given constant, like 1, (1, 2), or 3**100. But for implementation efficiency, the interpreter will keep around copies of constants, otherwise when doing something like a = 3**23 it would need to search through all the current constants to locate 3**23. For example
>>> x = 2 ** 23
>>> y = 2 ** 23
>>> x is y
False
>>> x == y
True


17.3 Our First Python Examples

We developed a simple routine to convert back and forth into binary representations of non-negative integers (the naturals). The goal was to introduce you to test-driven development using doctest.

code/Python/bin.py

     
    def tobin(x):
        """
        Convert integer to binary representation
        Precondition: x >= 0
     
        Examples:
     
        >>> tobin(6) 
        [1, 1, 0]
     
        >>> tobin(0)
        [0]
     
        """
     
        if x <= 0: return [0]
        b = []
        while x:
            b.append(x % 2)
            x = x // 2
     
        b.reverse()
        return b
     
    def toint(b):
        """
        Convert binary representation to integer
        Precondition: entries in list b are 0 and 1
     
        >>> toint([1, 1, 0]) 
        6
     
        >>> toint([0, 1, 1, 0])
        6
     
        >>> toint([0])
        0
     
        >>> toint([])
        0
     
        """
     
        result = 0
     
        for digit in b:
            result = result*2 + digit
        return result
     
     
    # to run tests do:
    # python3 -m doctest -v bin.py 
    if __name__ == "__main__":
        import doctest
        doctest.testmod()


We then looked at how we could add more strict checks on the input to our functions - primarily as a way of introducing the exceptions and how to raise them.

code/Python/bin-exception.py

     
    def tobin(x):
        """
        Convert integer to binary representation
        Precondition: x >= 0
     
        Examples:
     
        >>> tobin(6) 
        [1, 1, 0]
     
        >>> tobin(0)
        [0]
     
        """
     
        if x <= 0: return [0]
        b = []
        while x:
            b.append(x % 2)
            x = x // 2
     
        b.reverse()
        return b
     
    def toint(b):
        """
        Convert binary representation to integer
        Precondition: entries in list b are 0 and 1
     
        >>> toint([1, 1, 0]) 
        6
     
        >>> toint([0, 1, 1, 0])
        6
     
        >>> toint([0])
        0
     
        >>> toint([])
        0
     
        >>> toint([19])
        Traceback (most recent call last):
        ...
        Exception: Bad digit 19
     
        19
     
        # unexpected good behaviour on a string!
        >>> toint("100")
        4
     
        Lists do not need to contain the same type of data in each position
        >>> toint([1, "0", 1, "1"])
        11
     
        """
     
        result = 0
     
        for digit in b:
            d = int(digit)
            if d < 0 or d > 1:
                raise Exception("Bad digit " + str(digit))
     
            result = result*2 + int(digit)
        return result
     
     
    # to run tests do:
    # python3 bin.py -v
    if __name__ == "__main__":
        import doctest
        doctest.testmod()


We finished by doing an execise where you implemented two new functions to handle arbitrary base representations of non-negative integers. One function took an integer X, and a base B, and returned the base B representation of X as a list of digits. Each digit was in the range 0 ... (B-1), and they were ordered in from most to least signifiant. The other function took a list of digits and a base B and returned the corresponding integer. Note that it was simply evaluating a polynomial in B with coefficients from the list.

17.4 Object Oriented Notions

As we begin object-oriented programming, it is important to establish some common terminology.

17.5 A simple example object

code/Python/simple.py

    """
    Module for a simple class
    """
     
    class Simple:
        """
        Simple Illustrative Class
     
        Each instance has two integer fields, x and y, which are 
        initialized to 0. 
     
        There is a simple increment method for simultaneoulsy changing
        the values of x and y
     
        >>> s = Simple()
        >>> s.x == 0 and s.y == 0
        True
     
        >>> s.inc(1, 0)
        >>> s.x == 1 and s.y == 0
        True
     
        >>> s.inc(0, 2)
        >>> s.x == 1 and s.y == 2
        True
     
        >>> s.inc(2, 2)
        >>> s.hyp()
        5.0
     
        """
     
        # initializer part of the constructor for instances of the Simple class
        def __init__(self):
            self.x = 0
            self.y = 0
     
        def inc(self, dx, dy):
            """
            Increment x by dx, and y by dy
     
            """
            self.x += dx
            self.y += dy
     
        def hyp(self):
            """
            Compute the length of the hypotenuse of the triangle with sides
            of length x and y.
            """
            return (self.x*self.x + self.y*self.y) ** 0.5
     
    # to run the tests: python3 simple.py -v
    if __name__ == "__main__":
        import doctest
        doctest.testmod()
        


17.6 Converting the hash table into a dictionary

In class we developed a simple version of a dictionary that associates a key with some value. Here are the results.

Note: Python has dictionaries built in, so you don't need to implement them!

The first version just uses a simple list of (key, item) tuples. Then we modify that version so that there are a number of buckets, each of which contains a shorter list to search. A hash fuction tells us in what bucket to look for a matching key.

code/Python/dict-1.py

    class Dict:
        """
        Simple Dictionary class that stores (key, value) pairs
        key and value can be any object
     
        d = Dict(size_hint) returns a dictionary object that will have roughly
        size_hint entries in it.
     
        d.insert(key, item) inserts item into the dictionary with the given key.
        If there is an entry with the same key, the effect is undefined.
     
        d.lookup(key) returns an item from the dictionary with the corresponding
        key, or None if there is no item with that key
     
        >>> d = Dict(5)
        >>> key = 2
        >>> item = (2, "fred")
     
        >>> d.lookup(0) is None
        True
     
        >>> d.insert(key, item)
        >>> d.lookup(key) == item
        True
        """
     
        def __init__(self, size_hint):
            self.size = size_hint
            # list of (key, value) tuples
            self.kvpairs = [ ]
     
        def insert(self, key, item):
            """
            
            """
            self.kvpairs.append( (key, item) )
     
        def lookup(self, key):
            # each member of kvpairs is (key, val), so you can assign it
            # to a tuple and unpack it.
            for (k, v) in self.kvpairs:
                if k == key:
                    return v
            return None
        
    if __name__ == "__main__":
        import doctest
        doctest.testmod()


code/Python/dict-2.py

    class Dict:
        """
        Simple Dictionary class that stores (key, value) pairs
        key and value can be any object
     
        d = Dict(size_hint) returns a dictionary object that will have roughly
        size_hint entries in it.
     
        d.insert(key, item) inserts item into the dictionary with the given key.
        If there is an entry with the same key, the effect is undefined.
     
        d.lookup(key) returns an item from the dictionary with the corresponding
        key, or None if there is no item with that key
     
        >>> d = Dict(5)
        >>> key = 2
        >>> item = (2, "fred")
     
        >>> d.lookup(0) is None
        True
     
        >>> d.insert(key, item)
        >>> d.lookup(key) == item
        True
        """
     
        def __init__(self, size_hint):
            # in this implementation there are multiple buckets
            # each with a list of (key, value) tuples
            # the key is mapped to the corresponding bucket by a hash
            # function
            self.size = size_hint
     
            # manufacture a list of self.size empty lists
            self.buckets = [ [] ] * self.size
     
        def _hash_fn(self, key):
            """
            A bad hash function that takes a key and hashes it into
            an integer mod self.size.
     
            >>> d = Dict(256)
            >>> d._hash_fn('A') == ord('A')
            True
     
            >>> d._hash_fn('A') != d._hash_fn('B')
            True
     
            >>> d._hash_fn('AB') == (ord('A') + ord('B')) % 256
            True
     
            >>> d._hash_fn('AB') != d._hash_fn('BA')
            False
            """
     
            r = 0
            for c in str(key):
                r = (r + ord(c)) % self.size
            return r
     
        def insert(self, key, item):
            hash = self._hash_fn(key)
            self.buckets[hash].append( (key, item) )
     
        def lookup(self, key):
            hash = self._hash_fn(key)
            for (k, v) in self.buckets[hash]:
                if k == key:
                    return v
            return None
        
    if __name__ == "__main__":
        import doctest
        doctest.testmod()


17.7 Lists and References

When you assign a variable that references a list to another variable, as in the examples above, you do not make a copy of the original list.

If you really need a copy s of a list p , you use the idiom s = p[:] which does a shallow copy of the values in p into a new list s. That is, it creates a new list s that contains the same values as the original list p. This way if you modify an element of p it does not effect the corresponding one in s.
>>> p = [1, 2, 3]
>>> s = p[:]
>>> p is s
False
>>> p == s
True
>>> p
[1, 2, 3]
>>> s
[1, 2, 3]
>>> p[0] = 42
>>> p
[42, 2, 3]
>>> s
[1, 2, 3]


But, a reference is just a value, so a list can have an element that is a reference to a list. In this case we usually say that a list can have an element that is a list. This can have surprising effects when you make shallow copies of lists.
>>> p = [ [1], [2], [3, 4] ]
>>> p
[[1], [2], [3, 4]]
>>> q = p[:]
>>> q
[[1], [2], [3, 4]]
>>> p is q
False
>>> p.append([11, 22, 33])
>>> p
[[1], [2], [3, 4], [11, 22, 33]]
>>> q
[[1], [2], [3, 4]]
>>> p[1].append(42)
>>> p
[[1], [2, 42], [3, 4]]
>>> q
[[1], [2, 42], [3, 4]]
>>> p[0] = 'fred'
>>> p
['fred', [2, 42], [3, 4]]
>>> q
[[1], [2, 42], [3, 4]]
If you really want a complete copy with no shared mutable values you need to perform a deep copy. Depending on your application, you might want to do something inbetween a shallow and a deep copy. The Python copy module supports various kinds of copying for data structures.

If we think just about lists, the intuition for deep copying is that to deep copy a list, you first recursively deep copy each of the elements of the list and then put them together into a new list. Deep copying is non-trivial, because for example a list can contain a reference to itself. A deep copy had to handle this without recursing forever.
>>> l = [1, 2]
>>> l.append(l)
>>> l
[1, 2, [...]]
Here is are three, increasingly more capable, versions of deepcopy. You might want to revist these examples later after you have done more Python!

code/Python/deepcopy.py

    def deepcopy(a):
        """
        Deep copies lists.
        """
        try:
            b = []
            for i in a:
                b.append(deepcopy(i))
            return b
        except TypeError:
            return a
     
    def deepcopy2(a, memo=None):
        """
        Deep copies lists.  Handles recursive lists.
        """
        if not memo:
            memo = set()
        
        if id(a) in memo:
            return a
        else:
            memo.add(id(a))
     
        try:
            b = []
            for i in a:
                b.append(deepcopy2(i, memo))
            return b
        except TypeError:
            return a
     
    def deepcopy3(a, memo=None):
        """
        Deep copies non-dictionary containers (tuples, lists, strings, etc.).
        Handles recursive data structures too.
        """
        if not memo:
            memo = set()
     
        # Check if we've already copied this exact object (based on its id)
        if id(a) in memo:
            return a
        else:
            memo.add(id(a))
     
        # Check if the thing is immutable.  If so, we can just return it.
        try:
            hash(a)
            return a
        except:
            pass
     
        # Otherwise we'll copy it as a list.
        try:
            b = []
            for i in a:
                b.append(deepcopy3(i, memo))
            return a.__class__(b)
        except TypeError:
            return a


17.8 Lists that contain themselves

Finally, since a list is just a reference, a list can contain itself!
>>> l = [1, 2, 3]
>>> l
[1, 2, 3]
>>> l.append(l)
>>> l
[1, 2, 3, [...]]
>>> l[3]
[1, 2, 3, [...]]
>>> l[3][3]
[1, 2, 3, [...]]
And the list can be in itself multiple times. Here is an infinite binary tree of 1's:
>>> t = []
>>> t.append(t)
>>> t.append(1)
>>> t.append(t)
>>> t
[[...], 1, [...]]
>>> t[0][2]
[[...], 1, [...]]

17. Python
Tangible Computing / Version 3.20 2013-03-25