Tangible Computing - 17. Python

Tangible Computing

17. Python

In this course we use imperative programming languages. An imperative language (also called operational or stateful) is one in which the program mentions state directly, allows side-effects, and the meaning of a program is closely related to how it runs. C is the classic example of an imperative language.

Our next excursion into programming languages will be with Python. The main reason for using Python is that although it is also an imperative language, it occupies the other end of the spectrum from C: Python is interpreted, Object-Oriented, and loosely typed.

By being familiar with C and Python, you will find it much easier to understand the other imperative languages such as C++, Java, Perl, COBOL (don't laugh) and so on. The following list gives some of the characteristics of Python that we will be using. They are good in some situations, and not desirable in others.

Python is an interpreted language, not compiled. This means that much of what is involved in executing a program occurs just before or during run time. This means that, for example, a program can create and execute new methods while running.
Python is consistently Object-Oriented. Every thing you manipulate is an object, and classe are themselves objects.
Python is a dynamically typed language. That is, the same variables can be bound to different kinds of objects during the execution of a program. If a variable is used in a way that is not compatible with the object it is bound to, then a runtime exception is raised. This is in contrast to statically typed which often has compile-time checking to make sure that you cannot, for example, assign a string value to a integer typed variable.

17.1 The Python Language and Way

We have covered all sorts of language aspects of python. A great overview is the python tutorial.

An excellent site for learning Python is the Waterloo Circles site

This is the recommended style guide for Python: Python Enhancement Proposal 8

And here is another article of many on the web about doctest http://www.doughellmann.com/PyMOTW/doctest/

We talked about the types: int, float, str, list, tuple, dict, and class and objects.
We talked about the flow control commands: if-elif-else, while, for, and def.
We talked about modules and the commands: import, and from X import Y.
We talked about doctest.testmod for testing modules, by checking their docstrings for tests.

17.2 What does assignment mean?

When we began programming in C one of the first operations we encountered was assignment. We described it operationally in terms of what was going on in the hardware. When we write x = 42; and say "assign 42 to x" we mean the machine takes the binary representation of 42 and puts those bits into the memory location named x.

Later on we introduced pointers, where pointers were memory locations that contained the address of the location we really wanted to manipulate. Thus if we had a pointer p to x, p = &x, then we could manipulate x by dereferencing p, as in *p = *p + 1 to add one to x. We say that p references or points to x. Often when we talk about pointer operations we will say "assign p to point to x". The good thing about explicitly having pointers is that it is clear when we are manipulating the actual value of the pointer p, and when we are manipulating the thing pointed to by p (as with *p or p[2]). It becomes clear very quickly that when p and q point to the same thing, then any manipulation of *p is also a manipulation of *q. That is, p and q are aliases for the same thing.

In the Python world, all variables are references to some object. So when you see p = x + 42 you do not read it as "the result of adding 42 to the value of x is assigned to p". Instead, it is better to think of it as "assign p to reference the thing that results from adding 42 to the value of x". So a statement like p = q for pointers p and q means to make p and q reference the same thing.

Here is the typical Python illustration of this:

>>> a = [1, 2, 3, 4] >>> b = a >>> b [1, 2, 3, 4] >>> b[1] = 42 >>> b [1, 42, 3, 4] >>> a [1, 42, 3, 4]

The expression b[1] = 42 and a[1] = 42 in this example do the same thing. They ask the list, referenced by both a and b to update its element at posiiton 1 to reference the constant 42. Note the wording here. Elements of lists are also references. Lists are mutable, in the sense that the list can be modified by adding or removing elements, and the actual element can be modified to reference a new value.

In python, if you really want to check that two variables reference the same object you test if p is q. If you want to see if they are equal, then use p == q.

>>> p = [1, 2, 3] >>> q = [1, 2, 3] >>> p is q False >>> p == q True >>> r = p >>> r is p True >>> r == p True

SIDE NOTE: conceptually there is only one instance of any given constant, like 1, (1, 2), or 3**100. But for implementation efficiency, the interpreter will keep around copies of constants, otherwise when doing something like a = 3**23 it would need to search through all the current constants to locate 3**23. For example

>>> x = 2 ** 23 >>> y = 2 ** 23 >>> x is y False >>> x == y True

17.3 Our First Python Examples

We developed a simple routine to convert back and forth into binary representations of non-negative integers (the naturals). The goal was to introduce you to test-driven development using doctest.

code/Python/bin.py

def tobin(x):

"""

Convert integer to binary representation

Precondition: x >= 0

Examples:

>>> tobin(6)

[1, 1, 0]

>>> tobin(0)

[0]

"""

if x <= 0: return [0]

b = []

while x:

b.append(x % 2)

x = x // 2

b.reverse()

return b

def toint(b):

"""

Convert binary representation to integer

Precondition: entries in list b are 0 and 1

>>> toint([1, 1, 0])

6

>>> toint([0, 1, 1, 0])

6

>>> toint([0])

0

>>> toint([])

0

"""

result = 0

for digit in b:

result = result*2 + digit

return result

# to run tests do:

# python3 -m doctest -v bin.py

if __name__ == "__main__":

import doctest

doctest.testmod()

We then looked at how we could add more strict checks on the input to our functions - primarily as a way of introducing the exceptions and how to raise them.

code/Python/bin-exception.py

def tobin(x):

"""

Convert integer to binary representation

Precondition: x >= 0

Examples:

>>> tobin(6)

[1, 1, 0]

>>> tobin(0)

[0]

"""

if x <= 0: return [0]

b = []

while x:

b.append(x % 2)

x = x // 2

b.reverse()

return b

def toint(b):

"""

Convert binary representation to integer

Precondition: entries in list b are 0 and 1

>>> toint([1, 1, 0])

6

>>> toint([0, 1, 1, 0])

6

>>> toint([0])

0

>>> toint([])

0

>>> toint([19])

Traceback (most recent call last):

...

Exception: Bad digit 19

19

# unexpected good behaviour on a string!

>>> toint("100")

4

Lists do not need to contain the same type of data in each position

>>> toint([1, "0", 1, "1"])

11

"""

result = 0

for digit in b:

d = int(digit)

if d < 0 or d > 1:

raise Exception("Bad digit " + str(digit))

result = result*2 + int(digit)

return result

# to run tests do:

# python3 bin.py -v

if __name__ == "__main__":

import doctest

doctest.testmod()

We finished by doing an execise where you implemented two new functions to handle arbitrary base representations of non-negative integers. One function took an integer X, and a base B, and returned the base B representation of X as a list of digits. Each digit was in the range 0 ... (B-1), and they were ordered in from most to least signifiant. The other function took a list of digits and a base B and returned the corresponding integer. Note that it was simply evaluating a polynomial in B with coefficients from the list.

17.4 Object Oriented Notions

As we begin object-oriented programming, it is important to establish some common terminology.

Objects

An object is a thing with:
- identity - it can be distinguished from other objects. Typically, somewhere, every object has a unique id.
- state - it can have a set of values associated with it.
- behaviour - it reacts to messages by changing state and by sending messages to other objects.
The state of an object is in the form of attributes or properties. The current value associated with each attribute defines the state of an object. Typically the possible attributes of an object are defined by the class it belongs to.
Messages and Methods

A message is a communication with another object or class. Messages are sent, received, and responded to. The sending of a message is often called an event.

A method is a particular mechanism for implementing messages. This mechanism looks very much like a function call. Methods are invoked on objects and classes. The methods that handle a message are defined in the class an object belongs to. Which particular method is invoked to handle a particular message is determined by the message dispatch mechanism.
Classes

Objects are organized into classes. A class is an abstraction of the characteristics of the objects in the class. That is, the class is the general concept and the objects are particular instances.

Each class is responsible for managing the objects in it by doing the following three activities:
- construction - creating a new object from the class. Objects are instances of a class.
- querying - locating objects with certain properties among all the live instances of a class.
- destruction - destroying an object that is no longer required
An object has a life-cycle consisting of construction, use, and destruction.

Classes, like objects, have identity and state, and can respond to messages. In particular, the three management activities above are performed by sending messages to the class, not to an object. In some languages such as Smalltalk and Python, classes are also objects.
Inheritance Classes also exhibit inheritance. Inheritance in the OO term for the ability to introduce a new class by extending an existing class. Actually, inheritance is a poor term for this mechanism. We are really simply defining the IS-A relationship between classes. ClassSub ISA ClassSuper means every object of ClassSub is also an instance of ClassSuper. Thus all behaviours of ClassSuper are ascribed to ClassSub objects, even though it is not explicitly declared to be a member of ClassSuper . Historically, we say that instances of ClassSub inherit the properties and behaviour of ClassSuper.

We will explore inheritance later.

17.5 A simple example object

The constructor is named with the same name as the class. You need to write the initializer function __init__
Every object method has a first argument, self, that is the object on which the method is being called. So, whenever you have something like
p.dothis(x)
the method is called with
dothis(p, x)
Docstrings at the beginning of the class describe what the class does, and provides tests for doctest.

code/Python/simple.py

"""

Module for a simple class

"""

class Simple:

"""

Simple Illustrative Class

Each instance has two integer fields, x and y, which are

initialized to 0.

There is a simple increment method for simultaneoulsy changing

the values of x and y

>>> s = Simple()

>>> s.x == 0 and s.y == 0

True

>>> s.inc(1, 0)

>>> s.x == 1 and s.y == 0

True

>>> s.inc(0, 2)

>>> s.x == 1 and s.y == 2

True

>>> s.inc(2, 2)

>>> s.hyp()

5.0

"""

# initializer part of the constructor for instances of the Simple class

def __init__(self):

self.x = 0

self.y = 0

def inc(self, dx, dy):

"""

Increment x by dx, and y by dy

"""

self.x += dx

self.y += dy

def hyp(self):

"""

Compute the length of the hypotenuse of the triangle with sides

of length x and y.

"""

return (self.x*self.x + self.y*self.y) ** 0.5

# to run the tests: python3 simple.py -v

if __name__ == "__main__":

import doctest

doctest.testmod()

17.6 Converting the hash table into a dictionary

In class we developed a simple version of a dictionary that associates a key with some value. Here are the results.

Note: Python has dictionaries built in, so you don't need to implement them!

The first version just uses a simple list of (key, item) tuples. Then we modify that version so that there are a number of buckets, each of which contains a shorter list to search. A hash fuction tells us in what bucket to look for a matching key.

code/Python/dict-1.py

class Dict:

"""

Simple Dictionary class that stores (key, value) pairs

key and value can be any object

d = Dict(size_hint) returns a dictionary object that will have roughly

size_hint entries in it.

d.insert(key, item) inserts item into the dictionary with the given key.

If there is an entry with the same key, the effect is undefined.

d.lookup(key) returns an item from the dictionary with the corresponding

key, or None if there is no item with that key

>>> d = Dict(5)

>>> key = 2

>>> item = (2, "fred")

>>> d.lookup(0) is None

True

>>> d.insert(key, item)

>>> d.lookup(key) == item

True

"""

def __init__(self, size_hint):

self.size = size_hint

# list of (key, value) tuples

self.kvpairs = [ ]

def insert(self, key, item):

"""

self.kvpairs.append( (key, item) )

def lookup(self, key):

# each member of kvpairs is (key, val), so you can assign it

# to a tuple and unpack it.

for (k, v) in self.kvpairs:

if k == key:

return v

return None

if __name__ == "__main__":

import doctest

doctest.testmod()

code/Python/dict-2.py

class Dict:

"""

Simple Dictionary class that stores (key, value) pairs

key and value can be any object

d = Dict(size_hint) returns a dictionary object that will have roughly

size_hint entries in it.

d.insert(key, item) inserts item into the dictionary with the given key.

If there is an entry with the same key, the effect is undefined.

d.lookup(key) returns an item from the dictionary with the corresponding

key, or None if there is no item with that key

>>> d = Dict(5)

>>> key = 2

>>> item = (2, "fred")

>>> d.lookup(0) is None

True

>>> d.insert(key, item)

>>> d.lookup(key) == item

True

"""

def __init__(self, size_hint):

# in this implementation there are multiple buckets

# each with a list of (key, value) tuples

# the key is mapped to the corresponding bucket by a hash

# function

self.size = size_hint

# manufacture a list of self.size empty lists

self.buckets = [ [] ] * self.size

def _hash_fn(self, key):

"""

A bad hash function that takes a key and hashes it into

an integer mod self.size.

>>> d = Dict(256)

>>> d._hash_fn('A') == ord('A')

True

>>> d._hash_fn('A') != d._hash_fn('B')

True

>>> d._hash_fn('AB') == (ord('A') + ord('B')) % 256

True

>>> d._hash_fn('AB') != d._hash_fn('BA')

False

"""

r = 0

for c in str(key):

r = (r + ord(c)) % self.size

return r

def insert(self, key, item):

hash = self._hash_fn(key)

self.buckets[hash].append( (key, item) )

def lookup(self, key):

hash = self._hash_fn(key)

for (k, v) in self.buckets[hash]:

if k == key:

return v

return None

if __name__ == "__main__":

import doctest

doctest.testmod()

17.7 Lists and References

When you assign a variable that references a list to another variable, as in the examples above, you do not make a copy of the original list.

If you really need a copy s of a list p , you use the idiom s = p[:] which does a shallow copy of the values in p into a new list s. That is, it creates a new list s that contains the same values as the original list p. This way if you modify an element of p it does not effect the corresponding one in s.

>>> p = [1, 2, 3] >>> s = p[:] >>> p is s False >>> p == s True >>> p [1, 2, 3] >>> s [1, 2, 3] >>> p[0] = 42 >>> p [42, 2, 3] >>> s [1, 2, 3]

But, a reference is just a value, so a list can have an element that is a reference to a list. In this case we usually say that a list can have an element that is a list. This can have surprising effects when you make shallow copies of lists.

>>> p = [ [1], [2], [3, 4] ] >>> p [[1], [2], [3, 4]] >>> q = p[:] >>> q [[1], [2], [3, 4]] >>> p is q False >>> p.append([11, 22, 33]) >>> p [[1], [2], [3, 4], [11, 22, 33]] >>> q [[1], [2], [3, 4]] >>> p[1].append(42) >>> p [[1], [2, 42], [3, 4]] >>> q [[1], [2, 42], [3, 4]] >>> p[0] = 'fred' >>> p ['fred', [2, 42], [3, 4]] >>> q [[1], [2, 42], [3, 4]]

If you really want a complete copy with no shared mutable values you need to perform a deep copy. Depending on your application, you might want to do something inbetween a shallow and a deep copy. The Python copy module supports various kinds of copying for data structures.

If we think just about lists, the intuition for deep copying is that to deep copy a list, you first recursively deep copy each of the elements of the list and then put them together into a new list. Deep copying is non-trivial, because for example a list can contain a reference to itself. A deep copy had to handle this without recursing forever.

>>> l = [1, 2] >>> l.append(l) >>> l [1, 2, [...]]

Here is are three, increasingly more capable, versions of deepcopy. You might want to revist these examples later after you have done more Python!

code/Python/deepcopy.py

def deepcopy(a):

"""

Deep copies lists.

"""

try:

b = []

for i in a:

b.append(deepcopy(i))

return b

except TypeError:

return a

def deepcopy2(a, memo=None):

"""

Deep copies lists. Handles recursive lists.

"""

if not memo:

memo = set()

if id(a) in memo:

return a

else:

memo.add(id(a))

try:

b = []

for i in a:

b.append(deepcopy2(i, memo))

return b

except TypeError:

return a

def deepcopy3(a, memo=None):

"""

Deep copies non-dictionary containers (tuples, lists, strings, etc.).

Handles recursive data structures too.

"""

if not memo:

memo = set()

# Check if we've already copied this exact object (based on its id)

if id(a) in memo:

return a

else:

memo.add(id(a))

# Check if the thing is immutable. If so, we can just return it.

try:

hash(a)

return a

except:

pass

# Otherwise we'll copy it as a list.

try:

b = []

for i in a:

b.append(deepcopy3(i, memo))

return a.__class__(b)

except TypeError:

return a

17.8 Lists that contain themselves

Finally, since a list is just a reference, a list can contain itself!

>>> l = [1, 2, 3] >>> l [1, 2, 3] >>> l.append(l) >>> l [1, 2, 3, [...]] >>> l[3] [1, 2, 3, [...]] >>> l[3][3] [1, 2, 3, [...]]

And the list can be in itself multiple times. Here is an infinite binary tree of 1's:

>>> t = [] >>> t.append(t) >>> t.append(1) >>> t.append(t) >>> t [[...], 1, [...]] >>> t[0][2] [[...], 1, [...]]

17. Python
Tangible Computing / Version 3.20 2013-03-25