Python Training at VMWare Bangalore - Day 2

Aug 16-18, 2017
Anand Chitipothu & Vikrant Patil

These notes are available online at http://notes.pipal.in/2017/vmware-advpy

© Pipal Academy LLP

Day 1 | Day 2 | Day 3

Decorators (Contd...)

Let's look at the fib example again.

In [1]:
!python fib.py 5
8
In [2]:
!DEBUG=true python fib.py 5
|-- fib (5,)
| |-- fib (4,)
| | |-- fib (3,)
| | | |-- fib (2,)
| | | | |-- fib (1,)
| | | | | |-- return 1
| | | | |-- fib (0,)
| | | | | |-- return 1
| | | | |-- return 2
| | | |-- fib (1,)
| | | | |-- return 1
| | | |-- return 3
| | |-- fib (2,)
| | | |-- fib (1,)
| | | | |-- return 1
| | | |-- fib (0,)
| | | | |-- return 1
| | | |-- return 2
| | |-- return 5
| |-- fib (3,)
| | |-- fib (2,)
| | | |-- fib (1,)
| | | | |-- return 1
| | | |-- fib (0,)
| | | | |-- return 1
| | | |-- return 2
| | |-- fib (1,)
| | | |-- return 1
| | |-- return 3
| |-- return 8
8
In [3]:
!cat fib.py
import sys
from trace2 import trace

@trace
def fib(n):
    if n == 0 or n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)
    
def main():
    n = int(sys.argv[1])
    print(fib(n))
    
if __name__ == "__main__":
    main()

Let us try to improve it.

In [10]:
%%file memoize.py

def memoize(f):
    cache = {}
    def g(*args):
        if args not in cache:
            cache[args] = f(*args)
        return cache[args]
    return g
Overwriting memoize.py
In [11]:
%%file sq2.py

from memoize import memoize

@memoize
def square(x):
    print("square", x)
    return x*x

print(square(4))
print(square(4))
Overwriting sq2.py
In [12]:
!python sq2.py
square 4
16
16
In [13]:
%%file fib2.py
import sys
from trace2 import trace
from memoize import memoize

@memoize
@trace
def fib(n):
    if n == 0 or n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)
    
# fib = trace(fib)
# fib = memoize(fib)
    
def main():
    n = int(sys.argv[1])
    print(fib(n))
    
if __name__ == "__main__":
    main()
Writing fib2.py
In [15]:
!DEBUG=true python fib2.py 5
|-- fib (5,)
| |-- fib (4,)
| | |-- fib (3,)
| | | |-- fib (2,)
| | | | |-- fib (1,)
| | | | | |-- return 1
| | | | |-- fib (0,)
| | | | | |-- return 1
| | | | |-- return 2
| | | |-- return 3
| | |-- return 5
| |-- return 8
8
In [19]:
!time python fib.py 30
1346269

real	0m28.423s
user	0m28.274s
sys	0m0.070s
In [20]:
!time python fib2.py 30
1346269

real	0m0.052s
user	0m0.039s
sys	0m0.010s

Decorators - Summary

In [21]:
# any typical decorator would look like this
def decorator(f):
    print("defining function", f.__name__)
    def g(*args):
        print("before calling function", f.__name__)
        result = f(*args)
        print("after calling function", f.__name__)
        return result
    return g

Problem: Write a module cmdline.py to build command-line applications easily. Here is an example of how it can be used:

# hello.py
from cmdline import command, main

@command
def hello():
    """prints hello world message.
    """
    print("hello world!")

@command
def goodbye():
    """prints good bye message.
    """
    print("good bye!")

if __name__ == "__main__":
    main()

The program should produce the following output when run.

$ python hello.py hello
hello world!
$ python hello.py goodbye
good bye!

Bonus Problem: Implement support for help in the cmdline.py module.

$ python hello.py help
Available commands:

hello - prints hello world message
goodbye - prints good bye message
help - prints this help message

Bonus Problem: Can you make these commands take arguments?

@command
def upper(name):
    return name.upper()

And when called:

$ python hello.py upper python
PYTHON
$ python hello.py upper ten
TEN
In [62]:
%%file cmdline.py
"""Simple command-line framework.
"""
import sys

commands = {}

def command(f):
    commands[f.__name__] = f
    #print("defining command", f.__name__)
    #print(commands)
    return f

def main():
    cmdname = sys.argv[1]
    args = sys.argv[2:]
    print("executing command", cmdname, args)
    func = commands[cmdname]
    func(*args)
Overwriting cmdline.py
In [63]:
%%file hello.py
from cmdline import command, main

#@command
def hello():
    """prints hello world message.
    """
    print("hello world!")

hello = command(hello)    
    
@command
def goodbye():
    """prints good bye message.
    """
    print("good bye!")
    
@command
def whoareyou():
    print("Python")

@command
def upper(name):
    print(name.upper())

@command
def add(x, y):
    print(int(x) + int(y))
    
    
print("---------")
    
if __name__ == "__main__":
    main()
Overwriting hello.py
In [64]:
!python hello.py hello
---------
executing command hello []
hello world!
In [65]:
!python hello.py upper python
---------
executing command upper ['python']
PYTHON
In [66]:
!python hello.py add 3 4
---------
executing command add ['3', '4']
7
In [73]:
from trace2 import trace

def square(x):
    return x*x

@trace
def square2(x):
    return x*x

Decorators taking arguments

We had some code like this:

@with_retries
def wget(url):
    ...

How many times to retry and the delay between retries is hardcoded in the with_retries implementation.

Wouldn't it be nice to specify that when using the decorator?

@with_retries(retries=3, delay=0.1)
def wget(url):
    ...

This is equivalant to:

decor = with_retries(retries=3, delay=0.1)
@decor
def wget(url):
    ...

And that is equivalant to:

decor = with_retries(retries=3, delay=0.1)
def wget(url):
    ...
wget = decor(wget)

Couple of practical use cases:

@app.route("/login")
def login():
    ...

@login_required(role="admin")
def admin_page():
    ...
In [74]:
import time

def with_retries(retries=5, delay=0):
    def decor(f):
        def g(*args):
            for i in range(retries):
                try:
                    return f(*args)
                except Exception as e:
                    print(f.__name__, args, "failed:", e)
                time.sleep(delay)
            print("Giving up...")
        return g        
    return decor

Another way to write the samething in a slightly simpler way is:

In [76]:
import time

def with_retries(f=None, retries=5, delay=0):
    if f is None:
#         def decor(f):
#             return with_retries(f=f, retries=retries, delay=delay)
#         return decor
        return lambda f: with_retries(f=f, retries=retries, delay=delay)
    
    def g(*args):
        for i in range(retries):
            try:
                return f(*args)
            except Exception as e:
                print(f.__name__, args, "failed:", e)
            time.sleep(delay)
        print("Giving up...")
    return g

Iterators & Generators

How does iteration work in Python?

In [77]:
for x in [1, 2, 3, 4]:
    print(x)
1
2
3
4
In [78]:
for c in "hello":
    print(c)
h
e
l
l
o
In [79]:
for k in {"x": 1, "y": 2, "z": 3}:
    print(k)
z
x
y

dictionary is an unordered collection, so the keys can appear in any order.

In [80]:
%%file 5.txt
one
two
three
four
five
Writing 5.txt
In [81]:
for line in open("5.txt"):
    print(repr(line))
'one\n'
'two\n'
'three\n'
'four\n'
'five'

The Iteration Protocol

In [82]:
x = iter(["a", "b", "c", "d"])
In [83]:
x
Out[83]:
<list_iterator at 0x10b66f4e0>
In [84]:
next(x)
Out[84]:
'a'
In [85]:
next(x)
Out[85]:
'b'
In [86]:
next(x)
Out[86]:
'c'
In [87]:
next(x)
Out[87]:
'd'
In [88]:
next(x)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-88-5e4e57af3a97> in <module>()
----> 1 next(x)

StopIteration: 
In [89]:
# the largest word in the english dictionary
max(open("/usr/share/dict/words"), key=len)
Out[89]:
'formaldehydesulphoxylate\n'

What is an easy way to create an iterator?

Generators

In [90]:
def squares(numbers):
    for n in numbers:
        yield n*n
In [91]:
for x in squares([1, 2, 3, 4]):
    print(x)
1
4
9
16
In [92]:
def squares(numbers):
    print("BEGIN square", numbers)
    for n in numbers:
        print("computing square of", n)
        yield n*n
    print("END square")
In [93]:
sq = squares([1, 2, 3, 4])
In [94]:
sq
Out[94]:
<generator object squares at 0x10b63a5c8>
In [95]:
next(sq)
BEGIN square [1, 2, 3, 4]
computing square of 1
Out[95]:
1
In [96]:
next(sq)
computing square of 2
Out[96]:
4
In [97]:
next(sq)
computing square of 3
Out[97]:
9
In [98]:
next(sq)
computing square of 4
Out[98]:
16
In [99]:
next(sq)
END square
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-99-3f39f684fc70> in <module>()
----> 1 next(sq)

StopIteration: 
In [100]:
for x in squares([1, 2, 3, 4]):
    print(x)
BEGIN square [1, 2, 3, 4]
computing square of 1
1
computing square of 2
4
computing square of 3
9
computing square of 4
16
END square

Q: How will Python know if a function is a generator or not?

If the function has any yield statement then the function becomes a generator function.

In [102]:
def f():
    return 1
def g():
    yield 1
In [105]:
f.__code__.co_flags
Out[105]:
67
In [106]:
g.__code__.co_flags
Out[106]:
99
In [108]:
import dis
dis.COMPILER_FLAG_NAMES
Out[108]:
{1: 'OPTIMIZED',
 2: 'NEWLOCALS',
 4: 'VARARGS',
 8: 'VARKEYWORDS',
 16: 'NESTED',
 32: 'GENERATOR',
 64: 'NOFREE',
 128: 'COROUTINE',
 256: 'ITERABLE_COROUTINE'}

Q: Is it possible to have return inside a generator function?

In [109]:
def f():
    for i in range(10000):
        if i == 13:
            return
        yield i*i

Empty return statement is possible in Python 2 and Python 3.

Python 3.5+ supports return with a value, which is used for a special purpose called coroutines.

Problem: Write a generator countdown that takes a number n as argument and generates all numbers down to 0.

>>> for i in countdown(3):
...     print(i)
3
2
1
0

Use while loop to implement this.

In [112]:
def countdown(n):
    while n >= 0:
        yield n
        n -= 1      
In [113]:
for i in countdown(3):
    print(i)
3
2
1
0

Generator Expressions

In [114]:
[x*x for x in range(10)] # list comprehension
Out[114]:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
In [115]:
(x*x for x in range(10)) # generator expression
Out[115]:
<generator object <genexpr> at 0x10b6d20f8>
In [116]:
sum((x*x for x in range(1000000)))
Out[116]:
333332833333500000

When the generator expression is the only argument to a function, the parenthesis can be omited.

In [117]:
sum(x*x for x in range(1000000))
Out[117]:
333332833333500000

Example: Building data pipelines

In [118]:
import os

def find(root):
    """Finds all files in the given directory tree.
    """
    for path, dirnames, filenames in os.walk(root):
        for f in filenames:
            yield os.path.join(path, f)
In [119]:
def take(n, seq):
    it = iter(seq)
    return list(next(it) for i in range(n))
In [120]:
def integers():
    i = 1
    while True:
        yield i
        i += 1

def squares(numbers):
    return (n*n for n in numbers)
In [121]:
take(10, squares(integers()))
Out[121]:
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
In [124]:
def grep(pattern, seq):
    return (x for x in seq if pattern in x)
In [125]:
files = find(".")
pyfiles = grep(".py", files)
print(take(10, pyfiles))
['./a.py', './b.py', './cmdline.py', './fib.py', './fib2.py', './hello.py', './memoize.py', './mymodule.py', './mymodule.pyc', './mymodule2.py']
In [126]:
def count(seq):
    i = 0
    for x in seq:
        i = i+1
    return i
In [127]:
count(range(10))
Out[127]:
10
In [128]:
def count(seq):
    return sum(1 for x in seq)
In [129]:
count(range(10))
Out[129]:
10
In [130]:
files = find(".")
pyfiles = grep(".py", files)
print(count(pyfiles))
30
In [133]:
def readlines(filenames):
    """Returns an iterators over lines in all the files specified.
    """
    for f in filenames:
        for line in open(f):
            yield line

How many lines of Python code have we written in this course?

In [135]:
files = find(".")
pyfiles = grep(".py", files)
lines = readlines(pyfiles)
print(count(lines))
247

How many python functions have we written in this course?

In [138]:
files = find(".")
pyfiles = grep(".py", files)
lines = readlines(pyfiles)
functions = grep("def ", lines)
print(count(functions))
29

Problem: Write a function get_paragraphs to split given text into paragraphs.

The function should take a sequence of lines as argument and returns a sequence of paragraphs.

For sample input, see http://anandology.com/tmp/pg1342.txt

Once the function is there, we should be able to find:

  • The number of paragraphs
  • The longest paragraph
In [158]:
def get_paragraphs(lines):
    para = []
    for line in lines:
        if line.strip() != "":
            para.append(line)
        elif para:
            yield "".join(para)
            para = []
    if para:
        yield "".join(para)
In [159]:
lines = ["A1\n", "A2\n", "\n", "B1\n", "\n", "C1\n", "C2\n"]
In [160]:
get_paragraphs(lines)
Out[160]:
<generator object get_paragraphs at 0x10b6fcd00>
In [161]:
list(get_paragraphs(lines))
Out[161]:
['A1\nA2\n', 'B1\n', 'C1\nC2\n']

For more info on generators, look at:

http://speakerdeck.com/anandology/generators-inside-out

Working with web & APIs

@media all and (max-width: 800px) {
        .prompt {
                display: none !important;
        }
        #header-container, #maintoolbar, #menubar {
                display: none;
        }
        #notebook {
                padding: 0px;
        }
}
.training-header {
    background: #ddd;
}
In [ ]: