Session 6

Published

October 16, 2023

Topics Covered

Writing Custom Python Modules
Testing Python Programs

Writing Custom Modules

%%file mymodule.py

print("BEGIN mymodule")

x = 2

def add(a, b):
    print("add", a, b)
    return a+b

print(add(3, 4))

print("END mymodule")

Overwriting mymodule.py

!python mymodule.py

BEGIN mymodule
7
END mymodule

Let’s see what happens when we import it as a module.

import mymodule

BEGIN mymodule
7
END mymodule

mymodule.x

mymodule.add(10, 20)

If we import a module again, the code won’t be executed.

import mymodule

mymodule.x

Reimporting a module

If you make any changes after a module has been imported, the new changes won’t be seen.

The work-around is to reload the module.

import importlib

importlib.reload(mymodule)

BEGIN mymodule
add 3 4
7
END mymodule

<module 'mymodule' from '/home/jupyter-anand/book/mymodule.py'>

mymodule.add(10, 20)

add 10 20

The `name` magic variable

Python has a special varaible __name__, which is automaticaly set by the Python runtime.

%%file mymodule2.py
x = 2

def add(a, b):
    return a+b

print(__name__)
print(add(3, 4))

Overwriting mymodule2.py

!python mymodule2.py

__main__
7

The __name__ is spelled as dunder name, means double underscore name.

When a program is executed as a script, the special variable __name__ is set to __main__.

Lets try to import it and see.

import mymodule2

mymodule2
7

So, when a python file is imported as a module, the special variable __name__ is set to the module name.

How we avoid prining the value of add(3, 4) when it is imported as a module?

%%file mymodule3.py
x = 2

def add(a, b):
    return a+b

# do this only if this file is executed as a script
if __name__ == "__main__":
    print(add(3, 4))

Writing mymodule3.py

!python mymodule3.py

import mymodule3

We can also try importing a module using:

!python -c "import mymodule3"

Example: Square module

Let’s write a square program that takes a number as a command-line argument and prints its square. It should also be possible to use it as a module.

%%file sq.py
import sys

def square(n):
    return n*n

def main():
    n = int(sys.argv[1])
    print(square(n))
    
if __name__ == "__main__":
    main()

Overwriting sq.py

!python sq.py 5

!python -c "import sq; print(sq.square(4))"

import sq

sq.square(4)

Docstrings

import os

help(os.listdir)

Help on built-in function listdir in module posix:

listdir(path=None)
    Return a list containing the names of the files in the directory.
    
    path can be specified as either str, bytes, or a path-like object.  If path is bytes,
      the filenames returned will also be bytes; in all other circumstances
      the filenames returned will be str.
    If path is None, uses the path='.'.
    On some platforms, path may also be specified as an open file descriptor;\
      the file descriptor must refer to a directory.
      If this functionality is unavailable, using it raises NotImplementedError.
    
    The list is in arbitrary order.  It does not include the special
    entries '.' and '..' even if they are present in the directory.

import sq

help(sq.square)

Help on function square in module sq:

square(n)

Adding docstrings to a function

def add(x, y):
    "Adds two numbers"
    return x+y

add(4, 5)

help(add)

Help on function add in module __main__:

add(x, y)
    Adds two numbers

Typically we write multi-line string using three quotes for docstrings.

def add(x, y):
    """
    Adds two numbers
    """
    return x+y

Often, it is useful to include an example.

def add(x, y):
    """
    Adds two numbers.

        >>> add(3, 4)
        7
    """
    return x+y

help(add)

Help on function add in module __main__:

add(x, y)
    Adds two numbers.
    
        >>> add(3, 4)
        7

Using Typehints

def add(x: int, y: int) -> int:
    """
    Adds two numbers.

        >>> add(3, 4)
        7
    """
    return x+y

Type annotations are just annotations and python doesn’t enforce them.

# this perfectly fine becase Python doesn't bother about typehints at runtime
add("hello", "world")

'helloworld'

help(add)

Help on function add in module __main__:

add(x: int, y: int) -> int
    Adds two numbers.
    
        >>> add(3, 4)
        7

Adding docstrings to a module

Just like we add doctstrings for a function, we can add them to a module as well.

%%file sq.py
"""
The square module.

The square module provides functions to compute square of a number.

This can be used as a script as well.

USAGE:

    $ python sq.py 5
    25
"""
import sys

def square(n: int) -> int:
    """Computes square of a number.

        >>> square(4)
        16
    """
    return n*n

def main():
    n = int(sys.argv[1])
    print(square(n))
    
if __name__ == "__main__":
    main()

Overwriting sq.py

Let me reload it as we have already imported it before.

import sq
import importlib
importlib.reload(sq)

<module 'sq' from '/home/jupyter-anand/book/sq.py'>

sq.square(5)

!python sq.py 5

help(sq)

Help on module sq:

NAME
    sq - The square module.

DESCRIPTION
    The square module provides functions to compute square of a number.
    
    This can be used as a script as well.
    
    USAGE:
    
        $ python sq.py 5
        25

FUNCTIONS
    main()
    
    square(n: int) -> int
        Computes square of a number.
        
        >>> square(4)
        16

FILE
    /home/jupyter-anand/book/sq.py

sq.square?

Signature: sq.square(n: int) -> int
Docstring:
Computes square of a number.
>>> square(4)
16
File:      ~/book/sq.py
Type:      function

Problem: cube module

%load_problem cube-module

Problem: Cube Module

Write a module cube with a function cube.

The module should have a function cube that return cube of the given number.

>>> import cube
>>> cube.cube(3)
27

Make the file run as a script, taking a number as command-line argument and printing it's cube.

$ python cube.py 3
27

Please make sure docstrings are added both to the module and function.

>>> import cube
>>> help(cube)
Help on module cube:

NAME
    cube - The cube ...

DESCRIPTION
    The cube module ...

FUNCTIONS
    cube(n)
        Computes cube of a number.
        ...

FILE
    /home/.../cube.py

You can verify your solution using:

%verify_problem cube-module

%%file cube.py
"""
The cube module.
"""
import sys

Overwriting cube.py

import cube
importlib.reload(cube)
help(cube)

Help on module cube:

NAME
    cube - The cube module.

FILE
    /home/jupyter-anand/book/cube.py

def add(x, y):
    print("begin")
    """
    Adds two numbers.
    """
    return x+y

help(add)

Help on function add in module __main__:

add(x, y)

Testing Python Programs

def square(n):
    return n*n

square(4)

Seems to be working fine!

unittesting with `pytest`

%%file sq.py
def square(n):
    return n*n

def test_square():
    assert square(4) == 16

Overwriting sq.py

!pytest sq.py

============================= test session starts ==============================
platform linux -- Python 3.11.4, pytest-7.2.1, pluggy-1.0.0+repack
rootdir: /home/jupyter-anand/book
collected 1 item                                                               

sq.py .                                                                  [100%]

============================== 1 passed in 0.01s ===============================

!pytest sq.py -v

============================= test session starts ==============================
platform linux -- Python 3.11.4, pytest-7.2.1, pluggy-1.0.0+repack -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/jupyter-anand/book
collected 1 item                                                               

sq.py::test_square PASSED                                                [100%]

============================== 1 passed in 0.00s ===============================

Usually it is a good practice to seperate application and test code.

%%file sq.py

def square(n):
    return n*n

def sum_of_squares(x, y):
    return square(x) + square(y)

Overwriting sq.py

%%file test_sq.py
import sq

def test_square():
    assert sq.square(4) == 16

def test_sum_of_squares():
    assert sq.sum_of_squares(3, 4) == 25
    assert sq.sum_of_squares(0, 0) == 0

Overwriting test_sq.py

!pytest -v test_sq.py

============================= test session starts ==============================
platform linux -- Python 3.11.4, pytest-7.2.1, pluggy-1.0.0+repack -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/jupyter-anand/book
collected 2 items                                                              

test_sq.py::test_square PASSED                                           [ 50%]
test_sq.py::test_sum_of_squares PASSED                                   [100%]

============================== 2 passed in 0.01s ===============================

Example: Host Parser

Let’s write a program to parse the /etc/hosts file format.

!cat /etc/hosts

# /etc/hosts
127.0.0.1   localhost

# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1     ip6-allnodes
ff02::2     ip6-allrouters

%%file hosts.txt
1.2.3.4 myhost www.myhost

# this is comment
127.0.0.1 localhost

1.2.3.4 foo bar
1.2.3.5 web1 web2

Writing hosts.txt

Can we write a program to parse this?

The result could be a dictionary mapping from hostname to ip address.

%%file hosts.py
"""
Module to parse /etc/hosts file format.
"""
import sys

def parse(filename):
    """Parses a file in /etc/hosts file format and return a dictionary 
    mapping from host name to ip address.
    """
    lines = open(filename).readlines()
    result = {}
    for line in lines:
        result.update(parse_line(line))
    return result
    
def parse_line(line):
    """Parses one line of the hosts file.

    Returns a dictionary mapping from host to ip.
    """
    if is_empty(line) or is_comment(line):
        return {}

    # parts = line.strip().split()
    # ip = parts[0]
    # hosts = parts[1:]
    ip, *hosts = line.strip().split()
    
    return {h: ip for h in hosts}

def is_empty(line):
    return line.strip() == ""

def is_comment(line):
    return line.startswith("#")  

def main():
    filename = sys.argv[1]
    hosts = parse(filename)
    print(hosts)

if __name__ == "__main__":
    main()

Overwriting hosts.py

!python hosts.py hosts.txt

None

%%file test_hosts.py
from hosts import parse, parse_line

def test_parse_empty_file(tmp_path):
    path = tmp_path / "hosts.txt"
    path.write_text("")
    assert parse(path) == {}

def test_parse_single_line(tmp_path):
    path = tmp_path / "hosts.txt"
    path.write_text("1.2.3.4 foo bar")
    assert parse(path) == {"foo": "1.2.3.4", "bar": "1.2.3.4"}

def test_parse_multiline(tmp_path):
    path = tmp_path / "hosts.txt"
    path.write_text("1.2.3.4 foo bar\n2.3.4.5 web")
    assert parse(path) == {"foo": "1.2.3.4", "bar": "1.2.3.4", "web": "2.3.4.5"}

HOSTS1 = """
1.2.3.4 foo bar

# comment

2.3.4.5 web
"""

def test_parse_with_comments(tmp_path):
    path = tmp_path / "hosts.txt"
    path.write_text(HOSTS1)
    assert parse(path) == {"foo": "1.2.3.4", "bar": "1.2.3.4", "web": "2.3.4.5"}

def test_parse_line_empty():
    assert parse_line("") == {}
    assert parse_line("\n") == {}
    assert parse_line("   \n") == {}

def test_parse_line_comments():
    assert parse_line("# comment\n") == {}

def test_parse_line():
    assert parse_line("1.2.3.4 web\n") == {"web": "1.2.3.4"}
    assert parse_line("1.2.3.4 web1 web2\n") == {"web1": "1.2.3.4", "web2": "1.2.3.4"}

Overwriting test_hosts.py

!pytest test_hosts.py

============================= test session starts ==============================
platform linux -- Python 3.11.4, pytest-7.2.1, pluggy-1.0.0+repack
rootdir: /home/jupyter-anand/book
collected 7 items                                                              

test_hosts.py .......                                                    [100%]

============================== 7 passed in 0.03s ===============================