Session 9

Published

October 27, 2023

Topics Covered
  • Working with File System
  • Classes and Objects

Classes and Objects

name = "Mathematics"
name.lower()
'mathematics'
name.upper()
'MATHEMATICS'
name.lower().replace("mat", "rat").title()
'Ratheratics'
type(name)
str
f = open("files/five.txt")
f.read().split()
['one', 'two', 'three', 'four', 'five']
open("files/five.txt").readlines()
['one\n', 'two\n', 'three\n', 'four\n', 'five\n']
x = [1, 2, 3]
x
[1, 2, 3]
x.append(4)
x
[1, 2, 3, 4]
x.append(5)
x
[1, 2, 3, 4, 5]

The benefits of using methods over simple functions:

  • ability to chain method calls
  • maintain some state and update it
name.lower().replace("mat", "rat").title()
'Ratheratics'

If these were functions, we would have to do it this way:

title(replace(lower(name), "mat", "rat"))
name.lower().replace("mat", "rat").title()
'Ratheratics'

Example: The pathlib module

Let’s solve a problem of printing sizes of all files in a directory.

We’ll first try using the os module.

We can find the size of a file using os.path.getsize() or os.stat

!ls files/images
bangalore-1.jpg  bangalore-2.jpg  bangalore-3.jpg  bangalore-4.jpg  README.md
os.path.getsize("files/images/bangalore-1.jpg")
2084269
os.stat("files/images/bangalore-1.jpg")
os.stat_result(st_mode=33188, st_ino=2606145, st_dev=2048, st_nlink=1, st_uid=1003, st_gid=1005, st_size=2084269, st_atime=1698292567, st_mtime=1677207454, st_ctime=1698291066)
os.stat("files/images/bangalore-1.jpg").st_size
2084269
!ls -l files/images/bangalore-1.jpg
-rw-r--r-- 1 jupyter-anand jupyter-anand 2084269 Feb 24  2023 files/images/bangalore-1.jpg

How to print the size of every file in a directory.

path = "files/images"
os.listdir(path)
['README.md',
 'bangalore-1.jpg',
 'bangalore-4.jpg',
 'bangalore-3.jpg',
 'bangalore-2.jpg']
for f in os.listdir(path):
    p = os.path.join(path, f)
    size = os.path.getsize(p)
    print(size, f)
409 README.md
2084269 bangalore-1.jpg
1898997 bangalore-4.jpg
785897 bangalore-3.jpg
1276973 bangalore-2.jpg

The os.listdir, just gives the filenames. It is our responsibility to convert it into a full path.

Introduction to pathlib module

The pathlib module provides the same functionality with a class-based API.

from pathlib import Path
root = Path("files/images")
root.is_dir()
True
root.absolute()
PosixPath('/home/jupyter-anand/book/files/images')
root.parent
PosixPath('files')
p = root.joinpath("bangalore-1.jpg")
p.is_file()
True
p.stat().st_size
2084269
p = root / "bangalore-1.jpg"
p
PosixPath('files/images/bangalore-1.jpg')
p.name
'bangalore-1.jpg'
p.suffix
'.jpg'
p.stem
'bangalore-1'
f = root / "README.md"
f.read_text()
'# Images\n\nImages are from Unsplash.\n\nbangalore-1.jpg\nPhoto by Meriç Dağlı on Unsplash \nhttps://unsplash.com/photos/nNTVWEKP3Yc\n\nbangalore-2.jpg\nPhoto by Nishanth Avva on Unsplash\nhttps://unsplash.com/photos/kCxQtSjhAJQ\n\nbangalore-3.jpg\nPhoto by Chandan Chaurasia on Unsplash\nhttps://unsplash.com/photos/fuLPFeAd17E\n\nbangalore-4.jpg\nPhoto by Niiimmmmiiiii on Unsplash\nhttps://unsplash.com/photos/IXJzQtt6vwg'
for p in root.iterdir():
    print(p)
files/images/README.md
files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
for p in root.glob("*.jpg"):
    print(p)
files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
path = Path("test-dir")
path.exists()
False
path.mkdir()
path.mkdir(exist_ok=True)
path.owner()
'jupyter-anand'
path.group()
'jupyter-anand'
!ls -ld test-dir
drwxr-xr-x 2 jupyter-anand jupyter-anand 4096 Oct 26 04:26 test-dir
path.stat()
os.stat_result(st_mode=16877, st_ino=2637973, st_dev=2048, st_nlink=2, st_uid=1003, st_gid=1005, st_size=4096, st_atime=1698294413, st_mtime=1698294413, st_ctime=1698294413)

Let’s try solving the file size problem with pathlib.

path = Path("files/images")

for p in path.iterdir():
    size = p.stat().st_size
    print(size, p.name)
409 README.md
2084269 bangalore-1.jpg
1898997 bangalore-4.jpg
785897 bangalore-3.jpg
1276973 bangalore-2.jpg

Problem: Creating thumbnails

from PIL import Image
img = Image.open("files/images/bangalore-1.jpg")
img.size
(1908, 3391)
img.thumbnail((400, 400))
img.size
(225, 400)
img

img.save("a.jpg")
!ls -l a.jpg
-rw-r--r-- 1 jupyter-anand jupyter-anand 28963 Oct 26 04:35 a.jpg
%load_problem thumbnails
Problem: Create Thumbnails

Write a program thumbnails.py to create thumbnails of all images in a directory.

The program should take path to a directory as an argument and create thumbnails of size 400x400 for each of the image and write them into the output directory with the same filename. The output directory should be thumbnails/ by default, but it should be possible to specify a different output directory using flag -d or --output-directory.

For simplicity assume that all the images will be of extension .jpg.

$ ls files/images
bangalore-1.jpg
bangalore-2.jpg
bangalore-3.jpg
bangalore-4.jpg
README.md

$ python thumbnails.py files/images
created thumbnails/bangalore-1.jpg
created thumbnails/bangalore-2.jpg
created thumbnails/bangalore-3.jpg
created thumbnails/bangalore-4.jpg


$ python thumbnails.py files/images -d small
created small/bangalore-1.jpg
created small/bangalore-2.jpg
created small/bangalore-3.jpg
created small/bangalore-4.jpg

Hint

Use Python Imaging Library (PIL) for resizing the images.

You can verify your solution using:

%verify_problem thumbnails

%%file thumbnails.py
# your code here
import argparse
from pathlib import Path

p = argparse.ArgumentParser()
p.add_argument("-d", "--output-directory", help="Output directory for thumbnails", default="thumbnails", type=Path)
p.add_argument("imgdir", help="Directory with images", type=Path)
               
args = p.parse_args()

for p in args.imgdir.iterdir():
    print(p)
Overwriting thumbnails.py
!python thumbnails.py files/images
files/images/README.md
files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
!python thumbnails.py -d small files/images
files/images/README.md
files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
path = Path("small")
p = Path("files/images/bangalore-1.jpg")
path / p.name
PosixPath('small/bangalore-1.jpg')
path.joinpath(p.name)
PosixPath('small/bangalore-1.jpg')

Introduction to Classes

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def getx(self):
        return self.x

    def __repr__(self):
        return f"<Point({self.x}, {self.y})>"

    def __str__(self):
        return f"({self.x}, {self.y})"
p = Point(10, 20)
p
<Point(10, 20)>
p.getx()
10
p.x
10
print(p)
(10, 20)

More methods

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def getx(self):
        return self.x

    def add(self, p):
        """Adds a point to this point and returns a new Point.
        """
        x = self.x + p.x
        y = self.y + p.y
        return Point(x, y)

    def __repr__(self):
        return f"<Point({self.x}, {self.y})>"

    def __str__(self):
        return f"({self.x}, {self.y})"
p1 = Point(3, 4)
p2 = Point(30, 40)
p1.add(p2)
<Point(33, 44)>
p1.add(p2).add(p2)
<Point(63, 84)>
p1.add?
Signature: p1.add(p)
Docstring:
Adds a point to this point and returns a new Point.
        
File:      /tmp/ipykernel_1709321/425117471.py
Type:      method
%load_problem point-double
Problem: Double a Point

Add a method double to the Point class that doubles both x and y coordinates of the point.

You can start with this code for Point class.

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def getx(self):
        return self.x

    def __repr__(self):
        return f"Point({self.x}, {self.y})"

    def __str__(self):
        return f"({self.x}, {self.y})"

The expected behavior:

>>> p = Point(2, 3)
>>> p2 = p.double()
>>> p2
Point(4, 6)
>>> p.double().double()
Point(8, 12)

You can verify your solution using:

%verify_problem point-double

# your code here


Example: Github Gists

%%file gist_v1.py
"""
Python library to work with github gists.

This module provides functions to work with gitub gists.

for id in list_gists():
    g = get_gist(id)
    print(g['description']
"""

def list_gists(user):
    """Returns ids of all the gists of a given github user.
    """

def get_gist(id):
    """Returns the gist of given id.
    """
Overwriting gist_v1.py
%%file gist_v2.py
"""
Python library to work with github gists.

This module provides classes to work with gitub gists.

user = GithubUser("PipalBot")
for g in user.list_gists():
    print(g.get_description())
    for f in g.get_files():
        print(f.get_name())

"""

class GithubUser:
    def __init__(self, username):
        self.username = username

    def list_gists(self):
        """Return all the gists of this user.
        """

class Gist:
    def get_description(self):
        """Returns the description of the gist.
        """

    def get_files(self):
        """Returns the files of the gist.
        """

class GistFile:
    """A file in a gist.
    """
    def get_name(self):
        pass

    def get_content(self):
        pass
Overwriting gist_v2.py