name = "Mathematics"Session 9
- Working with File System
- Classes and Objects
Classes and Objects
name.lower()'mathematics'
name.upper()'MATHEMATICS'
name.lower().replace("mat", "rat").title()'Ratheratics'
type(name)str
f = open("files/five.txt")f.read().split()['one', 'two', 'three', 'four', 'five']
open("files/five.txt").readlines()['one\n', 'two\n', 'three\n', 'four\n', 'five\n']
x = [1, 2, 3]x[1, 2, 3]
x.append(4)x[1, 2, 3, 4]
x.append(5)x[1, 2, 3, 4, 5]
The benefits of using methods over simple functions:
- ability to chain method calls
- maintain some state and update it
name.lower().replace("mat", "rat").title()'Ratheratics'
If these were functions, we would have to do it this way:
title(replace(lower(name), "mat", "rat"))name.lower().replace("mat", "rat").title()'Ratheratics'
Example: The pathlib module
Let’s solve a problem of printing sizes of all files in a directory.
We’ll first try using the os module.
We can find the size of a file using os.path.getsize() or os.stat
!ls files/imagesbangalore-1.jpg bangalore-2.jpg bangalore-3.jpg bangalore-4.jpg README.md
os.path.getsize("files/images/bangalore-1.jpg")2084269
os.stat("files/images/bangalore-1.jpg")os.stat_result(st_mode=33188, st_ino=2606145, st_dev=2048, st_nlink=1, st_uid=1003, st_gid=1005, st_size=2084269, st_atime=1698292567, st_mtime=1677207454, st_ctime=1698291066)
os.stat("files/images/bangalore-1.jpg").st_size2084269
!ls -l files/images/bangalore-1.jpg-rw-r--r-- 1 jupyter-anand jupyter-anand 2084269 Feb 24 2023 files/images/bangalore-1.jpg
How to print the size of every file in a directory.
path = "files/images"os.listdir(path)['README.md',
'bangalore-1.jpg',
'bangalore-4.jpg',
'bangalore-3.jpg',
'bangalore-2.jpg']
for f in os.listdir(path):
p = os.path.join(path, f)
size = os.path.getsize(p)
print(size, f)409 README.md
2084269 bangalore-1.jpg
1898997 bangalore-4.jpg
785897 bangalore-3.jpg
1276973 bangalore-2.jpg
The os.listdir, just gives the filenames. It is our responsibility to convert it into a full path.
Introduction to pathlib module
The pathlib module provides the same functionality with a class-based API.
from pathlib import Pathroot = Path("files/images")root.is_dir()True
root.absolute()PosixPath('/home/jupyter-anand/book/files/images')
root.parentPosixPath('files')
p = root.joinpath("bangalore-1.jpg")p.is_file()True
p.stat().st_size2084269
p = root / "bangalore-1.jpg"pPosixPath('files/images/bangalore-1.jpg')
p.name'bangalore-1.jpg'
p.suffix'.jpg'
p.stem'bangalore-1'
f = root / "README.md"f.read_text()'# Images\n\nImages are from Unsplash.\n\nbangalore-1.jpg\nPhoto by Meriç Dağlı on Unsplash \nhttps://unsplash.com/photos/nNTVWEKP3Yc\n\nbangalore-2.jpg\nPhoto by Nishanth Avva on Unsplash\nhttps://unsplash.com/photos/kCxQtSjhAJQ\n\nbangalore-3.jpg\nPhoto by Chandan Chaurasia on Unsplash\nhttps://unsplash.com/photos/fuLPFeAd17E\n\nbangalore-4.jpg\nPhoto by Niiimmmmiiiii on Unsplash\nhttps://unsplash.com/photos/IXJzQtt6vwg'
for p in root.iterdir():
print(p)files/images/README.md
files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
for p in root.glob("*.jpg"):
print(p)files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
path = Path("test-dir")path.exists()False
path.mkdir()path.mkdir(exist_ok=True)path.owner()'jupyter-anand'
path.group()'jupyter-anand'
!ls -ld test-dirdrwxr-xr-x 2 jupyter-anand jupyter-anand 4096 Oct 26 04:26 test-dir
path.stat()os.stat_result(st_mode=16877, st_ino=2637973, st_dev=2048, st_nlink=2, st_uid=1003, st_gid=1005, st_size=4096, st_atime=1698294413, st_mtime=1698294413, st_ctime=1698294413)
Let’s try solving the file size problem with pathlib.
path = Path("files/images")
for p in path.iterdir():
size = p.stat().st_size
print(size, p.name)409 README.md
2084269 bangalore-1.jpg
1898997 bangalore-4.jpg
785897 bangalore-3.jpg
1276973 bangalore-2.jpg
Problem: Creating thumbnails
from PIL import Imageimg = Image.open("files/images/bangalore-1.jpg")img.size(1908, 3391)
img.thumbnail((400, 400))img.size(225, 400)
img
img.save("a.jpg")!ls -l a.jpg-rw-r--r-- 1 jupyter-anand jupyter-anand 28963 Oct 26 04:35 a.jpg
%load_problem thumbnailsWrite a program thumbnails.py to create thumbnails of all images in a directory.
The program should take path to a directory as an argument and create thumbnails of size 400x400 for each of the image and write them into the output directory with the same filename. The output directory should be thumbnails/ by default, but it should be possible to specify a different output directory using flag -d or --output-directory.
For simplicity assume that all the images will be of extension .jpg.
$ ls files/images
bangalore-1.jpg
bangalore-2.jpg
bangalore-3.jpg
bangalore-4.jpg
README.md
$ python thumbnails.py files/images
created thumbnails/bangalore-1.jpg
created thumbnails/bangalore-2.jpg
created thumbnails/bangalore-3.jpg
created thumbnails/bangalore-4.jpg
$ python thumbnails.py files/images -d small
created small/bangalore-1.jpg
created small/bangalore-2.jpg
created small/bangalore-3.jpg
created small/bangalore-4.jpg
Hint
Use Python Imaging Library (PIL) for resizing the images.
You can verify your solution using:
%verify_problem thumbnails
%%file thumbnails.py
# your code here
import argparse
from pathlib import Path
p = argparse.ArgumentParser()
p.add_argument("-d", "--output-directory", help="Output directory for thumbnails", default="thumbnails", type=Path)
p.add_argument("imgdir", help="Directory with images", type=Path)
args = p.parse_args()
for p in args.imgdir.iterdir():
print(p)Overwriting thumbnails.py
!python thumbnails.py files/imagesfiles/images/README.md
files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
!python thumbnails.py -d small files/imagesfiles/images/README.md
files/images/bangalore-1.jpg
files/images/bangalore-4.jpg
files/images/bangalore-3.jpg
files/images/bangalore-2.jpg
path = Path("small")p = Path("files/images/bangalore-1.jpg")path / p.namePosixPath('small/bangalore-1.jpg')
path.joinpath(p.name)PosixPath('small/bangalore-1.jpg')
Introduction to Classes
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def getx(self):
return self.x
def __repr__(self):
return f"<Point({self.x}, {self.y})>"
def __str__(self):
return f"({self.x}, {self.y})"p = Point(10, 20)p<Point(10, 20)>
p.getx()10
p.x10
print(p)(10, 20)
More methods
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def getx(self):
return self.x
def add(self, p):
"""Adds a point to this point and returns a new Point.
"""
x = self.x + p.x
y = self.y + p.y
return Point(x, y)
def __repr__(self):
return f"<Point({self.x}, {self.y})>"
def __str__(self):
return f"({self.x}, {self.y})"p1 = Point(3, 4)
p2 = Point(30, 40)p1.add(p2)<Point(33, 44)>
p1.add(p2).add(p2)<Point(63, 84)>
p1.add?Signature: p1.add(p) Docstring: Adds a point to this point and returns a new Point. File: /tmp/ipykernel_1709321/425117471.py Type: method
%load_problem point-doubleAdd a method double to the Point class that doubles both x and y coordinates of the point.
You can start with this code for Point class.
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def getx(self):
return self.x
def __repr__(self):
return f"Point({self.x}, {self.y})"
def __str__(self):
return f"({self.x}, {self.y})"
The expected behavior:
>>> p = Point(2, 3)
>>> p2 = p.double()
>>> p2
Point(4, 6)
>>> p.double().double()
Point(8, 12)
You can verify your solution using:
%verify_problem point-double
# your code here
Example: Github Gists
%%file gist_v1.py
"""
Python library to work with github gists.
This module provides functions to work with gitub gists.
for id in list_gists():
g = get_gist(id)
print(g['description']
"""
def list_gists(user):
"""Returns ids of all the gists of a given github user.
"""
def get_gist(id):
"""Returns the gist of given id.
"""Overwriting gist_v1.py
%%file gist_v2.py
"""
Python library to work with github gists.
This module provides classes to work with gitub gists.
user = GithubUser("PipalBot")
for g in user.list_gists():
print(g.get_description())
for f in g.get_files():
print(f.get_name())
"""
class GithubUser:
def __init__(self, username):
self.username = username
def list_gists(self):
"""Return all the gists of this user.
"""
class Gist:
def get_description(self):
"""Returns the description of the gist.
"""
def get_files(self):
"""Returns the files of the gist.
"""
class GistFile:
"""A file in a gist.
"""
def get_name(self):
pass
def get_content(self):
passOverwriting gist_v2.py