March 27-31, 2017
Anand Chitipothu
These notes are available online at https://notes.pipal.in/2017/symantec
© Pipal Academy LLP
%%file mymodule.py
print("BEGIN mymodule")
x = 2
def add(a, b):
return a+b
print(add(3, 4))
print("END mymodule")
!python mymodule.py
%%file a.py
import mymodule
print(mymodule.x)
print(mymodule.add(10, 20))
!python a.py
__name__ magic variable¶%%file mymodule2.py
x = 2
def add(a, b):
return a+b
print(add(3, 4))
print(__name__)
!python mymodule2.py
# ask python to import mymodule2
!python -c "import mymodule2"
When the file is execited as a script, the value of __name__ is set to "__main__". When the file is imported as a module, the __name__ is set to the module name.
%%file mymodule3.py
x = 2
def add(a, b):
return a+b
if __name__ == "__main__":
# run the following code only when this file is executed as a script.
# ignore this when this file is imported as a module.
print(add(3, 4))
!python mymodule3.py
!python -c "import mymodule3; print(mymodule3.add(10, 20))"
%%file sq.py
"""The square module.
The long description of the module after one empty line.
"""
import sys
def square(n):
"""Computes square of a number.
>>> square(4)
16
"""
return n*n
def main():
n = int(sys.argv[1])
print(square(n))
if __name__ == "__main__":
main()
!python sq.py 3
help("sq")
import sq
sq.square(123)
def square(x):
return x*x
print(square(3))
It is hard to know if the output is correct or not.
def square(x):
return x*x
if square(3) == 9:
print("OK")
Python has an assert statement to make sure something is True.
def square(x):
return x*x
def test_square():
assert square(3) == 9
assert square(-3) == 90
test_square()
Python has some tools to make testing easier.
%%file sq2.py
def square(x):
return x*x
def sum_of_squares(x, y):
return square(x) + square(y)
def test_square():
assert square(3) == 9
assert square(-3) == 9
def test_sum_of_squares():
assert sum_of_squares(0, 0) == 0
assert sum_of_squares(3, 4) == 25
!py.test sq2.py
!py.test -v sq2.py
# run only the tests that has a keyword "sum"
!py.test -v -k sum sq2.py
The py.test utility can be installed using:
pip install pytest
You may have to use sudo for it.
names = ["alice", "dave", "bob", "charlie"]
names.sort() # sorts in-place
names
names = ["alice", "dave", "bob", "charlie"]
sorted(names)
The sorted function returns a new sorted list and does not modify the original list.
sorted_names = sorted(names)
print(sorted_names)
How to sort these names by length?
# this is not what we want
sorted([len(name) for name in names])
sorted(names, key=len)
Let us say, we have records of students containing name and marks.
records = [
("A", 80),
("B", 37),
("C", 98),
("D", 72)
]
How to find sort these records by marks?
def get_marks(record):
print("get_marks", record)
return record[1]
sorted(records, key=get_marks)
We can use this even to sort files by size etc.
import os
files = sorted(os.listdir("."), key=os.path.getsize)
for f in files:
print(f)
ls -Sr | tail
Problem: Write a function isorted to sort given names ignoring the case.
>>> isorted(["A", "b", "d", "C"])
['A', 'b', 'C', 'd']
sorted(["A", "b", "d", "C"])
def isorted(names):
return sorted(names, key=ignorecase)
def ignorecase(name):
print("ignorecase", name)
# FIXME
return name.upper()
isorted(["A", "b", "d", "C"])
x = "hello"
for c in x:
print(c)
x[0]
x[1]
x[:4]
max("helloworld")
line = "one\n"
line.strip() # remove all whitespace on both the sides
" hello \n".strip()
" hello \n".strip("\n") # strip only the new line character
Q: How to replace only the first space?
"1 2 3 4".replace(" ", "-")
"1 2 3 4".replace(" ", "-", 1)
name = "Python"
message = "Hello {}".format(name)
print(message)
"chapter {}: {}".format(1, "Getting Started")
Sometimes we may want to use the same value multiple times in the pattern.
t = "chapter {}: {}\nContents of {} will come here."
print(t.format(1, "Getting Started", "Getting Started"))
t = "chapter {0}: {1}\nContents of {1} will come here."
print(t.format(1, "Getting Started"))
t = "chapter {number}: {title}\nContents of {title} will come here."
print(t.format(number=1, title="Getting Started"))
Let us look at another example.
def make_link(url):
return '<a href="{url}">{url}</a>'.format(url=url)
make_link("https://www.google.com/")
Another example:
email = """
Dear {name},
As you've requested, we've reset your password. Please use the following link
to reset your password.
http://mywebsite.com/reset-password?code={code}
Thanks,
Our Team
"""
def send_email(to, message):
# TODO
print(message)
message = email.format(name="Alice", code='123456789')
send_email("hello@example.com", message)
%%file three.txt
one
two
three
f = open("three.txt")
f.read()
print(open("three.txt").read())
Remember that reading from a file can't be done again and again (unless you rewind the file position) using the same file object.
f = open("three.txt")
f.read()
f.read()
It is also possible to pass a size to read to read a small chunk.
f = open("three.txt")
f.read(5)
f.read()
f.read()
The other common way to read a file is readlines.
open("three.txt").readlines()
lines = open("three.txt").readlines()
for line in lines:
print(line)
Why is the extra line coming between the lines?
That is because the line has a new line char at the end and prints adds another new line.
We can solve it in two ways.
# remove the new line before printing.
for line in lines:
print(line.strip("\n"))
# tell print to not add a new line
for line in lines:
print(line, end="")
for n in [1, 2, 3, 4]:
print(n, end="--")
Problem: Write a program cat.py that takes a filename as command-line argument and prints all the contents of the file.
$ python cat.py three.txt
one
two
three
%%file cat.py
import sys
filename = sys.argv[1]
contents = open(filename).read()
print(contents)
!python cat.py three.txt
Let us implement the unix word count program in Python. The program should print the line count, word count and char count for given filename.
%%file numbers.txt
1 one
2 two
3 three
4 four
5 five
%%file wc.py
"""Program to find line count, word count and char count of given file.
USAGE: python wc.py filename
"""
import sys
def linecount(f):
return len(open(f).readlines())
def wordcount(f):
return len(open(f).read().split())
def charcount(f):
return len(open(f).read())
def main():
f = sys.argv[1]
print(linecount(f), wordcount(f), charcount(f), f)
if __name__ == "__main__":
main()
!python wc.py numbers.txt
A file can be opened in write mode by specifying "w" as second argument.
f = open("a.txt", "w")
f.write("one\n")
f.write("two\n")
f.close()
open("a.txt").read()
When a file is opened in write mode, it gets overwritten if it already exists.
The contents written to the file are flushed to the disk only when the file is closed. It is very important to close the file when writing.
To append to an existing file, open it in append ("a") mode.
f = open("a.txt", "a")
f.write("three\n")
f.close()
open("a.txt").read()
with Statement¶The with statement is handy when writing to files as it takes care of closing the file automatically.
with open("b.txt", "w") as f:
f.write("one\n")
f.write("two\n")
# f gets closed automatically here
open("b.txt").read()
Q: How to insert a line in the middle of a file?
Simple answer is not possible.
You need to create a new file, copy the first part, the line to be the inserted and the last part. Once that is done, move the new file to old file.
Problem: Write a program copyfile.py to copy contents of one file to another. The program should accept the path of source file and destination file and copies the source file into the destination.
$ python copyfile.py numbers.txt numbers2.txt
WARNING: don't call this file copy.py as it interferes with a standard library module with the same name.
%%file copyfile.py
"""Program to copy files.
USAGE: python copyfile.py src.txt dest.txt
"""
import sys
def copyfile(src, dest):
contents = open(src).read()
with open(dest, "w") as f:
f.write(contents)
def copyfile2(src, dest):
with open(src) as f1, open(dest, "w") as f2:
f2.write(f1.read())
def main():
src = sys.argv[1]
dest = sys.argv[2]
copyfile(src, dest)
if __name__ == "__main__":
main()
!python copyfile.py three.txt 3.txt
!cat 3.txt
To open a file in binary mode, use mode as "rb", "wb", and "ab" for read, write and append respectively.
When a file is opened in binary mode, read and readlines returns bytes instead of strings.
open("a.txt", "r").read()
open("a.txt", "rb").read()
f = open("binary.bin", "wb")
f.write(b"\x12\x98")
f.close()
When you are working with text files, it might be useful to specify the encoding.
f = open("tamil.txt", "w", encoding="utf-8")
f.write("\u0b85\u0b86")
f.close()
open("tamil.txt", encoding="utf-8").read()
import sys
# prints to stdout
print("hello")
# same as
print("hello", file=sys.stdout)
# write to stderr
print("ERROR: unable to connect to database", file=sys.stderr)
%%file a.csv
A1,B1,C1
A2,B2,C2
A3,B3,C3
A4,B4,C4
The CSV format has lot of special cases and there are standard library modules to parse them. But, we'll parse with hand just to get a feel of how easy to do such things in Python.
open("a.csv").readlines()
# remove the new line char
[line.strip("\n") for line in open("a.csv").readlines()]
[line.strip("\n").split(",") for line in open("a.csv").readlines()]
To iterate over the lines of a file, we can just loop over the file instead of reading lines.
[line.strip("\n").split(",") for line in open("a.csv")]
def read_csv(filename):
return [line.strip("\n").split(",") for line in open(filename)]
dataset = read_csv("a.csv")
dataset
# how to get the first row?
dataset[0]
# how to get first column?
[row[0] for row in dataset]
def get_column(dataset, colum_index):
return [row[colum_index] for row in dataset]
get_column(dataset, 0)
get_column(dataset, 1)
Python has a tool called pip to install third-party libraries.
Python maintains a catalogue of all third-party libraries at https://pypi.python.org/
Any of these packages can be installed using pip.
!pip install requests