Python Training at VMWare - Day 3¶

Jan 17-19, 2018 Vikrant Patil

These notes are available online at http://notes.pipal.in/2018/vmware-jan-python

def column(data, c):
    rowcount = len(data)
    return [data[i][c] for i in range(rowcount)]

def reverse(items):
    return list(reversed(items))

def rotate90clockewise(data):
    colcount = len(data[0])
    return [reverse(column(data, i)) for i in range(colcount)]

def transpose(data):
    colcount = len(data[0])
    return [column(data, i) for i in range(colcount)]

def rotate90anticlockwise(data):
    return reverse(transpose(data))

data = [["A1","B1","C1"],
        ["A2","B2","C2"],
        ["A3","B3","C3"]]

transpose(data)

[['A1', 'A2', 'A3'], ['B1', 'B2', 'B3'], ['C1', 'C2', 'C3']]

rotate90clockewise(data)

[['A3', 'A2', 'A1'], ['B3', 'B2', 'B1'], ['C3', 'C2', 'C1']]

rotate90anticlockwise(data)

[['C1', 'C2', 'C3'], ['B1', 'B2', 'B3'], ['A1', 'A2', 'A3']]

Working with files¶

import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

%%file data.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Writing data.txt

filehandle = open("data.txt")

filehandle.read() #this will read complete file as a string

"The Zen of Python, by Tim Peters\n\nBeautiful is better than ugly.\nExplicit is better than implicit.\nSimple is better than complex.\nComplex is better than complicated.\nFlat is better than nested.\nSparse is better than dense.\nReadability counts.\nSpecial cases aren't special enough to break the rules.\nAlthough practicality beats purity.\nErrors should never pass silently.\nUnless explicitly silenced.\nIn the face of ambiguity, refuse the temptation to guess.\nThere should be one-- and preferably only one --obvious way to do it.\nAlthough that way may not be obvious at first unless you're Dutch.\nNow is better than never.\nAlthough never is often better than *right* now.\nIf the implementation is hard to explain, it's a bad idea.\nIf the implementation is easy to explain, it may be a good idea.\nNamespaces are one honking great idea -- let's do more of those!"

filehandle.read()

''

filehandle.close()

filehandle = open("data.txt")

lines = filehandle.readlines()

for line in lines:
    print(line, end="")

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

for index,line in enumerate(lines):
    print(index+1, line, end="")

1 The Zen of Python, by Tim Peters
2 
3 Beautiful is better than ugly.
4 Explicit is better than implicit.
5 Simple is better than complex.
6 Complex is better than complicated.
7 Flat is better than nested.
8 Sparse is better than dense.
9 Readability counts.
10 Special cases aren't special enough to break the rules.
11 Although practicality beats purity.
12 Errors should never pass silently.
13 Unless explicitly silenced.
14 In the face of ambiguity, refuse the temptation to guess.
15 There should be one-- and preferably only one --obvious way to do it.
16 Although that way may not be obvious at first unless you're Dutch.
17 Now is better than never.
18 Although never is often better than *right* now.
19 If the implementation is hard to explain, it's a bad idea.
20 If the implementation is easy to explain, it may be a good idea.
21 Namespaces are one honking great idea -- let's do more of those!

filehandle = open("data.txt")

filehandle.readline()

'The Zen of Python, by Tim Peters\n'

filehandle.readline()

'\n'

filehandle.readline()

'Beautiful is better than ugly.\n'

count = 1
line = filehandle.readline()
while line!="":
    print(count, line, end="")
    line = filehandle.readline()
    count += 1

1 Simple is better than complex.
2 Complex is better than complicated.
3 Flat is better than nested.
4 Sparse is better than dense.
5 Readability counts.
6 Special cases aren't special enough to break the rules.
7 Although practicality beats purity.
8 Errors should never pass silently.
9 Unless explicitly silenced.
10 In the face of ambiguity, refuse the temptation to guess.
11 There should be one-- and preferably only one --obvious way to do it.
12 Although that way may not be obvious at first unless you're Dutch.
13 Now is better than never.
14 Although never is often better than *right* now.
15 If the implementation is hard to explain, it's a bad idea.
16 If the implementation is easy to explain, it may be a good idea.
17 Namespaces are one honking great idea -- let's do more of those!

problems

Write python script cat.py which implements roughly unix command cat. cat prints all the input files to standard output.
Write python script head.py which implements unix command head. it should take number of lines as commmandline argument.
Write a python script wc.py which implements unix command wc.

%%file cat.py
"""
cat module implements unix command cat, approximately
"""
import sys

def cat(file):
    """
    print the file to standard output
    """
    for line in open(file):
        print(line, end="")
        
def catfiles(files):
    """
    print multiple files to standard output
    """
    for file in files:
        cat(file)
        
if __name__ == "__main__":
    catfiles(sys.argv[1:])

Writing cat.py

!python cat.py /home/vikrant/programming/explorations/python/argv.py data.txt

import sys

if __name__ == "__main__":
    print(sys.argv)
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

%%file head.py
"""
head module implements unix command head approximately
"""
import sys

def head(file, n):
    filehandle = open(file)
    line = filehandle.readline()
    count = 1
    while line and count<=n:
        print(line, end="")
        count += 1
        line = filehandle.readline()
        
if __name__ == "__main__":
    head(sys.argv[1], int(sys.argv[2]))

Overwriting head.py

!python head.py data.txt 5

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.

%%file wc.py
"""
wc module implements unix command wc approximately
"""
import sys

def word_count(file):
    words = open(file).read().split()
    return len(words)

def char_count(file):
    return len(open(file).read())

def line_count(file):
    return len(open(file).readlines())


if __name__ == "__main__":
    file = sys.argv[1]
    print(line_count(file), word_count(file), char_count(file), file)

Writing wc.py

!python wc.py data.txt

21 144 856 data.txt

What about finding file with largest number of line count in current directory?

file with largest number of words?

import os

files = [file for file in os.listdir(os.getcwd()) if os.path.isfile(file)]

files

['wc.py',
 'push',
 'day1.ipynb',
 'head.py',
 'add.p',
 'myscript.py',
 'links.txt',
 'day2.ipynb',
 'echo.py',
 'day2.html',
 'Untitled.html',
 'links.txt~',
 'hello.py',
 'mymodule.py',
 'hello1.py',
 'moddule1.py',
 'day3.ipynb',
 'functions.py',
 'data.txt',
 'module1.py',
 'module2.py',
 'day3.html',
 'Makefile',
 'add.py',
 'cat.py',
 'day1.html',
 'ls.py']

import wc
count = 0
f = files[0]
for file in files:
    lcount = wc.line_count(file)
    if lcount > count:
        count = lcount
        f = file
print(f, count)

day1.html 22182

max(files, key=wc.line_count)

'day1.html'

max(files, key=wc.word_count)

'day1.html'

Writing files¶

file = open("numbers.txt", "w")

file.write("one\n")
file.write("two\n")
file.write("three\n")

6

file.close()

!python cat.py numbers.txt

one
two
three

file = open("numbers.txt", "r+") #read and write at a time, but file has to pre-exist.
file.write("1\n")
file.write("2\n")
file.flush()

file.readline()

'two\n'

file.readline()

'three\n'

file.readline()

''

file.close()

!python cat.py numbers.txt

1
2
two
three

file  = open("numbers.txt","a")
file.write("end\n")

4

file.close()

!python cat.py numbers.txt

1
2
two
three
end

file = open("binary.bin", "wb")
file.write(b"\x65\x69")
file.close()

file = open("binary.bin", "rb")
file.read()

b'ei'

b = b"binary"

type(b)

bytes

b.decode()

'binary'

"string".encode()

b'string'

file = open("regional.txt", "w", encoding="utf-8")
file.write("हाआ")
file.close()

file = open("regional.txt", encoding="utf-8")
file.read()

'हाआ'

Examples¶

Write csv files
Parse csv files

tables = [[str(i*j) for i in range(1,6)] for j in range(1,11) ]

tables

[['1', '2', '3', '4', '5'],
 ['2', '4', '6', '8', '10'],
 ['3', '6', '9', '12', '15'],
 ['4', '8', '12', '16', '20'],
 ['5', '10', '15', '20', '25'],
 ['6', '12', '18', '24', '30'],
 ['7', '14', '21', '28', '35'],
 ['8', '16', '24', '32', '40'],
 ['9', '18', '27', '36', '45'],
 ['10', '20', '30', '40', '50']]

def csvwriter(data, filename):
    file = open(filename, "w")
    for row in data:
        file.write(",".join(row))
        file.write("\n")
    file.close()
    

def columndatawriter(data, delimiter, filename):
    with open(filename, "w") as f:
        for row in data:
            f.write(delimiter.join(row))
            f.write("\n")

csvwriter = lambda data, file: columndatawriter(data, ",", file)

def csvwriter(data, file):
    return columndatawriter(data, ",", file)

tsvwriter = lambda data, file: columndatawriter(data, "\t", file)

csvwriter(tables, "tables.csv")

!python cat.py tables.csv

1,2,3,4,5
2,4,6,8,10
3,6,9,12,15
4,8,12,16,20
5,10,15,20,25
6,12,18,24,30
7,14,21,28,35
8,16,24,32,40
9,18,27,36,45
10,20,30,40,50

csvwriter(tables, "tables.csv")

!python cat.py tables.csv

1,2,3,4,5
2,4,6,8,10
3,6,9,12,15
4,8,12,16,20
5,10,15,20,25
6,12,18,24,30
7,14,21,28,35
8,16,24,32,40
9,18,27,36,45
10,20,30,40,50

tsvwriter(tables, "tables.tsv")

!python cat.py tables.tsv

1	2	3	4	5
2	4	6	8	10
3	6	9	12	15
4	8	12	16	20
5	10	15	20	25
6	12	18	24	30
7	14	21	28	35
8	16	24	32	40
9	18	27	36	45
10	20	30	40	50

Writing to standard error and standard output¶

import sys

sys.stderr.write("Error: Something went wrong")

Error: Something went wrong

sys.stdout.write("This is just for information..")

This is just for information..

Working with dictionaries¶

author = {"name":"Lewis Carrol",
         "books":["Alice in wonderland", "Looking through the glass"],
         "language":"english"}

author

{'books': ['Alice in wonderland', 'Looking through the glass'],
 'language': 'english',
 'name': 'Lewis Carrol'}

author["name"]

'Lewis Carrol'

author["books"]

['Alice in wonderland', 'Looking through the glass']

author["country"] = "UK"

author

{'books': ['Alice in wonderland', 'Looking through the glass'],
 'country': 'UK',
 'language': 'english',
 'name': 'Lewis Carrol'}

del author['country']

author

{'books': ['Alice in wonderland', 'Looking through the glass'],
 'language': 'english',
 'name': 'Lewis Carrol'}

d = {True:1, False:0}

d

{False: 0, True: 1}

[[d[i==j] for i in range(5)] for j in range(5)]

[[1, 0, 0, 0, 0],
 [0, 1, 0, 0, 0],
 [0, 0, 1, 0, 0],
 [0, 0, 0, 1, 0],
 [0, 0, 0, 0, 1]]

author['country']

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-126-e6bf6cefc7cc> in <module>()
----> 1 author['country']

KeyError: 'country'

author.get("country", "UK")

'UK'

author

{'books': ['Alice in wonderland', 'Looking through the glass'],
 'language': 'english',
 'name': 'Lewis Carrol'}

del author['books']

author

{'language': 'english', 'name': 'Lewis Carrol'}

author.get("books", [])

[]

"name" in author

True

"books" in author

False

"language" in author

True

"english" in author

False

author

{'language': 'english', 'name': 'Lewis Carrol'}

"english" in author.values()

True

author.values()

dict_values(['Lewis Carrol', 'english'])

author.keys()

dict_keys(['name', 'language'])

author.items()

dict_items([('name', 'Lewis Carrol'), ('language', 'english')])

my grub conf looks like this ...

%%file grub.cnf
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash modlbs=off wifi=off"
GRUB_CMDLINE_LINUX=""

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

Writing grub.cnf

def parsegrubconf(conffile):
    grubconf = {}
    with open(conffile) as conf:
        for line in conf:
            if line.startswith("#") or line.strip()=="":
                continue
            else:
                tokens = line.strip().split("=")
                grubconf[tokens[0]] = "=".join(tokens[1:])
    return grubconf

conf = parsegrubconf("grub.cnf")

conf['GRUB_CMDLINE_LINUX']

'""'

conf['GRUB_TIMEOUT']

'10'

conf

{'GRUB_CMDLINE_LINUX': '""',
 'GRUB_CMDLINE_LINUX_DEFAULT': '"quiet splash modlbs=off wifi=off"',
 'GRUB_DEFAULT': '0',
 'GRUB_DISTRIBUTOR': '`lsb_release -i -s 2> /dev/null || echo Debian`',
 'GRUB_HIDDEN_TIMEOUT': '0',
 'GRUB_HIDDEN_TIMEOUT_QUIET': 'true',
 'GRUB_TIMEOUT': '10'}

s = "hello world few more words"

help(s.split)

Help on built-in function split:

split(...) method of builtins.str instance
    S.split(sep=None, maxsplit=-1) -> list of strings
    
    Return a list of the words in S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are
    removed from the result.

s.split(maxsplit=1)

['hello', 'world few more words']

Word frequency example¶

%%file words.txt
one
one two
one two three
one two three four
one two three four five
one two three five six
one two five six seven
one five six 
five six
five

Writing words.txt

%%file wordfreq.py
import sys

def getwords(file):
    return open(file).read().split()

def wordfreq(words):
    freq = {}
    for word in words:
        if word in freq:
            freq[word] += 1
        else:
            freq[word] = 1
    return freq

def wordfreq1(words):
    freq = {}
    for word in words:
        freq[word] = freq.get(word, 0) + 1
    return freq

def wordfreq2(words):
    uniq = set(words)
    freq = {}
    for w in uniq:
        freq[w] = words.count(w)
    return freq

if __name__ == "__main__":
    words = getwords(sys.argv[1])
    print(wordfreq(words))

Overwriting wordfreq.py

!python wordfreq.py words.txt

{'one': 8, 'two': 6, 'three': 4, 'four': 2, 'five': 6, 'six': 4, 'seven': 1}

set([1,1,1,2,3,2,3,4]) # set is unique collection of objects

{1, 2, 3, 4}

import wordfreq

words = wordfreq.getwords("words.txt")

wordfreq.wordfreq(words)

{'five': 6, 'four': 2, 'one': 8, 'seven': 1, 'six': 4, 'three': 4, 'two': 6}

wordfreq.wordfreq1(words)

{'five': 6, 'four': 2, 'one': 8, 'seven': 1, 'six': 4, 'three': 4, 'two': 6}

wordfreq.wordfreq2(words)

{'five': 6, 'four': 2, 'one': 8, 'seven': 1, 'six': 4, 'three': 4, 'two': 6}

Iterating over dictionaries¶

freq = wordfreq.wordfreq(words)

for key in freq:
    print(key, freq[key])

one 8
two 6
three 4
four 2
five 6
six 4
seven 1

for key in freq.keys():
    print(key, freq[key])

one 8
two 6
three 4
four 2
five 6
six 4
seven 1

for value in freq.values():
    print(value)

8
6
4
2
6
4
1

for key, value in freq.items():
    print(key, value)

one 8
two 6
three 4
four 2
five 6
six 4
seven 1

for key, value in sorted(freq.items()):
    print(key, value)

five 6
four 2
one 8
seven 1
six 4
three 4
two 6

def getvalue(pair):
    return pair[1]

for key, value in sorted(freq.items(), key=getvalue):
    print(key, value)

seven 1
four 2
three 4
six 4
two 6
five 6
one 8

for key, value in sorted(freq.items(), key=getvalue, reverse=True):
    print(key, value)

one 8
two 6
five 6
three 4
six 4
four 2
seven 1

for key, value in sorted(freq.items(), key=getvalue, reverse=True):
    print(key.rjust(5), value)

  one 8
  two 6
 five 6
three 4
  six 4
 four 2
seven 1

for key, value in sorted(freq.items(), key=getvalue, reverse=True):
    print(key.rjust(5), value, "*"*value)

  one 8 ********
  two 6 ******
 five 6 ******
three 4 ****
  six 4 ****
 four 2 **
seven 1 *

names = ["Anand","Naufal", "Vikrant", "David", "Alice", "Isac"]
countires = ["India", "India", "India", "USA", "UK", "USA"]

dict(zip(names, countires))

{'Alice': 'UK',
 'Anand': 'India',
 'David': 'USA',
 'Isac': 'USA',
 'Naufal': 'India',
 'Vikrant': 'India'}

d = {[]:"empty",
    [1,1,1]:"ones"}

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-197-8d7dfeb2ed78> in <module>()
      1 d = {[]:"empty",
----> 2     [1,1,1]:"ones"}

TypeError: unhashable type: 'list'

d = {(1,):1,
    (1,2):2}

d

{(1,): 1, (1, 2): 2}

Why classes?¶

Lets try to model bank account

%%file bank0.py

balance = 0

def get_balance():
    return balance

def deposite(amount):
    global balance
    balance += amount
    
def withdraw(amount):
    global balance
    balance -= amount
    
if __name__ == "__main__":
    deposite(100)
    print(get_balance())
    withdraw(20)
    print(get_balance())

Writing bank0.py

%%file m.py

a = 2
b = 3
def func():
    return a+b

Writing m.py

import m

m.a

2

m.b

3

m.func()

5

%%file bank1.py

def create_account():
    return {"balance":0}

def get_balance(account):
    return account['balance']

def deposite(account, amount):
    account['balance'] += amount
    
def withdraw(account, amount):
    account['balance'] -= amount
    

if __name__ == "__main__":
    a1 = create_account()
    a2 = create_account()
    deposite(a1, 100)
    deposite(a2, 200)
    print("a1 ", get_balance(a1))
    print("a2 ", get_balance(a2))
    withdraw(a1, 10)
    withdraw(a2, 20)
    print("a1 ", get_balance(a1))
    print("a2 ", get_balance(a2))

Overwriting bank1.py

!python bank1.py

a1  100
a2  200
a1  90
a2  180

%%file bank2.py

class BankAccount:
    
    def __init__(self):
        self.balance = 0
        
    def get_balance(self):
        return self.balance
    
    def deposit(self, amount):
        self.balance += amount
        
    def withdraw(self, amount):
        self.balance -= amount

if __name__ == "__main__":
    a1 = BankAccount()
    a2 = BankAccount()
    a1.deposit(100)
    a2.deposit(200)
    print("a1 ", a1.get_balance())
    print("a2 ", a2.get_balance())
    a1.withdraw(20)
    a2.withdraw(30)
    print("a1 ", a1.get_balance())
    print("a2 ", a2.get_balance())

Overwriting bank2.py

!python bank2.py

a1  100
a2  200
a1  80
a2  170

class Foo:
    pass

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

class Point:
    def __init__(obj, x, y): # this will also work, but by convention we use self
        obj.x = x
        obj.y = y

p = Point(1,2)

type(p)

__main__.Point

def f():
    pass

f

<function __main__.f>

class Foo:
    pass

Foo

__main__.Foo

p.x

1

p.y

2

p.z = 3

p.z

3

p.x = -1

p.x

-1

p.__dict__

{'x': -1, 'y': 2, 'z': 3}

problem

Write a class Timer, which can be used to time tasks. it should work as given below. hint:time.time()

t = Timer()
t.start()
do some work
t.stop()
print(t.get_time_taken())

import time

time.time()

1516353620.048532

import time
class Timer:
    def __init__(self):
        self._start = 0
        self._end = 0
        
    def start(self):
        self._start = time.time()
        
    def stop(self):
        self._end = time.time()
        
    def get_time_taken(self):
        return self._end - self._start
    
    def reset(self):
        self._start = 0
        self._end = 0

t = Timer()
t.start()

time.sleep(2)

t.stop()

print(t.get_time_taken())

16.38936138153076

t.start()

t.get_time_taken()

-88.4480152130127

t.reset()

Exceptions¶

names = ["Anand","Naufal", "Vikrant", "David", "Alice", "Isac"]
countires = ["India", "India", "India", "USA", "UK", "USA"]

dict([(1,"one"), (2,"two")])

{1: 'one', 2: 'two'}

teams = dict(zip(names, countires))

teams

{'Alice': 'UK',
 'Anand': 'India',
 'David': 'USA',
 'Isac': 'USA',
 'Naufal': 'India',
 'Vikrant': 'India'}

teams['Asimov']

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-250-d1fbe5f006c0> in <module>()
----> 1 teams['Asimov']

KeyError: 'Asimov'

doom

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-251-301636d58674> in <module>()
----> 1 doom

NameError: name 'doom' is not defined

2 + "3"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-252-2068cae7beb7> in <module>()
----> 1 2 + "3"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

open("sdkjhfkdsjhfkjds")

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-253-27e91b799f33> in <module>()
----> 1 open("sdkjhfkdsjhfkjds")

FileNotFoundError: [Errno 2] No such file or directory: 'sdkjhfkdsjhfkjds'

def parseint(strvalue):
    return int(strvalue)

parseint("42")

42

parseint("Nan")

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-256-e5485772f304> in <module>()
----> 1 parseint("Nan")

<ipython-input-254-9845c4807f9e> in parseint(strvalue)
      1 def parseint(strvalue):
----> 2     return int(strvalue)

ValueError: invalid literal for int() with base 10: 'Nan'

def parseint(strvalue):
    try:
        return int(strvalue)
    except ValueError as e:
        print("Handled :", e)
        return 0

parseint("Nan")

Handled : invalid literal for int() with base 10: 'Nan'

0

def csvparser(file):
    return [[int(item) for item in line.strip().split(",")] for line in open(file)]

def csvparser_missing(file):
    return [[parseint(item) for item in line.strip().split(",")] for line in open(file)]

%%file nums.csv
1,2,3
4,,5
6,7,Nan

Writing nums.csv

csvparser("nums.csv")

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-265-7136d0e78922> in <module>()
----> 1 csvparser("nums.csv")

<ipython-input-262-c5d85f9b2362> in csvparser(file)
      1 def csvparser(file):
----> 2     return [[int(item) for item in line.strip().split(",")] for line in open(file)]

<ipython-input-262-c5d85f9b2362> in <listcomp>(.0)
      1 def csvparser(file):
----> 2     return [[int(item) for item in line.strip().split(",")] for line in open(file)]

<ipython-input-262-c5d85f9b2362> in <listcomp>(.0)
      1 def csvparser(file):
----> 2     return [[int(item) for item in line.strip().split(",")] for line in open(file)]

ValueError: invalid literal for int() with base 10: ''

csvparser_missing("nums.csv")

Handled : invalid literal for int() with base 10: ''
Handled : invalid literal for int() with base 10: 'Nan'

[[1, 2, 3], [4, 0, 5], [6, 7, 0]]

Building commandline applications¶

%%file head.py
"""
head module implements unix command head approximately
"""
import argparse

def head(file, n):
    filehandle = open(file)
    line = filehandle.readline()
    count = 1
    while line and count<=n:
        print(line, end="")
        count += 1
        line = filehandle.readline()
        
        
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("file", type=str,
                       help="File in which to look for")
    parser.add_argument("-n", "--lines", 
                        type=int,
                        help="Number of lines to be seen as head")
    return parser.parse_args()
    
if __name__ == "__main__":
    args = parse_args()
    if args.lines:
        head(args.file, args.lines)
    else:
        head(args.file, 5)

Overwriting head.py

!python head.py data.txt

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.

%%file args.py
"""
head module implements unix command head approximately
"""
import argparse


def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("files",type=str,
                        nargs="+",
                       help="Files in which to look for")
    parser.add_argument("-n", "--lines", 
                        type=int,
                        help="Number of lines to be seen as head")
    parser.add_argument("-f", 
                       help="optional flag",
                       action="store_true")
    return parser.parse_args()
    
if __name__ == "__main__":
    args = parse_args()
    print(args)

Overwriting args.py

!python args.py -n 5 data.txt jkjsdhdf jksddj

Namespace(f=False, files=['data.txt', 'jkjsdhdf', 'jksddj'], lines=5)

problem

improve the script cat.py to have positional arguments and additional option "-n" which if given cat.py prints line numbers at start of line.

regular expressions¶

import re

pattern = re.compile("^def\s+[a-zA-Z]+\(.*\):")

m = pattern.match("def hello():")

m

<_sre.SRE_Match object; span=(0, 12), match='def hello():'>

pattern = re.compile("^def\s+([a-zA-Z]+)\(.*\):")

m = pattern.match("def hello():")

m

<_sre.SRE_Match object; span=(0, 12), match='def hello():'>

m.groups()

('hello',)

def functions(pyfiles):
    funcp = re.compile("^def\s+([a-zA-Z]+)\(.*\):")
    funcs = []
    for file in pyfiles:
        for line in open(file):
            m = funcp.match(line)
            if m:
                funcs.append(m.groups()[0])
    return funcs

import os

pyfiles = [file for file in os.listdir(os.getcwd()) if file.endswith(".py")]
functions(pyfiles)

['head',
 'getwords',
 'wordfreq',
 'deposite',
 'withdraw',
 'add',
 'square',
 'mult',
 'func',
 'addstrint',
 'cat',
 'catfiles',
 'listfiles',
 'deposite',
 'withdraw']

text = "some junk lkdljsfl 1hell3 kjfklfgj-090943unlkjrkltgj90485"
text2 = "kjsdhhfkjds kjhdsfkhsdkjh 987jhkfdskjh 1ghfh9 lkdfjlkds90m.?;ljfsdh"
p = re.compile("(.*)([0-9]{1,1}[a-z]{4,4}[0-9]{1,1})(.*)")
m1 = p.match(text)
m2 = p.match(text2)

m1

<_sre.SRE_Match object; span=(0, 57), match='some junk lkdljsfl 1hell3 kjfklfgj-090943unlkjrkl>

m2

<_sre.SRE_Match object; span=(0, 67), match='kjsdhhfkjds kjhdsfkhsdkjh 987jhkfdskjh 1ghfh9 lkd>

m1.groups()

('some junk lkdljsfl ', '1hell3', ' kjfklfgj-090943unlkjrkltgj90485')

m2.groups()

('kjsdhhfkjds kjhdsfkhsdkjh 987jhkfdskjh ', '1ghfh9', ' lkdfjlkds90m.?;ljfsdh')

Downloading stuff from internet¶

import requests

response = requests.get("http://httpbin.org/html")
print(response.text[:400])

<html>
<head>
<meta http-equiv='refresh' content='1; url=http://httpbin.org/html&arubalp=e92a1063-4b38-4bd0-bab1-476d4c3d8e'>
</head>
</html>

response.headers

{'Date': 'Fri, 19 Jan 2018 11:21:56 GMT', 'Server': 'Apache', 'X-Frame-Options': 'SAMEORIGIN', 'X-UA-Compatible': 'IE=edge;IE=11;IE=10;IE=9', 'Expires': '0', 'Content-Length': '142', 'Connection': 'close', 'Content-Type': 'text/html'}

response = requests.get("http://httpbin.org/get", params={"parameters":"dummy", "language":"python"})

print(response.text)

<html>
<head>
<meta http-equiv='refresh' content='1; url=http://httpbin.org/get?parameters=dummy&amp;language=python&arubalp=e92a1063-4b38-4bd0-bab1-476d4c3d8e'>
</head>
</html>

response = requests.post("http://httpbin.org/post", data={"name":"alice","email":"asa@kj.com"})

print(response.text[:100])

<!doctype html>
<html lang="en">
<head>
	<meta http-equiv="Content-Type" content="text/html; charset

finding popular repos of vmware on github¶

import requests
url = "https://api.github.com/orgs/vmware/repos"
repos = requests.get(url).json()

type(repos)

list

for repo in repos:
    print(repo['full_name'], repo['forks'])

vmware/pyvco 4
vmware/rvc 46
vmware/rbvmomi 153
vmware/vprobe-toolkit 9
vmware/CloudFS 16
vmware/vcd-nclient 2
vmware/lmock 5
vmware/FireBreath 2
vmware/weasel 1
vmware/vmware-vcenter 86
vmware/vmware-vshield 6
vmware/vcloud-rest 37
vmware/GemstoneWebTools 0
vmware/vmware-vcsa 17
vmware/vmware-vmware_lib 24
vmware/saml20serviceprovider 1
vmware/pg_rewind 18
vmware/vco-powershel-plugin 2
vmware/jenkins-reviewbot 12
vmware/dbeekeeper 0
vmware/thinapp_factory 16
vmware/vmware-cassandra 4
vmware/vmware-java 0
vmware/data-driven-framework 3
vmware/pyvmomi 463
vmware/pyvmomi-community-samples 393
vmware/open-vm-tools 154
vmware/pyvmomi-tools 17
vmware/upgrade-framework 11
vmware/webcommander 32

for repo in sorted(repos, key=lambda rep:rep["forks"], reverse=True)[:5]:
    print(repo['full_name'], repo['forks'])

vmware/pyvmomi 463
vmware/pyvmomi-community-samples 393
vmware/open-vm-tools 154
vmware/rbvmomi 153
vmware/vmware-vcenter 86

problem

Find distance between two cities using google api

url="https://maps.googleapis.com/maps/api/distancematrix/json"
paramaeter reqauired :
origins
destinations
units (metric)

def distance(source, dest):
    url = "https://maps.googleapis.com/maps/api/distancematrix/json"
    r = requests.get(url, params={"origins":source,
                                  "destinations":dest,
                                  "units":"metric"})
    #return r.json()
    return r.json()['rows'][0]['elements'][0]['distance']['text']

data = distance("bangalore", "mumbai")

data

{'destination_addresses': ['Mumbai, Maharashtra, India'],
 'origin_addresses': ['Bengaluru, Karnataka, India'],
 'rows': [{'elements': [{'distance': {'text': '980 km', 'value': 980127},
     'duration': {'text': '14 hours 49 mins', 'value': 53317},
     'status': 'OK'}]}],
 'status': 'OK'}

data['rows'][0]['elements'][0]['distance']['text']

'980 km'

distance("chennai", "bangalore")

'347 km'

Feedback ¶

References and practice material¶

google for bangpypers

Suggested mini projects.

tetsing
backup
tr
directorytree

Python Training at VMWare - Day 3¶

Working with files¶

Writing files¶

Examples¶

Writing to standard error and standard output¶

Working with dictionaries¶

Word frequency example¶

Iterating over dictionaries¶

Why classes?¶

Exceptions¶

Building commandline applications¶

regular expressions¶

Downloading stuff from internet¶

finding popular repos of vmware on github¶

Feedback¶

References and practice material¶

Feedback ¶