Python Training at VMWare Pune - Day 3

Oct 16-18, 2017 Vikrant Patil

These notes are available online at http://notes.pipal.in/2017/vmware-oct-python

© Pipal Academy LLP

Day 1 | Day 2 | Day 3

Working with files

In [1]:
%%file three.txt
one
two
three
Writing three.txt
In [2]:
filehandle = open("three.txt")
In [3]:
filehandle.read()
Out[3]:
'one\ntwo\nthree'
In [4]:
filehandle.close()
In [5]:
!python -c "import this" > data.txt
In [6]:
filehandle = open("data.txt")
In [7]:
filehandle.readline()
Out[7]:
'The Zen of Python, by Tim Peters\n'
In [8]:
filehandle.readline()
Out[8]:
'\n'
In [9]:
filehandle.readline()
Out[9]:
'Beautiful is better than ugly.\n'
In [10]:
filehandle.readlines()
Out[10]:
['Explicit is better than implicit.\n',
 'Simple is better than complex.\n',
 'Complex is better than complicated.\n',
 'Flat is better than nested.\n',
 'Sparse is better than dense.\n',
 'Readability counts.\n',
 "Special cases aren't special enough to break the rules.\n",
 'Although practicality beats purity.\n',
 'Errors should never pass silently.\n',
 'Unless explicitly silenced.\n',
 'In the face of ambiguity, refuse the temptation to guess.\n',
 'There should be one-- and preferably only one --obvious way to do it.\n',
 "Although that way may not be obvious at first unless you're Dutch.\n",
 'Now is better than never.\n',
 'Although never is often better than *right* now.\n',
 "If the implementation is hard to explain, it's a bad idea.\n",
 'If the implementation is easy to explain, it may be a good idea.\n',
 "Namespaces are one honking great idea -- let's do more of those!\n"]
In [11]:
filehandle.readline()
Out[11]:
''
In [12]:
filehandle.close()
In [14]:
for line in open("data.txt").readlines(): 
    print(len(line.strip().split()))# number of words per line
7
0
5
5
5
5
5
5
2
9
4
5
3
10
13
12
5
8
11
13
12

problem:

  • Write a program cat.py equivalent to unix command cat. It prints contents of file to standard output

    python cat.py three.txt
    one
    two
    three
  • Write a program head.py equivalent to unix command head. it should take first commandline argument as number of lines and second argument as filename. It should print first n lines of file on screen

python head.py 5 data.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
  • Write a program wc.py which implements unix command wc. it prints number of lines, number of words and number of characters.
python wc.py data.txt
20 144 856
In [15]:
%%file cat.py
"""
module cat is rough implementation of unix command cat
"""
import sys

def cat(file):
    """
    prints file to standard output
    """
    f = open(file)
    print(f.read())
    f.close()
    
if __name__ == "__main__":
    cat(sys.argv[1])
Writing cat.py
In [17]:
!python cat.py three.txt
one
two
three
In [22]:
%%file head.py
"""
module head implements unix head command
"""
import sys

def head(filename, n):
    f = open(filename)
    for i in range(n):
        print(f.readline(), end="")
    
    f.close()
    
if __name__ == "__main__":
    linecount = int(sys.argv[1])
    filename = sys.argv[2]
    head(filename, linecount)
Overwriting head.py
In [23]:
!python head.py 5 data.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
In [24]:
%%file wc.py
"""
module wc implements unix command wc
"""
import sys

def line_count(f):
    return len(open(f).readlines())

def test_line_count():
    assert line_count("three.txt") == 3
    
def word_count(f):
    return len(open(f).read().split())

def test_word_count():
    assert word_count("three.txt") == 3
    
def char_count(f):
    return len(open(f).read())

def test_char_count():
    assert char_count("three.txt") == 13
    
    
if __name__ == "__main__":
    f = sys.argv[1]
    print(line_count(f), word_count(f), char_count(f))
    
Writing wc.py
In [25]:
!python wc.py data.txt
21 144 857

problem:

  • Find a file with maximum number of lines in current directory
  • How about finding a file with maximum number of words?
In [26]:
!py.test -v wc.py
============================= test session starts ==============================
platform linux -- Python 3.6.1, pytest-3.0.7, py-1.4.33, pluggy-0.4.0 -- /home/vikrant/usr/local/anaconda3/bin/python
cachedir: .cache
rootdir: /home/vikrant/trainings/2017/vmware-oct-python, inifile:
collected 3 items 

wc.py::test_line_count PASSED
wc.py::test_word_count PASSED
wc.py::test_char_count PASSED

=========================== 3 passed in 0.01 seconds ===========================
In [27]:
import os
In [28]:
files = [f for f in os.listdir(os.getcwd()) if os.path.isfile(f)]
In [29]:
max([1,2,5,6,8])
Out[29]:
8
In [30]:
max(["one", "two", "three", "four"])
Out[30]:
'two'
In [31]:
max(["one", "two", "three", "four"], key=len)
Out[31]:
'three'
In [ ]:
 
In [ ]:
 
In [35]:
from wc import line_count, word_count, char_count
In [36]:
max(files, key=line_count)
Out[36]:
'day1.html'
In [37]:
max(files, key=word_count)
Out[37]:
'day1.html'
In [38]:
max(files, key=char_count)
Out[38]:
'day1.html'

Writing files

For writing you need to open file with write mode

In [41]:
f = open("primes.txt", "w")
f.write("two\n")
f.write("three\n")
f.write("five\n")
f.write("seven")
f.close()
In [43]:
!python cat.py primes.txt
two
three
five
seven
In [44]:
f = open("primes.txt", "a")
f.write("eleven\n")
f.write("thirteen")
f.close()
In [45]:
!python cat.py primes.txt
two
three
five
seveneleven
thirteen

similarly we can read or write files in differnt modes

  1. rb => read in binary mode
  2. wb => write in binary mode
  3. ab => append in binary mode
In [47]:
open("three.txt", "r").read() # read in text mode
Out[47]:
'one\ntwo\nthree'
In [48]:
open("primes.txt", "rb").read() # read in binary mode
Out[48]:
b'two\nthree\nfive\nseveneleven\nthirteen'
In [49]:
f = open("binarydata.bin", "wb")
f.write(b'x025x082')
f.close()
In [51]:
open("binarydata.bin", "rb").read()
Out[51]:
b'x025x082'
In [52]:
f = open("binarydata.bin", "ab")
f.write(b"binary")
f.close()
In [53]:
open("binarydata.bin", "rb").read()
Out[53]:
b'x025x082binary'

with statement

In [54]:
with open("primes.txt", "a") as f:
    f.write("seventeen\n")
In [55]:
!python cat.py primes.txt
two
three
five
seveneleven
thirteenseventeen

In [56]:
with open("regional.txt", "w", encoding="utf-8") as f:
    f.write("\u0c05\u0c06")
In [57]:
open("regional.txt", encoding="utf-8").read()
Out[57]:
'à°…à°†'

problem:

  1. Write a function write_cvs to write multiplication tables into a csv file. write_csv should take filename and two dimentional list as argument and write data from list to a given file in csv format
    1,2,3,4.....11
    2,4,6,8.....22
    .
    .
    .
  2. Write a cvsparser which can parse csv file and get data as two dimensional list.
In [58]:
def tables(n):
    return [[i*j for i in range(1, n+1)] for j in range(1, 11)]
In [59]:
tables(10)
Out[59]:
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 [2, 4, 6, 8, 10, 12, 14, 16, 18, 20],
 [3, 6, 9, 12, 15, 18, 21, 24, 27, 30],
 [4, 8, 12, 16, 20, 24, 28, 32, 36, 40],
 [5, 10, 15, 20, 25, 30, 35, 40, 45, 50],
 [6, 12, 18, 24, 30, 36, 42, 48, 54, 60],
 [7, 14, 21, 28, 35, 42, 49, 56, 63, 70],
 [8, 16, 24, 32, 40, 48, 56, 64, 72, 80],
 [9, 18, 27, 36, 45, 54, 63, 72, 81, 90],
 [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]]
In [60]:
w = ["one", "two", "three", "four"]
In [61]:
",".join(w)
Out[61]:
'one,two,three,four'
In [62]:
t = tables(11)
In [63]:
t[0]
Out[63]:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
In [65]:
",".join([str(i) for i in t[0]])
Out[65]:
'1,2,3,4,5,6,7,8,9,10,11'
In [67]:
def write_csv(filename, data):
    with open(filename, "w") as f:
        for row in data:
            line = ",".join([str(i) for i in row])
            f.write(line + "\n")
In [68]:
write_csv("tables.csv", tables(11))
In [69]:
!python cat.py tables.csv
1,2,3,4,5,6,7,8,9,10,11
2,4,6,8,10,12,14,16,18,20,22
3,6,9,12,15,18,21,24,27,30,33
4,8,12,16,20,24,28,32,36,40,44
5,10,15,20,25,30,35,40,45,50,55
6,12,18,24,30,36,42,48,54,60,66
7,14,21,28,35,42,49,56,63,70,77
8,16,24,32,40,48,56,64,72,80,88
9,18,27,36,45,54,63,72,81,90,99
10,20,30,40,50,60,70,80,90,100,110

In [71]:
[line.strip().split(",") for line in open("tables.csv")]
Out[71]:
[['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11'],
 ['2', '4', '6', '8', '10', '12', '14', '16', '18', '20', '22'],
 ['3', '6', '9', '12', '15', '18', '21', '24', '27', '30', '33'],
 ['4', '8', '12', '16', '20', '24', '28', '32', '36', '40', '44'],
 ['5', '10', '15', '20', '25', '30', '35', '40', '45', '50', '55'],
 ['6', '12', '18', '24', '30', '36', '42', '48', '54', '60', '66'],
 ['7', '14', '21', '28', '35', '42', '49', '56', '63', '70', '77'],
 ['8', '16', '24', '32', '40', '48', '56', '64', '72', '80', '88'],
 ['9', '18', '27', '36', '45', '54', '63', '72', '81', '90', '99'],
 ['10', '20', '30', '40', '50', '60', '70', '80', '90', '100', '110']]

Writing to stderr, stdout

In [72]:
import sys
sys.stdout.write("Hello pyhton")
Hello pyhton
In [73]:
sys.stderr.write("Error: some ..exception..!")
Error: some ..exception..!

Dictioonaries

In [74]:
author = {"name":"lewis carrol",
          "books":["Alice in wonderland", "Looking through the glass"],
          "language":"English"
         }
In [75]:
author['name']
Out[75]:
'lewis carrol'
In [76]:
author['books']
Out[76]:
['Alice in wonderland', 'Looking through the glass']
In [77]:
print(author)
{'name': 'lewis carrol', 'books': ['Alice in wonderland', 'Looking through the glass'], 'language': 'English'}
In [78]:
del author['books']
In [79]:
print(author)
{'name': 'lewis carrol', 'language': 'English'}
In [80]:
"name" in author
Out[80]:
True
In [81]:
"lewis" in author
Out[81]:
False
In [82]:
author['language']
Out[82]:
'English'
In [83]:
author.get("language")
Out[83]:
'English'
In [84]:
"books" in author
Out[84]:
False
In [85]:
author['books']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-85-7b693b1298ba> in <module>()
----> 1 author['books']

KeyError: 'books'
In [86]:
author.get("books", [])
Out[86]:
[]

Iterating over dictionaries

In [87]:
d = {"one":1, "two":2, "three":3}

Iterating over keys

In [88]:
for key in d:
    print(key)
one
two
three
In [89]:
[k for k in d]
Out[89]:
['one', 'two', 'three']
In [96]:
for key in d.keys():
    print(key, d[key])
one 1
two 2
three 3
In [97]:
for k in d:
    print(k)
one
two
three

Iterating over values directly

In [91]:
for value in d.values():
    print(value)
1
2
3
In [92]:
def f(name):
    print(name)
In [93]:
f(name="python")
python

Iterating with keys and values

In [95]:
for k, v in d.items():
    print(k, v)
one 1
two 2
three 3
In [98]:
print(d.items())
dict_items([('one', 1), ('two', 2), ('three', 3)])
In [99]:
list(d.items())
Out[99]:
[('one', 1), ('two', 2), ('three', 3)]
In [100]:
numbers = [("one",1),("two", 2),("three", 3)]
In [101]:
dict(numbers)
Out[101]:
{'one': 1, 'three': 3, 'two': 2}
In [102]:
names = ['a','b','c']
values = [1,2,3]
In [103]:
dict(zip(names, values))
Out[103]:
{'a': 1, 'b': 2, 'c': 3}
In [104]:
items = ("Pen", "Pencil", "Colorbox")
prices = (25, 10, 50)
In [105]:
cart = dict(zip(items, prices))
In [108]:
cart
Out[108]:
{'Colorbox': 50, 'Pen': 25, 'Pencil': 10}
In [110]:
for item, price in cart.items():
    print(item.rjust(8), price)
print("-"*12)
print("Total".rjust(8), sum(cart.values()))
     Pen 25
  Pencil 10
Colorbox 50
------------
   Total 85

problem:

  • Write a function unzip which operates over a dictionary and returns keys and values as separate lists.
In [111]:
def unzip(d):
    keys = d.keys()
    values = [d[k] for k in keys]
    return list(keys), values
In [112]:
unzip(d)
Out[112]:
(['one', 'two', 'three'], [1, 2, 3])
In [113]:
unzip(cart)
Out[113]:
(['Pen', 'Pencil', 'Colorbox'], [25, 10, 50])

Example

Write a program to count word frequency in a file

In [114]:
%%file words.txt
five
five four
five four three
five four three two
five four three two one
six seven eight nine
six seven eight
six seven
six
Writing words.txt
In [116]:
"hello hello python".count("hello")
Out[116]:
2
In [117]:
s = "hello hello python"
In [118]:
s.count("hello")
Out[118]:
2
In [119]:
words = s.split()
In [120]:
words
Out[120]:
['hello', 'hello', 'python']
In [121]:
for w in words:
    print(w, s.count(w))
hello 2
hello 2
python 1
In [122]:
%%file wordfeq.py
"""
computes word frequency of every word in a file
"""
import sys

def read_words(filename):
    return open(filename).read().split()
    
def wordfreq(words):
    freq = {}
    
    for word in words:
        if word in freq:
            freq[word] += 1
        else:
            freq[word] = 1
    return freq
    
    
    
if __name__ == "__main__":
    filename = sys.argv[1]
    words = read_words(filename)
    freq = wordfreq(words)
    print(freq)
Writing wordfeq.py
In [123]:
!python wordfeq.py words.txt
{'five': 5, 'four': 4, 'three': 3, 'two': 2, 'one': 1, 'six': 4, 'seven': 3, 'eight': 2, 'nine': 1}
In [124]:
%%file wordfreq.py
"""
computes word frequency of every word in a file
"""
import sys

def read_words(filename):
    return open(filename).read().split()
    
def wordfreq(words):
    freq = {}
    
    for word in words:
        freq[word] = freq.get(word, 0) + 1
    return freq
    
    
    
if __name__ == "__main__":
    filename = sys.argv[1]
    words = read_words(filename)
    freq = wordfreq(words)
    print(freq)
Writing wordfreq.py
In [125]:
!python wordfreq.py words.txt
{'five': 5, 'four': 4, 'three': 3, 'two': 2, 'one': 1, 'six': 4, 'seven': 3, 'eight': 2, 'nine': 1}
In [126]:
def wordfreq1(words):
    freq = {}
    uniquewords = set(words)
    for w in uniquewords:
        freq[w] = words.count(w)
    return freq
In [127]:
import wordfreq
In [128]:
words = wordfreq.read_words("words.txt")
In [129]:
freq = wordfreq1(words)
In [130]:
print(freq)
{'nine': 1, 'seven': 3, 'two': 2, 'six': 4, 'three': 3, 'five': 5, 'four': 4, 'eight': 2, 'one': 1}
In [131]:
for w, f in freq.items():
    print(w.rjust(5), f)
 nine 1
seven 3
  two 2
  six 4
three 3
 five 5
 four 4
eight 2
  one 1
In [132]:
for w, f in sorted(freq.items()):
    print(w.rjust(5), f)
eight 2
 five 5
 four 4
 nine 1
  one 1
seven 3
  six 4
three 3
  two 2
In [133]:
for w, f in sorted(freq.items(), key=lambda x: x[1]):
    print(w.rjust(5), f)
 nine 1
  one 1
  two 2
eight 2
seven 3
three 3
  six 4
 four 4
 five 5
In [134]:
for w, f in sorted(freq.items(), key=lambda x: x[1], reverse=True):
    print(w.rjust(5), f)
 five 5
  six 4
 four 4
seven 3
three 3
  two 2
eight 2
 nine 1
  one 1
In [135]:
for w, f in sorted(freq.items(), key=lambda x: x[1], reverse=True):
    print(w.rjust(5), f, "*"*f)
 five 5 *****
  six 4 ****
 four 4 ****
seven 3 ***
three 3 ***
  two 2 **
eight 2 **
 nine 1 *
  one 1 *

Grouping all keys with given values

In [136]:
team = {"david":"USA", "anand":"india","linus":"USA", "naufal":"india", "alice":"uk"}
In [137]:
[name for name in team.keys() if team[name]=="india"]
Out[137]:
['anand', 'naufal']
In [138]:
[name for name in team.keys() if team[name]=="USA"]
Out[138]:
['david', 'linus']

Pitfalls

In [139]:
x = [1,2,3,4]
y = x
y.append(5)
print(x)
[1, 2, 3, 4, 5]
In [140]:
x = [1,2,3]
y = [3,4,5]
x, y = y, x
In [141]:
def f(x):
    x = x +1
In [142]:
v = 5
f(v)
print(v)
5
In [143]:
def appendone(l):
    l.append(1)
In [144]:
x = [1,2,3]
appendone(x)
print(x)
[1, 2, 3, 1]
In [145]:
x = [1,2,3,4]
y = x
y = [1,2,3]
print(x)
[1, 2, 3, 4]
In [146]:
x = 1
y = x
y = 2
print(x)
1

Classes

In [147]:
class Complex:
    def __init__(self, real, imaginary):
        self._real = real
        self._imag = imaginary
        
    def get_real(self):
        return self._real
    
    def get_imaginary(self):
        return self._imag
    
In [148]:
p = Complex(3, 4)
In [149]:
p
Out[149]:
<__main__.Complex at 0x7feb6c027898>
In [150]:
type(p)
Out[150]:
__main__.Complex
In [151]:
isinstance(p, Complex)
Out[151]:
True
In [152]:
isinstance([], Complex)
Out[152]:
False
In [153]:
p.get_real()
Out[153]:
3
In [154]:
p.get_imaginary()
Out[154]:
4
In [155]:
class EmptyClass:
    pass
In [156]:
e = EmptyClass()
In [157]:
type(e)
Out[157]:
__main__.EmptyClass
In [158]:
isinstance(e, EmptyClass)
Out[158]:
True
In [159]:
class Complex:
    def __init__(self, real, imaginary):
        self._real = real
        self._imag = imaginary
        
    def get_real(self):
        return self._real
    
    def get_imaginary(self):
        return self._imag
    
    def display(self):
        print(self._real,"+", str(self._imag) + "j")
        
    def add(self, c):
        r = self._real + c.get_real()
        i = self._imag + c.get_imaginary()
        return Complex(r, i)
In [160]:
p = Complex(4,5)
In [161]:
p.display()
4 + 5j
In [162]:
p2 = Complex(3,7)
In [163]:
p3 = p.add(p2)
In [164]:
p3.display()
7 + 12j

why classes?

In [165]:
%%file bank0.py

balance = 0

def deposit(amount):
    global balance
    balance += amount
    
def withdraw(amount):
    global balance
    balance -= amount
    
def get_balance():
    return balance

if __name__ == "__main__":
    deposit(100)
    withdraw(40)
    print(get_balance())
Writing bank0.py
In [166]:
!python bank0.py
60
In [169]:
%%file bank1.py


def make_account():
    return {"balance":0}

def deposit(account, amount):
    account['balance'] += amount
    
def withdraw(account, amount):
    account['balance'] -= amount
    
def get_balance(account):
    return account['balance']

if __name__ == "__main__":
    a1 = make_account()
    a2 = make_account()
    deposit(a1, 500)
    deposit(a2, 300)
    withdraw(a1, 100)
    print("a1", get_balance(a1))
    print("a2", get_balance(a2))
Overwriting bank1.py
In [170]:
!python bank1.py
a1 400
a2 300
In [171]:
print(p)
<__main__.Complex object at 0x7feb6c05bf60>
In [172]:
p.__dict__
Out[172]:
{'_imag': 5, '_real': 4}
In [173]:
dir(p)
Out[173]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_imag',
 '_real',
 'add',
 'display',
 'get_imaginary',
 'get_real']
In [176]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
In [177]:
p = Point(3,4)
In [178]:
p.x
Out[178]:
3
In [179]:
p.y
Out[179]:
4
In [180]:
p.z
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-180-6dce4e43e15c> in <module>()
----> 1 p.z

AttributeError: 'Point' object has no attribute 'z'
In [181]:
p.z = 5
In [182]:
p.z
Out[182]:
5
In [185]:
class ColoredPoint(Point):
    color = (0,0,0)
    
    def get_color(self):
        return self.color
In [186]:
cp = ColoredPoint(10, 5)
print(cp.x)
print(cp.y)
print(cp.get_color())
10
5
(0, 0, 0)

problem:

  • Write a class Timer to measure time taken in a task. The class should have start and stop methods. And it should be able to find time elapsed between start and end **hint:time.time()**
t = Timer()
t.start()
do_some_stuff()
t.stop()
print("Time taken by task:", t.get_time_taken())

Exceptions

In [187]:
dfdfdsf
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-187-5739b04c62d0> in <module>()
----> 1 dfdfdsf

NameError: name 'dfdfdsf' is not defined
In [188]:
int("hello")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-188-045de671ab8a> in <module>()
----> 1 int("hello")

ValueError: invalid literal for int() with base 10: 'hello'
In [189]:
"2" * "3"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-189-ff03144c1b70> in <module>()
----> 1 "2" * "3"

TypeError: can't multiply sequence by non-int of type 'str'
In [195]:
b = "hello"
#b = "2"
c = "3"

try:
    a = int(b)
    #a = b * c
except TypeError as e:
    a = 1
    print("Handled TypeError", e)
except ValueError as e:
    b = 0
    print("Handled ValueError", e)
Handled ValueError invalid literal for int() with base 10: 'hello'
In [196]:
b
Out[196]:
0
In [197]:
d = {"a":1, "b":2}
In [198]:
d['c']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-198-3e4d85f12902> in <module>()
----> 1 d['c']

KeyError: 'c'
In [199]:
import sys
def get_value(data, key, default):
    try:
        return data[key]
    except KeyError as e:
        print("Value not found, returning default", e, file=sys.stderr)
        return default
In [200]:
get_value(d, "c", 5)
Value not found, returning default 'c'
Out[200]:
5
%%file missing.txt 1 2 3 4 5 N/A 6 7 8 Nan 9 10
In [202]:
def read_with_missing(filename):
    with open(filename) as f:
        return [int(s.strip()) for s in f.readlines()]
In [203]:
read_with_missing("missing.txt")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-203-c6e4af947f3e> in <module>()
----> 1 read_with_missing("missing.txt")

<ipython-input-202-ad1bb595afbf> in read_with_missing(filename)
      1 def read_with_missing(filename):
      2     with open(filename) as f:
----> 3         return [int(s.strip()) for s in f.readlines()]

<ipython-input-202-ad1bb595afbf> in <listcomp>(.0)
      1 def read_with_missing(filename):
      2     with open(filename) as f:
----> 3         return [int(s.strip()) for s in f.readlines()]

ValueError: invalid literal for int() with base 10: 'N/A'
In [206]:
import sys
def parseint(strnum):
    try:
        return int(strnum)
    except ValueError as e:
        print("Invalid integer", strnum, e, file=sys.stderr)
        return 0
    
def read_with_missing(filename):
    with open(filename) as f:
        return [parseint(s.strip()) for s in f.readlines()]
In [207]:
read_with_missing("missing.txt")
Invalid integer N/A invalid literal for int() with base 10: 'N/A'
Invalid integer Nan invalid literal for int() with base 10: 'Nan'
Out[207]:
[1, 2, 3, 4, 5, 0, 6, 7, 8, 0, 9, 10]

Downloading stuff from internet

In [208]:
from urllib.request import urlopen
In [209]:
response = urlopen("http://httpbin.org/html")
In [210]:
response
Out[210]:
<http.client.HTTPResponse at 0x7feb57560eb8>
In [211]:
contents = response.read()
In [212]:
contents[:100]
Out[212]:
b'<!DOCTYPE html>\n<html>\n  <head>\n  </head>\n  <body>\n      <h1>Herman Melville - Moby-Dick</h1>\n\n     '
In [213]:
html = contents.decode("utf-8")
In [216]:
print(html[:400])
<!DOCTYPE html>
<html>
  <head>
  </head>
  <body>
      <h1>Herman Melville - Moby-Dick</h1>

      <div>
        <p>
          Availing himself of the mild, summer-cool weather that now reigned in these latitudes, and in preparation for the peculiarly active pursuits shortly to be anticipated, Perth, the begrimed, blistered old blacksmith, had not removed his portable forge to the hold again, af

Third party library requests is very handy for downloading stuff from web

pip3 install requests
In [217]:
import requests
In [218]:
response = requests.get("http://httpbin.org/html")
print(response.text[:400])
<!DOCTYPE html>
<html>
  <head>
  </head>
  <body>
      <h1>Herman Melville - Moby-Dick</h1>

      <div>
        <p>
          Availing himself of the mild, summer-cool weather that now reigned in these latitudes, and in preparation for the peculiarly active pursuits shortly to be anticipated, Perth, the begrimed, blistered old blacksmith, had not removed his portable forge to the hold again, af
In [219]:
response.headers
Out[219]:
{'Connection': 'keep-alive', 'Server': 'meinheld/0.6.1', 'Date': 'Wed, 18 Oct 2017 10:06:43 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Content-Length': '3741', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true', 'X-Powered-By': 'Flask', 'X-Processed-Time': '0.00299310684204', 'Via': '1.1 vegur'}
In [220]:
response.status_code
Out[220]:
200
In [222]:
response = requests.get("http://httpbin.org/get", params = {"param1":"hello", "param2":"python"})
In [224]:
print(response.text)
{
  "args": {
    "param1": "hello", 
    "param2": "python"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.14.2"
  }, 
  "origin": "42.107.71.32", 
  "url": "http://httpbin.org/get?param1=hello&param2=python"
}

In [225]:
response = requests.post("http://httpbin.org/post", data={"name":"python", "email":"abc@remail.com"})
In [226]:
print(response.text)
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "email": "abc@remail.com", 
    "name": "python"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "34", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.14.2"
  }, 
  "json": null, 
  "origin": "42.107.71.32", 
  "url": "http://httpbin.org/post"
}

In [227]:
import requests
url = "https://api.github.com/orgs/vmware/repos"
repos = requests.get(url).json()
In [228]:
type(repos)
Out[228]:
list
In [229]:
for rep in repos:
    print(rep['full_name'], rep['forks'])
vmware/pyvco 4
vmware/rvc 46
vmware/rbvmomi 152
vmware/vprobe-toolkit 9
vmware/CloudFS 16
vmware/vcd-nclient 2
vmware/lmock 5
vmware/FireBreath 2
vmware/weasel 1
vmware/vmware-vcenter 83
vmware/vmware-vshield 6
vmware/vcloud-rest 37
vmware/GemstoneWebTools 0
vmware/vmware-vcsa 17
vmware/vmware-vmware_lib 23
vmware/saml20serviceprovider 1
vmware/pg_rewind 18
vmware/vco-powershel-plugin 2
vmware/jenkins-reviewbot 12
vmware/dbeekeeper 0
vmware/thinapp_factory 16
vmware/vmware-cassandra 4
vmware/vmware-java 0
vmware/data-driven-framework 3
vmware/pyvmomi 427
vmware/pyvmomi-community-samples 362
vmware/open-vm-tools 136
vmware/pyvmomi-tools 18
vmware/upgrade-framework 11
vmware/webcommander 29
In [230]:
repos = sorted(repos, key=lambda r:r['forks'], reverse=True)[:5]
In [232]:
for r in repos:
    print(r['full_name'], r['forks'])
vmware/pyvmomi 427
vmware/pyvmomi-community-samples 362
vmware/rbvmomi 152
vmware/open-vm-tools 136
vmware/vmware-vcenter 83

problem: Find distance between two cities using google api

In [238]:
import requests
def distance(origin, destination):
    url = "https://maps.googleapis.com/maps/api/distancematrix/json"
    response = requests.get(url, params={"units":"metric",
                                         "origins":origin,
                                         "destinations":destination
                                        })
    return response.json()['rows'][0]['elements'][0]['distance']['text']
In [239]:
distance("pune", "bangalore")
Out[239]:
'841 km'

commandline applications

Consider following unix command

In [1]:
!ls
add.py		day1.ipynb     functions.py  mymodule.py   three.txt
arguments.py	day2.html      head.py	     primes.tx	   Untitled.html
bank0.py	day2.ipynb     main1.py      primes.txt    wc.py
bank1.py	day3.html      main2.py      push	   wordfeq.py
binarydata.bin	day3.ipynb     main.py	     __pycache__   wordfreq.py
cat.py		echo.py        Makefile      regional.txt  words.txt
data.txt	feedback.txt   missing.txt   square.py
day1.html	functions1.py  mymodule1.py  tables.csv
In [7]:
!ls /home/vikrant/
AnacondaProjects  Documents  Pictures	  Templates  Videos
bin		  Downloads  programming  trainings
Desktop		  Music      Public	  usr
In [9]:
!cp data.txt /tmp/

These are called positional arguments. They can be optional or compulsory.

Lets make our own small command fib.py which should work like as given below

Usage: python fib.py n
computes nth fibonacci number
In [10]:
%%file fib.py
import argparse

def fib(n):
    prev = 1
    current = 1
    
    for i in range(2,n):
        current, prev = prev+current, current
    return current

def parse_args():
    p = argparse.ArgumentParser()
    p.add_argument("n", help="n for computing nth fiboinacci number",
                  type=int)
    return p.parse_args()


def main():
    args = parse_args()
    print(args)
    print(fib(args.n))

if __name__ == "__main__":
    main()
Writing fib.py
In [12]:
!python fib.py 
usage: fib.py [-h] n
fib.py: error: the following arguments are required: n
In [13]:
!python fib.py -h
usage: fib.py [-h] n

positional arguments:
  n           n for computing nth fiboinacci number

optional arguments:
  -h, --help  show this help message and exit
In [14]:
!python fib.py 10
Namespace(n=10)
55

Lets extend out command to print sequence of fibinacci numbers till nth fibonacci. Lets add one optional argument to out command -s, if this is given our command should print sequence and not just a number.

In [15]:
%%file fib.py
import argparse

def fib(n):
    prev = 1
    current = 1
    
    for i in range(2,n):
        current, prev = prev+current, current
    return current

def printfiblist(n):
    prev, current = 1, 1
    print(prev, current, end=" ")
    for i in range(2, n):
        current, prev = prev+ current, current
        print(current, end=" ")

def parse_args():
    p = argparse.ArgumentParser()
    p.add_argument("n", 
                   help="n for computing nth fiboinacci number",
                   type=int)
    p.add_argument("-s", "--sequence",
                  help="Print sequence of fibonaci",
                  action="store_true")
    return p.parse_args()


def main():
    args = parse_args()
    print(args)
    if args.sequence:
        printfiblist(args.n)
    else:
        print(fib(args.n))

if __name__ == "__main__":
    main()
Overwriting fib.py
In [16]:
!python fib.py -h
usage: fib.py [-h] [-s] n

positional arguments:
  n               n for computing nth fiboinacci number

optional arguments:
  -h, --help      show this help message and exit
  -s, --sequence  Print sequence of fibonaci
In [19]:
!python fib.py --sequence 10
Namespace(n=10, sequence=True)
1 1 2 3 5 8 13 21 34 55 
In [20]:
!python fib.py -s 5
Namespace(n=5, sequence=True)
1 1 2 3 5 
In [21]:
!python fib.py 5
Namespace(n=5, sequence=False)
5

References

google for

  1. python documentation - tutorial for python docs is good enough
  2. Structure and Interpretation of Computer Programs
  3. bangpypers community in bangalore
In [ ]: