Advanced Python Training at Arcesium - Day 2¶

Sep 25-27, 2019 Vikrant Patil

These notes are available online at http://notes.pipal.in/2020/arcesium_advanced_feb/day2.html

We will be using python 3.7 from anaconda for this training. You can download it from

https://www.anaconda.com/download/

Understanding iterations¶

for n in [2, 3, 4, 5, 6, 7]:
    print(n)

for c in "This is a string to test for loop":
    print(c, end=",")

T,h,i,s, ,i,s, ,a, ,s,t,r,i,n,g, ,t,o, ,t,e,s,t, ,f,o,r, ,l,o,o,p,

for item in {"a":True, "b":False}:
    print(item)

a
b

The iteration protocol¶

items = [1, 2, 3, 4, 5]

itr_items =  iter(items)

itr_items

<list_iterator at 0x7f5783d17a20>

next(itr_items)

1

next(itr_items)

2

next(itr_items)

3

next(itr_items)

4

next(itr_items)

5

next(itr_items)

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-14-8f3d308df4eb> in <module>
----> 1 next(itr_items)

StopIteration:

generators¶

def squares(numbers):
    for n in numbers:
        yield n*n

squares

<function __main__.squares(numbers)>

s =squares([2,3,4])

s

<generator object squares at 0x7f5783c88750>

for i in s:
    print(i)

4
9
16

def squares(numbers):
    print("Begin squares")
    for n in numbers:
        print("Computing square of ", n)
        yield n*n
        print("Back to squares")
    print("Finished squares")

sqrs = squares([4,5,6])

sqrs

<generator object squares at 0x7f5783c887c8>

next(sqrs)

Begin squares
Computing square of  4

16

next(sqrs)

Back to squares
Computing square of  5

25

next(sqrs)

Back to squares
Computing square of  6

36

next(sqrs)

Back to squares
Finished squares

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-26-6a5cc5500491> in <module>
----> 1 next(sqrs)

StopIteration:

problems

Write a generator coutdown which works exactly opposite of range!
Can we write a term generator for y'days piseries? How can we use this generator to sum pi?
is it possible to write infinite sequence generator? write infinite fib series generator.
How will we work with inifinite sequences? can you write a function called take which takes only n items from given sequence.
```
>>> ones = infiniteones()
>>> take(ones, 5)
[1, 1, 1, 1, 1]
```

def hold():
    print("Enter ....")
    yield 1
    print("After 1")
    print("Going 2")
    yield 2
    print("After 2")
    print("Going 3")
    yield 3
    print("After 3")
    print("Stopping....")

h = hold()

next(h)

Enter ....

1

next(h)

After 1
Going 2

2

next(h)

After 2
Going 3

3

next(h)

After 3
Stopping....

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-32-31146b9ab14d> in <module>
----> 1 next(h)

StopIteration:

def foo(x):
    if x:
        return 1
    else:
        yield 0

f = foo(True)

f

<generator object foo at 0x7f5783c885e8>

next(f)

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-41-aff1dd02a623> in <module>
----> 1 next(f)

StopIteration: 1

f = foo(True)

x = next(f)

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-43-dc64844b6993> in <module>
----> 1 x = next(f)

StopIteration: 1

print(x)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-44-fc17d851ef81> in <module>
----> 1 print(x)

NameError: name 'x' is not defined

def loop():
    n = 0
    while True:
        yield n
        
        if n == 3:
            return 
        n += 1

l = loop()

next(l)

0

next(l)

1

next(l)

2

next(l)

3

next(l)

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-51-cdc8a39da60d> in <module>
----> 1 next(l)

StopIteration:

help(max)

Help on built-in function max in module builtins:

max(...)
    max(iterable, *[, default=obj, key=func]) -> value
    max(arg1, arg2, *args, *[, key=func]) -> value
    
    With a single iterable argument, return its biggest item. The
    default keyword-only argument specifies an object to return if
    the provided iterable is empty.
    With two or more arguments, return the largest argument.

def countdown(n):
    while n > 0:
        yield n 
        n -= 1

for i in countdown(4):
    print(i, end=",")

4,3,2,1,

def piseries():
    n = 1
    while True:
        yield 8/((4*n-3)*(4*n-1))
        n += 1

def take(seq, n):
    return [next(seq) for _ in range(n)]

take(piseries(), 5)

[2.6666666666666665,
 0.22857142857142856,
 0.08080808080808081,
 0.041025641025641026,
 0.02476780185758514]

sum(take(piseries(), 1000))

3.141092653621038

def fibseries():
    cur, next_ = 1, 1
    while True:
        yield cur
        cur , next_ = next_, cur+next_

take(fibseries(), 10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

cntd3 = countdown(3)

for i in cntd3:
    print(i, end=",")

3,2,1,

for i in cntd3:
    print(i, end=",")

c3 = countdown(3)

copyc3 = c3

for i in c3:
    print(i, end=",")

3,2,1,

for i in copyc3:
    print(i, end=",")

Building data pipeline using generators¶

import os

def take(seq, n):
    return [next(seq) for _ in range(n)]

def find(root):
    for path, dirnames, filenames in os.walk(root):
        for f in filenames:
            yield os.path.join(path, f)
            
def grep(pattern, seq):
    return (x for x in seq if pattern in x) # this called generator expression

files = find("/home/vikrant/trainings")
pyfiles = grep(".py", files)
print(take(pyfiles, 5))

['/home/vikrant/trainings/2018/vmware-advanced-apr/bank1.py', '/home/vikrant/trainings/2018/vmware-advanced-apr/bank0.py', '/home/vikrant/trainings/2018/vmware-advanced-apr/commands.py', '/home/vikrant/trainings/2018/vmware-advanced-apr/sockets.py~', '/home/vikrant/trainings/2018/vmware-advanced-apr/memoize.py']

def readlines(filenames):
    for file in filenames:
        with open(file) as f:
                yield from f
                
def count(seq):
    return sum(1 for item in seq) # this is also generator expression

files = find("/home/vikrant/trainings/")
csvfiles = grep(".csv", files)
lines = readlines(csvfiles)
count(lines)

6813

import re
def grep(pattern, seq):
    p = re.compile(pattern)
    return (x for x in seq if p.match(x)) # this called generator expression

files = find("/home/vikrant/trainings/")
pyfiles = grep(r"[\w\/]+\.py", files)
take(pyfiles, 5)

['/home/vikrant/trainings/nakul/bank1.py',
 '/home/vikrant/trainings/nakul/bank0.py',
 '/home/vikrant/trainings/nakul/bank2.py',
 '/home/vikrant/trainings/nakul/functions4.py',
 '/home/vikrant/trainings/nakul/functions.py']

import re

pattern = re.compile(r"\w+.py")
pattern.match("/vikrant/trainings/hello.py")

files = find("/home/vikrant/trainings/")
pyfiles = grep(r"[\w\/]+\.py", files)
lines = readlines(pyfiles)
funcs = grep(r"def .*", lines)
count(funcs)

199

files = find("/home/vikrant/trainings/")
pyfiles = grep(r"[\w\/]+\.py", files)
lines = readlines(pyfiles)
funcs = grep(r"def .*", lines)

next(funcs)

'def make_account():\n'

count(funcs)

198

problem

https://ia802902.us.archive.org/4/items/prideandprejudic01342gut/pandp12.txt

Write a function get_paragraphs to split text in above text text file into paragraphs. When an empty line comes, thats end of paragraph.
- How many paragraphs are there?
- Which is longest paragraph?

import requests

def wget(url, filename):
    resp = requests.get(url)
    with open(filename, "w") as f:
        f.write(resp.text)

novelurl = "https://ia802902.us.archive.org/4/items/prideandprejudic01342gut/pandp12.txt"
wget(novelurl, "pandp.txt")

!tail pandp.txt

!head pandp.txt

def get_paragraphs(lines):
    para = []
    for line in lines:
        if line.strip() !="":
            para.append(line.strip())
        elif para:
            yield "\n".join(para)
            para = []
    if para:
        yield "\n".join(para)
        
def get_paragraphs_(lines):
    para = ""
    for line in lines:
            line = line.strip()
            if line=="":
                if para=="":
                    continue
                else:
                    yield para
                    para = ""
            else:
                para = para + "\n" + line
    if para:
        yield para

lines = readlines(["pandp.txt"])
count(get_paragraphs(lines))

2202

!wc pandp.txt

 14583 123882 717331 pandp.txt

def test_get_paragraphs(func):
    def append_newl(items):
        return [item+"\n" for item in items]
    lines = [""]
    assert count(func(append_newl(lines))) == 0
    lines = ["A","B","","","C"]
    assert count(func(append_newl(lines))) == 2
    lines = ["A","","B","","","C","D"]
    assert count(func(append_newl(lines))) == 3
    assert max(func(append_newl(lines)), key=len)=="C\nD"
    
#test_get_paragraphs(get_paragraphs_)
test_get_paragraphs(get_paragraphs)

numpy¶

!pyhton -m pip install numpy

import numpy as np

a = np.array([1,2,3,4,5])

a

array([1, 2, 3, 4, 5])

a.shape

(5,)

a.ndim

1

a100 = np.arange(100).reshape(10,10)

a100

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

a100.shape

(10, 10)

a100.ndim

2

a100.dtype

dtype('int64')

a100[0]

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

a100[-1]

array([90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

a100[:,0]

array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

a100[1,:]

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

a100

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

various ways to create arrays¶

np.zeros(100).reshape(20,5)

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

z = _

z.dtype

dtype('float64')

help(np.zeros)

Help on built-in function zeros in module numpy:

zeros(...)
    zeros(shape, dtype=float, order='C')
    
    Return a new array of given shape and type, filled with zeros.
    
    Parameters
    ----------
    shape : int or tuple of ints
        Shape of the new array, e.g., ``(2, 3)`` or ``2``.
    dtype : data-type, optional
        The desired data-type for the array, e.g., `numpy.int8`.  Default is
        `numpy.float64`.
    order : {'C', 'F'}, optional, default: 'C'
        Whether to store multi-dimensional data in row-major
        (C-style) or column-major (Fortran-style) order in
        memory.
    
    Returns
    -------
    out : ndarray
        Array of zeros with the given shape, dtype, and order.
    
    See Also
    --------
    zeros_like : Return an array of zeros with shape and type of input.
    empty : Return a new uninitialized array.
    ones : Return a new array setting values to one.
    full : Return a new array of given shape filled with value.
    
    Examples
    --------
    >>> np.zeros(5)
    array([ 0.,  0.,  0.,  0.,  0.])
    
    >>> np.zeros((5,), dtype=int)
    array([0, 0, 0, 0, 0])
    
    >>> np.zeros((2, 1))
    array([[ 0.],
           [ 0.]])
    
    >>> s = (2,2)
    >>> np.zeros(s)
    array([[ 0.,  0.],
           [ 0.,  0.]])
    
    >>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
    array([(0, 0), (0, 0)],
          dtype=[('x', '<i4'), ('y', '<i4')])

np.zeros(10, dtype=np.int16)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int16)

np.zeros_like(a100)

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

np.ones_like(a100)

array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

np.asarray([1, 2, 3, 4, 5])

array([1, 2, 3, 4, 5])

np.empty(100).reshape(25,4)

array([[            nan, 0.00000000e+000, 4.94065646e-324,
        0.00000000e+000],
       [4.44659081e-323, 6.91758644e-310, 4.65489148e-310,
                    nan],
       [0.00000000e+000,             nan, 6.91758594e-310,
        4.94065646e-324],
       [3.55727265e-322, 3.45845952e-323, 6.91758644e-310,
        4.65489148e-310],
       [0.00000000e+000, 4.94065646e-324, 4.94065646e-324,
        6.91758594e-310],
       [4.94065646e-324, 7.11454530e-322, 1.48219694e-323,
        6.91758644e-310],
       [4.65489148e-310, 0.00000000e+000, 4.94065646e-324,
        4.94065646e-324],
       [0.00000000e+000, 4.94065646e-324, 1.06718180e-321,
        4.44659081e-323],
       [6.91758644e-310, 4.65489148e-310, 0.00000000e+000,
        1.97626258e-323],
       [4.94065646e-324, 4.94065646e-323, 6.91749408e-310,
        1.42290906e-321],
       [4.44659081e-323, 6.91758644e-310, 4.65489148e-310,
        3.95252517e-323],
       [1.97626258e-323, 4.94065646e-324, 0.00000000e+000,
        4.94065646e-324],
       [1.77863633e-321, 4.44659081e-323, 6.91758644e-310,
        4.65489148e-310],
       [0.00000000e+000, 2.96439388e-323, 1.48219694e-323,
        0.00000000e+000],
       [4.94065646e-324, 2.13436359e-321, 5.43472210e-323,
        6.91758644e-310],
       [4.65489148e-310, 9.38724727e-323, 2.96439388e-323,
        1.48219694e-323],
       [0.00000000e+000, 4.94065646e-324, 2.49009086e-321,
        4.94065646e-323],
       [6.91758644e-310, 4.65489148e-310, 0.00000000e+000,
        0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000, 4.94065646e-324,
        0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
        0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
        6.91758655e-310],
       [0.00000000e+000, 2.13436359e-321, 9.88131292e-324,
        6.91758655e-310],
       [4.65489169e-310,             nan, 1.28457068e-322,
                    nan],
       [6.91758655e-310, 4.65480997e-310, 3.59679790e-321,
        4.44659081e-323],
       [6.91758655e-310, 4.65489169e-310, 4.94065646e-324,
                    nan]])

np.empty_like(range(10))

array([140013499548336, 140013499126056, 140013499126224, 140013499126280,
       140013499127960, 140013499127064, 140013499127344, 140013499128072,
       140013499128240, 140013499127400])

Access patterns¶

a100 = np.arange(100).reshape(10,10)

a100[:5, :5]

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

a100[5:,5:]

array([[55, 56, 57, 58, 59],
       [65, 66, 67, 68, 69],
       [75, 76, 77, 78, 79],
       [85, 86, 87, 88, 89],
       [95, 96, 97, 98, 99]])

a100[:5,5:]

array([[ 5,  6,  7,  8,  9],
       [15, 16, 17, 18, 19],
       [25, 26, 27, 28, 29],
       [35, 36, 37, 38, 39],
       [45, 46, 47, 48, 49]])

subview = a100[:5, :5]

subview

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

subview.shape

(5, 5)

subview[0,0]= -1

subview

array([[-1,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

type(subview)

numpy.ndarray

a100

array([[-1,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

type(a100)

numpy.ndarray

copy_subview = subview.copy()

copy_subview

array([[-1,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

copy_subview[0,0] = 0

copy_subview

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

subview

array([[-1,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

a100

array([[-1,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

operations¶

a = np.array(range(10))

a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

a > 4

array([False, False, False, False, False,  True,  True,  True,  True,
        True])

a[a>3]

array([4, 5, 6, 7, 8, 9])

a + 4

array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13])

a - 5

array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4])

a * 2

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

a **2

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

a + a

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

a*a

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

np.exp(a)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

a100

array([[-1,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

a100.max()

99

a100.min()

-1

a100.std()

28.883384496973342

a100.sum()

4949

a100.cumsum()

array([  -1,    0,    2,    5,    9,   14,   20,   27,   35,   44,   54,
         65,   77,   90,  104,  119,  135,  152,  170,  189,  209,  230,
        252,  275,  299,  324,  350,  377,  405,  434,  464,  495,  527,
        560,  594,  629,  665,  702,  740,  779,  819,  860,  902,  945,
        989, 1034, 1080, 1127, 1175, 1224, 1274, 1325, 1377, 1430, 1484,
       1539, 1595, 1652, 1710, 1769, 1829, 1890, 1952, 2015, 2079, 2144,
       2210, 2277, 2345, 2414, 2484, 2555, 2627, 2700, 2774, 2849, 2925,
       3002, 3080, 3159, 3239, 3320, 3402, 3485, 3569, 3654, 3740, 3827,
       3915, 4004, 4094, 4185, 4277, 4370, 4464, 4559, 4655, 4752, 4850,
       4949])

from scipy.misc import face

image = face(gray=True)

image

array([[114, 130, 145, ..., 119, 129, 137],
       [ 83, 104, 123, ..., 118, 134, 146],
       [ 68,  88, 109, ..., 119, 134, 145],
       ...,
       [ 98, 103, 116, ..., 144, 143, 143],
       [ 94, 104, 120, ..., 143, 142, 142],
       [ 94, 106, 119, ..., 142, 141, 140]], dtype=uint8)

from matplotlib import pyplot  as plt

def imshow(img):
    plt.imshow(img, cmap=plt.cm.gray)
    plt.show()

%matplotlib inline

imshow(a100)

imshow(image)

image

array([[114, 130, 145, ..., 119, 129, 137],
       [ 83, 104, 123, ..., 118, 134, 146],
       [ 68,  88, 109, ..., 119, 134, 145],
       ...,
       [ 98, 103, 116, ..., 144, 143, 143],
       [ 94, 104, 120, ..., 143, 142, 142],
       [ 94, 106, 119, ..., 142, 141, 140]], dtype=uint8)

negate = 255 - image

imshow(negate)

thumbnail = image[::3,::3]

imshow(thumbnail)

imshow(image[::5,::5])

imshow(image[::20,::20])

plain = np.zeros_like(thumbnail)

imshow(plain)

plain[::10,:] = 255
plain[:, ::10] = 255

imshow(plain)

plain[:12,:12]

array([[255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0],
       [255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255],
       [255,   0,   0,   0,   0,   0,   0,   0,   0,   0, 255,   0]],
      dtype=uint8)

small = np.zeros(100).reshape(10,10)

small[::3,:] = 255
small[:,::3] = 255
imshow(small)

imshow(thumbnail + plain)

imshow(0.75*thumbnail + 0.25*plain)

imshow((0.5*thumbnail + 0.5*plain)*2)

imshow(a100+200)

imshow(np.maximum(thumbnail, plain))

problem

SWap topleft and bottm right corner of rhis image

def swapcorners(img):
    imglike = img.copy()
    h, w = img.shape
    q1 = img[:h//2, :w//2].copy()
    q2 = img[h//2:,w//2:].copy()
    
    imglike[:h//2, :w//2] = q2
    imglike[h//2:, w//2:] = q1
    
    return imglike

imshow(swapcorners(thumbnail))

thumb = image[::10, ::10]

hthumb = np.hstack([thumb, thumb, thumb])
vthump = np.vstack([hthumb, hthumb, hthumb])
imshow(vthump)

imshow(np.flip(thumb))

Matplotlib¶

url : https://notes.pipal.in/2020/arcesium_advanced_feb/HYDERABAD-weather.csv

url = "https://notes.pipal.in/2020/arcesium_advanced_feb/HYDERABAD-weather.csv"
wget(url, "HYDERABAD-weather.csv")

!tail -n 5 HYDERABAD-weather.csv

594,HYDERABAD,December,1996,28.3,14.9,0.0
595,HYDERABAD,December,1997,28.7,19.2,40.6
596,HYDERABAD,December,1998,28.7,12.8,0.0
597,HYDERABAD,December,1999,29.0,14.2,0.0
598,HYDERABAD,December,2000,29.6,13.3,1.0

import csv

with open("HYDERABAD-weather.csv") as f:
    data = list(csv.reader(f))

type(data)

list

data[0]

['', 'city', 'month', 'year', 'maxtemp', 'mintemp', 'rainfall']

numeric_data = data[1:]

data[0]

['', 'city', 'month', 'year', 'maxtemp', 'mintemp', 'rainfall']

numeric_data[:3]

[['0', 'HYDERABAD', 'January', '1951', '29.0', '14.8', '0.0'],
 ['1', 'HYDERABAD', 'January', '1952', '29.1', '13.6', '0.0'],
 ['2', 'HYDERABAD', 'January', '1953', '28.6', '14.6', '3.5']]

def floatcolumn(matrix, colnum):
    return [float(row[colnum]) for row in matrix]

maxtemp = floatcolumn(numeric_data, 4)

mintemp = floatcolumn(numeric_data, 5)

rainfall = floatcolumn(numeric_data, 6)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-241-ed8819080945> in <module>
----> 1 rainfall = floatcolumn(numeric_data, 6)

<ipython-input-238-435c5f6a0ecd> in floatcolumn(matrix, colnum)
      1 def floatcolumn(matrix, colnum):
----> 2     return [float(row[colnum]) for row in matrix]

<ipython-input-238-435c5f6a0ecd> in <listcomp>(.0)
      1 def floatcolumn(matrix, colnum):
----> 2     return [float(row[colnum]) for row in matrix]

ValueError: could not convert string to float:

def parsefloat(sf):
    try:
        return float(sf)
    except ValueError as v:
        print(v)
        return 0

def floatcolumn(matrix, colnum):
    return [parsefloat(row[colnum]) for row in matrix]

rainfall = floatcolumn(numeric_data, 6)

could not convert string to float:

plt.scatter(rainfall, maxtemp)

<matplotlib.collections.PathCollection at 0x7f575da2bf98>

plt.scatter(rainfall, mintemp)

<matplotlib.collections.PathCollection at 0x7f575d8e6978>

year = [int(row[3]) for row in numeric_data]

numeric_data[:6]

[['0', 'HYDERABAD', 'January', '1951', '29.0', '14.8', '0.0'],
 ['1', 'HYDERABAD', 'January', '1952', '29.1', '13.6', '0.0'],
 ['2', 'HYDERABAD', 'January', '1953', '28.6', '14.6', '3.5'],
 ['3', 'HYDERABAD', 'January', '1954', '28.2', '13.9', '0.0'],
 ['4', 'HYDERABAD', 'January', '1955', '28.0', '14.7', '0.0'],
 ['5', 'HYDERABAD', 'January', '1956', '28.1', '14.2', '0.0']]

numeric_data[-6:]

[['593', 'HYDERABAD', 'December', '1995', '28.9', '15.9', '0.0'],
 ['594', 'HYDERABAD', 'December', '1996', '28.3', '14.9', '0.0'],
 ['595', 'HYDERABAD', 'December', '1997', '28.7', '19.2', '40.6'],
 ['596', 'HYDERABAD', 'December', '1998', '28.7', '12.8', '0.0'],
 ['597', 'HYDERABAD', 'December', '1999', '29.0', '14.2', '0.0'],
 ['598', 'HYDERABAD', 'December', '2000', '29.6', '13.3', '1.0']]

plt.plot(year, rainfall)

[<matplotlib.lines.Line2D at 0x7f575d8b8b70>]

sorted_data = sorted(numeric_data, key= lambda r:r[3])

year = [int(row[3]) for row in sorted_data]

rainfall = floatcolumn(sorted_data, 6)

could not convert string to float:

plt.plot(year, rainfall)

[<matplotlib.lines.Line2D at 0x7f575d7f68d0>]

a100.mean()

49.49

months = np.array([row[2] for row in numeric_data])

months

array(['January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'January', 'January', 'January', 'January',
       'January', 'January', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'February', 'February', 'February',
       'February', 'February', 'March', 'March', 'March', 'March',
       'March', 'March', 'March', 'March', 'March', 'March', 'March',
       'March', 'March', 'March', 'March', 'March', 'March', 'March',
       'March', 'March', 'March', 'March', 'March', 'March', 'March',
       'March', 'March', 'March', 'March', 'March', 'March', 'March',
       'March', 'March', 'March', 'March', 'March', 'March', 'March',
       'March', 'March', 'March', 'March', 'March', 'March', 'March',
       'March', 'March', 'March', 'March', 'April', 'April', 'April',
       'April', 'April', 'April', 'April', 'April', 'April', 'April',
       'April', 'April', 'April', 'April', 'April', 'April', 'April',
       'April', 'April', 'April', 'April', 'April', 'April', 'April',
       'April', 'April', 'April', 'April', 'April', 'April', 'April',
       'April', 'April', 'April', 'April', 'April', 'April', 'April',
       'April', 'April', 'April', 'April', 'April', 'April', 'April',
       'April', 'April', 'April', 'April', 'May', 'May', 'May', 'May',
       'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May',
       'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May',
       'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May',
       'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May',
       'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May', 'May',
       'May', 'June', 'June', 'June', 'June', 'June', 'June', 'June',
       'June', 'June', 'June', 'June', 'June', 'June', 'June', 'June',
       'June', 'June', 'June', 'June', 'June', 'June', 'June', 'June',
       'June', 'June', 'June', 'June', 'June', 'June', 'June', 'June',
       'June', 'June', 'June', 'June', 'June', 'June', 'June', 'June',
       'June', 'June', 'June', 'June', 'June', 'June', 'June', 'June',
       'June', 'June', 'June', 'July', 'July', 'July', 'July', 'July',
       'July', 'July', 'July', 'July', 'July', 'July', 'July', 'July',
       'July', 'July', 'July', 'July', 'July', 'July', 'July', 'July',
       'July', 'July', 'July', 'July', 'July', 'July', 'July', 'July',
       'July', 'July', 'July', 'July', 'July', 'July', 'July', 'July',
       'July', 'July', 'July', 'July', 'July', 'July', 'July', 'July',
       'July', 'July', 'July', 'July', 'July', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'August', 'August', 'August', 'August', 'August', 'August',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'September', 'September', 'September', 'September', 'September',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'October', 'October', 'October', 'October',
       'October', 'October', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'November', 'November', 'November',
       'November', 'November', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December', 'December', 'December', 'December',
       'December', 'December'], dtype='<U9')

rainfall = np.array(floatcolumn(numeric_data, 6))

could not convert string to float:

rainfall[:5]

array([0. , 0. , 3.5, 0. , 0. ])

a = np.array(range(5))

b = np.array(['a','b','a','b','c'])

b=="a"

array([ True, False,  True, False, False])

a[b=="a"]

array([0, 2])

rainfall[months=="March"].mean()

15.264000000000001

months=="March"

False

def get_mean_rainfall(rainfall, months, month):
    return rainfall[months==month].mean()

import datetime

set(months)

{'April',
 'August',
 'December',
 'February',
 'January',
 'July',
 'June',
 'March',
 'May',
 'November',
 'October',
 'September'}

uniqmonths = list(set(months))
rainfall_ = [get_mean_rainfall(rainfall, months, month) for month in uniqmonths]

plt.bar(uniqmonths, rainfall_)

<BarContainer object of 12 artists>

import altair as alt
import pandas as pd

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-287-364103eee631> in <module>
----> 1 import altair as alt
      2 import pandas as pd

ModuleNotFoundError: No module named 'altair'

!python -m pip install altair

Collecting altair
  Downloading https://files.pythonhosted.org/packages/a8/07/d8acf03571db619ff117df5730dd5c0b1ad0822aa02ad1084d73e2659442/altair-4.0.1-py3-none-any.whl (708kB)
     |████████████████████████████████| 716kB 579kB/s eta 0:00:01
Requirement already satisfied: toolz in /home/vikrant/anaconda3/lib/python3.7/site-packages (from altair) (0.10.0)
Requirement already satisfied: pandas in /home/vikrant/anaconda3/lib/python3.7/site-packages (from altair) (0.24.2)
Requirement already satisfied: jinja2 in /home/vikrant/anaconda3/lib/python3.7/site-packages (from altair) (2.10.1)
Requirement already satisfied: entrypoints in /home/vikrant/anaconda3/lib/python3.7/site-packages (from altair) (0.3)
Requirement already satisfied: jsonschema in /home/vikrant/anaconda3/lib/python3.7/site-packages (from altair) (3.0.1)
Requirement already satisfied: numpy in /home/vikrant/anaconda3/lib/python3.7/site-packages (from altair) (1.16.4)
Requirement already satisfied: python-dateutil>=2.5.0 in /home/vikrant/anaconda3/lib/python3.7/site-packages (from pandas->altair) (2.8.0)
Requirement already satisfied: pytz>=2011k in /home/vikrant/anaconda3/lib/python3.7/site-packages (from pandas->altair) (2019.1)
Requirement already satisfied: MarkupSafe>=0.23 in /home/vikrant/anaconda3/lib/python3.7/site-packages (from jinja2->altair) (1.1.1)
Requirement already satisfied: attrs>=17.4.0 in /home/vikrant/anaconda3/lib/python3.7/site-packages (from jsonschema->altair) (19.1.0)
Requirement already satisfied: pyrsistent>=0.14.0 in /home/vikrant/anaconda3/lib/python3.7/site-packages (from jsonschema->altair) (0.14.11)
Requirement already satisfied: setuptools in /home/vikrant/anaconda3/lib/python3.7/site-packages (from jsonschema->altair) (41.0.1)
Requirement already satisfied: six>=1.11.0 in /home/vikrant/anaconda3/lib/python3.7/site-packages (from jsonschema->altair) (1.12.0)
Installing collected packages: altair
Successfully installed altair-4.0.1

import altair as alt
import pandas as pd

%%file sales.txt
area,sales,profit
North,5,2
East,25,8
West,15,6
South,20,5
Central,10,3

Writing sales.txt

sales = pd.read_csv("sales.txt")

sales

alt.Chart(sales).mark_point()

alt.Chart(sales).mark_point().encode(y="area")

alt.Chart(sales).mark_point().encode(
    x="sales", 
    y="area")

alt.Chart(sales).mark_bar().encode(
    x="sales", 
    y="area")

alt.Chart(sales).mark_line().encode(
    x="sales", 
    y="area")

base = alt.Chart(sales).mark_bar().encode(
        x="sales", 
        y="area")

base.mark_circle()

base.encode(color="area")

base.encode(color="area", size="profit")

base.encode(color="area", size="profit").mark_circle()

base.to_json()

'{\n  "$schema": "https://vega.github.io/schema/vega-lite/v4.0.2.json",\n  "config": {\n    "view": {\n      "continuousHeight": 300,\n      "continuousWidth": 400\n    }\n  },\n  "data": {\n    "name": "data-e9a1bf97bac3c6f8642dc2ef7d8e4b49"\n  },\n  "datasets": {\n    "data-e9a1bf97bac3c6f8642dc2ef7d8e4b49": [\n      {\n        "area": "North",\n        "profit": 2,\n        "sales": 5\n      },\n      {\n        "area": "East",\n        "profit": 8,\n        "sales": 25\n      },\n      {\n        "area": "West",\n        "profit": 6,\n        "sales": 15\n      },\n      {\n        "area": "South",\n        "profit": 5,\n        "sales": 20\n      },\n      {\n        "area": "Central",\n        "profit": 3,\n        "sales": 10\n      }\n    ]\n  },\n  "encoding": {\n    "x": {\n      "field": "sales",\n      "type": "quantitative"\n    },\n    "y": {\n      "field": "area",\n      "type": "nominal"\n    }\n  },\n  "mark": "bar"\n}'

JSON¶

import json

d = {"a":28.565,
     "b": 30,
    "c" : [1,2,3]}

d

{'a': 28.565, 'b': 30, 'c': [1, 2, 3]}

json.dumps(d)

'{"a": 28.565, "b": 30, "c": [1, 2, 3]}'

jsondata = '{"a": 28.565, "b": 30, "c": [1, 2, 3]}'

json.loads(jsondata)

{'a': 28.565, 'b': 30, 'c': [1, 2, 3]}

url = "https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol=MSFT&interval=5min&outputsize=full&apikey=demo"

resp = requests.get(url)
data = resp.json()

type(data)

dict

data.keys()

dict_keys(['Meta Data', 'Time Series (5min)'])

data['Meta Data']

{'1. Information': 'Intraday (5min) open, high, low, close prices and volume',
 '2. Symbol': 'MSFT',
 '3. Last Refreshed': '2020-02-14 16:00:00',
 '4. Interval': '5min',
 '5. Output Size': 'Full size',
 '6. Time Zone': 'US/Eastern'}

pd.DataFrame(data['Time Series (5min)'])

pd.DataFrame(data['Time Series (5min)']).transpose()

XML¶

url = "http://www.thehindu.com"

resp = requests.get(url, params={"service":"rss"})

xmltext = resp.text

print(xmltext[:1300])

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
    <channel>
        <title>The Hindu - Home</title>
        <link>https://www.thehindu.com/</link>
        <description>Default RSS Feed</description>
        <language>en-us</language>
        <copyright>Copyright 2020 The Hindu</copyright>
        <item>
            <title><![CDATA[26 teams appointed to conduct eye tests for 2.6 lakh people: Collector]]></title>
            <author><![CDATA[Staff Reporter]]></author>
            <category><![CDATA[Andhra Pradesh]]></category>
            <link>https://www.thehindu.com/news/national/andhra-pradesh/26-teams-appointed-to-conduct-eye-tests-for-26-lakh-people-collector/article30852050.ece</link>
            <description><![CDATA[
                ‘Initially, the teams will complete medical tests in nine selected mandals’
            ]]></description>
            <pubDate><![CDATA[Tue, 18 Feb 2020 18:06:58 +0530]]></pubDate>
        </item>
        <item>
            <title><![CDATA[Envoy Sun Weidong says China will win battle against coronavirus]]></title>
            <author><![CDATA[PTI]]></author>
            <category><![CDATA[International]]></category>
            <link>https://www.thehindu.com/news/international/envoy-sun-weidong-says-china-will

from xml.etree import ElementTree as et

root = et.fromstring(xmltext)

items = root.findall(".//item")

type(items)

list

len(items)

100

items[0]

<Element 'item' at 0x7f574fd92958>

print(et.tostring(items[0]).decode())

<item>
            <title>26 teams appointed to conduct eye tests for 2.6 lakh people: Collector</title>
            <author>Staff Reporter</author>
            <category>Andhra Pradesh</category>
            <link>https://www.thehindu.com/news/national/andhra-pradesh/26-teams-appointed-to-conduct-eye-tests-for-26-lakh-people-collector/article30852050.ece</link>
            <description>
                &#8216;Initially, the teams will complete medical tests in nine selected mandals&#8217;
            </description>
            <pubDate>Tue, 18 Feb 2020 18:06:58 +0530</pubDate>
        </item>

for item in items[:10]:
    print(item.findtext("title"))
    print(item.findtext("link"))
    print("*"*30)

26 teams appointed to conduct eye tests for 2.6 lakh people: Collector
https://www.thehindu.com/news/national/andhra-pradesh/26-teams-appointed-to-conduct-eye-tests-for-26-lakh-people-collector/article30852050.ece
******************************
Envoy Sun Weidong says China will win battle against coronavirus
https://www.thehindu.com/news/international/envoy-sun-weidong-says-china-will-win-battle-against-coronavirus/article30852046.ece
******************************
Health officers inspect roadside eateries 
https://www.thehindu.com/news/national/karnataka/health-officers-inspect-roadside-eateries/article30852028.ece
******************************
Setting up Water Front Management Authority for Yamuna not possible, DDA tells NGT 
https://www.thehindu.com/news/national/setting-up-water-front-management-authority-for-yamuna-not-possible-dda-tells-ngt/article30852025.ece
******************************
Adventures await
https://www.thehindu.com/life-and-style/motoring/adventures-await/article30852018.ece
******************************
In the Ennore-Pulicat wetlands, livelihoods depend heavily on the area’s biodiversity 
https://www.thehindu.com/news/cities/chennai/in-the-ennore-pulicat-wetlands-livelihoods-depend-heavily-on-the-areas-biodiversity/article30851988.ece
******************************
RTC will operate 120 special buses to Siva temples for Maha Sivaratri: official
https://www.thehindu.com/news/national/andhra-pradesh/rtc-will-operate-120-special-buses-to-siva-temples-for-maha-sivaratri-official/article30851938.ece
******************************
Cosmic Ray, Caracas, Star Superior, Amalfi Sunrise, Code Of Honour and Adjudicate excel 
https://www.thehindu.com/sport/races/cosmic-ray-caracas-star-superior-amalfi-sunrise-code-of-honour-and-adjudicate-excel/article30851879.ece
******************************
Water supply snapped to tax evaders’ buildings, houses
https://www.thehindu.com/news/cities/Madurai/water-supply-snapped-to-tax-evaders-buildings-houses/article30851865.ece
******************************
TDB decries delay in payment of wages to workers 
https://www.thehindu.com/news/national/kerala/tdb-decries-delay-in-payment-of-wages-to-workers/article30851852.ece
******************************

from xml.dom.minidom import parseString

root = parseString(xmltext)
items = root.getElementsByTagName("item")

for item in items[:10]:
    title = item.getElementsByTagName("title")[0]
    link = item.getElementsByTagName("link")[0]
    print(title.firstChild.data)
    print(link.firstChild.data)
    print("*"*30)

26 teams appointed to conduct eye tests for 2.6 lakh people: Collector
https://www.thehindu.com/news/national/andhra-pradesh/26-teams-appointed-to-conduct-eye-tests-for-26-lakh-people-collector/article30852050.ece
******************************
Envoy Sun Weidong says China will win battle against coronavirus
https://www.thehindu.com/news/international/envoy-sun-weidong-says-china-will-win-battle-against-coronavirus/article30852046.ece
******************************
Health officers inspect roadside eateries 
https://www.thehindu.com/news/national/karnataka/health-officers-inspect-roadside-eateries/article30852028.ece
******************************
Setting up Water Front Management Authority for Yamuna not possible, DDA tells NGT 
https://www.thehindu.com/news/national/setting-up-water-front-management-authority-for-yamuna-not-possible-dda-tells-ngt/article30852025.ece
******************************
Adventures await
https://www.thehindu.com/life-and-style/motoring/adventures-await/article30852018.ece
******************************
In the Ennore-Pulicat wetlands, livelihoods depend heavily on the area’s biodiversity 
https://www.thehindu.com/news/cities/chennai/in-the-ennore-pulicat-wetlands-livelihoods-depend-heavily-on-the-areas-biodiversity/article30851988.ece
******************************
RTC will operate 120 special buses to Siva temples for Maha Sivaratri: official
https://www.thehindu.com/news/national/andhra-pradesh/rtc-will-operate-120-special-buses-to-siva-temples-for-maha-sivaratri-official/article30851938.ece
******************************
Cosmic Ray, Caracas, Star Superior, Amalfi Sunrise, Code Of Honour and Adjudicate excel 
https://www.thehindu.com/sport/races/cosmic-ray-caracas-star-superior-amalfi-sunrise-code-of-honour-and-adjudicate-excel/article30851879.ece
******************************
Water supply snapped to tax evaders’ buildings, houses
https://www.thehindu.com/news/cities/Madurai/water-supply-snapped-to-tax-evaders-buildings-houses/article30851865.ece
******************************
TDB decries delay in payment of wages to workers 
https://www.thehindu.com/news/national/kerala/tdb-decries-delay-in-payment-of-wages-to-workers/article30851852.ece
******************************

Database¶

import sqlite3

conn = sqlite3.connect("data.db")

cur = conn.cursor()

cur.execute("create table person (name varchar(100), email varchar(100))")

<sqlite3.Cursor at 0x7f57505f3a40>

cur.execute("insert into person (name, email) values('alice', 'alice@wonder.land')")

<sqlite3.Cursor at 0x7f57505f3a40>

cur = cur.execute("select * from person")

cur.fetchall()

[('alice', 'alice@wonder.land')]

def find(conn , email):
    q = "select * from person where email='{}'".format(email)
    print(q)
    cur = conn.cursor()
    return cur.execute(q).fetchall()

find(conn, "alice@wonder.land")

select * from person where email='alice@wonder.land'

[('alice', 'alice@wonder.land')]

def find_(conn, email):
    q = "select * from person where email=?"
    cur = conn.cursor()
    return cur.execute(q, (email,)).fetchall()

find_(conn, "alice@wonder.land")

[('alice', 'alice@wonder.land')]

conn.commit()

conn.close()

conn = sqlite3.connect("data.db")

find(conn, "*")

select * from person where email='*'

[]

find_(conn, "alice@wonder.land")

[('alice', 'alice@wonder.land')]

records = [
    ("alex", "alex@zoo.in"),
    ("Elsa", "elsa@frozen.mov"),
    ("ELisa", "elisa@hacker.hack")
]

cur = conn.cursor()
cur.executemany("insert into person values(?,?)", records)

<sqlite3.Cursor at 0x7f57505f3570>

cur.execute("select * from person").fetchall()

[('alice', 'alice@wonder.land'),
 ('alex', 'alex@zoo.in'),
 ('Elsa', 'elsa@frozen.mov'),
 ('ELisa', 'elisa@hacker.hack')]

To manage database tables as classes, one should use ORM (Object Relational Mapping) which can be done using library sqlalchemy. More details can ne seen at library homepage

	2020-02-14 16:00:00	2020-02-14 15:55:00	2020-02-14 15:50:00	2020-02-14 15:45:00	2020-02-14 15:40:00	2020-02-14 15:35:00	2020-02-14 15:30:00	2020-02-14 15:25:00	2020-02-14 15:20:00	2020-02-14 15:15:00	...	2020-01-27 10:20:00	2020-01-27 10:15:00	2020-01-27 10:10:00	2020-01-27 10:05:00	2020-01-27 10:00:00	2020-01-27 09:55:00	2020-01-27 09:50:00	2020-01-27 09:45:00	2020-01-27 09:40:00	2020-01-27 09:35:00
1. open	185.1200	184.9000	184.7100	184.7200	184.6000	184.5100	184.6500	184.4150	184.4900	184.6303	...	162.0900	162.3843	161.7700	162.0400	162.1000	161.8400	162.2300	161.9100	161.4525	160.3600
2. high	185.4200	185.1950	184.7950	184.7500	184.7300	184.6500	184.6500	184.6700	184.5500	184.7000	...	162.3300	162.6074	162.3825	162.1100	162.1100	162.1900	162.2592	162.3200	162.1185	161.6600
3. low	185.0500	184.7950	184.7050	184.6750	184.5900	184.4496	184.4900	184.3800	184.3300	184.4200	...	161.9501	162.0969	161.7450	161.7350	161.6300	161.8400	161.7900	161.9100	161.4200	160.2100
4. close	185.3400	185.1300	184.7950	184.7050	184.7300	184.5900	184.5100	184.6598	184.4300	184.4900	...	162.0000	162.1143	162.3700	161.7655	162.0400	162.1600	161.8100	162.2700	161.9250	161.4910
5. volume	1362354	856529	347054	184177	223800	166058	135481	161852	179103	256761	...	406067	342023	591544	409539	529315	408370	565614	524571	776577	3311154

	1. open	2. high	3. low	4. close	5. volume
2020-02-14 16:00:00	185.1200	185.4200	185.0500	185.3400	1362354
2020-02-14 15:55:00	184.9000	185.1950	184.7950	185.1300	856529
2020-02-14 15:50:00	184.7100	184.7950	184.7050	184.7950	347054
2020-02-14 15:45:00	184.7200	184.7500	184.6750	184.7050	184177
2020-02-14 15:40:00	184.6000	184.7300	184.5900	184.7300	223800
2020-02-14 15:35:00	184.5100	184.6500	184.4496	184.5900	166058
2020-02-14 15:30:00	184.6500	184.6500	184.4900	184.5100	135481
2020-02-14 15:25:00	184.4150	184.6700	184.3800	184.6598	161852
2020-02-14 15:20:00	184.4900	184.5500	184.3300	184.4300	179103
2020-02-14 15:15:00	184.6303	184.7000	184.4200	184.4900	256761
2020-02-14 15:10:00	184.6300	184.6400	184.5900	184.6400	135195
2020-02-14 15:05:00	184.7550	184.8050	184.5800	184.6200	167408
2020-02-14 15:00:00	184.7400	184.7850	184.7050	184.7500	152411
2020-02-14 14:55:00	184.6900	184.7500	184.6900	184.7451	106061
2020-02-14 14:50:00	184.7700	184.7900	184.6250	184.7050	164644
2020-02-14 14:45:00	184.7564	184.7750	184.6400	184.7750	125630
2020-02-14 14:40:00	184.7800	184.8100	184.7500	184.7500	196640
2020-02-14 14:35:00	184.5200	184.7900	184.5200	184.7900	203463
2020-02-14 14:30:00	184.3900	184.5851	184.3800	184.5200	136376
2020-02-14 14:25:00	184.1400	184.3954	184.1350	184.3818	134295
2020-02-14 14:20:00	184.3550	184.3550	184.0700	184.1400	194770
2020-02-14 14:15:00	184.5800	184.6100	184.3501	184.3569	148356
2020-02-14 14:10:00	184.4600	184.5900	184.4400	184.5800	157224
2020-02-14 14:05:00	184.2380	184.5300	184.1950	184.4546	168193
2020-02-14 14:00:00	184.3500	184.3800	184.1900	184.2300	124547
2020-02-14 13:55:00	184.3400	184.4800	184.2700	184.3500	123391
2020-02-14 13:50:00	184.5750	184.5779	184.2900	184.3350	179815
2020-02-14 13:45:00	184.5350	184.6000	184.4800	184.5800	114433
2020-02-14 13:40:00	184.4400	184.5600	184.4200	184.5350	175102
2020-02-14 13:35:00	184.2101	184.4700	184.1900	184.4500	124787
...	...	...	...	...	...
2020-01-27 12:00:00	162.9200	163.0500	162.8300	162.8800	212821
2020-01-27 11:55:00	162.8550	162.9750	162.7800	162.9200	229113
2020-01-27 11:50:00	163.0571	163.0571	162.8500	162.8550	244961
2020-01-27 11:45:00	162.7541	163.0750	162.7000	163.0500	418908
2020-01-27 11:40:00	162.5850	162.9700	162.5800	162.7550	312796
2020-01-27 11:35:00	162.6259	162.9012	162.5509	162.5959	221288
2020-01-27 11:30:00	162.3700	162.7500	162.2800	162.6150	261139
2020-01-27 11:25:00	162.2500	162.6000	162.2000	162.3900	251578
2020-01-27 11:20:00	162.3000	162.4200	162.2400	162.2500	213573
2020-01-27 11:15:00	162.3200	162.4852	162.2500	162.3000	177814
2020-01-27 11:10:00	162.4900	162.5150	162.2400	162.3600	237192
2020-01-27 11:05:00	162.2694	162.5250	162.1600	162.4900	213333
2020-01-27 11:00:00	161.9100	162.2800	161.8800	162.2700	282084
2020-01-27 10:55:00	162.1750	162.2050	161.9100	161.9200	247940
2020-01-27 10:50:00	162.1400	162.2450	162.1000	162.1750	257260
2020-01-27 10:45:00	162.3300	162.3300	162.0995	162.1400	244488
2020-01-27 10:40:00	162.1500	162.4300	162.1200	162.3200	203881
2020-01-27 10:35:00	162.6300	162.6600	162.1100	162.1594	316466
2020-01-27 10:30:00	162.4565	162.8659	162.4255	162.6415	472640
2020-01-27 10:25:00	161.9950	162.4600	161.9950	162.4400	358133
2020-01-27 10:20:00	162.0900	162.3300	161.9501	162.0000	406067
2020-01-27 10:15:00	162.3843	162.6074	162.0969	162.1143	342023
2020-01-27 10:10:00	161.7700	162.3825	161.7450	162.3700	591544
2020-01-27 10:05:00	162.0400	162.1100	161.7350	161.7655	409539
2020-01-27 10:00:00	162.1000	162.1100	161.6300	162.0400	529315
2020-01-27 09:55:00	161.8400	162.1900	161.8400	162.1600	408370
2020-01-27 09:50:00	162.2300	162.2592	161.7900	161.8100	565614
2020-01-27 09:45:00	161.9100	162.3200	161.9100	162.2700	524571
2020-01-27 09:40:00	161.4525	162.1185	161.4200	161.9250	776577
2020-01-27 09:35:00	160.3600	161.6600	160.2100	161.4910	3311154

	area	sales	profit
0	North	5	2
1	East	25	8
2	West	15	6
3	South	20	5
4	Central	10	3