Python Training at Arcesium - Day 3

Nov 15-17, 2017 Vikrant Patil

These notes are available online at http://notes.pipal.in/2017/arcesium-oct-advpython/day3.html

© Pipal Academy LLP

Day 1 | Day 2 | Day 3

Classes

Syntactically classes in python look like this

class NameOfClass:
    <statement-1>
    <statement-2>
    .
    .
In [1]:
import math

class Circle:
    
    def __init__(self,radius):
        self.radius = radius
        
    def area(self):
        return math.pi * self.radius**2
    
class Square:
    def __init__(self, s):
        self.side = s
        
    def area(self):
        return self.side**2
    
class Triangle:
    def __init__(self, base, height):
        self.base = base
        self.height = height
        
        
In [4]:
shapes = [Circle(1), Square(1), Circle(2), Square(2)]
areas = [s.area() for s in shapes]
areas
Out[4]:
[3.141592653589793, 1, 12.566370614359172, 4]
In [5]:
shapes.append(Triangle(1,1))
In [6]:
areas = [s.area() for s in shapes]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-24e685d5e942> in <module>()
----> 1 areas = [s.area() for s in shapes]

<ipython-input-6-24e685d5e942> in <listcomp>(.0)
----> 1 areas = [s.area() for s in shapes]

AttributeError: 'Triangle' object has no attribute 'area'
In [7]:
filhandle = open("data.csv")
In [8]:
def grep(filehandle, pattern):
    return [line for line in filehandle.readlines() if pattern in line]
        
    
In [9]:
grep(filhandle, "p")
Out[9]:
[]
grep (Triangle, "p")
In [14]:
class FakeReadlines:
    def readlines(self):
        return []
In [15]:
f = FakeReadlines()
In [16]:
grep(f, "p")
Out[16]:
[]

Why classes

Q: is it possible to have nested modules?

In [17]:
!mkdir nested
!touch nested/__init__.py
!mkdir nested/inner
!touch nested/inner/__init__.py
!touch nested/inner/hello.py
In [18]:
import nested
In [19]:
type(nested)
Out[19]:
module
In [20]:
from nested.inner import hello
In [23]:
!echo 'print("Hello world!")' > nested/inner/__main__.py
In [24]:
!python nested/inner/
Hello world!
  • If you have a number of related functions that you just want to bundle together, a module will do a good job
  • The purpose of class is to bundle data structure which represents some logical entity with operations that work with this data structure. Classes are convinient namespaces.
  • Classes are good at modeling data while functions are good at processiong data.
  • Extensibility is biggeste advantage that you have with classes/OOPS

Class object

In [25]:
class ClassA:
    value = 42
    def f(self):
        return "Hello from classA"
    
In [26]:
ClassA
Out[26]:
__main__.ClassA
In [27]:
ClassA.value
Out[27]:
42
In [28]:
ClassA.f
Out[28]:
<function __main__.ClassA.f>

Instance object

In [ ]:
 
In [29]:
x = ClassA()
In [30]:
x
Out[30]:
<__main__.ClassA at 0x7f0d1802be48>
In [31]:
x.value
Out[31]:
42
In [32]:
x.f
Out[32]:
<bound method ClassA.f of <__main__.ClassA object at 0x7f0d1802be48>>
In [33]:
x.y = 10
In [34]:
x.y
Out[34]:
10
In [35]:
def f2():
    pass
In [36]:
x.method2 = f2
In [37]:
x.method2
Out[37]:
<function __main__.f2>
In [38]:
x.f
Out[38]:
<bound method ClassA.f of <__main__.ClassA object at 0x7f0d1802be48>>
In [39]:
method = x.f
In [40]:
method
Out[40]:
<bound method ClassA.f of <__main__.ClassA object at 0x7f0d1802be48>>
In [41]:
method()
Out[41]:
'Hello from classA'
In [42]:
ClassA.f()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-42-46724427a1a8> in <module>()
----> 1 ClassA.f()

TypeError: f() missing 1 required positional argument: 'self'
In [44]:
ClassA.f(x)
Out[44]:
'Hello from classA'
In [45]:
x.value
Out[45]:
42
In [46]:
y = ClassA()
In [47]:
y.value
Out[47]:
42
In [48]:
ClassA.value = 43
In [49]:
x.value
Out[49]:
43
In [50]:
y.value
Out[50]:
43
In [53]:
class A:
    z = 0
    def __init__(self, x, y):
        self.x = x
        self.y = y
   
In [55]:
a1 = A(2, 4)
In [56]:
a2 = A(4,5)
In [57]:
a1.z
Out[57]:
0
In [58]:
a1.x
Out[58]:
2
In [59]:
a1.y
Out[59]:
4
In [60]:
a2.z
Out[60]:
0
In [61]:
A.z = -1
In [62]:
a1.z
Out[62]:
-1
In [63]:
A.x = 3
In [64]:
a1.x
Out[64]:
2
In [92]:
def outside_increment(self): #this is function
    self.value +=1
    
class A:
    def __init__(self):
        self.value = 0
        
    def increment(self): #this is method
        self.value +=1 
In [66]:
a = A()
In [67]:
a.increment()
In [68]:
a.value
Out[68]:
1
In [69]:
outside_increment(a)
In [90]:
a.outside_increment = outside_increment
In [91]:
a.outside_increment()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-91-8beae19e0f57> in <module>()
----> 1 a.outside_increment()

TypeError: outside_increment() missing 1 required positional argument: 'self'
In [ ]:
 
In [93]:
dir(a)
Out[93]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'increment',
 'outside_increment',
 'value']
In [70]:
a.value
Out[70]:
2
In [71]:
class B:
    name = ""
    email = ""
In [72]:
b1 = B()
In [73]:
b2 = B()
In [74]:
b1.name = "alice"
b2.email = "hello@alice.in"
In [75]:
b2.name
Out[75]:
''
In [76]:
B.name = "hello"
In [77]:
b1.name
Out[77]:
'alice'
In [82]:
class C:
    name = ""
    email = ""
    z = 3
    def __init__(self, name, email):
        self.name = name
        self.email = email
In [83]:
c1 = C("alice", "hello@alice.in")
c2 = C("xzy", "hello@xyz.com")
In [84]:
C.name = "abc"
In [85]:
c1.name
Out[85]:
'alice'
In [86]:
c1.z
Out[86]:
3
In [87]:
C.z = 5
In [88]:
c1.z
Out[88]:
5
In [89]:
c2.z
Out[89]:
5

Customizing classes

In [98]:
class Pair:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return 'Pair({0.x!r},{0.y!r})'.format(self)
    
    def __str__(self):
        return '({0.x!s},{0.y!s})'.format(self)
In [99]:
p = Pair(2,3)
In [100]:
p
Out[100]:
Pair(2,3)
In [101]:
print(p)
(2,3)
In [102]:
"{0} {1}".format("hello", "python")
Out[102]:
'hello python'
In [104]:
p1 =Pair(Pair(1,2), Pair(2,3))
In [105]:
p1
Out[105]:
Pair(Pair(1,2),Pair(2,3))
In [106]:
print(p1)
((1,2),(2,3))

arithmatic oprators

In [107]:
class Pair:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return 'Pair({0.x!r},{0.y!r})'.format(self)
    
    def __str__(self):
        return '({0.x!s},{0.y!s})'.format(self)
    
    def __add__(self, p):
        return Pair(self.x+p.x, self.y + p.y)

    def __sub__(self, p):
        pass
    
    def __rmul__(self, c):
        pass
    
    def __mul__(self, c):
        pass
    def __eq__(self, p):
        pass
    
In [108]:
p = Pair(1,2)
p2 = Pair(3,4)
In [109]:
p + p2
Out[109]:
Pair(4,6)
In [110]:
2*"wew"
Out[110]:
'wewwew'
In [111]:
"**"*5
Out[111]:
'**********'
In [112]:
class Pair:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return 'Pair({0.x!r},{0.y!r})'.format(self)
    
    def __getitem__(self, name):
        return self.__dict__[name]
    
    def __setitem__(self, name , value):
        if name in ["x", "y"]:
            self.__dict__[name] = value
        else:
            raise Exception("Unsupported name!")
In [113]:
p = Pair(1,2)
In [114]:
p['x']
Out[114]:
1
In [115]:
p['x'] = 3
In [116]:
p
Out[116]:
Pair(3,2)
In [117]:
p['z'] = 4
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-117-15175c98fecd> in <module>()
----> 1 p['z'] = 4

<ipython-input-112-72d01f5e6ca3> in __setitem__(self, name, value)
     14             self.__dict__[name] = value
     15         else:
---> 16             raise Exception("Unsupported name!")

Exception: Unsupported name!
In [118]:
a1 + a2
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-118-20a6a4a44f0e> in <module>()
----> 1 a1 + a2

TypeError: unsupported operand type(s) for +: 'A' and 'A'

Example: Text Formatting

In [119]:
%%file five.txt
one
two
three
four
five
Writing five.txt
In [120]:
class Formatter:
    def format_text(self, text):
        """Formats given text.
        This implememtation returns the same text,
        but sub classes can override this method to provide
        different way of formatting.
        """
        return text
    
    def format_file(self, filename):
        text = open(filename).read()
        return self.format_text(text)
In [121]:
class UpperCaseFormatter(Formatter):
    def format_text(self, text):
        return text.upper()
In [122]:
f = UpperCaseFormatter()
In [123]:
print(f.format_text("Hello"))
HELLO
In [124]:
print(f.format_file("five.txt"))
ONE
TWO
THREE
FOUR
FIVE
In [126]:
class LineForematter(Formatter):
    def format_line(self, line):
        return line
    
    def format_text(self, text):
        lines = text.split("\n")
        lines = [self.format_line(l) for l in lines]
        return "\n".join(lines)
In [127]:
class PrefixFormatter(LineForematter):
    def __init__(self, prefix):
        self.prefix = prefix
        
    def format_line(self, line):
        return self.prefix + line
In [128]:
f = PrefixFormatter(prefix="[INFO]")
In [129]:
print(f.format_text("Hello\nWorld"))
[INFO]Hello
[INFO]World
In [130]:
print(f.format_file("five.txt"))
[INFO]one
[INFO]two
[INFO]three
[INFO]four
[INFO]five

Q: is it possible to make class immutable?

In [131]:
class UpperCase:
    
    def __getattr__(self, name):
        return name.upper()
In [132]:
u = UpperCase()
In [133]:
u.hello
Out[133]:
'HELLO'
In [137]:
class Sealed:
    def __setattr__(self, name, value):
        raise Exception("No chance!")
In [138]:
s = Sealed()
In [139]:
s.x = 0
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-139-d443b8ea315c> in <module>()
----> 1 s.x = 0

<ipython-input-137-c14756ee149e> in __setattr__(self, name, value)
      1 class Sealed:
      2     def __setattr__(self, name, value):
----> 3         raise Exception("No chance!")

Exception: No chance!
In [149]:
class Date:
    __slots__ = ['year','month','day']
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day
        
    def method(self, x):
        self.x = x
In [147]:
d = Date(2017, 11, 17)
In [148]:
d.method(60)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-148-7d471b41e2b2> in <module>()
----> 1 d.method(60)

<ipython-input-146-03ead950b0e6> in method(self, x)
      7 
      8     def method(self, x):
----> 9         self.x = x

AttributeError: 'Date' object has no attribute 'x'

problem: Write a class Timer to measure time takes by task. The class should have start and stop methods and should be able to find time taken between call of start() and stop(). use: time.time()

t = Timer()
t.start()
do_some_stuff()
t.stop()
print("Time taken:", t.get_time_taken())
In [153]:
import time
class Timer:
    def __init__(self):
        self._start = 0
        self._end = 0
        
    def start(self):
        self._start = time.time()
        
    def stop(self):
        self._end = time.time()
        
    def get_time_taken(self):
        return self._end - self._start
    
t = Timer()
t.start()
time.sleep(5)
t.stop()
print("Time taken:", t.get_time_taken())
Time taken: 5.0053815841674805

Context management protocol

In [154]:
with open("five.txt") as f:
    print(f.read())
one
two
three
four
five
In [155]:
with open("five.txt", "w") as f:
    f.write("six\n")
f.write("seven\n")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-155-2ed1e393bd3e> in <module>()
      1 with open("five.txt", "w") as f:
      2     f.write("six\n")
----> 3 f.write("seven\n")

ValueError: I/O operation on closed file.
In [161]:
from socket import socket, AF_INET, SOCK_STREAM 
class LazyConnection:
    def __init__(self, address, family=AF_INET, type_=SOCK_STREAM):
        self.address = address
        self.family = family
        self.type_ = type_
        self.sock = None
        
    def __enter__(self):
        print("__enter__ of LazyConnection")
        if self.sock is not None:
            raise RuntimeError("Already connected")
        self.sock = socket(self.family, self.type_)
        self.sock.connect(self.address)
        return self.sock
    
    def __exit__(self, exception_type, exeception_value, tracebak):
        print("__exit__ of LazyConnection")
        self.sock.close()
        self.sock = None
In [162]:
from functools import partial

def fprint(s):
    print("*"*5, s)
    
conn = LazyConnection(("www.python.org",80))

fprint("Before with")
with conn as s:
    #conn.__enter__() executes
    fprint("Inside with start")
    s.send(b'GET /index.html HTTP/1.0\r\n')
    s.send(b'Host: www.python.org\r\n')
    s.send(b'\r\n')
    resp = b''.join(iter(partial(s.recv, 8192), b''))
    # the callable is called untill it returns b''
    fprint("Inside with ends")
    
print(resp.decode("utf-8"))
***** Before with
__enter__ of LazyConnection
***** Inside with start
***** Inside with ends
__exit__ of LazyConnection
HTTP/1.1 301 Moved Permanently
Server: Varnish
Retry-After: 0
Location: https://www.python.org/index.html
Content-Length: 0
Accept-Ranges: bytes
Date: Fri, 17 Nov 2017 07:33:03 GMT
Via: 1.1 varnish
Connection: close
X-Served-By: cache-lhr6345-LHR
X-Cache: HIT
X-Cache-Hits: 0
X-Timer: S1510903983.015465,VS0,VE0
Strict-Transport-Security: max-age=63072000; includeSubDomains


problem: Write class ContextTimer which extends from Timer class. This new class implements context management protocol. where when you enter in with block your timer should start. and when you exit from with block your timer should stop and print time taken

with ContextTimer() as ct:
    for i in range(10000):
        for j in range(1000):
            s = i*j*1.0

Time taken in with block: 2.121..
In [163]:
class ContextTimer(Timer):
    def __enter__(self):
        self.start()
        
    def __exit__(self, ect, ec, tb):
        self.stop()
        print("Time taken in with block:", self.get_time_taken())
In [164]:
with ContextTimer() as ct:
    time.sleep(3)
    
Time taken in with block: 3.0031464099884033

properties

In [167]:
class Person(object):
    def __init__(self, firstname, lastname):
        self.firstname = firstname
        self.lastname = lastname
        
    @property
    def fullname(self):
        print("Calling function fullname")
        return " ".join([self.firstname, self.lastname])
In [168]:
p = Person("Alice", "Whoever")
In [169]:
p.fullname
Calling function fullname
Out[169]:
'Alice Whoever'
In [170]:
p.firstname
Out[170]:
'Alice'
In [171]:
p.lastname
Out[171]:
'Whoever'
In [172]:
p.fullname
Calling function fullname
Out[172]:
'Alice Whoever'
In [173]:
class Person(object):
    def __init__(self, name, email):
        self._name = name
        self._email = email
       
    #getter
    @property
    def name(self):
        ## you might do some additonal processing or checks
        return self._name
    
    #setter
    @name.setter
    def name(self, value):
        if not isinstance(value, str):
            raise TypeError("Expected a string for name")
        self._name = value
        
    #deleter
    @name.deleter
    def name(self):
        raise AttributeError("Can't delete name attribute")
In [174]:
p = Person("Alice", "alice@example.com")
In [175]:
p.name
Out[175]:
'Alice'
In [176]:
p.name = "alice"
In [177]:
p.name
Out[177]:
'alice'
In [178]:
p.name = 42
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-178-7e1a4a1cfe0b> in <module>()
----> 1 p.name = 42

<ipython-input-173-d668b36ba6ab> in name(self, value)
     14     def name(self, value):
     15         if not isinstance(value, str):
---> 16             raise TypeError("Expected a string for name")
     17         self._name = value
     18 

TypeError: Expected a string for name
In [179]:
del p.name
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-179-82a48009a62c> in <module>()
----> 1 del p.name

<ipython-input-173-d668b36ba6ab> in name(self)
     20     @name.deleter
     21     def name(self):
---> 22         raise AttributeError("Can't delete name attribute")

AttributeError: Can't delete name attribute

Descriptors

In [181]:
class Zero:
    def __get__(self, obj, cls):
        if obj is None:
            return self
        print("Zero.__get__")
        return 0
    
    def __repr__(self):
        print("<Descriptor Zero>")
In [182]:
class Foo:
    x = Zero()
In [183]:
f = Foo()
In [184]:
f.x
Zero.__get__
Out[184]:
0
In [185]:
three = 3
In [186]:
type(three)
Out[186]:
int
In [188]:
f.x + 3
Zero.__get__
Out[188]:
3
In [191]:
class Integer:
    
    def __init__(self, name):
        self.name = name
        
    def __get__(self, instance, cls):
        print("__get__ from", self)
        if instance is None:
            return self
        else:
            return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        print("__set__ from", self)
        if not isinstance(value, int):
            raise TypeError("Expected an int")
        instance.__dict__[self.name] = value
        
    def __delete__(self, instance):
        print("__delete__ from,", self)
        del instance.__dict__[self.name]
        
    def __str__(self):
        return "Integer<{0.name!s}>".format(self)
In [192]:
class Point:
    x = Integer('x')
    y = Integer('y')
    
    def __init__(self, x, y):
        self.x = x
        self.x = y
In [193]:
p = Point(2,3)
__set__ from Integer<x>
__set__ from Integer<x>
In [195]:
Point.__dict__
Out[195]:
mappingproxy({'__dict__': <attribute '__dict__' of 'Point' objects>,
              '__doc__': None,
              '__init__': <function __main__.Point.__init__>,
              '__module__': '__main__',
              '__weakref__': <attribute '__weakref__' of 'Point' objects>,
              'x': <__main__.Integer at 0x7f0d09f96400>,
              'y': <__main__.Integer at 0x7f0d09f964e0>})
In [196]:
p.x
__get__ from Integer<x>
Out[196]:
3
In [223]:
Point.x
__get__ from Integer<x>
Out[223]:
<__main__.Integer at 0x7f0d09f96400>
In [197]:
p.y = 2
__set__ from Integer<y>
In [198]:
p.y = "3"
__set__ from Integer<y>
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-198-ed6b0524c329> in <module>()
----> 1 p.y = "3"

<ipython-input-191-49f0927b9521> in __set__(self, instance, value)
     14         print("__set__ from", self)
     15         if not isinstance(value, int):
---> 16             raise TypeError("Expected an int")
     17         instance.__dict__[self.name] = value
     18 

TypeError: Expected an int

problem: Implement a my_property decorator that works like built in property

In [199]:
class my_property:
    
    def __init__(self, func):
        self.func = func
        
    def __get__(self, instance, cls):
        if instance is None:
            return self
        print("my_property __get__")
        return self.func(instance)
In [203]:
class Person:
    def __init__(self, name, email):
        self._name = name
        self._email = email
        
    @my_property
    def name(self):
        return self._name
In [204]:
p = Person("name", "email@xyz.com")
In [205]:
p.name
my_property __get__
Out[205]:
'name'
In [ ]:
my_property(func)

staticmethod and classmethod

In [220]:
class A:
    
    def __init__(self, x):
        self.x = x
        
    def __repr__(self):
        return "{0}({1})".format(self.__class__.__name__, repr(self.x))
    
    @staticmethod
    def parse1(value):
        return A(int(value))
    
    @classmethod
    def parse2(cls, value):
        return cls(int(value))
    
class B(A):
    pass
In [213]:
a = A.parse1("123")
In [214]:
a
Out[214]:
A(123)
In [215]:
b = B.parse1("123")
In [216]:
b
Out[216]:
A(123)
In [217]:
type(b)
Out[217]:
__main__.A
In [221]:
b1 = B.parse2("123")
In [222]:
b1
Out[222]:
B(123)
In [224]:
class Integer:
    
    def __init__(self, name):
        self.name = name
        
    def __get__(self, instance, cls):
        print(instance, cls)
        print("__get__ from", self)
        if instance is None:
            return self
        else:
            return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        print("__set__ from", self)
        if not isinstance(value, int):
            raise TypeError("Expected an int")
        instance.__dict__[self.name] = value
        
    def __delete__(self, instance):
        print("__delete__ from,", self)
        del instance.__dict__[self.name]
        
    def __str__(self):
        return "Integer<{0.name!s}>".format(self)
In [225]:
class Point:
    x = Integer('x')
    y = Integer('y')
    
    def __init__(self, x, y):
        self.x = x
        self.x = y
In [226]:
Point.x
None <class '__main__.Point'>
__get__ from Integer<x>
Out[226]:
<__main__.Integer at 0x7f0d09f4a8d0>
In [227]:
p = Point(2,3)
__set__ from Integer<x>
__set__ from Integer<x>
In [228]:
p.x
<__main__.Point object at 0x7f0d09f51c18> <class '__main__.Point'>
__get__ from Integer<x>
Out[228]:
3

class decorator

In [230]:
def debug(func):
    prefix = "*"*5
    msg = prefix + func.__qualname__
    
    def wrapper(*args, **kwargs):
        print(msg)
        return func(*args, **kwargs)
    
    return wrapper

def debugmethods(cls):
    for name, val in vars(cls).items():
        if callable(val):
            setattr(cls, name, debug(val))
    return cls

@debugmethods
class Foo:
    x = "dummy"
    
    def method1(self):
        pass
    
    def method2(self):
        pass
    
    def method3(self):
        pass
In [231]:
f = Foo()
In [232]:
f.x
Out[232]:
'dummy'
In [233]:
f.method1()
*****Foo.method1

Multithreading

In [234]:
%%file t1.py
import threading
def task():
    print("Hello ", threading.currentThread().getName())
    
def main():
    t1 = threading.Thread(target=task)
    t1.start()
    t1.join()
    
if __name__ == "__main__":
    main()
Writing t1.py
In [235]:
!python t1.py
Hello  Thread-1
In [236]:
%%file t2.py
import threading
def task():
    print("Hello ", threading.currentThread().getName())
    
def main():
    threads = [threading.Thread(target=task) for i in range(10)]
    for t in threads:
        t.start()
    for t in threads:
        t.join()
if __name__ == "__main__":
    main()
Writing t2.py
In [237]:
!python t2.py
Hello  Thread-1
Hello  Thread-2
Hello  Thread-3
Hello  Thread-4
Hello  Thread-5
Hello  Thread-6
Hello  Thread-7
Hello  Thread-8
Hello  Thread-9
Hello  Thread-10
In [248]:
%%file counter.py
import threading 

class Counter:
    def __init__(self):
        self.count = 0
        
    def tick(self):
        self.count +=1 
        
def task(counter, n):
    for i in range(n):
        counter.tick()
        
def main():
    counter = Counter()
    n = 100000
    nthreads = 10
    threads = [threading.Thread(target=task, args=(counter, n)) for i in range(nthreads)]
    for t in threads:
        t.start()
    for t in threads:
        t.join()
    print(counter.count)
if __name__ == "__main__":
    main()
Overwriting counter.py
In [252]:
!time python counter.py
549039
0.53user 0.00system 0:00.54elapsed 99%CPU (0avgtext+0avgdata 9508maxresident)k
0inputs+0outputs (0major+1273minor)pagefaults 0swaps
In [250]:
!python counter.py
515036
In [251]:
%%file counter1.py
import threading 

class Counter:
    def __init__(self):
        self.count = 0
        self.lock = threading.Lock()
        
    def tick(self):
        with self.lock:
            self.count +=1 
        
def task(counter, n):
    for i in range(n):
        counter.tick()
        
def main():
    counter = Counter()
    n = 100000
    nthreads = 10
    threads = [threading.Thread(target=task, args=(counter, n)) for i in range(nthreads)]
    for t in threads:
        t.start()
    for t in threads:
        t.join()
    print(counter.count)
if __name__ == "__main__":
    main()
Writing counter1.py
In [254]:
!time -p python counter.py
638765
real 0.53
user 0.50
sys 0.01
In [255]:
!time -p python counter1.py
1000000
real 4.14
user 3.65
sys 3.05

problem: Download urls in parallel using threads.

In [256]:
!seq 10 | xargs printf "http://httpbin.org/get?x=%d\n"
http://httpbin.org/get?x=1
http://httpbin.org/get?x=2
http://httpbin.org/get?x=3
http://httpbin.org/get?x=4
http://httpbin.org/get?x=5
http://httpbin.org/get?x=6
http://httpbin.org/get?x=7
http://httpbin.org/get?x=8
http://httpbin.org/get?x=9
http://httpbin.org/get?x=10
In [257]:
!seq 10 | xargs printf "http://httpbin.org/get?x=%d\n" > urls.txt
In [270]:
%%file sget.py
import sys
from urllib.request import urlopen

def get_urls(filename):
    return (line.strip() for line in open(filename))

def wget(url):
    return urlopen(url).read()

def main():
    filename = sys.argv[1]
    for url in get_urls(filename):
        wget(url)
        
if __name__ == "__main__":
    main()
Overwriting sget.py
In [271]:
!time -p python sget.py urls.txt
real 6.40
user 0.12
sys 0.02
In [272]:
%%file pget.py
import sys
from sget import get_urls, wget
from multiprocessing.pool import ThreadPool

def main():
    filename = sys.argv[1]
    concurrency = int(sys.argv[2])
    urls = get_urls(filename)
    pool = ThreadPool(concurrency)
    pool.map(wget, urls)
    
if __name__ == "__main__":
    main()
Overwriting pget.py
In [267]:
!time -p python pget.py urls.txt 1
real 18.12
user 0.20
sys 0.01
In [268]:
!time -p python pget.py urls.txt 2
real 13.72
user 0.22
sys 0.01
In [273]:
!time -p python pget.py urls.txt 4
real 2.34
user 0.17
sys 0.01

Lets try same task with multiprocessing

In [296]:
%%file pcpu.py
import sys
from multiprocessing.pool import Pool

def totaltask():
    task("")
    task2("")

def task(dummy):
    print("Started task")
    sum = 0 
    for i in range(1000):
        for j in range(10000):
            sum += 1.0*i*j
    print("Finished task")
    return sum

def task2(x):
    print("Started task2")
    sum = 0 
    for i in range(1000):
        for j in range(10000):
            sum += 1.0*i*j
    return sum

def main():
    conc = int(sys.argv[1])
    pool = Pool(conc)
    pool.apply(totaltask, args=() )
   
if __name__ == "__main__":
    main()
Overwriting pcpu.py
In [297]:
!time -p python pcpu.py 2
Started task
Finished task
Started task2
real 3.19
user 3.13
sys 0.01

more about yield

In [298]:
def coroutine(func):
    def start(*args,**kwargs):
        cr = func(*args,**kwargs)
        next(cr)
        return cr
    return start

@coroutine
def grep(pattern):
    print("Looking for %s" % pattern)
    while True:
        line = (yield)
        if pattern in line:
            print(line)
In [299]:
g = grep("python")
Looking for python
In [300]:
g.send("Hello python")
Hello python
In [301]:
g.send("There is no pattern")

C-extensions

In [9]:
import ctypes
from ctypes.util import find_library
In [10]:
path = find_library("c")
In [11]:
print(path)
libc.so.6
In [12]:
libc = ctypes.cdll.LoadLibrary(path)
In [13]:
libc.printf("Hello c world!")
Out[13]:
1
In [14]:
libc.abs(-2)
Out[14]:
2
In [16]:
libc.fabs
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-16-612a0d5b449d> in <module>()
----> 1 libc.fabs

/home/vikrant/usr/local/anaconda3/lib/python3.6/ctypes/__init__.py in __getattr__(self, name)
    359         if name.startswith('__') and name.endswith('__'):
    360             raise AttributeError(name)
--> 361         func = self.__getitem__(name)
    362         setattr(self, name, func)
    363         return func

/home/vikrant/usr/local/anaconda3/lib/python3.6/ctypes/__init__.py in __getitem__(self, name_or_ordinal)
    364 
    365     def __getitem__(self, name_or_ordinal):
--> 366         func = self._FuncPtr((name_or_ordinal, self))
    367         if not isinstance(name_or_ordinal, int):
    368             func.__name__ = name_or_ordinal

AttributeError: /lib/x86_64-linux-gnu/libc.so.6: undefined symbol: fabs

Using cython

In [311]:
!mkdir cy
In [312]:
%%file cy/sum.pyx

def sum(values):
    result = 0
    for v in values:
        result += v
    return result
Writing cy/sum.pyx
In [313]:
%%file cy/setup.py
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("sum.pyx")
)
Writing cy/setup.py
In [1]:
!cd cy && python setup.py build_ext --inplace
running build_ext
building 'sum' extension
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/vikrant/usr/local/anaconda3/include/python3.6m -c sum.c -o build/temp.linux-x86_64-3.6/sum.o
creating build/lib.linux-x86_64-3.6
gcc -pthread -shared -L/home/vikrant/usr/local/anaconda3/lib -Wl,-rpath=/home/vikrant/usr/local/anaconda3/lib,--no-as-needed build/temp.linux-x86_64-3.6/sum.o -L/home/vikrant/usr/local/anaconda3/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/sum.cpython-36m-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.6/sum.cpython-36m-x86_64-linux-gnu.so -> 
In [4]:
%%file cy/testcy.py
import sum
print(sum.sum(range(1000)))
Overwriting cy/testcy.py
In [5]:
!cd cy && python testcy.py
499500

Numba

In [320]:
from numba import jit
from numpy import arange

# jit decorator tells Numba to compile this function.
# The argument types will be inferred by Numba when function is called.
@jit
def sum2d(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result


def sum2d_(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result

@jit()
def sum2d_i(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result
In [322]:
a = arange(10000).reshape(100,100)
%timeit sum2d(a)
The slowest run took 4661.45 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 20.8 µs per loop
In [323]:
%timeit sum2d_(a)
100 loops, best of 3: 3.55 ms per loop

References

In [ ]: