Python Training at Symantec - Chennai -- Day 5

March 27-31, 2017
Anand Chitipothu

These notes are available online at https://notes.pipal.in/2017/symantec

© Pipal Academy LLP

Home | Day 1 | Day 2 | Day 3 | Day 4 | Day 5

Topics for today

  • Class Inheritance
  • Exception Handling
  • Working with APIs
  • Regular Expressions
  • Working with Databases
  • Writing Command-line applications

Class Inheritance

Let us look at an example of class inheritance.

In [4]:
class Formatter:
    def format_text(self, text):
        return text
    
    def format_file(self, filename):
        text = open(filename).read()
        return self.format_text(text)
    
class UpperCaseFormatter(Formatter):
    def format_text(self, text):
        return text.upper()
In [5]:
f = UpperCaseFormatter()
f.format_text("Hello World!")
Out[5]:
'HELLO WORLD!'
In [6]:
print(f.format_file("words.txt"))
FIVE
FIVE FOUR
FIVE FOUR THREE
FIVE FOUR THREE TWO
FIVE FOUR THREE TWO ONE
In [22]:
class LineFormatter(Formatter):
    def format_text(self, text):
        lines = text.splitlines()
        lines = [self.format_line(line) for line in lines]
        return "\n".join(lines)
    
    def format_line(self, line):
        """Method to format a line.
        
        The sub classes can override this method to 
        specify how to format indidual lines.
        """
        return line
In [23]:
class PrefixFormatter(LineFormatter):
    def __init__(self, prefix):
        self.prefix = prefix
        
    def format_line(self, line):
        return self.prefix + line
In [24]:
f = PrefixFormatter("[INFO] ")
In [25]:
print(f.format_text("Hello"))
[INFO] Hello
In [26]:
print(f.format_text("a\nb\nc\n"))
[INFO] a
[INFO] b
[INFO] c
In [28]:
print(f.format_file("words.txt"))
[INFO] five
[INFO] five four
[INFO] five four three
[INFO] five four three two
[INFO] five four three two one

Exception Handling

Please see exceptions notebook.

Working with Web

In [30]:
from urllib.request import urlopen
In [31]:
url = "http://anandology.com/tmp/hello.txt"
In [32]:
contents = urlopen(url).read()
print(contents)
b'Hello, World!\n'

Notice that urlopen always gives bytes. We need to convert that into string if required.

In [33]:
print(contents.decode('utf-8'))
Hello, World!

In [34]:
response = urlopen(url)
In [35]:
response
Out[35]:
<http.client.HTTPResponse at 0x10ab5de48>
In [36]:
print(response.headers)
Server: nginx/0.7.67
Date: Fri, 31 Mar 2017 10:45:52 GMT
Content-Type: text/plain
Content-Length: 14
Last-Modified: Sat, 22 Aug 2015 05:41:05 GMT
Connection: close
Accept-Ranges: bytes


There is a third-party library called requests that is very popular.

In [37]:
import requests
In [38]:
r = requests.get("http://anandology.com/tmp/hello.txt")
In [39]:
r
Out[39]:
<Response [200]>
In [40]:
r.text
Out[40]:
'Hello, World!\n'
In [41]:
response = urlopen("http://google.com/")
print(response.headers)
Date: Fri, 31 Mar 2017 10:47:03 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See https://www.google.com/support/accounts/answer/151657?hl=en for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: NID=100=D5twSyiyGOXo_5xsblLNxRevvu2I8kY-j-OkRg8XIyOfi3gT55OO4Rv_THWo3d74ERQzLeAoyP8jqPF7TIj4Ujir2b9d5bkttH9Y9HfjTrHklHHnAPK_-Pk6Rgjy1BjG; expires=Sat, 30-Sep-2017 10:47:03 GMT; path=/; domain=.google.co.in; HttpOnly
Accept-Ranges: none
Vary: Accept-Encoding
Connection: close


In [42]:
r = requests.get("http://google.com/")
In [43]:
r.status_code
Out[43]:
200
In [45]:
print(r.text[:500])
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-IN"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script>(function(){window.google={kEI:'zzPeWIiwBIn6vAS6pLXoCg',kEXPI:'1351892,1352552,1352864,1352993,1353040,1353106,3700317,3700347,3700405,4029815,4031109,4032677,4036527,4038012,4039268,4041899,4043492,4045841,4048347,40657

JSON

In [46]:
import json
In [47]:
person = {"name": "Alice", "email": "alice@example.com"}
In [48]:
print(person)
{'email': 'alice@example.com', 'name': 'Alice'}
In [49]:
jsontext = json.dumps(person)
In [50]:
jsontext
Out[50]:
'{"email": "alice@example.com", "name": "Alice"}'
In [51]:
person = {
    "name": "Alice", 
    "email": "alice@example.com",
    "active": True,
    "n": 87,
    "tags": ["a", "b"],
    "comment": "comment with a single 'quote'"
}
In [52]:
print(person)
{'tags': ['a', 'b'], 'n': 87, 'email': 'alice@example.com', 'comment': "comment with a single 'quote'", 'name': 'Alice', 'active': True}
In [53]:
jsontext = json.dumps(person)
In [54]:
jsontext
Out[54]:
'{"tags": ["a", "b"], "n": 87, "email": "alice@example.com", "comment": "comment with a single \'quote\'", "name": "Alice", "active": true}'
In [55]:
print(jsontext)
{"tags": ["a", "b"], "n": 87, "email": "alice@example.com", "comment": "comment with a single 'quote'", "name": "Alice", "active": true}

JSON APIs

Let us try a simple example.

There is a service called http://httpbin.org. It just echos back our request as JSON.

In [56]:
url = "http://httpbin.org/get"
In [57]:
print(requests.get(url).text)
{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.11.1"
  }, 
  "origin": "183.83.118.58", 
  "url": "http://httpbin.org/get"
}

In [58]:
requests.get(url).text
Out[58]:
'{\n  "args": {}, \n  "headers": {\n    "Accept": "*/*", \n    "Accept-Encoding": "gzip, deflate", \n    "Connection": "close", \n    "Host": "httpbin.org", \n    "User-Agent": "python-requests/2.11.1"\n  }, \n  "origin": "123.201.210.187", \n  "url": "http://httpbin.org/get"\n}\n'

The requests library provides a handy way to work with JSON.

In [60]:
d = requests.get(url).json()
In [61]:
d
Out[61]:
{'args': {},
 'headers': {'Accept': '*/*',
  'Accept-Encoding': 'gzip, deflate',
  'Connection': 'close',
  'Host': 'httpbin.org',
  'User-Agent': 'python-requests/2.11.1'},
 'origin': '183.83.118.58',
 'url': 'http://httpbin.org/get'}
In [62]:
d['origin']
Out[62]:
'183.83.118.58'
In [63]:
def get_my_ip():
    url = "http://httpbin.org/get"    
    d = requests.get(url).json()
    return d['origin']
In [65]:
get_my_ip()
Out[65]:
'183.83.118.58'

Github has very beautiful API.

https://developer.github.com/

In [79]:
url = "https://api.github.com/search/repositories"
params = {
    "q": "language:python",
    "sort": "stars",
    "order": "desc"
}

d = requests.get(url, params=params).json()
In [67]:
type(d)
Out[67]:
dict
In [68]:
d.keys()
Out[68]:
dict_keys(['total_count', 'incomplete_results', 'items'])
In [69]:
d['total_count']
Out[69]:
1595071
In [72]:
d['items'][0].keys()
Out[72]:
dict_keys(['forks_url', 'contributors_url', 'updated_at', 'stargazers_url', 'collaborators_url', 'trees_url', 'full_name', 'branches_url', 'labels_url', 'forks_count', 'issue_events_url', 'statuses_url', 'archive_url', 'assignees_url', 'notifications_url', 'tags_url', 'git_url', 'default_branch', 'milestones_url', 'fork', 'git_refs_url', 'teams_url', 'issues_url', 'pushed_at', 'stargazers_count', 'comments_url', 'has_pages', 'url', 'git_commits_url', 'name', 'merges_url', 'events_url', 'blobs_url', 'has_issues', 'watchers_count', 'hooks_url', 'ssh_url', 'commits_url', 'size', 'git_tags_url', 'open_issues_count', 'id', 'created_at', 'has_downloads', 'mirror_url', 'homepage', 'issue_comment_url', 'language', 'forks', 'private', 'compare_url', 'downloads_url', 'has_wiki', 'watchers', 'clone_url', 'contents_url', 'has_projects', 'owner', 'deployments_url', 'svn_url', 'languages_url', 'subscription_url', 'score', 'releases_url', 'subscribers_url', 'keys_url', 'description', 'pulls_url', 'html_url', 'open_issues'])
In [74]:
for repo in d['items'][:10]:
    print(repo['full_name'])
vinta/awesome-python
jakubroztocil/httpie
pallets/flask
rg3/youtube-dl
nvbn/thefuck
django/django
kennethreitz/requests
ansible/ansible
josephmisiti/awesome-machine-learning
minimaxir/big-list-of-naughty-strings
In [77]:
%%file popular-repos.py
import requests

url = "https://api.github.com/search/repositories"
params = {
    "q": "language:python",
    "sort": "stars",
    "order": "desc"
}

d = requests.get(url, params=params).json()
for repo in d['items'][:10]:
    print(repo['full_name'])
Overwriting popular-repos.py
In [78]:
!python popular-repos.py
vinta/awesome-python
jakubroztocil/httpie
pallets/flask
rg3/youtube-dl
nvbn/thefuck
django/django
kennethreitz/requests
ansible/ansible
josephmisiti/awesome-machine-learning
minimaxir/big-list-of-naughty-strings

Sending POST requests

In [80]:
import requests
url = "http://httpbin.org/post"
data = {"x": 1, "y": 2}
response = requests.post(url, data=data)
In [81]:
response.json()
Out[81]:
{'args': {},
 'data': '',
 'files': {},
 'form': {'x': '1', 'y': '2'},
 'headers': {'Accept': '*/*',
  'Accept-Encoding': 'gzip, deflate',
  'Connection': 'close',
  'Content-Length': '7',
  'Content-Type': 'application/x-www-form-urlencoded',
  'Host': 'httpbin.org',
  'User-Agent': 'python-requests/2.11.1'},
 'json': None,
 'origin': '183.83.118.58',
 'url': 'http://httpbin.org/post'}
In [82]:
response = requests.post(url, json=data)
In [83]:
response.json()
Out[83]:
{'args': {},
 'data': '{"y": 2, "x": 1}',
 'files': {},
 'form': {},
 'headers': {'Accept': '*/*',
  'Accept-Encoding': 'gzip, deflate',
  'Connection': 'close',
  'Content-Length': '16',
  'Content-Type': 'application/json',
  'Host': 'httpbin.org',
  'User-Agent': 'python-requests/2.11.1'},
 'json': {'x': 1, 'y': 2},
 'origin': '123.201.210.187',
 'url': 'http://httpbin.org/post'}

Example: Slack Integration

Setup an incoming webbook for your slack channel. That gives a webhook URL.

In [84]:
# given by slack - this posts to #hack on pipalacademy's slack
url ="https://hooks.slack.com/services/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"
In [85]:
payload={
    "text": "This is a message sent from Python.\n" +
            "And this is another line of text."
}
requests.post(url, data={"payload": json.dumps(payload)})
Out[85]:
<Response [200]>
In [86]:
def post_to_slack(message):
    payload={
        "text": message
    }
    requests.post(url, data={"payload": json.dumps(payload)})
    
In [88]:
post_to_slack("Hello everyone")
In [89]:
for i in range(10):
    post_to_slack("Hello " + str(i))
In [90]:
def post_to_slack(message, username=None, icon=None):
    payload={
        "text": message
    }
    if username:
        payload["username"] = username
    if icon:
        payload["icon_emoji"] = icon
    requests.post(url, data={"payload": json.dumps(payload)})
    
In [91]:
post_to_slack("banana banana banana", username="monkey-bot", icon=":monkey_face:")

Real-world problem from Sibi

Generete ansible dynamic inventory.

It looks something like this:

{
    "_meta": {
        "hostvars": {
            "host1": {},
            "host2": {}                
        }
    },
    "group1": [
        "host1",
        "host2"
    ]
}
In [92]:
%%file hosts.txt
mail-1.example.com
mail-2.example.com
web-1.example.com
web-2.example.com
Overwriting hosts.txt
In [99]:
%%file generate-dyn-inventory.py
"""Takes a plain text hosts file and generates
dynamic inventory for ansible.
"""
import sys
import json

def read_hosts(filename):
    return [line.strip() for line in open(filename)]

def get_host_vars(hostname):
    return {}

def generate_inventory(hosts):
    d = {
        "_meta": {
            "hostvars": {h: get_host_vars(h) for h in hosts}
        },
        "testgroup": hosts
    }
    return d

def main():
    hostsfile = sys.argv[1]
    hosts = read_hosts(hostsfile)
    inventory = generate_inventory(hosts)
    print(json.dumps(inventory, indent=4))
    
if __name__ == "__main__":
    main()
Overwriting generate-dyn-inventory.py
In [100]:
!python generate-dyn-inventory.py hosts.txt
{
    "_meta": {
        "hostvars": {
            "web-1.example.com": {},
            "mail-1.example.com": {},
            "web-2.example.com": {},
            "mail-2.example.com": {}
        }
    },
    "testgroup": [
        "mail-1.example.com",
        "mail-2.example.com",
        "web-1.example.com",
        "web-2.example.com"
    ]
}
In [103]:
%%file generate-dyn-inventory.py
"""Takes a plain text hosts file and generates
dynamic inventory for ansible.

version 2
"""
import sys
import json

def read_hosts(filename):
    return [line.strip() for line in open(filename)]

def get_host_vars(hostname):
    return {"g": find_group(hostname)}

def find_group(hostname):
    basename = hostname.split(".")[0]
    group = basename.split("-")[0]
    return group

def generate_inventory(hosts):
        
    d = {
        "_meta": {
            "hostvars": {h: get_host_vars(h) for h in hosts}
        },
    }

    # for each host, identify the group and add to that
    for h in hosts:
        group = find_group(h)
        d.setdefault(group, []).append(h)
    
    return d

def main():
    hostsfile = sys.argv[1]
    hosts = read_hosts(hostsfile)
    inventory = generate_inventory(hosts)
    print(json.dumps(inventory, indent=4))
    
if __name__ == "__main__":
    main()
Overwriting generate-dyn-inventory.py
In [104]:
!python generate-dyn-inventory.py hosts.txt
{
    "_meta": {
        "hostvars": {
            "web-2.example.com": {
                "g": "web"
            },
            "mail-2.example.com": {
                "g": "mail"
            },
            "mail-1.example.com": {
                "g": "mail"
            },
            "web-1.example.com": {
                "g": "web"
            }
        }
    },
    "web": [
        "web-1.example.com",
        "web-2.example.com"
    ],
    "mail": [
        "mail-1.example.com",
        "mail-2.example.com"
    ]
}

Regular Expressions

Please look at regular-expressions notebook.

Writing Command-line Applications

Professional command-line applciations usually takes various flags and display nice help.

For example, let us look at the grep command in unix.

In [105]:
!grep --help
usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
	[-e pattern] [-f file] [--binary-files=value] [--color=when]
	[--context[=num]] [--directories=action] [--label] [--line-buffered]
	[--null] [pattern] [file ...]

Let us understand different kinds of arguments that are commonly used:

 -i, --ignore-case
         Perform case insensitive matching.  
         By default, grep is case sensitive.

 -c, --count
         Only a count of selected lines is written to 
         standard output. 

 -C[num, --context=num]
         Print num lines of leading and trailing context 
         surrounding each match.

 -f file, --file=file
         Read one or more newline separated patterns from file. 

pattern 
        Pattern to look for

[file ...]     
        zero or more filenames to search

The argparse module

In [120]:
%%file echo.py
"""Simple echo program that takes command-line flags.
"""
import argparse

def parse_args():
    p = argparse.ArgumentParser()
    p.add_argument("message", help="message to display")
    p.add_argument("-r", "--repeats", 
                   type=int,
                   default=1,
                   help="number of times to repeat the message")
    return p.parse_args()

def main():
    args = parse_args()
    print(args)
    for i in range(args.repeats):
        print(args.message)
    
if __name__ == "__main__":
    main()
Overwriting echo.py
In [117]:
!python echo.py --help
usage: echo.py [-h] [-r REPEATS] message

positional arguments:
  message               message to display

optional arguments:
  -h, --help            show this help message and exit
  -r REPEATS, --repeats REPEATS
                        number of times to repeat the message
In [121]:
!python echo.py -r 4 hello
Namespace(message='hello', repeats=4)
hello
hello
hello
hello
In [122]:
!python echo.py -r x hello
usage: echo.py [-h] [-r REPEATS] message
echo.py: error: argument -r/--repeats: invalid int value: 'x'
In [132]:
%%file echo2.py
"""Simple echo program that takes command-line flags.

version 2 with boolean flags and multiple arguments.
"""
import argparse

def parse_args():
    p = argparse.ArgumentParser()
    p.add_argument("message", nargs="+", help="message to display")
    p.add_argument("-r", "--repeats", 
                   type=int,
                   default=1,
                   help="number of times to repeat the message")
    p.add_argument("-u", "--upper-case", 
                   default=False,
                   action="store_true",
                   help="convert the message to upper case")
    return p.parse_args()

def main():
    args = parse_args()
    print(args)
    message = " ".join(args.message)
    if args.upper_case:
        message = message.upper()
    for i in range(args.repeats):
        print(message)
    
if __name__ == "__main__":
    main()
Overwriting echo2.py
In [130]:
!python echo2.py --help
usage: echo2.py [-h] [-r REPEATS] [-u] message [message ...]

positional arguments:
  message               message to display

optional arguments:
  -h, --help            show this help message and exit
  -r REPEATS, --repeats REPEATS
                        number of times to repeat the message
  -u, --upper-case      convert the message to upper case
In [133]:
!python echo2.py -u -r 3 hello
Namespace(message=['hello'], repeats=3, upper_case=True)
HELLO
HELLO
HELLO
In [134]:
!python echo2.py -r 3 hello world
Namespace(message=['hello', 'world'], repeats=3, upper_case=False)
hello world
hello world
hello world

Custom Validation

In [154]:
%%file resolve-ip.py
import argparse
import re

def ipaddress(value):
#     if value.count(".") != 3:
#         raise ValueError("Invalid IP address: " + repr(value))
    m = re.match("^(\d+)\.(\d+)\.(\d+)\.(\d+)$", value)
    if not m:
        raise ValueError("Invalid IP address: " + repr(value))
    return tuple([int(x) for x in m.groups()])

def parse_args():
    p = argparse.ArgumentParser()
    p.add_argument("ip", nargs="+", 
                   type=ipaddress,
                   help="IP address to resolve")
    return p.parse_args()

def main():
    args = parse_args()
    print(args)
    
if __name__ == "__main__":
    main()
Overwriting resolve-ip.py
In [155]:
!python resolve-ip.py --help
usage: resolve-ip.py [-h] ip [ip ...]

positional arguments:
  ip          IP address to resolve

optional arguments:
  -h, --help  show this help message and exit
In [156]:
!python resolve-ip.py hello
usage: resolve-ip.py [-h] ip [ip ...]
resolve-ip.py: error: argument ip: invalid ipaddress value: 'hello'
In [157]:
!python resolve-ip.py 1.2.3.4
Namespace(ip=[(1, 2, 3, 4)])
In [ ]: