Module 2 - Day 4

Login to Lab using your credentials. There is a notebook with name 2-4.ipynb already created for you. Open that and use it for today’s training.

Shut down all previous notebooks.

Command line options

Using typer module (third party module)

to install any thrid party library use pip module

pip install typer
python -m pip install typer
python3 -m pip install typer (if you have python3 and python2 installed then use this)
!python3 -m pip install typer
Defaulting to user installation because normal site-packages is not writeable
Collecting typer
  Downloading typer-0.12.3-py3-none-any.whl.metadata (15 kB)
Requirement already satisfied: click>=8.0.0 in /opt/tljh/user/lib/python3.10/site-packages (from typer) (8.1.7)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/tljh/user/lib/python3.10/site-packages (from typer) (4.12.2)
Collecting shellingham>=1.3.0 (from typer)
  Downloading shellingham-1.5.4-py2.py3-none-any.whl.metadata (3.5 kB)
Collecting rich>=10.11.0 (from typer)
  Downloading rich-13.7.1-py3-none-any.whl.metadata (18 kB)
Collecting markdown-it-py>=2.2.0 (from rich>=10.11.0->typer)
  Downloading markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /opt/tljh/user/lib/python3.10/site-packages (from rich>=10.11.0->typer) (2.18.0)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich>=10.11.0->typer)
  Downloading mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Downloading typer-0.12.3-py3-none-any.whl (47 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.2/47.2 kB 3.5 MB/s eta 0:00:00
Downloading rich-13.7.1-py3-none-any.whl (240 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 240.7/240.7 kB 5.0 MB/s eta 0:00:00a 0:00:01
Downloading shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB)
Downloading markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 87.5/87.5 kB 18.6 MB/s eta 0:00:00
Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: shellingham, mdurl, markdown-it-py, rich, typer
  WARNING: The script markdown-it is installed in '/home/jupyter-pipal/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script typer is installed in '/home/jupyter-pipal/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed markdown-it-py-3.0.0 mdurl-0.1.2 rich-13.7.1 shellingham-1.5.4 typer-0.12.3

[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: pip install --upgrade pip
import typer
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[2], line 1
----> 1 import typer

ModuleNotFoundError: No module named 'typer'
import typer
%%file head.py
import sys

def head(filename:str, n:int):
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")

n = int(sys.argv[1])
filename = sys.argv[2]
head(filename, n)
Overwriting head.py
!python head.py 3 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
!python head.py --help
Traceback (most recent call last):
  File "/opt/arcesium-python-2024-june/head.py", line 8, in <module>
    n = int(sys.argv[1])
ValueError: invalid literal for int() with base 10: '--help'
%%file head1.py
import typer

def head(filename:str, n:int):
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")

if __name__ == "__main__":
    # this if block will make sure than this code is exucuted only when it is used as python program
    typer.run(head)
Overwriting head1.py
!python head1.py --help
                                                                                
 Usage: head1.py [OPTIONS] FILENAME N                                           
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ *    filename      TEXT     [default: None] [required]                       │
│ *    n             INTEGER  [default: None] [required]                       │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                  │
╰──────────────────────────────────────────────────────────────────────────────╯
!python head1.py zen.txt 5
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
!head --help
Usage: head [OPTION]... [FILE]...
Print the first 10 lines of each FILE to standard output.
With more than one FILE, precede each with a header giving the file name.

With no FILE, or when FILE is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.
  -c, --bytes=[-]NUM       print the first NUM bytes of each file;
                             with the leading '-', print all but the last
                             NUM bytes of each file
  -n, --lines=[-]NUM       print the first NUM lines instead of the first 10;
                             with the leading '-', print all but the last
                             NUM lines of each file
  -q, --quiet, --silent    never print headers giving file names
  -v, --verbose            always print headers giving file names
  -z, --zero-terminated    line delimiter is NUL, not newline
      --help        display this help and exit
      --version     output version information and exit

NUM may have a multiplier suffix:
b 512, kB 1000, K 1024, MB 1000*1000, M 1024*1024,
GB 1000*1000*1000, G 1024*1024*1024, and so on for T, P, E, Z, Y, R, Q.
Binary prefixes can be used, too: KiB=K, MiB=M, and so on.

GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Full documentation <https://www.gnu.org/software/coreutils/head>
or available locally via: info '(coreutils) head invocation'
%%file head2.py
import typer
from typing_extensions import Annotated

def head(filename:str, n:Annotated[int, typer.Option(help="Number of lines to be printed")]=10):
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")

if __name__ == "__main__":
    # this if block will make sure than this code is exucuted only when it is used as python program
    typer.run(head)
Overwriting head2.py
!python head2.py --help
                                                                                
 Usage: head2.py [OPTIONS] FILENAME                                             
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ *    filename      TEXT  [default: None] [required]                          │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --n           INTEGER  Number of lines to be printed [default: 10]           │
│ --help                 Show this message and exit.                           │
╰──────────────────────────────────────────────────────────────────────────────╯
!python head2.py zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
!python head2.py --n 4 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.

problem

  • Implement unix command grep which will look for given keyword into a file and print those lines which have that keyword. Add option –v for your program which reverses the beahviour, it means it will print all those lines which do not contain the keyword
!grep better zen.txt
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Now is better than never.
Although never is often better than *right* now.
!grep -v better zen.txt
The Zen of Python, by Tim Peters

Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

typer library homepage

homepage

problem

  • Implement unix command grep which will look for given keyword into a file and print those lines which have that keyword. Add option –v for your program which reverses the beahviour, it means it will print all those lines which do not contain the keyword
%%file grep.py
import typer
from typing_extensions import Annotated

def grep(keyword:str, filename:str):
    with open(filename) as f:
        for line in f:
            if keyword in line:
                print(line, end="")


if __name__ == "__main__":
    typer.run(grep)
    
Overwriting grep.py
!python grep.py --help
                                                                                
 Usage: grep.py [OPTIONS] KEYWORD FILENAME                                      
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ *    keyword       TEXT  [default: None] [required]                          │
│ *    filename      TEXT  [default: None] [required]                          │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                  │
╰──────────────────────────────────────────────────────────────────────────────╯
!python grep.py better zen.txt
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Now is better than never.
Although never is often better than *right* now.
%%file grep.py
import typer
from typing_extensions import Annotated

def grep(keyword:str, filename:str, v:bool=False):
    with open(filename) as f:
        for line in f:
            if keyword in line:
                print(line, end="")


if __name__ == "__main__":
    typer.run(grep)
    
Overwriting grep.py
!python grep.py --help
                                                                                
 Usage: grep.py [OPTIONS] KEYWORD FILENAME                                      
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ *    keyword       TEXT  [default: None] [required]                          │
│ *    filename      TEXT  [default: None] [required]                          │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --v       --no-v      [default: no-v]                                        │
│ --help                Show this message and exit.                            │
╰──────────────────────────────────────────────────────────────────────────────╯
%%file grep.py
import typer
from typing_extensions import Annotated

def grep(keyword:str, filename:str, v:Annotated[bool, typer.Option(help="print line which do not have this keyword")]=False):
    with open(filename) as f:
        for line in f:
            if keyword in line and not v:
                print(line, end="")
            elif keyword not in line and v:
                print(line, end="")


if __name__ == "__main__":
    typer.run(grep)
    
Overwriting grep.py
!python grep.py --help
                                                                                
 Usage: grep.py [OPTIONS] KEYWORD FILENAME                                      
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ *    keyword       TEXT  [default: None] [required]                          │
│ *    filename      TEXT  [default: None] [required]                          │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --v       --no-v      print line which do not have this keyword              │
│                       [default: no-v]                                        │
│ --help                Show this message and exit.                            │
╰──────────────────────────────────────────────────────────────────────────────╯
!python grep.py --v better zen.txt
The Zen of Python, by Tim Peters

Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
!python grep.py better zen.txt
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Now is better than never.
Although never is often better than *right* now.

Dictionaries

person = {"name":"vikrant", "place":"Dapoli"}
person['name']
'vikrant'
person['address']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[56], line 1
----> 1 person['address']

KeyError: 'address'
for key in person:
    print(key)
name
place
for key in person:
    print(key, person[key])
name vikrant
place Dapoli
for k,v in person.items():
    print(k, v)
name vikrant
place Dapoli
for v in person.values():
    print(v)
vikrant
Dapoli
stock = {"symbol": "IBM", "high":123, "low":120, "gain":5}
stock
{'symbol': 'IBM', 'high': 123, 'low': 120, 'gain': 5}
list(stock) # if you iterate over stock dictionary , you will get all the keys
['symbol', 'high', 'low', 'gain']
list(stock.items()) # stock.items() will iterate over all key and value pairs
[('symbol', 'IBM'), ('high', 123), ('low', 120), ('gain', 5)]
for k in stock:
    print(k)
symbol
high
low
gain
for k,v in stock.items():
    print(k,v)
symbol IBM
high 123
low 120
gain 5
for k,v in stock:
    print(k)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[69], line 1
----> 1 for k,v in stock:
      2     print(k)

ValueError: too many values to unpack (expected 2)
stock
{'symbol': 'IBM', 'high': 123, 'low': 120, 'gain': 5}
for v in stock.values():
    print(v)
IBM
123
120
5
stock['close']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[72], line 1
----> 1 stock['close']

KeyError: 'close'
stock.get("close", 0)
0
stock
{'symbol': 'IBM', 'high': 123, 'low': 120, 'gain': 5}
stock.get("high", 0)
123
person['nationality']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[77], line 1
----> 1 person['nationality']

KeyError: 'nationality'
person.get('nationality', 'Indian')
'Indian'
person
{'name': 'vikrant', 'place': 'Dapoli'}

Example

Find out frequency of each word from given file

%%file words.txt
one
one two
one two three
one two three four
one two three four five
one two three four five six
one two three seven six
one two eight seven six
one nine eight seven six
ten nine eight seven six
ten nine eight seven
ten nine eight
ten nine
ten
Overwriting words.txt
d = {}
d['one'] = 1
d
{'one': 1}
d['one'] = 2
d
{'one': 2}
def get_words(filename):
    with open(filename) as f:
        return f.read().split()
words = get_words("words.txt")
unique_words= []
[unique_words.append(word) for word in words if word not in unique_words]
[None, None, None, None, None, None, None, None, None, None]
unique_words
['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten']
print({word}:{counts})
  Cell In[94], line 1
    print({word}:{counts})
                ^
SyntaxError: invalid syntax
print(f"{word}:{counts}")
def word_count(words):
    count={}
    for word in words:
        if word in word_count:
            count[word]+=1
        else:
            count[word]=1
    return count


def word_count_(words):
    count={}
    for word in words:
        count[word] = count.get(word, 0) + 1
    return count


def word_count__(words):
    unique = set(words)  # one pass from all the words
    count = {}
    for w in unique:
        count[w] = words.count(w) # for every count there will be one pass 
    return count


def pretty_print(word_count):    
    for word, counts in word_count.items():
        print(f"{word}:{counts}")
word_count_(words)
{'one': 9,
 'two': 7,
 'three': 5,
 'four': 3,
 'five': 2,
 'six': 5,
 'seven': 5,
 'eight': 5,
 'nine': 5,
 'ten': 5}
freq = word_count_(words)
for word, count in freq.items():
    print(f"{word} {count}")
one 9
two 7
three 5
four 3
five 2
six 5
seven 5
eight 5
nine 5
ten 5
for word, count in freq.items():
    print(f"{word.ljust(5)} {count}")
one   9
two   7
three 5
four  3
five  2
six   5
seven 5
eight 5
nine  5
ten   5
for word, count in freq.items():
    print(f"{word.rjust(5)} {count}")
  one 9
  two 7
three 5
 four 3
 five 2
  six 5
seven 5
eight 5
 nine 5
  ten 5
"test".rjust(10)
'      test'
"test".ljust(10)
'test      '
"test".center(10)
'   test   '
for word, count in freq.items():
    print(f"{word.rjust(5)}: {count}")
  one: 9
  two: 7
three: 5
 four: 3
 five: 2
  six: 5
seven: 5
eight: 5
 nine: 5
  ten: 5
def get_count(pair):
    return pair[1]

for word, count in sorted(freq.items(), key=get_count):
    print(f"{word.rjust(5)}: {count}")
 five: 2
 four: 3
three: 5
  six: 5
seven: 5
eight: 5
 nine: 5
  ten: 5
  two: 7
  one: 9
for word, count in sorted(freq.items(), key=get_count, reverse=True):
    print(f"{word.rjust(5)}: {count}")
  one: 9
  two: 7
three: 5
  six: 5
seven: 5
eight: 5
 nine: 5
  ten: 5
 four: 3
 five: 2
for word, count in sorted(freq.items(), key=get_count, reverse=True):
    print(f"{word.rjust(5)}: {count}", "*"*count)
  one: 9 *********
  two: 7 *******
three: 5 *****
  six: 5 *****
seven: 5 *****
eight: 5 *****
 nine: 5 *****
  ten: 5 *****
 four: 3 ***
 five: 2 **

Dictionary comprehsnion

names = ['APPLE', 'IBM', 'AT&T', 'AGILENT']
values = [700.5, 300.1, 355.7, 600.3]
d = {}
for name, value in zip(names, values):
    d[name] =value
d
{'APPLE': 700.5, 'IBM': 300.1, 'AT&T': 355.7, 'AGILENT': 600.3}
{name:value for name, value in zip(names, values)}
{'APPLE': 700.5, 'IBM': 300.1, 'AT&T': 355.7, 'AGILENT': 600.3}
def word_count__(words):
    unique = set(words)  # one pass from all the words
    return {w:words.count(w) for w in unique}
word_count__(words)
{'two': 7,
 'one': 9,
 'ten': 5,
 'eight': 5,
 'four': 3,
 'seven': 5,
 'six': 5,
 'three': 5,
 'five': 2,
 'nine': 5}
dict(zip(names, values))
{'APPLE': 700.5, 'IBM': 300.1, 'AT&T': 355.7, 'AGILENT': 600.3}
indexdata = [('IBM', 'Monday', 111.71436961893693),
          ('IBM', 'Tuesday', 141.21220022208635),
          ('IBM', 'Wednesday', 112.40571010053796),
          ('IBM', 'Thursday', 137.54133351926248),
          ('IBM', 'Friday', 140.25154281801224),
          ('MICROSOFT', 'Monday', 235.0403622499107),
          ('MICROSOFT', 'Tuesday', 225.0206535036475),
          ('MICROSOFT', 'Wednesday', 216.10342426936444),
          ('MICROSOFT', 'Thursday', 200.38038844494193),
          ('MICROSOFT', 'Friday', 235.80850482793264),
          ('APPLE', 'Monday', 321.49182055844256),
          ('APPLE', 'Tuesday', 340.63612771662815),
          ('APPLE', 'Wednesday', 303.9065277507285),
          ('APPLE', 'Thursday', 338.1350605764038),
          ('APPLE', 'Friday', 318.3912296144338)]
def mean(nums):
    return sum(nums)/len(nums)


def get_prices(symbol, data):
    return [price for name, day, price in data if name==symbol]    

def get_weekly_average(symbol, data):
    p = get_prices(symbol, data)
    return mean(p)

def get_symbols(data):
    return set([item[0] for item in data])
get_weekly_average("IBM", indexdata)
128.62503125576717
{symbol:get_weekly_average(symbol, indexdata) for symbol in get_symbols(indexdata)}
{'IBM': 128.62503125576717,
 'APPLE': 324.51215324332736,
 'MICROSOFT': 222.47066665915946}