Python Virtual Training For Arcesium - Module II - Day 2¶

Jun Jul 18-22, 2022 Vikrant Patil

All notes are available online at https://notes.pipal.in/2022/arcesium_finop_batch1/

Please accept the invitation that you have received in your email and login to

https://engage.pipal.in/

From there launch your jupyter lab. Create a notebook with name module2-day2

© Pipal Academy LLP

Problem Set 2.1¶

In [11]:
def reversed_words(statement):
    words = statement.split()
    reversed_list = []
    for word in reversed(words):
        reversed_list.append(word)
        
    return " ".join(reversed_list)     #do not put print here
In [10]:
## this is not production code!
# write as per specification in the problem statement!
statement = "joy of programming"
words = statement.split()
reversed_list = []
for word in reversed(words):
    reversed_list.append(word)   
" ".join(reversed_list)
Out[10]:
'programming of joy'
In [8]:
reversed_words("Some statement to reverse")
Out[8]:
'reverse to statement Some'
In [9]:
def reversed_words(statement):
    words = statement.split()
    return " ".join([word for word in reversed(words)])
    
In [12]:
reversed_words("text statement goes here")
Out[12]:
'here goes statement text'
In [13]:
def merge_lists(collection1 , collection2):
    merged = []
    for item1, item2 in zip(collection1, collection2):
        merged.append(item1)
        merged.append(item2)
        
    return merged
In [14]:
ones = [1, 1, 1, 1]
In [15]:
ones.append(0)
In [16]:
ones.append(0, 2)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [16], in <cell line: 1>()
----> 1 ones.append(0, 2)

TypeError: list.append() takes exactly one argument (2 given)
In [17]:
ones.extend([1, 0])
In [18]:
def merge_lists(collection1 , collection2):
    merged = []
    for item1, item2 in zip(collection1, collection2):
        merged.extend([item1, item2])
        
    return merged
In [20]:
ones.extend(["1","0"])
In [21]:
ones
Out[21]:
[1, 1, 1, 1, 0, 1, 0, '1', '0']
In [22]:
twos = [2, 2, 2]
threes = [3, 3]
In [23]:
twos.extend(threes)
In [24]:
twos
Out[24]:
[2, 2, 2, 3, 3]
In [25]:
threes
Out[25]:
[3, 3]
In [26]:
fours = [4, 4]
fives = [5, 5, 5, 5]
In [27]:
fours.append(fives) # this take only one item and adds that at end
In [28]:
fours
Out[28]:
[4, 4, [5, 5, 5, 5]]
In [29]:
fours = [4, 4]
fives = [5, 5, 5, 5]
fours.extend(fives)
In [30]:
fours
Out[30]:
[4, 4, 5, 5, 5, 5]
In [31]:
import random
In [32]:
random.choice([4, 5, 6, 2, 6])
Out[32]:
6
In [33]:
random.choice([4, 5, 6, 2, 6])
Out[33]:
5
In [34]:
random.choice("thisisrandomtext")
Out[34]:
's'
In [35]:
length = random.choice(range(8,16))
In [36]:
length
Out[36]:
11
In [38]:
import string
string.ascii_letters
string.digits
Out[38]:
'0123456789'
In [40]:
chars = string.ascii_letters + string.digits + "!@#$%^&*()"
"".join([random.choice(chars) for i in range(length)])
Out[40]:
'dKzud($PbC9'
In [41]:
import string

def generate_password():
    length = random.choice(range(8, 16))
    chars = string.ascii_letters + string.digits + "!@#$%^&*()"
    return "".join([random.choice(chars) for i in range(length)])
    
In [57]:
generate_password()
Out[57]:
'ig7YDbd3O@UG&'

Working With Files¶

In [59]:
%%file zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Writing zen.txt
In [61]:
!cat zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
In [62]:
filename = "zen.txt" #this relative path.. this means the file is present in current working directory
In [63]:
with open(filename) as f:
    contents = f.read()
    print(contents)
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

In [65]:
with open(filename) as f:
    for line in f:
        print(line) # line already has "\n" and print also has added its own \n
The Zen of Python, by Tim Peters



Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one-- and preferably only one --obvious way to do it.

Although that way may not be obvious at first unless you're Dutch.

Now is better than never.

Although never is often better than *right* now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!

In [66]:
with open(filename) as f:
    for line in f:
        print(line, end="")
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
In [67]:
with open(filename) as f:
    for linenum, line in enumerate(f, start=1):
        print(linenum, line, end="")
1 The Zen of Python, by Tim Peters
2 
3 Beautiful is better than ugly.
4 Explicit is better than implicit.
5 Simple is better than complex.
6 Complex is better than complicated.
7 Flat is better than nested.
8 Sparse is better than dense.
9 Readability counts.
10 Special cases aren't special enough to break the rules.
11 Although practicality beats purity.
12 Errors should never pass silently.
13 Unless explicitly silenced.
14 In the face of ambiguity, refuse the temptation to guess.
15 There should be one-- and preferably only one --obvious way to do it.
16 Although that way may not be obvious at first unless you're Dutch.
17 Now is better than never.
18 Although never is often better than *right* now.
19 If the implementation is hard to explain, it's a bad idea.
20 If the implementation is easy to explain, it may be a good idea.
21 Namespaces are one honking great idea -- let's do more of those!
In [69]:
with open(filename) as f:
    for line in reversed(f): # because f is iterator
        print(line, end="")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [69], in <cell line: 1>()
      1 with open(filename) as f:
----> 2     for line in reversed(f): # because f is iterator
      3         print(line, end="")

TypeError: '_io.TextIOWrapper' object is not reversible
In [70]:
f = open(filename) # returns filehandle which is also an iterator
In [71]:
f.readline() # read only one line, so pointer of iterator goes to next line
Out[71]:
'The Zen of Python, by Tim Peters\n'
In [72]:
f.readline()
Out[72]:
'\n'
In [73]:
f.readline()
Out[73]:
'Beautiful is better than ugly.\n'
In [74]:
for line in f: # this will start where the pointer was
    print(line, end="")
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
In [76]:
f.readline() # once the file reaches to its end, this will reaturn empty string
Out[76]:
''
In [78]:
f.read() # even this will return empty string
Out[78]:
''
In [79]:
f.close()
In [81]:
f.read() # becasue file is closed this will give error
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [81], in <cell line: 1>()
----> 1 f.read()

ValueError: I/O operation on closed file.
In [82]:
%%file words.txt
some
words
with some lines
and some data
to work
on file reading
Writing words.txt
In [83]:
x = [3, 4, 5, 6]
In [84]:
y = [3, 4, 5, 6]
In [86]:
x == y # the data is same and matching
Out[86]:
True
In [88]:
x is y # is checks if object x and y are they same?
Out[88]:
False
In [89]:
x is "the"
<>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
<>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
/tmp/ipykernel_17359/2034459987.py:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
  x is "the"
Out[89]:
False
In [90]:
def count_word(filename, searchword="the"):
    with open(filename) as filehandle:
        count = 0
        for line in filehandle:
            words = line.split()
            for word in words:
                if word == searchword:
                    count += 1
                    
    return count
In [91]:
count_word("zen.txt")
Out[91]:
5
In [92]:
!cat zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
In [93]:
def count_word(filename, searchword="the"):
    with open(filename) as filehandle:
        count = 0
        for line in filehandle:
            words = line.split()
            for word in words:
                if word.lower() == searchword.lower():
                    count += 1
                    
    return count
In [94]:
count_word("zen.txt")
Out[94]:
6
In [95]:
poem = """Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
"""
In [96]:
poem
Out[96]:
"Although never is often better than *right* now.\nIf the implementation is hard to explain, it's a bad idea.\nIf the implementation is easy to explain, it may be a good idea.\nNamespaces are one honking great idea -- let's do more of those!\n"
In [98]:
poem.split() # this understands tab, newline and whitespace...also understands multiple whitespaces
Out[98]:
['Although',
 'never',
 'is',
 'often',
 'better',
 'than',
 '*right*',
 'now.',
 'If',
 'the',
 'implementation',
 'is',
 'hard',
 'to',
 'explain,',
 "it's",
 'a',
 'bad',
 'idea.',
 'If',
 'the',
 'implementation',
 'is',
 'easy',
 'to',
 'explain,',
 'it',
 'may',
 'be',
 'a',
 'good',
 'idea.',
 'Namespaces',
 'are',
 'one',
 'honking',
 'great',
 'idea',
 '--',
 "let's",
 'do',
 'more',
 'of',
 'those!']
In [99]:
def count_word(filename, searchword="the"):
    with open(filename) as filehandle:
        words = filehandle.read().lower().split()
        return words.count(searchword.lower())
In [100]:
count_word("zen.txt", "THE")
Out[100]:
6
In [102]:
!cat words.txt
some
words
with some lines
and some data
to work
on file reading

problem

  • Write a python script cat.py which mimics unix command cat. Essentially cat.py should print contents of file to screen.
python cat.py words.txt
some
words
with some lines
and some data
to work
on file reading
  • Write a python script head.py which mimics unix command head. It should show first n lines from the file passed as argument.
python head.py 5 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
In [103]:
%%file cat.py
import sys

def printfile(filename):
    with open(filename) as f:
        print(f.read())
        

printfile(sys.argv[1])
Writing cat.py
In [104]:
!python cat.py words.txt
some
words
with some lines
and some data
to work
on file reading

In [105]:
!python cat.py zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

In [106]:
!python cat.py cat.py
import sys

def printfile(filename):
    with open(filename) as f:
        print(f.read())
        

printfile(sys.argv[1])

In [107]:
import cat
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Input In [107], in <cell line: 1>()
----> 1 import cat

File ~/trainings/2022/arcesium_finop_batch1/cat.py:8, in <module>
      4     with open(filename) as f:
      5         print(f.read())
----> 8 printfile(sys.argv[1])

File ~/trainings/2022/arcesium_finop_batch1/cat.py:4, in printfile(filename)
      3 def printfile(filename):
----> 4     with open(filename) as f:
      5         print(f.read())

FileNotFoundError: [Errno 2] No such file or directory: '-f'
In [115]:
%%file cat.py
import sys

def printfile(filename):
    with open(filename) as f:
        print(f.read())
        

if __name__ == "__main__":# this block will be executed only when this file is used as a script
    printfile(sys.argv[1])
Overwriting cat.py
In [109]:
import cat
In [110]:
!python cat.py words.txt
some
words
with some lines
and some data
to work
on file reading

In [111]:
import cat
In [112]:
%%file testmodule.py

print(__name__)
Writing testmodule.py
In [113]:
!python testmodule.py
__main__
In [114]:
import testmodule
testmodule
  • Write a python script head.py which mimics unix command head. It should show first n lines from the file passed as argument.
python head.py 5 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
In [118]:
!head -n 6 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
In [123]:
help(f.read)
Help on built-in function read:

read(size=-1, /) method of _io.TextIOWrapper instance
    Read at most n characters from stream.
    
    Read from underlying buffer until we have n characters or we hit EOF.
    If n is negative or omitted, read until EOF.

In [124]:
help(f.readlines)
Help on built-in function readlines:

readlines(hint=-1, /) method of _io.TextIOWrapper instance
    Return a list of lines from the stream.
    
    hint can be specified to control the number of lines read: no more
    lines will be read if the total size (in bytes/characters) of all
    lines so far exceeds hint.

In [127]:
%%file head.py
import sys

def first_n_lines(filename, n):
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")
            
if __name__ == "__main__":
    first_n_lines(sys.argv[2], int(sys.argv[1]))
Overwriting head.py
In [128]:
!python head.py 4 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.

problems

  • Write a python script wc.py which print line count, word count and character count of a file
    python wc.py zen.txt
    21 144 857 zen.txt
In [129]:
!wc zen.txt
 21 144 857 zen.txt
In [132]:
%%file wc.py
import sys

def linecount(filename):
    with open(filename) as f:
        return len(f.readlines()) # readlines reads all lines in a file as a list of lines
    
def wordcount(filename):
    with open(filename) as f:
        return len(f.read().split())
    
def charcount(filename):
    with open(filename) as f:
        return len(f.read())
    
    
if __name__ == "__main__":
    file = sys.argv[1]
    print(linecount(file), wordcount(file), charcount(file), file)
Overwriting wc.py
In [131]:
!python wc.py zen.txt
21 144 857 zen.txt
In [133]:
!pip install typer
Requirement already satisfied: typer in /home/vikrant/usr/local/jupyter-py3.10/lib/python3.10/site-packages (0.6.1)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /home/vikrant/usr/local/jupyter-py3.10/lib/python3.10/site-packages (from typer) (8.1.2)
In [134]:
%%file headnew.py
import typer


def head(filename, n):
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")
            
            
if __name__ == "__main__":
    typer.run(head)
Writing headnew.py
In [137]:
!python headnew.py --help
Usage: headnew.py [OPTIONS] FILENAME N

Arguments:
  FILENAME  [required]
  N         [required]

Options:
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.
  --help                          Show this message and exit.
In [138]:
!python headnew.py zen.txt 5
Traceback (most recent call last):

  File "/home/vikrant/trainings/2022/arcesium_finop_batch1/headnew.py", line 11, in <module>
    typer.run(head)

  File "/home/vikrant/trainings/2022/arcesium_finop_batch1/headnew.py", line 6, in head
    for i in range(n):

TypeError: 'str' object cannot be interpreted as an integer

In [139]:
%%file headnew.py
import typer


def head(filename:str, n:int): # type annotations
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")
            
            
if __name__ == "__main__":
    typer.run(head)
Overwriting headnew.py
In [141]:
!python headnew.py --help
Usage: headnew.py [OPTIONS] FILENAME N

Arguments:
  FILENAME  [required]
  N         [required]

Options:
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.
  --help                          Show this message and exit.
In [142]:
!python headnew.py zen.txt 5
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
In [143]:
%%file headnew.py
import typer


def head(filename:str, n:int=typer.Argument(5, help="Number of lines to be displayed")): # type annotations
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")
            
            
if __name__ == "__main__":
    typer.run(head)
Overwriting headnew.py
In [144]:
!python headnew.py --help
Usage: headnew.py [OPTIONS] FILENAME [N]

Arguments:
  FILENAME  [required]
  [N]       Number of lines to be displayed  [default: 5]

Options:
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.
  --help                          Show this message and exit.
In [145]:
!python headnew.py zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
In [146]:
!python headnew.py zen.txt 3
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
In [147]:
%%file headnew.py
import typer


def head(filename:str, n:int=typer.Option(5, "--numlines", "-n", help="Number of lines to be displayed")): # type annotations
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")
            
            
if __name__ == "__main__":
    typer.run(head)
Overwriting headnew.py
In [148]:
!python headnew.py --help
Usage: headnew.py [OPTIONS] FILENAME

Arguments:
  FILENAME  [required]

Options:
  -n, --numlines INTEGER          Number of lines to be displayed  [default:
                                  5]
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.
  --help                          Show this message and exit.
In [149]:
!python headnew.py -n 3 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
In [151]:
!python headnew.py --numlines 4 zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
In [152]:
!grep --help
Usage: grep [OPTION]... PATTERNS [FILE]...
Search for PATTERNS in each FILE.
Example: grep -i 'hello world' menu.h main.c
PATTERNS can contain multiple patterns separated by newlines.

Pattern selection and interpretation:
  -E, --extended-regexp     PATTERNS are extended regular expressions
  -F, --fixed-strings       PATTERNS are strings
  -G, --basic-regexp        PATTERNS are basic regular expressions
  -P, --perl-regexp         PATTERNS are Perl regular expressions
  -e, --regexp=PATTERNS     use PATTERNS for matching
  -f, --file=FILE           take PATTERNS from FILE
  -i, --ignore-case         ignore case distinctions in patterns and data
      --no-ignore-case      do not ignore case distinctions (default)
  -w, --word-regexp         match only whole words
  -x, --line-regexp         match only whole lines
  -z, --null-data           a data line ends in 0 byte, not newline

Miscellaneous:
  -s, --no-messages         suppress error messages
  -v, --invert-match        select non-matching lines
  -V, --version             display version information and exit
      --help                display this help text and exit

Output control:
  -m, --max-count=NUM       stop after NUM selected lines
  -b, --byte-offset         print the byte offset with output lines
  -n, --line-number         print line number with output lines
      --line-buffered       flush output on every line
  -H, --with-filename       print file name with output lines
  -h, --no-filename         suppress the file name prefix on output
      --label=LABEL         use LABEL as the standard input file name prefix
  -o, --only-matching       show only nonempty parts of lines that match
  -q, --quiet, --silent     suppress all normal output
      --binary-files=TYPE   assume that binary files are TYPE;
                            TYPE is 'binary', 'text', or 'without-match'
  -a, --text                equivalent to --binary-files=text
  -I                        equivalent to --binary-files=without-match
  -d, --directories=ACTION  how to handle directories;
                            ACTION is 'read', 'recurse', or 'skip'
  -D, --devices=ACTION      how to handle devices, FIFOs and sockets;
                            ACTION is 'read' or 'skip'
  -r, --recursive           like --directories=recurse
  -R, --dereference-recursive  likewise, but follow all symlinks
      --include=GLOB        search only files that match GLOB (a file pattern)
      --exclude=GLOB        skip files that match GLOB
      --exclude-from=FILE   skip files that match any file pattern from FILE
      --exclude-dir=GLOB    skip directories that match GLOB
  -L, --files-without-match  print only names of FILEs with no selected lines
  -l, --files-with-matches  print only names of FILEs with selected lines
  -c, --count               print only a count of selected lines per FILE
  -T, --initial-tab         make tabs line up (if needed)
  -Z, --null                print 0 byte after FILE name

Context control:
  -B, --before-context=NUM  print NUM lines of leading context
  -A, --after-context=NUM   print NUM lines of trailing context
  -C, --context=NUM         print NUM lines of output context
  -NUM                      same as --context=NUM
      --color[=WHEN],
      --colour[=WHEN]       use markers to highlight the matching strings;
                            WHEN is 'always', 'never', or 'auto'
  -U, --binary              do not strip CR characters at EOL (MSDOS/Windows)

When FILE is '-', read standard input.  With no FILE, read '.' if
recursive, '-' otherwise.  With fewer than two FILEs, assume -h.
Exit status is 0 if any line (or file if -L) is selected, 1 otherwise;
if any error occurs and -q is not given, the exit status is 2.

Report bugs to: bug-grep@gnu.org
GNU grep home page: <http://www.gnu.org/software/grep/>
General help using GNU software: <https://www.gnu.org/gethelp/>

Writing file¶

In [156]:
with open("mydata.txt", "w") as f: # second argument of string "w" which means opne the file in write mode
    f.write("Some data on first line")
    f.write("  BTW this is also in first line")
    f.write("\n")
    f.write("second line")
In [155]:
!python cat.py mydata.txt
Some data on first line  BTW this is also in first line
second line
In [157]:
help(open)
Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
    Open file and return a stream.  Raise OSError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
    
    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the file regardless of the current seek position).
    In text mode, if encoding is not specified the encoding used is platform
    dependent: locale.getpreferredencoding(False) is called to get the
    current locale encoding. (For reading and writing raw bytes use binary
    mode and leave encoding unspecified.) The available modes are:
    
    ========= ===============================================================
    Character Meaning
    --------- ---------------------------------------------------------------
    'r'       open for reading (default)
    'w'       open for writing, truncating the file first
    'x'       create a new file and open it for writing
    'a'       open for writing, appending to the end of the file if it exists
    'b'       binary mode
    't'       text mode (default)
    '+'       open a disk file for updating (reading and writing)
    'U'       universal newline mode (deprecated)
    ========= ===============================================================
    
    The default mode is 'rt' (open for reading text). For binary random
    access, the mode 'w+b' opens and truncates the file to 0 bytes, while
    'r+b' opens the file without truncation. The 'x' mode implies 'w' and
    raises an `FileExistsError` if the file already exists.
    
    Python distinguishes between files opened in binary and text modes,
    even when the underlying operating system doesn't. Files opened in
    binary mode (appending 'b' to the mode argument) return contents as
    bytes objects without any decoding. In text mode (the default, or when
    't' is appended to the mode argument), the contents of the file are
    returned as strings, the bytes having been first decoded using a
    platform-dependent encoding or using the specified encoding if given.
    
    'U' mode is deprecated and will raise an exception in future versions
    of Python.  It has no effect in Python 3.  Use newline to control
    universal newlines mode.
    
    buffering is an optional integer used to set the buffering policy.
    Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
    line buffering (only usable in text mode), and an integer > 1 to indicate
    the size of a fixed-size chunk buffer.  When no buffering argument is
    given, the default buffering policy works as follows:
    
    * Binary files are buffered in fixed-size chunks; the size of the buffer
      is chosen using a heuristic trying to determine the underlying device's
      "block size" and falling back on `io.DEFAULT_BUFFER_SIZE`.
      On many systems, the buffer will typically be 4096 or 8192 bytes long.
    
    * "Interactive" text files (files for which isatty() returns True)
      use line buffering.  Other text files use the policy described above
      for binary files.
    
    encoding is the name of the encoding used to decode or encode the
    file. This should only be used in text mode. The default encoding is
    platform dependent, but any encoding supported by Python can be
    passed.  See the codecs module for the list of supported encodings.
    
    errors is an optional string that specifies how encoding errors are to
    be handled---this argument should not be used in binary mode. Pass
    'strict' to raise a ValueError exception if there is an encoding error
    (the default of None has the same effect), or pass 'ignore' to ignore
    errors. (Note that ignoring encoding errors can lead to data loss.)
    See the documentation for codecs.register or run 'help(codecs.Codec)'
    for a list of the permitted encoding error strings.
    
    newline controls how universal newlines works (it only applies to text
    mode). It can be None, '', '\n', '\r', and '\r\n'.  It works as
    follows:
    
    * On input, if newline is None, universal newlines mode is
      enabled. Lines in the input can end in '\n', '\r', or '\r\n', and
      these are translated into '\n' before being returned to the
      caller. If it is '', universal newline mode is enabled, but line
      endings are returned to the caller untranslated. If it has any of
      the other legal values, input lines are only terminated by the given
      string, and the line ending is returned to the caller untranslated.
    
    * On output, if newline is None, any '\n' characters written are
      translated to the system default line separator, os.linesep. If
      newline is '' or '\n', no translation takes place. If newline is any
      of the other legal values, any '\n' characters written are translated
      to the given string.
    
    If closefd is False, the underlying file descriptor will be kept open
    when the file is closed. This does not work when a file name is given
    and must be True in that case.
    
    A custom opener can be used by passing a callable as *opener*. The
    underlying file descriptor for the file object is then obtained by
    calling *opener* with (*file*, *flags*). *opener* must return an open
    file descriptor (passing os.open as *opener* results in functionality
    similar to passing None).
    
    open() returns a file object whose type depends on the mode, and
    through which the standard file operations such as reading and writing
    are performed. When open() is used to open a file in a text mode ('w',
    'r', 'wt', 'rt', etc.), it returns a TextIOWrapper. When used to open
    a file in a binary mode, the returned class varies: in read binary
    mode, it returns a BufferedReader; in write binary and append binary
    modes, it returns a BufferedWriter, and in read/write mode, it returns
    a BufferedRandom.
    
    It is also possible to use a string or bytearray as a file for both
    reading and writing. For strings StringIO can be used like a file
    opened in a text mode, and for bytes a BytesIO can be used like a file
    opened in a binary mode.

In [158]:
with open("mydata.txt", "a") as f: # second argument of string "a" which means opne the file in append mode
    f.write("Some data on first line")
    f.write("  BTW this is also in first line")
    f.write("\n")
    f.write("second line")
In [159]:
!python cat.py mydata.txt
Some data on first line  BTW this is also in first line
second lineSome data on first line  BTW this is also in first line
second line
  • Write a function csvparser to load tabular data from a csv file.
In [160]:
[1, 2, 3, 4]
Out[160]:
[1, 2, 3, 4]
In [161]:
[[1, 2, 3],
 [2, 3, 4],
 [3, 4, 5]]
Out[161]:
[[1, 2, 3], [2, 3, 4], [3, 4, 5]]
In [162]:
%%file data.csv
A,B,C
E,F,G
X,Y,Z
Writing data.csv
In [165]:
def csvparser(filename):
    with open(filename) as f:
        data = []
        for line in f:
            row = line.strip().split(",")
            data.append(row)
        return data
In [167]:
data = csvparser("data.csv")
In [169]:
data[0] # zeroth row
Out[169]:
['A', 'B', 'C']
In [170]:
data[1] # first row
Out[170]:
['E', 'F', 'G']
In [173]:
tables = [[i*j for i in range(1, 11)] for j in range(1, 6)]
In [174]:
tables
Out[174]:
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 [2, 4, 6, 8, 10, 12, 14, 16, 18, 20],
 [3, 6, 9, 12, 15, 18, 21, 24, 27, 30],
 [4, 8, 12, 16, 20, 24, 28, 32, 36, 40],
 [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]]
In [176]:
def writecsv(tabulardata, filename):
    with open(filename, "w") as f:
        for row in tabulardata:
            strrow = [str(item) for item in row]
            f.write(",".join(strrow))
            f.write("\n")
        
In [178]:
writecsv(tables, "tables.csv")
In [179]:
!python cat.py tables.csv
1,2,3,4,5,6,7,8,9,10
2,4,6,8,10,12,14,16,18,20
3,6,9,12,15,18,21,24,27,30
4,8,12,16,20,24,28,32,36,40
5,10,15,20,25,30,35,40,45,50

In [ ]: