indexdata = [('IBM', 'Monday', 111.71436961893693),
('IBM', 'Tuesday', 141.21220022208635),
('IBM', 'Wednesday', 112.40571010053796),
('IBM', 'Thursday', 137.54133351926248),
('IBM', 'Friday', 140.25154281801224),
('MICROSOFT', 'Monday', 235.0403622499107),
('MICROSOFT', 'Tuesday', 225.0206535036475),
('MICROSOFT', 'Wednesday', 216.10342426936444),
('MICROSOFT', 'Thursday', 200.38038844494193),
('MICROSOFT', 'Friday', 235.80850482793264),
('APPLE', 'Monday', 321.49182055844256),
('APPLE', 'Tuesday', 340.63612771662815),
('APPLE', 'Wednesday', 303.9065277507285),
('APPLE', 'Thursday', 338.1350605764038),
('APPLE', 'Friday', 318.3912296144338)]Module 2 - Day 2
Login to Lab using your credentials. There is a notebook with name 2-2.ipynb already created for you. Open that and use it for today’s training.
Shut down all previous notebooks.
problems
- Write a list comprehension for finding data for given day (Monday)
- Write a function to find weekly maximum for given symbol
[item for item in indexdata if item[1]=="Monday"][('IBM', 'Monday', 111.71436961893693),
('MICROSOFT', 'Monday', 235.0403622499107),
('APPLE', 'Monday', 321.49182055844256)]
[(name, day, price) for name, day, price in indexdata if day=="Monday"][('IBM', 'Monday', 111.71436961893693),
('MICROSOFT', 'Monday', 235.0403622499107),
('APPLE', 'Monday', 321.49182055844256)]
[price for name, day, price in indexdata if name=="IBM"][111.71436961893693,
141.21220022208635,
112.40571010053796,
137.54133351926248,
140.25154281801224]
max([price for name, day, price in indexdata if name=="IBM"])141.21220022208635
Reading files from python
%%file zen.txt
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!Writing zen.txt
with open("zen.txt") as filehandle:
filedata = filehandle.read() # this will read complete file
print(filedata)
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
for i in range(5):
x = i*i
y = x + 2
print(y)
2
3
6
11
18
with open("zen.txt") as filehandle:
firstline = filehandle.readline() # this will read only one line
secondline = filehandle.readline() # this will read next line
print(firstline)
print(secondline)The Zen of Python, by Tim Peters
filehandle.readline()# file is closed after with block!--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[11], line 1 ----> 1 filehandle.readline() ValueError: I/O operation on closed file.
with open("zen.txt") as filehandle:
print(filehandle.read())
print("Next print", filehandle.read()) # this will result into empty stringThe Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Next print
with open("zen.txt") as f:
for line in f:
print(line)The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
with open("zen.txt") as f:
for line in f:
print(line, end="") # the line will have its own \n char at endThe Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
problem
- Write a function
print_with_linenumswhich takes a text file name/path as argument and prints contents of file with line numbers at start
def print_with_linenums(filename):
with open(filename) as handle:
for linenum, line in enumerate(handle, start=1):
print(linenum, line, end="")print_with_linenums("zen.txt")1 The Zen of Python, by Tim Peters
2
3 Beautiful is better than ugly.
4 Explicit is better than implicit.
5 Simple is better than complex.
6 Complex is better than complicated.
7 Flat is better than nested.
8 Sparse is better than dense.
9 Readability counts.
10 Special cases aren't special enough to break the rules.
11 Although practicality beats purity.
12 Errors should never pass silently.
13 Unless explicitly silenced.
14 In the face of ambiguity, refuse the temptation to guess.
15 There should be one-- and preferably only one --obvious way to do it.
16 Although that way may not be obvious at first unless you're Dutch.
17 Now is better than never.
18 Although never is often better than *right* now.
19 If the implementation is hard to explain, it's a bad idea.
20 If the implementation is easy to explain, it may be a good idea.
21 Namespaces are one honking great idea -- let's do more of those!
!pwd # is unix command/opt/arcesium-python-2024-june
print_with_linenums("/opt/arcesium-python-2024-june/zen.txt") # this is absolute path1 The Zen of Python, by Tim Peters
2
3 Beautiful is better than ugly.
4 Explicit is better than implicit.
5 Simple is better than complex.
6 Complex is better than complicated.
7 Flat is better than nested.
8 Sparse is better than dense.
9 Readability counts.
10 Special cases aren't special enough to break the rules.
11 Although practicality beats purity.
12 Errors should never pass silently.
13 Unless explicitly silenced.
14 In the face of ambiguity, refuse the temptation to guess.
15 There should be one-- and preferably only one --obvious way to do it.
16 Although that way may not be obvious at first unless you're Dutch.
17 Now is better than never.
18 Although never is often better than *right* now.
19 If the implementation is hard to explain, it's a bad idea.
20 If the implementation is easy to explain, it may be a good idea.
21 Namespaces are one honking great idea -- let's do more of those!
print_with_linenums("zen.txt")1 The Zen of Python, by Tim Peters
2
3 Beautiful is better than ugly.
4 Explicit is better than implicit.
5 Simple is better than complex.
6 Complex is better than complicated.
7 Flat is better than nested.
8 Sparse is better than dense.
9 Readability counts.
10 Special cases aren't special enough to break the rules.
11 Although practicality beats purity.
12 Errors should never pass silently.
13 Unless explicitly silenced.
14 In the face of ambiguity, refuse the temptation to guess.
15 There should be one-- and preferably only one --obvious way to do it.
16 Although that way may not be obvious at first unless you're Dutch.
17 Now is better than never.
18 Although never is often better than *right* now.
19 If the implementation is hard to explain, it's a bad idea.
20 If the implementation is easy to explain, it may be a good idea.
21 Namespaces are one honking great idea -- let's do more of those!
print_with_linenums("testfolder/hello.txt") # relative path1 hello world!
print_with_linenums("/opt/arcesium-python-2024-june/testfolder/hello.txt") # absolute path1 hello world!
Parsing data from file
%%file salary.txt
100000
121323
200000
340000
150000Writing salary.txt
with open("salary.txt") as f:
data = []
for line in f:
data.append(line)
data['100000\n', '121323\n', '200000\n', '340000\n', '150000\n']
with open("salary.txt") as f:
data = []
for line in f:
data.append(line.strip()) # strip will remove trailing spaces
data # data is text!['100000', '121323', '200000', '340000', '150000']
sum(data)--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[30], line 1 ----> 1 sum(data) TypeError: unsupported operand type(s) for +: 'int' and 'str'
def read_int_list(filename):
with open(filename) as f:
data = []
for line in f:
n = line.strip()
n = int(n)
data.append(n)
return dataread_int_list("salary.txt")[100000, 121323, 200000, 340000, 150000]
def read_int_list(filename):
with open(filename) as f:
return [int(line.strip()) for line in f]read_int_list("salary.txt")[100000, 121323, 200000, 340000, 150000]
salaries = read_int_list("salary.txt")max(salaries)340000
sum(salaries)911323
problem
- Parse integers from a row given in a file, write a function to do this
parse_row_as_ints
%%file salary.csv
11111,22222,33333,40000,50000Writing salary.csv
bonus problem
- parse csv tabular data as list of list of integers ( 2d list) , write a function
parseints_from_csv
%%file tabular.csv
1,2,3,4,5
21,22,23,24,25,
31,32,33,34,35Overwriting tabular.csv
[[1,2,3,4,5],
[21,22,23,24,25],
[31,32,33,34,35]]
"hello this is a statment".split(" ")['hello', 'this', 'is', 'a', 'statment']
"121,232,23232".split(",")['121', '232', '23232']
[int(token) for token in "121,232,23232".split(",")][121, 232, 23232]
f = open("salary.csv")data = f.read()data'11111,22222,33333,40000,50000\n'
data.strip()'11111,22222,33333,40000,50000'
data.strip().split(",")['11111', '22222', '33333', '40000', '50000']
[ int(i) for i in data.strip().split(",")][11111, 22222, 33333, 40000, 50000]
f.close() # beacuse we did not open file using with statementdef parse_row_as_ints(filename):
with open(filename) as f:
textlist = f.read().strip().split(",")
return [int(t) for t in textlist]parse_row_as_ints("salary.csv")[11111, 22222, 33333, 40000, 50000]
def sqauare(nums):
data = []
for i in nums:
data.append(i*i)
return datasqauare(range(5))[0]
def sqauare(nums):
data = []
for i in nums:
data.append(i*i)
return datasqauare(range(5))[0, 1, 4, 9, 16]
%%file tabular.csv
1,2,3,4,5
21,22,23,24,25,
31,32,33,34,35Overwriting tabular.csv
def process_line(line):
textlist = line.strip().split(",")
return [int(t) for t in textlist]
def parseints_from_csv(filename):
with open(filename) as f:
rows = []
for line in f:
rows.append(process_line(line))
return rowsparseints_from_csv("tabular.csv")--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[63], line 1 ----> 1 parseints_from_csv("tabular.csv") Cell In[62], line 9, in parseints_from_csv(filename) 7 rows = [] 8 for line in f: ----> 9 rows.append(process_line(line)) 10 return rows Cell In[62], line 3, in process_line(line) 1 def process_line(line): 2 textlist = line.strip().split(",") ----> 3 return [int(t) for t in textlist] Cell In[62], line 3, in <listcomp>(.0) 1 def process_line(line): 2 textlist = line.strip().split(",") ----> 3 return [int(t) for t in textlist] ValueError: invalid literal for int() with base 10: ''
"21,22,23,24,25,".strip().split(",")['21', '22', '23', '24', '25', '']
%%file tabular1.csv
1,2,3,4,5
21,22,23,24,25
31,32,33,34,35Writing tabular1.csv
parseints_from_csv("tabular1.csv")[[1, 2, 3, 4, 5], [21, 22, 23, 24, 25], [31, 32, 33, 34, 35]]
def process_line(line):
textlist = line.strip().split(",")
return [int(t) for t in textlist]
def parseints_from_csv(filename):
with open(filename) as f:
return [process_line(line) for line in f]parseints_from_csv("tabular1.csv")[[1, 2, 3, 4, 5], [21, 22, 23, 24, 25], [31, 32, 33, 34, 35]]
Write text files using python
with open("out.txt", "w") as fhandle: # write mode
fhandle.write("Hello there!")
fhandle.write("is this second line?")with open("out.txt", "w") as fhandle: # writing it again will over write the file!
fhandle.write("Hello there!")
fhandle.write("\n") # unless we write \n , it won't be there in the file!
fhandle.write("is this second line?")nums = [1, 2, 3, 4, 5]def write_list_to_file(listdata, filename):
with open(filename, "w") as f:
for item in listdata:
f.write(str(item))
f.write("\n")write_list_to_file(nums, "nums.txt")!cat nums.txt1
2
3
4
5
%%file cat.py
import sys
def print_file(filename):
with open(filename) as f:
for line in f:
print(line, end="")
filename = sys.argv[1]
print_file(filename)Writing cat.py
!python cat.py nums.txt1
2
3
4
5
with open("nums.txt", "a") as f: # this will append to existing file
f.write("6")!python cat.py nums.txt1
2
3
4
5
6
problem
Data is given as a list, write it into a file each item on one row. write a function
write_columnfor this>>> write_column(listdata, filename)
nums = [1, 2, 3, 43, 4,6]String formating
x = 35f"The value of x is {x}" # format string'The value of x is 35'
"The value of x is " + str(x)'The value of x is 35'
f"The value of x is {x}"'The value of x is 35'
def process_item(item):
return f"{item}\n" # str(item)
def write_column(data, filename):
with open(filename, "w") as f:
for item in data:
f.write(process_item(item))
write_column(nums, "n.txt")!python cat.py n.txt1
2
3
43
4
6
data = [[1, 2, 3, 4],
[21, 22, 23, 24],
[31, 32, 33, 34],
[41, 42, 43, 44]]def process_row(row):
textrow = [f"{item}" for item in row]
return ",".join(textrow)
def write_csv(data, filename):
with open(filename, "w") as f:
for row in data:
f.write(process_row(row))
f.write("\n")
words = ["one", "two", "three"]",".join(words)'one,two,three'
write_csv(data, "csvdata.csv")!python cat.py csvdata.csv1,2,3,4
21,22,23,24
31,32,33,34
41,42,43,44
%%file stocks.csv
symbol,high,low,gain
IBM,123,122,3
AGG,232,232,0
CAC,231,215,-3Writing stocks.csv
def process_remaining_csvdata(fhandle):
return [line.strip().split(",") for line in fhandle]
with open("stocks.csv") as f:
headers = f.readline().strip().split(",")
data = process_remaining_csvdata(f)headers['symbol', 'high', 'low', 'gain']
data[['IBM', '123', '122', '3'],
['AGG', '232', '232', '0'],
['CAC', '231', '215', '-3']]
String formating - more
x, y = 10, 20f"value x = {x} and value of y = {y}"'value x = 10 and value of y = 20'
"value of x = {0} and value of y = {1}".format(30, 50)'value of x = 30 and value of y = 50'
tables = [ [n*i for i in range(1, 11) ] for n in range(1, 6)]tables[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20],
[3, 6, 9, 12, 15, 18, 21, 24, 27, 30],
[4, 8, 12, 16, 20, 24, 28, 32, 36, 40],
[5, 10, 15, 20, 25, 30, 35, 40, 45, 50]]
for t in tables[0]:
print(t)1
2
3
4
5
6
7
8
9
10
for t in tables[0]:
print(f"{t:2d}") 1
2
3
4
5
6
7
8
9
10