Python Virtual Training For Arcesium - Module I - Nurturing Session¶

Jan 16-20, 2023 Vikrant Patil

All notes are available online at https://notes.pipal.in/2023/arcesium_finop_jan/

© Pipal Academy LLP

Some basics¶

In [1]:
%%file file.txt

I can create a file 
using jupyter
Writing file.txt
In [2]:
!python hello.py vikrant arg1 arg1
Hello

lists Vs str¶

In [4]:
a = [1, 2, 3, 5, 6] #
In [5]:
a
Out[5]:
[1, 2, 3, 5, 6]
In [6]:
a[0] # you will get 1 and [
Out[6]:
1
In [7]:
b = '[' + "1" + "," + "2" + "," + "3" + "]"
In [9]:
b # b is not a list it is str
Out[9]:
'[1,2,3]'
  • to work with a , you will use list methods
  • to work with b, you will use str methods
In [10]:
%%file box.py
import sys

word = sys.argv[1] # make sure that you out appropriate argument from all the argument
topline =  "+-" + "-"*len(word) + "-+"
bottomline = topline
print(topline)
print("| " + word + " |")
print(bottomline)
Writing box.py
In [11]:
!python box.py python
+--------+
| python |
+--------+
In [14]:
%%file boxthesentence.py
import sys

words = sys.argv[1:] # make sure that you out appropriate argument from all the argument
sentence = " ".join(words)
topline =  "+-" + "-"*len(sentence) + "-+"
bottomline = topline
print(topline)
print("| " + sentence + " |")
print(bottomline)
Overwriting boxthesentence.py
In [15]:
!python boxthesentence.py This is more than just a word
+-------------------------------+
| This is more than just a word |
+-------------------------------+

Few more things to remember about sys.argv¶

  • sys.argv is a list of str
  • sys.argv[0] is always filename
  • sys.argv[1] gives first argument
  • sys.argv[1:] gives list of all arguments except filename
  • any argument taken from sys.argv is always text

Code blocks from submission¶

In [ ]:
def group(a,b):
    x = list(range(int((len(a))/b),len(a),b+1))
    for num in x:
        a.insert(int(num),"/") # generally you don't modify the arguments!
    string = str(a) # 
    y = string.replace("[","").replace("]","") # 
    return y.split("/")
In [16]:
nums = [1, 2, 3, 4, 5, 6]
In [17]:
str(nums) # this is almost never required! 
Out[17]:
'[1, 2, 3, 4, 5, 6]'
In [18]:
def group(a,b):
    for num in a:
        b=[]
        f=[]
        h=[]
        j=[]
        tableofthree=str(b)
        tableofthree1=str(f)
        tableofthree2=str(h)
        tableofthree3=str(j)
        if len(tableofthree)<=b:
            tableofthree.append(num)
        elif len(tableofthree1)<=b:
            tableofthree1.append(num)
        elif len(tableofthree2)<=b:
            tableofthree2.append(num)
        else len(tableofthree3)<=b:
            tableofthree3.append(num)
    return join(",",tableofthree,tableofthree1,tableofthree2,tableofthree3)
  Cell In[18], line 17
    else len(tableofthree3)<=b:
         ^
SyntaxError: expected ':'
In [19]:
x = "text"
In [20]:
x.append("x")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[20], line 1
----> 1 x.append("x")

AttributeError: 'str' object has no attribute 'append'
In [21]:
# [1, 2, 3, 4, 5, 6, 7, 8, 9] -> divide into groups of three
# 


def group(items, groupsize):
    groups = []
    for start in range(0, len(items), groupsize):
        groups.append(items[start:start+groupsize])
    return groups
    
In [22]:
group([1, 2, 3, 4, 5, 6, 7, 8, 9], 4)
Out[22]:
[[1, 2, 3, 4], [5, 6, 7, 8], [9]]
In [23]:
nums
Out[23]:
[1, 2, 3, 4, 5, 6]
In [24]:
nums[0:3] # it will take number starting from index 0 till index 3 (excluded)
Out[24]:
[1, 2, 3]
In [25]:
list(range(0, 20, 2))
Out[25]:
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
In [26]:
list(range(0, 20, 3))
Out[26]:
[0, 3, 6, 9, 12, 15, 18]
In [27]:
nums[3:6]
Out[27]:
[4, 5, 6]
In [28]:
def group(items, groupsize):
    groups = []
    for start in range(0, len(items), groupsize):
        groups.append(items[start:start+groupsize])
    return groups
    
In [29]:
items = [1, 2, 3, 4, 5, 6, 7, 8, 9]
In [30]:
list(range(0, len(items), 4))
Out[30]:
[0, 4, 8]
In [31]:
items[0:0+4]
Out[31]:
[1, 2, 3, 4]
In [32]:
items[4:4+4]
Out[32]:
[5, 6, 7, 8]
In [33]:
items[8:12]
Out[33]:
[9]
In [34]:
items[5:100] # this will not fail, it will stop after the end of list has come
Out[34]:
[6, 7, 8, 9]
In [35]:
items[100]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[35], line 1
----> 1 items[100]

IndexError: list index out of range

Method Vs Function¶

In [36]:
"-".join(["one", "two", "three"])
Out[36]:
'one-two-three'
In [37]:
join("-", ["one", "two", "three"]) # this is invalid
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[37], line 1
----> 1 join("-", ["one", "two", "three"]) # this is invalid

NameError: name 'join' is not defined
In [38]:
import os
In [43]:
os.path.join("/", "home", "vikrant") # this is a function inside os.path module
Out[43]:
'/home/vikrant'
In [44]:
"".join(["a", "b"]) # is a method inside string ""
Out[44]:
'ab'

return statement¶

In [46]:
def group(items, groupsize):
    groups = []
    for start in range(0, len(items), groupsize):
        groups.append(items[start:start+groupsize])
    print(groups) # this will fail in unittests!
    
In [47]:
def group(items, groupsize):
    groups = []
    for start in range(0, len(items), groupsize):
        groups.append(items[start:start+groupsize])
    return groups # this will fail in unittests!
    
In [48]:
import os    
In [49]:
os.path.sep
Out[49]:
'/'

Daily average prices of some symbols are given each for five days of a week. Data consists of list of records for each day. Every record consists of (symbol, day, stock price). Write a function weekly_average to compute weekly average for given symbol using this data. it should work as given below

>>> prices = [('IBM', 'Monday', 111.71436961893693),
              ('IBM', 'Tuesday', 141.21220022208635),
              ('IBM', 'Wednesday', 112.40571010053796),
              ('IBM', 'Thursday', 137.54133351926248),
              ('IBM', 'Friday', 140.25154281801224),
              ('MICROSOFT', 'Monday', 235.0403622499107),
              ('MICROSOFT', 'Tuesday', 225.0206535036475),
              ('MICROSOFT', 'Wednesday', 216.10342426936444),
              ('MICROSOFT', 'Thursday', 200.38038844494193),
              ('MICROSOFT', 'Friday', 235.80850482793264),
              ('APPLE', 'Monday', 321.49182055844256),
              ('APPLE', 'Tuesday', 340.63612771662815),
              ('APPLE', 'Wednesday', 303.9065277507285),
              ('APPLE', 'Thursday', 338.1350605764038),
              ('APPLE', 'Friday', 318.3912296144338)]
>>> weekly_average(prices, "APPLE")
324.51215324332736
In [50]:
prices = [('IBM', 'Monday', 111.71436961893693),
              ('IBM', 'Tuesday', 141.21220022208635),
              ('IBM', 'Wednesday', 112.40571010053796),
              ('IBM', 'Thursday', 137.54133351926248),
              ('IBM', 'Friday', 140.25154281801224),
              ('MICROSOFT', 'Monday', 235.0403622499107),
              ('MICROSOFT', 'Tuesday', 225.0206535036475),
              ('MICROSOFT', 'Wednesday', 216.10342426936444),
              ('MICROSOFT', 'Thursday', 200.38038844494193),
              ('MICROSOFT', 'Friday', 235.80850482793264),
              ('APPLE', 'Monday', 321.49182055844256),
              ('APPLE', 'Tuesday', 340.63612771662815),
              ('APPLE', 'Wednesday', 303.9065277507285),
              ('APPLE', 'Thursday', 338.1350605764038),
              ('APPLE', 'Friday', 318.3912296144338)]
In [51]:
type(prices)
Out[51]:
list
In [52]:
for item in prices:
    print(item)
('IBM', 'Monday', 111.71436961893693)
('IBM', 'Tuesday', 141.21220022208635)
('IBM', 'Wednesday', 112.40571010053796)
('IBM', 'Thursday', 137.54133351926248)
('IBM', 'Friday', 140.25154281801224)
('MICROSOFT', 'Monday', 235.0403622499107)
('MICROSOFT', 'Tuesday', 225.0206535036475)
('MICROSOFT', 'Wednesday', 216.10342426936444)
('MICROSOFT', 'Thursday', 200.38038844494193)
('MICROSOFT', 'Friday', 235.80850482793264)
('APPLE', 'Monday', 321.49182055844256)
('APPLE', 'Tuesday', 340.63612771662815)
('APPLE', 'Wednesday', 303.9065277507285)
('APPLE', 'Thursday', 338.1350605764038)
('APPLE', 'Friday', 318.3912296144338)
In [53]:
for item in prices:
    print(item[0])
IBM
IBM
IBM
IBM
IBM
MICROSOFT
MICROSOFT
MICROSOFT
MICROSOFT
MICROSOFT
APPLE
APPLE
APPLE
APPLE
APPLE
In [54]:
for item in prices:
    print(item[1])
Monday
Tuesday
Wednesday
Thursday
Friday
Monday
Tuesday
Wednesday
Thursday
Friday
Monday
Tuesday
Wednesday
Thursday
Friday
In [55]:
for item in prices:
    print(item[2])
111.71436961893693
141.21220022208635
112.40571010053796
137.54133351926248
140.25154281801224
235.0403622499107
225.0206535036475
216.10342426936444
200.38038844494193
235.80850482793264
321.49182055844256
340.63612771662815
303.9065277507285
338.1350605764038
318.3912296144338
In [56]:
for ticker,day,value in prices:
    print(ticker, day, value)
IBM Monday 111.71436961893693
IBM Tuesday 141.21220022208635
IBM Wednesday 112.40571010053796
IBM Thursday 137.54133351926248
IBM Friday 140.25154281801224
MICROSOFT Monday 235.0403622499107
MICROSOFT Tuesday 225.0206535036475
MICROSOFT Wednesday 216.10342426936444
MICROSOFT Thursday 200.38038844494193
MICROSOFT Friday 235.80850482793264
APPLE Monday 321.49182055844256
APPLE Tuesday 340.63612771662815
APPLE Wednesday 303.9065277507285
APPLE Thursday 338.1350605764038
APPLE Friday 318.3912296144338
In [57]:
def findfiles(folderpath, prefix, extension):
    import os
    for file in os.listdir(folderpath):
        new_list=[]
        if file.startswith(prefix) and file.endswith(extension):
            new_list.append(file)
               
    return new_list
In [58]:
findfiles(".", "", ".py")
Out[58]:
['box.py']
In [59]:
!ls
args.py		   Makefile	       module1-day4.org		other_files
basicstats.py	   Makefile~	       module1-day5.html	push
box.py		   module1-day1.html   module1-day5.ipynb	testdir
boxthesentence.py  module1-day1.ipynb  module1-day5.org		testdir1
file.txt	   module1-day2.html   module1-day5.org~	test.html
greeting.py	   module1-day2.ipynb  module1-nurturing.html	test.ipynb
hello1.py	   module1-day3.html   module1-nurturing.ipynb	test.txt
hello.py	   module1-day3.ipynb  mysum.py			Untitled.html
index.html	   module1-day4.html   nurturing_session1.org	users.csv~
index.ipynb	   module1-day4.ipynb  nurturing_session1.org~
In [60]:
def findfiles(folderpath, prefix, extension):
    import os
    new_list=[]
    for file in os.listdir(folderpath):
        if file.startswith(prefix) and file.endswith(extension):
            new_list.append(file)
               
    return new_list
In [61]:
findfiles(".", "", ".py")
Out[61]:
['hello1.py',
 'hello.py',
 'boxthesentence.py',
 'basicstats.py',
 'greeting.py',
 'mysum.py',
 'args.py',
 'box.py']

key argument for max/sorted¶

In [62]:
prices
Out[62]:
[('IBM', 'Monday', 111.71436961893693),
 ('IBM', 'Tuesday', 141.21220022208635),
 ('IBM', 'Wednesday', 112.40571010053796),
 ('IBM', 'Thursday', 137.54133351926248),
 ('IBM', 'Friday', 140.25154281801224),
 ('MICROSOFT', 'Monday', 235.0403622499107),
 ('MICROSOFT', 'Tuesday', 225.0206535036475),
 ('MICROSOFT', 'Wednesday', 216.10342426936444),
 ('MICROSOFT', 'Thursday', 200.38038844494193),
 ('MICROSOFT', 'Friday', 235.80850482793264),
 ('APPLE', 'Monday', 321.49182055844256),
 ('APPLE', 'Tuesday', 340.63612771662815),
 ('APPLE', 'Wednesday', 303.9065277507285),
 ('APPLE', 'Thursday', 338.1350605764038),
 ('APPLE', 'Friday', 318.3912296144338)]
In [63]:
def get_value(record):
    return record[2]

max(prices, key=get_value)
Out[63]:
('APPLE', 'Tuesday', 340.63612771662815)
In [64]:
sorted(prices, key=get_value)
Out[64]:
[('IBM', 'Monday', 111.71436961893693),
 ('IBM', 'Wednesday', 112.40571010053796),
 ('IBM', 'Thursday', 137.54133351926248),
 ('IBM', 'Friday', 140.25154281801224),
 ('IBM', 'Tuesday', 141.21220022208635),
 ('MICROSOFT', 'Thursday', 200.38038844494193),
 ('MICROSOFT', 'Wednesday', 216.10342426936444),
 ('MICROSOFT', 'Tuesday', 225.0206535036475),
 ('MICROSOFT', 'Monday', 235.0403622499107),
 ('MICROSOFT', 'Friday', 235.80850482793264),
 ('APPLE', 'Wednesday', 303.9065277507285),
 ('APPLE', 'Friday', 318.3912296144338),
 ('APPLE', 'Monday', 321.49182055844256),
 ('APPLE', 'Thursday', 338.1350605764038),
 ('APPLE', 'Tuesday', 340.63612771662815)]
In [65]:
words = "one two three four five six seven eight nine ten".split()
In [66]:
words
Out[66]:
['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten']
In [67]:
max(words, key=len)
Out[67]:
'three'
In [68]:
sorted(words)
Out[68]:
['eight', 'five', 'four', 'nine', 'one', 'seven', 'six', 'ten', 'three', 'two']
In [69]:
sorted(words, key=len)
Out[69]:
['one', 'two', 'six', 'ten', 'four', 'five', 'nine', 'three', 'seven', 'eight']
In [70]:
add
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[70], line 1
----> 1 add

NameError: name 'add' is not defined
In [71]:
def add(a, b):
    return a + b
In [73]:
add # this is a variable which happen to contain function
Out[73]:
<function __main__.add(a, b)>
In [74]:
add
Out[74]:
<function __main__.add(a, b)>
In [75]:
print(add)
<function add at 0x7f8f467dc280>
In [76]:
def sumof(x, y, func):
    return func(x) + func(y)
In [77]:
def square(x):
    return x*x
In [78]:
sumof(4, 5, square)
Out[78]:
41
In [82]:
max(words, key=len) # to find max one has to put a loop... and here built in max is doing that for us
Out[82]:
'three'
In [80]:
sorted(prices, key=get_value) # just passing function as a variable ... not calling 
Out[80]:
[('IBM', 'Monday', 111.71436961893693),
 ('IBM', 'Wednesday', 112.40571010053796),
 ('IBM', 'Thursday', 137.54133351926248),
 ('IBM', 'Friday', 140.25154281801224),
 ('IBM', 'Tuesday', 141.21220022208635),
 ('MICROSOFT', 'Thursday', 200.38038844494193),
 ('MICROSOFT', 'Wednesday', 216.10342426936444),
 ('MICROSOFT', 'Tuesday', 225.0206535036475),
 ('MICROSOFT', 'Monday', 235.0403622499107),
 ('MICROSOFT', 'Friday', 235.80850482793264),
 ('APPLE', 'Wednesday', 303.9065277507285),
 ('APPLE', 'Friday', 318.3912296144338),
 ('APPLE', 'Monday', 321.49182055844256),
 ('APPLE', 'Thursday', 338.1350605764038),
 ('APPLE', 'Tuesday', 340.63612771662815)]
In [81]:
get_value(('IBM', 'Monday', 111.71436961893693))
Out[81]:
111.71436961893693
In [83]:
records = [
  ("TATA", 200.0, 5.5),
  ("INFY", 2000.0, -5),
  ("RELIANCE", 1505.5, 50.0),
  ("HCL", 1200, 70.5)
]
In [84]:
def get_value(r):
    return r[1]

def get_gain(r):
    return r[2]

sorted(records, key=get_value)
Out[84]:
[('TATA', 200.0, 5.5),
 ('HCL', 1200, 70.5),
 ('RELIANCE', 1505.5, 50.0),
 ('INFY', 2000.0, -5)]
In [85]:
sorted(records, key=get_gain)
Out[85]:
[('INFY', 2000.0, -5),
 ('TATA', 200.0, 5.5),
 ('RELIANCE', 1505.5, 50.0),
 ('HCL', 1200, 70.5)]

A clean room has to maintain constant temperature with some margin allowed depending on experiment. A list has deviation of temperature of the room during one experiment. We have to find the maximum deviation. Write a function key which will be used as key argument to max function to find maximum temperature fluctuation from a list of values.

>>> fluctuations = [1,1,-1,2,-5,3,4,1,2,1]
>>> max(fluctuations, key=key)
-5
In [86]:
fluctuations = [1,1,-1,2,-5,3,4,1,2,1]
In [87]:
max(fluctuations)
Out[87]:
4
In [88]:
sorted(fluctuations)
Out[88]:
[-5, -1, 1, 1, 1, 1, 2, 2, 3, 4]
In [89]:
abs(-5)
Out[89]:
5
In [91]:
max(fluctuations, key=abs) # key argument is not always index
Out[91]:
-5
In [92]:
turnover = ["1B", "1.2B", "1000K", "3M"]
In [104]:
def convert(entry):
    if entry.endswith("B"):
        return float(entry[:-1])*1e9
    elif entry.endswith("M"):
        return float(entry[:-1])*1e6
    elif entry.endswith("K"):
        return float(entry[:-1])*1000
    else:
        raise Exception("Invalid suffix") 
In [100]:
max(turnover) # it is text, so we will get by alphebetical order
Out[100]:
'3M'
In [101]:
max(turnover, key=convert)
Out[101]:
'1.2B'
In [102]:
dom
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[102], line 1
----> 1 dom

NameError: name 'dom' is not defined
In [103]:
convert("34Z")
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Cell In[103], line 1
----> 1 convert("34Z")

Cell In[99], line 9, in convert(entry)
      7     return float(entry[:-1])*1000
      8 else:
----> 9     raise Exception("Invalid suffix")

Exception: Invalid suffix
In [ ]:
 
In [ ]: