Python Virtual Training For Arcesium - Module III - Nurturing Session¶

Sep 9, 2022 Vikrant Patil

All notes are available online at https://notes.pipal.in/2022/arcesium_finop_batch1/

Please accept the invitation that you have received in your email and login to

https://engage.pipal.in/

© Pipal Academy LLP

problem I¶

Write a function combined_daily_time_series that combines stock data from www.alphavantage.co. for given list of symbols and returns a dataframe. The dataframe should have these numeric columns, open, high, low, close, volume, symbol. Here are some hints. Have a look at these functions and methods pd.concat, DataFrame.rename, pd.to_numeric

Also write a function get_total_volume which takes dataframe generated by above function and returns a series that symbols as row labels and total volume for that symbol as a value.

>>> df = combined_daily_time_series(["AAPL","IBM"])
>>> df
open  high low close volume symbol
2022-08-25 20:00:00 170.290 170.3500 170.2000 170.2000 11160 AAPL
2022-08-25 20:15:00 170.290 170.3500 170.2000 170.2000 11160 AAPL
2022-08-25 20:30:00 170.290 170.3500 170.2000 170.2000 11160 AAPL
.
.
2022-08-25 20:00:00 170.290 170.3500 170.2000 170.2000 11160 IBM
2022-08-25 20:15:00 170.290 170.3500 170.2000 170.2000 11160 IBM

200 rows × 6 columns

>>> get_total_volume(df)
symbol
AAPL    77772389
IBM      7689538
Name: volume, dtype: int64
In [2]:
def download_for_single_ticker(ticker):
    # same code what we did during the session
    pass


def clean_data(data):
    ## covert column types to numerical type
    ## and add column for ticker
    pass

def combined_daily_time_series(tickers):
    dfs = []
    for ticker in tickers:
        data = download_for_single_ticker(ticker) # What we did during session was very raw data
        dfs.append(data)
        
    return pd.concat(dfs)

def get_total_volume(combined_data):
    pass
In [7]:
import pandas as pd
df = pd.DataFrame({"a":range(10),
             "b":list("aaabbbcccc")})
df
Out[7]:
a b
0 0 a
1 1 a
2 2 a
3 3 b
4 4 b
5 5 b
6 6 c
7 7 c
8 8 c
9 9 c
In [10]:
df.groupby("b").sum()['a']
Out[10]:
b
a     3
b    12
c    30
Name: a, dtype: int64
In [11]:
import pandas as pd
df = pd.DataFrame({"a":[str(i) for i in range(10)],
             "b":list("aaabbbcccc")})
df
Out[11]:
a b
0 0 a
1 1 a
2 2 a
3 3 b
4 4 b
5 5 b
6 6 c
7 7 c
8 8 c
9 9 c
In [12]:
df.a.describe()
Out[12]:
count     10
unique    10
top        0
freq       1
Name: a, dtype: object
In [13]:
pd.to_numeric(df.a)
Out[13]:
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
Name: a, dtype: int64
In [14]:
df['a'] = pd.to_numeric(df['a'])
In [15]:
df.a.describe()
Out[15]:
count    10.00000
mean      4.50000
std       3.02765
min       0.00000
25%       2.25000
50%       4.50000
75%       6.75000
max       9.00000
Name: a, dtype: float64
In [18]:
combined_data = pd.DataFrame({"ticker" : ["IBM"]*5 + ["AAPL"]*5,
                              "volume" : range(1200, 1210),
                             "open":range(10),
                              "high":range(10),
                              "low":range(10),
                             "close":range(10),
                             })
combined_data
Out[18]:
ticker volume open high low close
0 IBM 1200 0 0 0 0
1 IBM 1201 1 1 1 1
2 IBM 1202 2 2 2 2
3 IBM 1203 3 3 3 3
4 IBM 1204 4 4 4 4
5 AAPL 1205 5 5 5 5
6 AAPL 1206 6 6 6 6
7 AAPL 1207 7 7 7 7
8 AAPL 1208 8 8 8 8
9 AAPL 1209 9 9 9 9
In [20]:
def get_total_volume(combined_data):
    return combined_data.groupby('ticker').sum()['volume']
In [21]:
get_total_volume(combined_data)
Out[21]:
ticker
AAPL    6035
IBM     6010
Name: volume, dtype: int64

problem 2¶

Write a function latest_jobs which will serach for job at given location in arcesium job portal. Arcesium jobs are listed on "https://careers.arcesium.com/go/All-Jobs/4687610/" . Scrape this page and get a dataframe which contains jobs at given location from latest 25 jobs. The dataframe should have Title and Location

>>> latest_jobs("London")
                                               Title         Location
0  Solutions Architect-Private Markets  Solutions  London, ENG, GB
1  Solutions Architect  Solutions Architect  Lond  London, ENG, GB

How do we go about downloading data?

  1. try with pandas if you are looking for table
  2. if the link points to a file (csv,excel,movie, pdf) then make use request.get and save the contents of resp into a file
  3. check if there is API and use requests module
  4. get html page using request and parse it using re, beatifulsoup
  5. autmated browser launching techniques using selenium
In [32]:
url = "https://careers.arcesium.com/go/All-Jobs/4687610"
dfs = pd.read_html(url)
jobs = dfs[0]
jobs
Out[32]:
Title Location
0 Sales Executive - Private Markets Sales Execu... New York, NY, US
1 Associate Associate Hyderabad, TG, IN +2 mo... Hyderabad, TG, IN +2 more…
2 Principal Reliability Engineer Principal Reli... Hyderabad, TG, IN +2 more…
3 Senior Software Engineer - Data Platform Seni... New York, NY, US
4 Software Test Engineer Software Test Engineer... Hyderabad, TG, IN +2 more…
5 Forward Deployed - Implementation Forward Dep... Hyderabad, TG, IN
6 Engineering Manager Engineering Manager Hyde... Hyderabad, TG, IN
7 Manager, Investor Services Manager, Investor ... Hyderabad, TG, IN
8 Principal Engineer Principal Engineer Hydera... Hyderabad, TG, IN +2 more…
9 Engineering Lead Engineering Lead Gurugram, ... Gurugram, HR, IN +2 more…
10 Senior Principal Engineer Senior Principal En... Hyderabad, TG, IN +2 more…
11 Technical Lead Technical Lead Hyderabad, TG,... Hyderabad, TG, IN +2 more…
12 Senior Reliability Engineer Senior Reliabilit... Hyderabad, TG, IN +2 more…
13 Talent Sourcer Talent Sourcer Gurugram, HR, ... Gurugram, HR, IN +2 more…
14 Principal Engineer - Infrastructure Principal... Gurugram, HR, IN +2 more…
15 Executive Assistant Executive Assistant New ... New York, NY, US
16 Infrastructure Engineer Intern Infrastructure... New York, NY, US
17 Senior Manager - TAO Senior Manager - TAO Hy... Hyderabad, TG, IN
18 Designer Designer Bengaluru, KA, IN +2 more... Bengaluru, KA, IN +2 more…
19 HC Generalist HC Generalist New York, NY, US... New York, NY, US
20 Senior Software Engineer Senior Software Engi... Bengaluru, KA, IN +2 more…
21 Organization Development Intern Organization ... New York, NY, US
22 Senior Analyst, TAO Senior Analyst, TAO Hyde... Hyderabad, TG, IN
23 Senior Analyst, Middle Office Senior Analyst,... Hyderabad, TG, IN
24 Software Engineer Intern Software Engineer In... New York, NY, US
In [24]:
combined_data
Out[24]:
ticker volume open high low close
0 IBM 1200 0 0 0 0
1 IBM 1201 1 1 1 1
2 IBM 1202 2 2 2 2
3 IBM 1203 3 3 3 3
4 IBM 1204 4 4 4 4
5 AAPL 1205 5 5 5 5
6 AAPL 1206 6 6 6 6
7 AAPL 1207 7 7 7 7
8 AAPL 1208 8 8 8 8
9 AAPL 1209 9 9 9 9
In [25]:
combined_data[combined_data.ticker=="IBM"]
Out[25]:
ticker volume open high low close
0 IBM 1200 0 0 0 0
1 IBM 1201 1 1 1 1
2 IBM 1202 2 2 2 2
3 IBM 1203 3 3 3 3
4 IBM 1204 4 4 4 4
In [26]:
combined_data[combined_data.ticker.str.contains("IB")]
Out[26]:
ticker volume open high low close
0 IBM 1200 0 0 0 0
1 IBM 1201 1 1 1 1
2 IBM 1202 2 2 2 2
3 IBM 1203 3 3 3 3
4 IBM 1204 4 4 4 4
In [27]:
combined_data[combined_data.ticker.str.contains("A")]
Out[27]:
ticker volume open high low close
5 AAPL 1205 5 5 5 5
6 AAPL 1206 6 6 6 6
7 AAPL 1207 7 7 7 7
8 AAPL 1208 8 8 8 8
9 AAPL 1209 9 9 9 9
In [29]:
def latest_job(location):
    ## get data from link using pandas.read_html
    ## filter out datframe with the location we are interested in
    pass
In [33]:
jobs.to_csv("arcesium_jobs.csv")
In [34]:
!cat arcesium_jobs.csv
,Title,Location
0,"Sales Executive - Private Markets  Sales Executive - Private Markets  New York, NY, US  Sep 9, 2022","New York, NY, US"
1,"Associate  Associate  Hyderabad, TG, IN  +2 more…  Sep 9, 2022","Hyderabad, TG, IN  +2 more…"
2,"Principal Reliability Engineer  Principal Reliability Engineer  Hyderabad, TG, IN  +2 more…  Sep 8, 2022","Hyderabad, TG, IN  +2 more…"
3,"Senior Software Engineer - Data Platform  Senior Software Engineer - Data Platform  New York, NY, US  Sep 8, 2022","New York, NY, US"
4,"Software Test Engineer  Software Test Engineer  Hyderabad, TG, IN  +2 more…  Sep 7, 2022","Hyderabad, TG, IN  +2 more…"
5,"Forward Deployed - Implementation  Forward Deployed - Implementation  Hyderabad, TG, IN  Sep 7, 2022","Hyderabad, TG, IN"
6,"Engineering Manager  Engineering Manager  Hyderabad, TG, IN  Sep 7, 2022","Hyderabad, TG, IN"
7,"Manager, Investor Services  Manager, Investor Services  Hyderabad, TG, IN  Sep 7, 2022","Hyderabad, TG, IN"
8,"Principal Engineer  Principal Engineer  Hyderabad, TG, IN  +2 more…  Sep 7, 2022","Hyderabad, TG, IN  +2 more…"
9,"Engineering Lead  Engineering Lead  Gurugram, HR, IN  +2 more…  Sep 7, 2022","Gurugram, HR, IN  +2 more…"
10,"Senior Principal Engineer  Senior Principal Engineer  Hyderabad, TG, IN  +2 more…  Sep 5, 2022","Hyderabad, TG, IN  +2 more…"
11,"Technical Lead  Technical Lead  Hyderabad, TG, IN  +2 more…  Sep 5, 2022","Hyderabad, TG, IN  +2 more…"
12,"Senior Reliability Engineer  Senior Reliability Engineer  Hyderabad, TG, IN  +2 more…  Sep 4, 2022","Hyderabad, TG, IN  +2 more…"
13,"Talent Sourcer  Talent Sourcer  Gurugram, HR, IN  +2 more…  Sep 4, 2022","Gurugram, HR, IN  +2 more…"
14,"Principal Engineer - Infrastructure  Principal Engineer - Infrastructure  Gurugram, HR, IN  +2 more…  Sep 3, 2022","Gurugram, HR, IN  +2 more…"
15,"Executive Assistant  Executive Assistant  New York, NY, US  Sep 3, 2022","New York, NY, US"
16,"Infrastructure Engineer Intern  Infrastructure Engineer Intern  New York, NY, US  Sep 2, 2022","New York, NY, US"
17,"Senior Manager - TAO  Senior Manager - TAO  Hyderabad, TG, IN  Sep 2, 2022","Hyderabad, TG, IN"
18,"Designer  Designer  Bengaluru, KA, IN  +2 more…  Sep 2, 2022","Bengaluru, KA, IN  +2 more…"
19,"HC Generalist  HC Generalist  New York, NY, US  Sep 2, 2022","New York, NY, US"
20,"Senior Software Engineer  Senior Software Engineer  Bengaluru, KA, IN  +2 more…  Sep 1, 2022","Bengaluru, KA, IN  +2 more…"
21,"Organization Development Intern  Organization Development Intern  New York, NY, US  Sep 1, 2022","New York, NY, US"
22,"Senior Analyst, TAO  Senior Analyst, TAO  Hyderabad, TG, IN  Aug 31, 2022","Hyderabad, TG, IN"
23,"Senior Analyst, Middle Office  Senior Analyst, Middle Office  Hyderabad, TG, IN  Aug 31, 2022","Hyderabad, TG, IN"
24,"Software Engineer Intern  Software Engineer Intern  New York, NY, US  Aug 30, 2022","New York, NY, US"
In [ ]: