Python Virtual Training For Arcesium - Module III - Day 4

Dec 17-23, 2020 Vikrant Patil

These notes are available online at http://notes.pipal.in/2020/arcesium_finop_batch3/module3-day4.html

© Pipal Academy LLP

Day 1 | Day 2 | Day 3 | Day 4 | Day 5

We will be using jupyter hub from http://lab.pipal.in for this training. Create a notebook with name module3-day4.ipynb for today's session. Before you start shutdown all kernels except today's notebook.

Using selenium to download data from internet

docs for python-selenium

https://selenium-python.readthedocs.io/installation.html#drivers

download geckodriver for the browser you need to launch from python

for firefox

https://github.com/mozilla/geckodriver/releases

create virtual environment for selenium

python -m venv firefox_selenium

activate it using (windows)

firefox_selenium\Scripts\activate.bat

for linux/max

source firefox_selenium/bin/activate

geckodriver for windows will have zip file, which has geckodriver.exe in it. unzip it and copy it in firefox_selenium\Scripts\

for linux/mac users copy the unzipped executable in firefox_selenium/bin/

pip install selenium
In [4]:
%%file search_arcesium.py

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://www.arcesium.com/")

careers = driver.find_element_by_class_name("careers-anchor")
careers.click()
driver.close()
Overwriting search_arcesium.py

documentation for selenium using python

https://selenium-python.readthedocs.io/

Reading pdf files

to read pdf files we will need a package called PyPDF2

from jupyter

!pip install PyPDF2

from cmd

pip install PyPDF2

In [5]:
!pip install PyPDF2
Collecting PyPDF2
  Using cached PyPDF2-1.26.0.tar.gz (77 kB)
Building wheels for collected packages: PyPDF2
  Building wheel for PyPDF2 (setup.py) ... done
  Created wheel for PyPDF2: filename=PyPDF2-1.26.0-py3-none-any.whl size=61084 sha256=84670daae0093b25e472b99f599e73fda3e64185542411e7a44b3141db7ab31e
  Stored in directory: /home/vikrant/.cache/pip/wheels/b1/1a/8f/a4c34be976825a2f7948d0fa40907598d69834f8ab5889de11
Successfully built PyPDF2
Installing collected packages: PyPDF2
Successfully installed PyPDF2-1.26.0
WARNING: You are using pip version 20.2.3; however, version 20.3.3 is available.
You should consider upgrading via the '/home/vikrant/anaconda3/bin/python -m pip install --upgrade pip' command.
In [6]:
!cat download.py
import sys
import requests

def download(url, filename):
    resp = requests.get(url)
    with open(filename, "wb") as f:
        f.write(resp.content)
        
if __name__ == "__main__":
    url = sys.argv[1]
    filename = sys.argv[2]
    download(url, filename)
    
In [7]:
!python download.py https://posoco.in/download/16-07-20_nldc_psp/?wpdmdl=30215 demanddata.pdf

We will try to read this pdf file https://posoco.in/download/16-07-20_nldc_psp/?wpdmdl=30215 and try to exctract table A from page 2

In [9]:
import PyPDF2
In [15]:
with open("demanddata.pdf", "rb") as f:
    pdfreader = PyPDF2.PdfFileReader(f)
    n = pdfreader.getNumPages()
    page = pdfreader.getPage(1)
    print(page.extractText()[:100])
NR
WR
SR
ER
NER
TOTAL
59882
41115
34238
21526
2730
159491
1114
0
0
0
6
1120
1398
998
807
447
48
3698
In [20]:
def print_pdf_text(filename):
    with open("demanddata.pdf", "rb") as f:
        pdfreader = PyPDF2.PdfFileReader(f)
        n = pdfreader.getNumPages()
        for p in range(n):
            page = pdfreader.getPage(p)
            print(page.extractText()[:500])
            print("="*10)
In [21]:
print_pdf_text("demanddata.pdf")
 
National Load Despatch Centre
 
 
POWER SYSTEM OPERATION CORPORATION LIMITED
 

 

 
 
(
Government of India Enterprise
/


)
 
B
-
9, QUTU
B INSTITUTIONAL AREA, KATWARIA SARAI,
 
NEW DELHI 
-
110016
 



,

,


,




 
_____________________________________________________________________________________________________________________________
__________
 
Ref:
 
POSOCO/NLDC/SO/Daily PSP
 
Report
 
 
    
 
    
 
        

:
 
16
th
 
Jul
 
20
20
 
 
To,
 
 
 
1.
 

 

,











,


,





==========
NR
WR
SR
ER
NER
TOTAL
59882
41115
34238
21526
2730
159491
1114
0
0
0
6
1120
1398
998
807
447
48
3698
355
33
77
149
29
643
11
49
128
-
-
187
39.60
16.60
41.59
4.60
0.03
102
12.6
0.0
0.0
0.0
0.0
12.6
65470
43593
38117
21535
2827
160654
22:20
10:29
10:00
21:20
19:41
21:26
Region
FVI
< 49.7
49.7 - 49.8
49.8 - 49.9
< 49.9
49.9 - 50.05
> 50.05
All India
0.057
0.16
1.81
13.19
15.16
76.52
8.32
Max.Demand
Shortage during
Energy Met
Drawal
OD(+)/UD(-)
Max OD
Energy
Region
States
Met during the 
day(MW)
ma
==========
16-Jul-2020
Sl 
No
Voltage Level
Line Details
Circuit
Max Import (MW)
Max Export (MW)
Import (MU)
Export (MU)
NET (MU)
1
HVDC
ALIPURDUAR-AGRA
D/C
0
1001
0.0
24.4
-24.4
2
HVDC
PUSAULI  B/B
-
0
399
0.0
9.6
-9.6
3
765 kV
GAYA-VARANASI
D/C
0
655
0.0
12.9
-12.9
4
765 kV
SASARAM-FATEHPUR
S/C
108
119
0.0
0.9
-0.9
5
765 kV
GAYA-BALIA
S/C
0
478
0.0
4.7
-4.7
6
400 kV
PUSAULI-VARANASI
S/C
0
283
0.0
5.9
-5.9
7
400 kV
PUSAULI -ALLAHABAD
S/C
0
180
0.0
3.5
-3.5
8
400 kV
MUZAFFARPUR-GORAKHPUR
D/C
0
834
0.0
15.5
==========
In [25]:
def get_page(pdffile, pageno):
    with open("demanddata.pdf", "rb") as f:
        pdfreader = PyPDF2.PdfFileReader(f)
        page = pdfreader.getPage(pageno)
        return page.extractText()
In [26]:
print(get_page("demanddata.pdf", 1))
NR
WR
SR
ER
NER
TOTAL
59882
41115
34238
21526
2730
159491
1114
0
0
0
6
1120
1398
998
807
447
48
3698
355
33
77
149
29
643
11
49
128
-
-
187
39.60
16.60
41.59
4.60
0.03
102
12.6
0.0
0.0
0.0
0.0
12.6
65470
43593
38117
21535
2827
160654
22:20
10:29
10:00
21:20
19:41
21:26
Region
FVI
< 49.7
49.7 - 49.8
49.8 - 49.9
< 49.9
49.9 - 50.05
> 50.05
All India
0.057
0.16
1.81
13.19
15.16
76.52
8.32
Max.Demand
Shortage during
Energy Met
Drawal
OD(+)/UD(-)
Max OD
Energy
Region
States
Met during the 
day(MW)
maximum 
Demand(MW)
(MU)
Schedule
(MU)
(MU)
(MW)
Shortage 
(MU)
Punjab
11090
0
237.9
146.8
-1.8
49
0.0
Haryana
9388
0
209.4
152.8
0.7
325
1.9
Rajasthan
12087
0
262.4
119.7
5.4
809
0.0
Delhi
5726
0
118.6
102.8
-1.4
228
0.0
NR
UP
22873
0
448.9
208.5
2.0
546
0.4
Uttarakhand
1899
0
42.8
20.7
0.8
111
0.0
HP
1366
0
28.6
-2.6
-0.2
91
0.0
J&K(UT) & Ladakh(UT)
2177
544
43.1
20.3
0.4
502
10.3
Chandigarh
295
0
6.0
5.9
0.2
61
0.0
Chhattisgarh
3685
0
86.9
36.8
0.8
468
0.0
Gujarat
13478
0
286.2
87.6
4.0
527
0.0
MP
9547
0
214.7
113.8
-3.8
198
0.0
WR
Maharashtra
16964
0
365.1
138.1
-1.9
457
0.0
Goa
405
0
8.5
8.2
-0.2
33
0.0
DD
246
0
5.3
5.3
0.0
19
0.0
DNH
614
0
14.0
13.8
0.2
44
0.0
AMNSIL
777
0
17.1
4.2
0.7
272
0.0
Andhra Pradesh
6439
0
141.0
45.6
-1.3
607
0.0
Telangana
8614
0
167.3
81.6
-2.5
385
0.0
SR
Karnataka
8486
0
155.1
51.1
-3.4
650
0.0
Kerala
3077
0
65.2
46.1
0.5
179
0.0
Tamil Nadu
12371
0
271.3
125.9
-3.7
573
0.0
Puducherry
349
0
7.5
7.5
-0.1
35
0.0
Bihar
5740
0
111.5
106.0
-0.3
386
0.0
DVC
2989
0
62.7
-42.6
-0.7
206
0.0
Jharkhand
1438
0
26.3
18.5
-1.0
124
0.0
ER
Odisha
3983
0
82.2
-0.2
-0.2
325
0.0
West Bengal
7917
0
162.6
47.2
-0.8
303
0.0
Sikkim
100
0
1.4
1.5
-0.1
17
0.0
Arunachal Pradesh
120
3
2.0
1.8
0.2
40
0.0
Assam
1759
23
30.0
27.1
-0.1
135
0.0
Manipur
183
1
2.6
2.3
0.3
37
0.0
NER
Meghalaya
307
2
5.3
-1.3
0.3
52
0.0
Mizoram
89
1
1.5
1.2
0.0
13
0.0
Nagaland
140
2
2.2
2.3
-0.2
23
0.0
Tripura
298
7
4.9
5.9
0.7
66
0.0
Bhutan
Nepal
Bangladesh
53.3
-1.5
-19.1
2337.0
-271.3
-1110.0
NR
WR
SR
ER
NER
TOTAL
352.1
-295.4
95.0
-145.8
-6.0
0.0
359.2
-293.7
84.6
-152.6
-3.4
-6.0
7.1
1.6
-10.5
-6.9
2.6
-6.0
NR
WR
SR
ER
NER
TOTAL
3838
14847
11792
3445
677
34598
9289
23225
14423
4892
47
51876
13127
38072
26215
8337
723
86473
NR
WR
SR
ER
NER
All India
546
1080
370
482
7
2486
25
13
14
0
0
52
355
33
77
149
29
643
26
33
47
0
0
106
40
82
19
0
22
163
71
73
210
5
0
359
1063
1314
737
636
58
3809
6.71
5.54
28.51
0.73
0.05
9.43
42.55
10.54
45.35
24.19
49.63
29.09
1.068
1.102
Based on State Max Demands
Diversity factor = Sum of regional or state maximum demands / All India maximum demand
*Source: RLDCs for solar connected to ISTS; SLDCs for embedded solar. Limited visibility of embedded solar data.
Executive Director-NLDC
Share of RES in total generation (%)
Share of Non-fossil fuel (Hydro,Nuclear and RES) in total generation(%)
H. All India Demand Diversity Factor
Based on Regional Max Demands
Lignite
Hydro
Nuclear
Gas, Naptha & Diesel
RES (Wind, Solar, Biomass & Others)
Total
State Sector
Total
G. Sourcewise generation (MU)
Coal
Actual(MU)
O/D/U/D(MU)
F. Generation Outage(MW)
Central Sector
Day Peak (MW)
E. Import/Export by Regions (in MU) - Import(+ve)/Export(-ve); OD(+)/UD(-)
Schedule(MU)
D. Transnational Exchanges (MU) - Import(+ve)/Export(-ve)€€€
Actual (MU)
Energy Shortage (MU)
Maximum Demand Met During the Day (MW) (From NLDC SCADA)
Time Of Maximum Demand Met (From NLDC SCADA)
B. Frequency Profile (%)
C. Power Supply Position in States
Demand Met during Evening Peak hrs(MW) (at 2000 hrs; from RLDCs)
Peak Shortage (MW)
Energy Met (MU)
Hydro Gen (MU)
Wind Gen (MU)
Solar Gen (MU)*
Report for previous day
Date of Reporting:
16-Jul-2020
A. Power Supply Position at All India and Regional level

In [27]:
def chunk(items, n, count):
    """
    [i1, i2, i3, i4, i5, i6, i7 ....i100]
    will break it into peices of size n
    """
    s = 0 
    for i in range(count):
        yield items[s:s+n] # n items from items, starting s position
        s = (i+1)*n
        

def extract_table_A(pagetext):
    lines = pagetext("\n")
    header = "NR WR SR ER NER TOTAL"
    headers = header.strip().split()
    data = {}
    steps = chunk(lines, len(headers))
In [28]:
for row in chunk(list(range(20)), 3, 4):
    print(row)
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
[9, 10, 11]
In [29]:
import random

def randomnums(n):
    for i in range(n):
        yield random.random() 
In [32]:
ran = randomnums(5)
In [33]:
ran
Out[33]:
<generator object randomnums at 0x7f45eda8a270>
In [34]:
r = reversed([1, 2, 3, 4, 5])
In [35]:
r
Out[35]:
<list_reverseiterator at 0x7f45edf5f550>
In [36]:
next(r)
Out[36]:
5
In [37]:
next(r)
Out[37]:
4
In [38]:
next(r)
Out[38]:
3
In [39]:
next(r)
Out[39]:
2
In [40]:
next(r)
Out[40]:
1
In [41]:
next(r)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-41-8ebe59a56b1d> in <module>
----> 1 next(r)

StopIteration: 
In [42]:
ran
Out[42]:
<generator object randomnums at 0x7f45eda8a270>
In [43]:
next(ran)
Out[43]:
0.7359782505244229
In [44]:
next(ran)
Out[44]:
0.34407781572007
In [45]:
next(ran)
Out[45]:
0.07464743675189789
In [46]:
ran
Out[46]:
<generator object randomnums at 0x7f45eda8a270>
In [47]:
def randomnums(n):
    print("Start generator")
    for i in range(n):
        print("yielding ...", i)
        yield random.random() 
        print("Back to loop")
        
    print("End of generator")
In [48]:
ran = randomnums(3)
In [49]:
next(ran)
Start generator
yielding ... 0
Out[49]:
0.800996994201215
In [50]:
next(ran)
Back to loop
yielding ... 1
Out[50]:
0.9485583938907229
In [51]:
random.random()
Out[51]:
0.700528870612659
In [53]:
def nhellos(n):
    print("Start generator")
    for i in range(n):
        print("yielding ...", i)
        yield "hello"
        print("Back to loop")  
    print("End of generator")
In [54]:
h = nhellos(2)
In [55]:
next(h)
Start generator
yielding ... 0
Out[55]:
'hello'
In [56]:
s = next(h)
Back to loop
yielding ... 1
In [57]:
s
Out[57]:
'hello'
In [58]:
next(h)
Back to loop
End of generator
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-58-31146b9ab14d> in <module>
----> 1 next(h)

StopIteration: 
In [59]:
for r in randomnums(4):
    print(r)
Start generator
yielding ... 0
0.12238524801463746
Back to loop
yielding ... 1
0.15188283688094728
Back to loop
yielding ... 2
0.4091636583979146
Back to loop
yielding ... 3
0.5480814846207586
Back to loop
End of generator
In [64]:
def chunk(items, n, count):
    """
    [i1, i2, i3, i4, i5, i6, i7 ....i100]
    will break it into peices of size n
    """
    s = 0 
    for i in range(count):
        yield items[s:s+n] # n items from items, starting s position
        s = (i+1)*n
        

def extract_table_A(pagetext):
    lines = pagetext.split("\n")
    header = "NR WR SR ER NER TOTAL"
    headers = header.strip().split()
    data = {}
    steps = chunk(lines, len(headers), 9)
    next(steps)
    for row in steps:
        print(row)
    
In [65]:
extract_table_A(get_page("demanddata.pdf", 1))
['59882', '41115', '34238', '21526', '2730', '159491']
['1114', '0', '0', '0', '6', '1120']
['1398', '998', '807', '447', '48', '3698']
['355', '33', '77', '149', '29', '643']
['11', '49', '128', '-', '-', '187']
['39.60', '16.60', '41.59', '4.60', '0.03', '102']
['12.6', '0.0', '0.0', '0.0', '0.0', '12.6']
['65470', '43593', '38117', '21535', '2827', '160654']
In [72]:
import pandas as pd
def chunk(items, n, count):
    """
    [i1, i2, i3, i4, i5, i6, i7 ....i100]
    will break it into peices of size n
    """
    s = 0 
    for i in range(count):
        yield items[s:s+n] # n items from items, starting s position
        s = (i+1)*n
        

def extract_table_A(pagetext):
    lines = pagetext.split("\n")
    header = "NR WR SR ER NER TOTAL"
    headers = header.strip().split()
    data = {}
    steps = chunk(lines, len(headers), 9)
    next(steps)
    for row in steps:
        for h, d in zip(headers, row):
            data.setdefault(h , []).append(d)
            
    pd.DataFrame(data)
In [66]:
d = {}
In [67]:
d['x']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-67-7657742692bd> in <module>
----> 1 d['x']

KeyError: 'x'
In [68]:
d.get('x', 0)
Out[68]:
0
In [69]:
d
Out[69]:
{}
In [70]:
d.setdefault("x", 0)
Out[70]:
0
In [71]:
d
Out[71]:
{'x': 0}
In [77]:
import pandas as pd
def chunk(items, n, count):
    """
    [i1, i2, i3, i4, i5, i6, i7 ....i100]
    will break it into peices of size n
    """
    s = 0 
    for i in range(count):
        yield items[s:s+n] # n items from items, starting s position
        s = (i+1)*n
        

def get_page(pdffile, pageno):
    with open("demanddata.pdf", "rb") as f:
        pdfreader = PyPDF2.PdfFileReader(f)
        page = pdfreader.getPage(pageno)
        return page.extractText()
        
        
def extract_table_A(pagetext):
    lines = pagetext.split("\n")
    header = "NR WR SR ER NER TOTAL"
    headers = header.strip().split()
    data = {}
    steps = chunk(lines, len(headers), 9)
    next(steps)
    for row in steps:
        for h, d in zip(headers, row):
            data.setdefault(h , []).append(d)
            
    return pd.DataFrame(data)
In [78]:
extract_table_A(get_page("demanddata.pdf", 1))
Out[78]:
NR WR SR ER NER TOTAL
0 59882 41115 34238 21526 2730 159491
1 1114 0 0 0 6 1120
2 1398 998 807 447 48 3698
3 355 33 77 149 29 643
4 11 49 128 - - 187
5 39.60 16.60 41.59 4.60 0.03 102
6 12.6 0.0 0.0 0.0 0.0 12.6
7 65470 43593 38117 21535 2827 160654
In [83]:
%%file extract_tableA.py
"""this script allows extracting table from a pdf file. it assumes 
certain format. tested with file https://posoco.in/download/16-07-20_nldc_psp/?wpdmdl=30215
"""
import pandas as pd
import PyPDF2
import typer

app = typer.Typer()


def chunk(items, n, count):
    """
    [i1, i2, i3, i4, i5, i6, i7 ....i100]
    will break it into peices of size n
    """
    s = 0 
    for i in range(count):
        yield items[s:s+n] # n items from items, starting s position
        s = (i+1)*n
        

def get_page(pdffile, pageno):
    with open("demanddata.pdf", "rb") as f:
        pdfreader = PyPDF2.PdfFileReader(f)
        page = pdfreader.getPage(pageno)
        return page.extractText()
        
        
def extract_table_A(pagetext):
    lines = pagetext.split("\n")
    header = "NR WR SR ER NER TOTAL"
    headers = header.strip().split()
    data = {}
    steps = chunk(lines, len(headers), 9)
    next(steps)
    for row in steps:
        for h, d in zip(headers, row):
            data.setdefault(h , []).append(d)
            
    return pd.DataFrame(data)


@app.command()
def extract_tableA(pdffile, csvfile):
    """
    exctracts table A from pdffile and saves it in csvfile
    """
    page = get_page(pdffile, 1)
    df = extract_table_A(page)
    df.to_csv(csvfile)
    
    
if __name__ == "__main__":
    app()
    
    
Overwriting extract_tableA.py

ANy command line tool has elaborate command line options

In [84]:
!python extract_tableA.py --help
Usage: extract_tableA.py [OPTIONS] PDFFILE CSVFILE

  exctracts table A from pdffile and saves it in csvfile

Arguments:
  PDFFILE  [required]
  CSVFILE  [required]

Options:
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.
In [85]:
!python extract_tableA.py demanddata.pdf demanddata.csv
In [87]:
!pip install typer
Requirement already satisfied: typer in /home/vikrant/anaconda3/lib/python3.8/site-packages (0.3.1)
Requirement already satisfied: click<7.2.0,>=7.1.1 in /home/vikrant/anaconda3/lib/python3.8/site-packages (from typer) (7.1.2)
WARNING: You are using pip version 20.2.3; however, version 20.3.3 is available.
You should consider upgrading via the '/home/vikrant/anaconda3/bin/python -m pip install --upgrade pip' command.
In [86]:
!cat demanddata.csv
,NR,WR,SR,ER,NER,TOTAL
0,59882,41115,34238,21526,2730,159491
1,1114,0,0,0,6,1120
2,1398,998,807,447,48,3698
3,355,33,77,149,29,643
4,11,49,128,-,-,187
5,39.60,16.60,41.59,4.60,0.03,102
6,12.6,0.0,0.0,0.0,0.0,12.6
7,65470,43593,38117,21535,2827,160654
In [88]:
%%file head.py
import typer

app = typer.Typer()

@app.command()
def head(filename:str, n:int=5):
    with open(filename) as f:
        for i in range(n):
            print(f.readline(), end="")
            
if __name__ == "__main__":
    app()
Overwriting head.py
In [89]:
!python head.py --help
Usage: head.py [OPTIONS] FILENAME

Arguments:
  FILENAME  [required]

Options:
  --n INTEGER                     [default: 5]
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.
In [90]:
!python head.py demanddata.csv
,NR,WR,SR,ER,NER,TOTAL
0,59882,41115,34238,21526,2730,159491
1,1114,0,0,0,6,1120
2,1398,998,807,447,48,3698
3,355,33,77,149,29,643
In [91]:
!python head.py --n 3 demanddata.csv
,NR,WR,SR,ER,NER,TOTAL
0,59882,41115,34238,21526,2730,159491
1,1114,0,0,0,6,1120

date patterns

In [92]:
import datetime
In [93]:
datetime.datetime.today().strftime("%Y")
Out[93]:
'2021'
In [94]:
datetime.datetime.today()
Out[94]:
datetime.datetime(2021, 1, 15, 13, 6, 57, 298854)
In [100]:
datetime.datetime.today().strftime("%Y-%m-%d %H:%M:%S")
Out[100]:
'2021-01-15 13:07:35'
In [101]:
time = '2021-01-15 13:07:35'
In [102]:
datetime.datetime.strptime(time, "%Y-%m-%d %H:%M:%S")
Out[102]:
datetime.datetime(2021, 1, 15, 13, 7, 35)
In [103]:
time1 = "2030/01/15"

datetime.datetime.strptime(time1, "%Y/%m/%d")
Out[103]:
datetime.datetime(2030, 1, 15, 0, 0)
In [105]:
import re #regular expression module
In [106]:
multiplestring = """
fjdsaf hdsg kjfhdsf kjhfds ds
dhf kdsjh

def hello():
    print("hello")

sadsad kjshdf jsfkjdhfs kjdshafkjhdsa f
kjhfds kd
kjhdsfkj 
kjhd f
kkdjkfj
"""
In [116]:
empty = re.compile("^$") # empty line
ninechars = re.compile("^.........$")
p1 = re.compile("\d+.+") # one digits and many chARS
P2 = re.compile("^\d+$") # only digits one or more
In [117]:
p1.match("hello") # if there no match it will return None
In [118]:
p1.match("2kjfdkjf")
Out[118]:
<re.Match object; span=(0, 8), match='2kjfdkjf'>
In [119]:
P2.match("fdfd")
In [120]:
P2.match("5")
Out[120]:
<re.Match object; span=(0, 1), match='5'>
In [121]:
P2.match("5556575")
Out[121]:
<re.Match object; span=(0, 7), match='5556575'>
In [122]:
s = "<c>text</c>"
In [ ]: