d = {"x": 1, "y": 2, "z": 3}Session 5
- Dictionaries & Sets
Dictionaries
Dictionaries are used to store name value pairs.
d{'x': 1, 'y': 2, 'z': 3}
d['x']1
d['x'] = 11d{'x': 11, 'y': 2, 'z': 3}
Dictionary Usage Patterns
There are two common ways to use dictionaries.
- as a record
- as a lookup-table
# as a record
person = {
"name": "Alice",
"email": "alice@example.com",
"phone": "9876543110"
}person{'name': 'Alice', 'email': 'alice@example.com', 'phone': '9876543110'}
person['name']'Alice'
person['email']'alice@example.com'
When we are using a dictionary as a record, we know what are the possible keys.
# as a lookup table
phone_numbers = {
"alice": 1234,
"bob": 2345
}phone_numbers['alice']1234
phone_numbers['charlie'] = 3456phone_numbers{'alice': 1234, 'bob': 2345, 'charlie': 3456}
When using a dictionary as a lookup-table, the keys are not known upfront.
Example: Greeting in multiple languages
Let’s write a function greet to greet a person in any language.
If we just have to greet in one language English, we could write it as:
def greet(name):
print("Hello", name)greet("Alice")Hello Alice
Let’s add support for multiple languages.
def greet(name, lang):
if lang == "en":
print("Hello", name)
elif lang == "hi":
print("Namaste", name)greet("Alice", "en")Hello Alice
greet("Alice", "hi")Namaste Alice
We need to modify the code if we want to add support for a new language. That doesn’t look very nice.
Wouldn’t it be nice, if we can manage the translations outside the greet function?
prefixes = {
"en": "Hello",
"hi": "Namaste",
"it": "Caiso",
}def greet(name, lang):
prefix = prefixes[lang]
print(prefix, name)greet("Alice", "it")Caiso Alice
We can go even one step further and move translations into a text file.
%%file greetings.txt
en Hello
hi Namaste
it Caiso
ka Namaskara
te Namaskaram
ta VanakkamOverwriting greetings.txt
prefixes = {}
for line in open("greetings.txt"):
lang, prefix = line.strip().split()
prefixes[lang] = prefixprefixes{'en': 'Hello',
'hi': 'Namaste',
'it': 'Caiso',
'ka': 'Namaskara',
'te': 'Namaskaram',
'ta': 'Vanakkam'}
greet("Alice", "ta")Vanakkam Alice
greet("Alice", "ka")Namaskara Alice
Problem: Read prices
%load_problem read-pricesWrite a function read_prices to read prices of items from a text file.
The function will take path a filename as argument, parses the contents of the file and returns the prices of the items mentioned in the file as a dictionary.
The file will have one row for each item containing the name of the item followed by the price. The price could have decimal component.
For example, the following is a sample prices file.
$ cat files/prices.txt
Apple 150.0
Banana 12.0
Carrot 75.5
Guava 124.5
And here is expected output when read_prices is called with that file as argument.
>>> read_prices("files/prices.txt")
{"Apple": 150.0, "Banana": 12.0, "Carrot": 75.5, "Guava": 124.5}
>>> prices = read_prices("files/prices.txt")
>>> prices['Apple']
150.0
You can assume that the item name will not have any spaces and the file is well formatted and there will not be any empty lines.
You can verify your solution using:
%verify_problem read-prices
# your code here
!cat files/prices.txtApple 150.0
Banana 12.0
Carrot 75.5
Guava 124.5
Creating Dictionaries
Literal Syntax
d = {"x": 1, "y": 2}d{'x': 1, 'y': 2}
Using dict function
dict(x=1, y=2){'x': 1, 'y': 2}
The dict function can also be used to create a new dict from an existing one and add/update some entries.
d = {"x":1, "y": 2}dict(d, z=3){'x': 1, 'y': 2, 'z': 3}
dict(d, x=10, z=3){'x': 10, 'y': 2, 'z': 3}
The dict function also takes pairs.
pairs = [("one", 1), ("two", 2), ("three", 3)]dict(pairs){'one': 1, 'two': 2, 'three': 3}
names = ["Alice", "Bob", "Charlie"]
scores = [10, 20, 30]dict(zip(names, scores)){'Alice': 10, 'Bob': 20, 'Charlie': 30}
Dictionary Comprehensions
squares = {i: i*i for i in range(5)}squares{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
import os{f: os.path.getsize(f) for f in os.listdir(".") if f.endswith(".py")}{'hello3.py': 515,
'copy_file.py': 20,
'sq.py': 396,
'head.py': 152,
'date.py': 35,
'args.py': 28,
'echo.py': 55,
'countdown.py': 12,
'square.py': 16,
'sum.py': 84,
'hello.py': 292,
'hello2.py': 454}
py_files = {f: os.path.getsize(f) for f in os.listdir(".") if f.endswith(".py")}py_files{'hello3.py': 515,
'copy_file.py': 20,
'sq.py': 396,
'head.py': 152,
'date.py': 35,
'args.py': 28,
'echo.py': 55,
'countdown.py': 12,
'square.py': 16,
'sum.py': 84,
'hello.py': 292,
'hello2.py': 454}
Iterating over Dictionaries
d = {"x": 1, "y": 2, "z": 3}d.keys()dict_keys(['x', 'y', 'z'])
d.values()dict_values([1, 2, 3])
d.items()dict_items([('x', 1), ('y', 2), ('z', 3)])
To iterate over keys:
for k in d.keys():
print(k)x
y
z
for k in d:
print(k)x
y
z
for k in d:
print(k, d[k])x 1
y 2
z 3
To iterate over values:
for v in d.values():
print(v)1
2
3
Iterate over key-value pairs:
for k,v in d.items():
print(k, v)x 1
y 2
z 3
Example: Marks of a student
marks = {
"English": 89,
"Maths": 87,
"Science": 65
}marks{'English': 89, 'Maths': 87, 'Science': 65}
for subject, score in marks.items():
print(subject, score)English 89
Maths 87
Science 65
How to find the total marks?
sum(marks.values())241
for subject, score in marks.items():
print(subject, score)
print("---")
print("Total", sum(marks.values()))English 89
Maths 87
Science 65
---
Total 241
Small puzzle now. In which subject did the student get the highest marks?
marks{'English': 89, 'Maths': 87, 'Science': 65}
max(marks.values())89
max(marks.keys())'Science'
def get_score(subject):
return marks[subject]max(marks.keys(), key=get_score)'English'
marks = {
"English": 79,
"Maths": 87,
"Science": 65
}max(marks.keys(), key=get_score)'Maths'
How to handle ties?
marks = {
"English": 87,
"Maths": 87,
"Science": 65
}
def get_score(subject):
return marks[subject]max(marks, key=get_score)'English'
marks = {
"Maths": 87,
"English": 87,
"Science": 65
}
def get_score(subject):
return marks[subject]max(marks, key=get_score)'Maths'
What if we want to take the subject with longest name in case of ties.
marks = {
"Maths": 87,
"English": 87,
"Science": 65
}
def get_score(subject):
return marks[subject], len(subject)max(marks, key=get_score)'English'
Problem: invertdict
%load_problem invertdictWrite a function invertdict to interchange the keys and values in a dictionary.
The function should take a dictionary as an argument and return a new dictionary with keys and values interchanged. For simplicity, assume that the values in the dictionary are unique.
>>> invertdict({"x": 1, "y": 2, "z": 3})
{1: "x", 2: "y", 3: "z"}
You can verify your solution using:
%verify_problem invertdict
# your code here
A note on performance
numbers = list(range(1000000))1000000 in numbersFalse
%time 1000000 in numbersCPU times: user 10.6 ms, sys: 0 ns, total: 10.6 ms
Wall time: 12.9 ms
False
%timeit 1000000 in numbers9.57 ms ± 483 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
numdict = {n: n for n in numbers}%timeit 1000000 in numdict50.6 ns ± 4.04 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
numset = set(range(1000000))%timeit 1000000 in numset45.8 ns ± 3.29 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Sets
Set is an unordered collection of unique values.
x = {1, 2, 3}1 in xTrue
type(x)set
How to create an empty set?
x = {}type(x)dict
So {} creates a dict, not a set.
x = set()xset()
How to find unique elements in a list.
names = ["a", "b", "c", "d", "a"]# naive way
def unique(values):
result = []
for v in values:
if v not in result:
result.append(v)
return resultunique(names)['a', 'b', 'c', 'd']
%time unique(range(1000));CPU times: user 4.12 ms, sys: 0 ns, total: 4.12 ms
Wall time: 4.12 ms
%time unique(range(10000));CPU times: user 461 ms, sys: 3.79 ms, total: 465 ms
Wall time: 481 ms
%time unique(range(20000));CPU times: user 1.76 s, sys: 5.57 ms, total: 1.76 s
Wall time: 1.79 s
%time unique(range(50000));CPU times: user 11 s, sys: 22.4 ms, total: 11 s
Wall time: 11.2 s
# naive way
def unique(values):
return list(set(values))%time unique(range(10000));CPU times: user 566 µs, sys: 0 ns, total: 566 µs
Wall time: 565 µs
%time unique(range(100000));CPU times: user 11.3 ms, sys: 0 ns, total: 11.3 ms
Wall time: 11.3 ms
%time unique(range(1000000));CPU times: user 63.7 ms, sys: 28.6 ms, total: 92.3 ms
Wall time: 105 ms
%time unique(range(10000000));CPU times: user 529 ms, sys: 683 ms, total: 1.21 s
Wall time: 1.28 s
products1 = {"Apple", "Banana", "Guava"}
products2 = {"Apple", "Carrot", "Guava"}products1 - products2{'Banana'}
products2 - products1{'Carrot'}
products1 | products2{'Apple', 'Banana', 'Carrot', 'Guava'}
products1 & products2{'Apple', 'Guava'}
Understanding Python Execution Environment
x = 1x = 1 + 2%%file variables.py
x = 1
name = "python"
g = globals()
print(g['x'])
print(g['name'])
print(type(g))
g['x'] = 10
print(x)Overwriting variables.py
!python variables.py1
python
<class 'dict'>
10
x3
zNameError: name 'z' is not defined
g = globals()g['z'] = 5z5