Lists, Tuples, Dictionaries & Sets
Real programs don’t work with one value at a time โ they work with collections. A list of servers. A dictionary of ticket details. A set of unique user IDs. This module teaches you Python’s four built-in collection types and, crucially, when to reach for each one. Get this right and your code becomes dramatically easier to write and read.
So far you’ve stored one value per variable. That works fine for a single server name or a single ticket count. But what about a list of 200 servers? Or a record with a server’s name, IP address, CPU usage, and online status all together? You need collections โ and Python gives you four excellent ones.
Lists โ Your Most-Used Collection
A list is an ordered sequence of items. The items can be of any type โ strings, numbers, booleans, or even other lists. You create a list with square brackets and separate items with commas.
Lists are indexed, meaning every item has a position number starting at zero. The first item is at index 0, the second at index 1, and so on. Python also supports negative indexing โ index -1 always refers to the last item, -2 to the second-to-last, regardless of how long the list is.
Blue = positive index ยท Orange = negative index (count from end)
Slicing lets you extract a portion of a list: servers[1:3] gives items at index 1 and 2 (the stop index is excluded). servers[:2] gives the first two items. servers[-2:] gives the last two. This notation feels strange at first but becomes second nature very quickly.
Lists come with a rich set of methods. The ones you’ll use most: .append() adds one item to the end, .extend() adds multiple items, .insert(index, item) adds at a specific position, .remove(item) deletes the first matching item, .pop() removes and returns the last item, .sort() sorts in place, and .reverse() flips the order.
Tuples โ Fixed Collections That Don’t Change
A tuple looks just like a list but uses parentheses instead of square brackets, and โ crucially โ once created, it cannot be changed. You can’t add items, remove items, or change any item in a tuple. This immutability is the whole point.
Use tuples for data that represents a fixed “record” or “coordinate” โ things like (latitude, longitude), (host, port, database), or (255, 128, 0) for an RGB colour. If someone reads your code and sees a tuple, they immediately know: this data is not supposed to change. A list signals flexibility; a tuple signals permanence.
Tuples support indexing and slicing just like lists, and they can be unpacked โ split into separate variables in one line:
db_config = ("ora-prod-01", 1521, "ORCL")
# Unpack into individual variables
host, port, service = db_config
print(f"Connecting to {host}:{port}/{service}")
# Connecting to ora-prod-01:1521/ORCL
Dictionaries โ Structured Records with Named Fields
A dictionary stores key-value pairs. Instead of accessing items by position (like a list), you access them by name. This is enormously useful for representing structured objects โ a server record, a ticket, a user profile, a configuration block.
Keys are usually strings (though they can be any immutable type). Values can be anything at all โ strings, numbers, booleans, lists, even other dictionaries. You create a dictionary with curly braces: {“key”: value}. You access a value with dict[“key”] or the safer dict.get(“key”) which returns None instead of raising an error if the key doesn’t exist.
Dictionaries are the Python equivalent of a JSON object โ and since most APIs return JSON, you’ll be working with dictionaries constantly if you’re in DevOps, AI/ML, or automation work.
Sets โ Unique Items & Membership Operations
A set is an unordered collection that automatically eliminates duplicates. Every item in a set appears exactly once, no matter how many times you add it. Sets are created with curly braces (like dicts) but with no key-value pairs โ just values: {1, 2, 3}. To create an empty set you must use set() โ not {}, which creates an empty dict.
Sets shine in two situations: when you need to deduplicate a list quickly, and when you need to compare two collections using set operations. Python’s set operations are clean and fast:
| Operation | Python syntax | What it gives you |
|---|---|---|
| Union | A | B or A.union(B) | All items in A or B (or both) |
| Intersection | A & B or A.intersection(B) | Items that appear in both A and B |
| Difference | A – B or A.difference(B) | Items in A that are NOT in B |
| Symmetric diff | A ^ B | Items in A or B but NOT both |
| Subset check | A.issubset(B) | True if every item in A is also in B |
Nested Structures โ Lists of Dicts & Dicts of Lists
The real power of Python’s data structures comes from combining them. A list of dictionaries is the most common pattern in professional Python โ it’s how you represent a table of records. Each dictionary is one row; each key is a column name.
This pattern matches exactly how data arrives from databases (list of rows), REST APIs (list of JSON objects), and CSV files (list of records). Once you understand list-of-dicts, you’ll recognise it everywhere.
A dictionary of lists is also very common โ it groups items by category. For example, {“production”: [“web-01”, “db-01”], “dev”: [“dev-01”, “dev-02”]} maps environments to their server lists. You’ll build these patterns constantly once you reach the file-handling and database modules.
Lists โ create, access, modify, loop
# Create
servers = ["web-01", "db-01", "api-01", "cache-01"]
empty = []
# Access by index
servers[0] # "web-01" (first)
servers[-1] # "cache-01" (last)
servers[1:3] # ["db-01", "api-01"]
servers[:2] # ["web-01", "db-01"]
servers[-2:] # ["api-01", "cache-01"]
# Modify
servers.append("lb-01") # add to end
servers.insert(1, "proxy-01") # insert at index 1
servers.remove("cache-01") # remove by value
servers.pop() # remove & return last item
servers.pop(0) # remove & return item at index 0
servers[0] = "web-prod-01" # replace by index
# Useful operations
len(servers) # number of items
"db-01" in servers # True / False membership check
servers.sort() # sort alphabetically in-place
servers.sort(reverse=True) # reverse sort
sorted(servers) # returns new sorted list (original unchanged)
servers.reverse() # reverse in-place
servers.count("db-01") # how many times item appears
servers.index("api-01") # index position of item
# List comprehensions
upper = [s.upper() for s in servers]
db_only = [s for s in servers if s.startswith("db")]
lengths = [len(s) for s in servers]
Tuples โ create, access, unpack
# Create
db_config = ("ora-prod-01", 1521, "ORCL")
single = ("only-one",) # trailing comma required for single-item tuple
coords = 19.0760, 72.8777 # parentheses optional โ still a tuple
# Access (same as lists)
db_config[0] # "ora-prod-01"
db_config[-1] # "ORCL"
# Unpack โ clean way to split a tuple into named variables
host, port, service = db_config
lat, lon = coords
# Useful operations
len(db_config) # 3
1521 in db_config # True
db_config.count("ORCL") # 1
db_config.index(1521) # 1
# Convert between list and tuple
as_list = list(db_config)
as_tuple = tuple(as_list)
Dictionaries โ create, access, modify, loop
# Create
server = {
"name": "prod-db-01",
"ip": "10.0.1.15",
"cpu": 74,
"online": True
}
empty = {}
# Access
server["name"] # "prod-db-01" โ raises KeyError if missing
server.get("cpu") # 74 โ returns None if missing
server.get("disk", 0) # 0 โ returns default if missing
# Add & update
server["disk"] = 88 # add new key
server["cpu"] = 81 # update existing key
server.update({"mem": 62, "env": "production"}) # bulk update
# Delete
del server["disk"] # remove key โ raises KeyError if missing
server.pop("mem", None) # remove & return โ safe, no error if missing
# Check membership
"cpu" in server # True โ checks keys only
"cpu" in server.values() # False โ checks values
# Loop patterns
for key in server: # iterate over keys
print(key)
for key, value in server.items(): # iterate over key-value pairs
print(f"{key}: {value}")
for value in server.values(): # iterate over values only
print(value)
# Useful operations
len(server) # number of key-value pairs
server.keys() # dict_keys([...]) โ all key names
server.values() # dict_values([...]) โ all values
server.items() # dict_items([('name','prod-db-01'),...]) โ both
Sets โ create, modify, operations
# Create
team_a = {"alice", "bob", "carol"}
team_b = {"carol", "dave", "eve"}
empty = set() # NOT {} โ that creates an empty dict
# Deduplicate a list instantly
ip_list = ["10.0.1.1", "10.0.1.2", "10.0.1.1", "10.0.1.3"]
unique_ips = set(ip_list) # {'10.0.1.1', '10.0.1.2', '10.0.1.3'}
# Modify
team_a.add("frank") # add one item
team_a.update(["grace", "henry"]) # add multiple items
team_a.remove("bob") # remove โ raises KeyError if missing
team_a.discard("nobody") # remove safely โ no error if missing
# Set operations
team_a | team_b # union โ everyone in either team
team_a & team_b # intersection โ in both teams
team_a - team_b # difference โ in A but not B
team_a ^ team_b # symmetric diff โ in one but not both
# Membership check (very fast โ much faster than a list)
"carol" in team_a # True
"dave" in team_a # False
Nested structures โ list of dicts
# The most common pattern: list of dictionaries
fleet = [
{"name": "web-01", "env": "prod", "cpu": 45, "online": True},
{"name": "db-01", "env": "prod", "cpu": 82, "online": True},
{"name": "dev-01", "env": "dev", "cpu": 12, "online": False},
]
# Access a single field
fleet[0]["name"] # "web-01"
fleet[1]["cpu"] # 82
# Loop and filter
online_servers = [s for s in fleet if s["online"]]
# Sort by a field
by_cpu = sorted(fleet, key=lambda s: s["cpu"], reverse=True)
# Group by environment (dict of lists)
by_env = {}
for s in fleet:
env = s["env"]
if env not in by_env:
by_env[env] = []
by_env[env].append(s["name"])
# {'prod': ['web-01', 'db-01'], 'dev': ['dev-01']}
Each example uses the data structure that’s genuinely the right fit for the problem โ not just the first one that comes to mind. Noticing why each type was chosen is as important as reading the code itself.
Uses a list to manage an IT ticket queue โ adding new tickets, escalating one to the front, closing resolved ones, and printing the current queue. Lists are right here because order matters and items change constantly.
# IT Support: Ticket queue as a list
queue = ["TKT-1041", "TKT-1035", "TKT-1029"]
# Add a new ticket to the back of the queue
queue.append("TKT-1055")
# Escalate a critical ticket โ move it to the front
critical = "TKT-1062"
queue.insert(0, critical)
# Resolve the ticket at the front of the queue
resolved = queue.pop(0)
print(f"Resolved: {resolved}")
# Remove a ticket that was cancelled
if "TKT-1029" in queue:
queue.remove("TKT-1029")
print("TKT-1029 cancelled and removed")
# List comprehension: extract only high-priority IDs (ticket num > 1040)
high_priority = [t for t in queue if int(t.split("-")[1]) > 1040]
print(f"\nCurrent queue ({len(queue)} tickets):")
for i, t in enumerate(queue, 1):
print(f" {i}. {t}")
print(f"\nHigh priority: {high_priority}")
TKT-1029 cancelled and removed
Current queue (3 tickets):
1. TKT-1041
2. TKT-1035
3. TKT-1055
High priority: [‘TKT-1041’, ‘TKT-1055’]
Represents database query results as a list of dictionaries โ exactly how Python database libraries (cx_Oracle, psycopg2) return rows. Filters, sorts, and summarises the data before display. Tuples hold the column definitions, which shouldn’t change.
# Database: Process query results (list of dicts pattern)
# Simulates rows returned from: SELECT * FROM employees WHERE dept='IT'
employees = [
{"id": 1, "name": "Priya Sharma", "dept": "IT", "salary": 85000, "active": True},
{"id": 2, "name": "Rahul Mehta", "dept": "IT", "salary": 92000, "active": True},
{"id": 3, "name": "Anita Patel", "dept": "Finance", "salary": 78000, "active": True},
{"id": 4, "name": "Suresh Kumar", "dept": "IT", "salary": 67000, "active": False},
{"id": 5, "name": "Meera Nair", "dept": "Finance", "salary": 95000, "active": True},
]
# Immutable column definition (tuple โ should never change)
columns = ("id", "name", "dept", "salary", "active")
# Filter: active IT employees only
it_active = [e for e in employees
if e["dept"] == "IT" and e["active"]]
# Sort by salary descending
it_active.sort(key=lambda e: e["salary"], reverse=True)
# Summary stats
salaries = [e["salary"] for e in it_active]
total_sal = sum(salaries)
avg_sal = total_sal / len(salaries)
print(f"{'Name':16} {'Dept':10} {'Salary':>10}")
print("-" * 40)
for e in it_active:
print(f"{e['name']:16} {e['dept']:10} {e['salary']:>10,}")
print("-" * 40)
print(f"{'Total':27} {total_sal:>10,}")
print(f"{'Average':27} {avg_sal:>10,.0f}")
—————————————-
Rahul Mehta IT 92,000
Priya Sharma IT 85,000
—————————————-
Total 177,000
Average 88,500
Stores environment configuration in a dictionary of dictionaries, demonstrates safe key access with .get(), and generates deployment summary from the nested structure. DevOps engineers work with config dicts constantly โ from YAML to environment variables.
# DevOps: Environment config as nested dict
config = {
"production": {
"db_host": "ora-prod-01.internal",
"db_port": 1521,
"replicas": 3,
"ssl": True,
"log_level": "WARNING",
},
"staging": {
"db_host": "ora-stg-01.internal",
"db_port": 1521,
"replicas": 1,
"ssl": True,
"log_level": "INFO",
},
"dev": {
"db_host": "localhost",
"db_port": 5432,
"replicas": 1,
"ssl": False,
"log_level": "DEBUG",
},
}
target_env = "staging"
env_cfg = config.get(target_env)
if not env_cfg:
print(f"Error: environment '{target_env}' not found in config.")
else:
print(f"=== Deployment Config: {target_env.upper()} ===")
for key, value in env_cfg.items():
ssl_note = " (โ disabled)" if key == "ssl" and not value else ""
print(f" {key:14}: {value}{ssl_note}")
# Safe access for optional key
timeout = env_cfg.get("timeout_secs", 30)
print(f" {'timeout_secs':14}: {timeout} (default)")
db_host : ora-stg-01.internal
db_port : 1521
replicas : 1
ssl : True
log_level : INFO
timeout_secs : 30 (default)
Uses sets to build a clean vocabulary from text data โ deduplicating tokens, finding words unique to one corpus versus another, and identifying shared terms. Set operations are a fundamental tool in NLP preprocessing.
# AI/ML: Set operations for NLP vocabulary analysis
# Two document corpora (simplified โ normally thousands of docs)
corpus_support = [
"reset password account login failed error system",
"network timeout connection error server down reset",
"account locked password reset request user",
]
corpus_billing = [
"invoice payment failed account error billing",
"refund request account credit card payment",
"subscription renewal billing account update",
]
def build_vocab(corpus):
words = []
for doc in corpus:
words.extend(doc.split())
return set(words) # set() removes all duplicates
vocab_support = build_vocab(corpus_support)
vocab_billing = build_vocab(corpus_billing)
# Set operations
shared_terms = vocab_support & vocab_billing # intersection
support_only = vocab_support - vocab_billing # difference
billing_only = vocab_billing - vocab_support # difference
all_terms = vocab_support | vocab_billing # union
print(f"Support vocabulary : {len(vocab_support)} unique terms")
print(f"Billing vocabulary : {len(vocab_billing)} unique terms")
print(f"Shared terms : {sorted(shared_terms)}")
print(f"Support-only terms : {sorted(support_only)}")
print(f"Billing-only terms : {sorted(billing_only)}")
print(f"Total unique terms : {len(all_terms)}")
Billing vocabulary : 11 unique terms
Shared terms : [‘account’, ‘error’, ‘request’, ‘reset’]
Support-only terms : [‘connection’, ‘down’, ‘failed’, ‘locked’, ‘login’, ‘network’, ‘password’, ‘server’, ‘system’, ‘timeout’, ‘user’]
Billing-only terms : [‘billing’, ‘card’, ‘credit’, ‘credit’, ‘invoice’, ‘payment’, ‘refund’, ‘renewal’, ‘subscription’, ‘update’]
Total unique terms : 23
Combines all four data structure types in one program: a tuple for immutable version info, a list to track deployment order, a dict of lists to group deployments by environment, and a set to track which services have been deployed. This is the kind of structure a release automation script would maintain.
# Automation: Release tracker using all four data structures
# Tuple: immutable release metadata
release = ("v2.4.1", "2026-06-14", "hotfix")
version, date, rel_type = release
# List: ordered deployment steps
deploy_order = ["auth-service", "api-gateway", "user-service",
"notification-service", "report-service"]
# Dict of lists: services by environment
environments = {
"dev": ["auth-service", "api-gateway"],
"staging": ["auth-service", "api-gateway", "user-service"],
"production": ["auth-service", "api-gateway", "user-service",
"notification-service", "report-service"],
}
# Set: services already deployed (updated as we go)
deployed = set()
print(f"Release {version} ({rel_type}) โ {date}\n")
for env, services in environments.items():
print(f"[{env.upper()}]")
for svc in deploy_order:
if svc not in services:
continue # skip services not in this env
if svc in deployed:
print(f" โฉ {svc} (already deployed)")
else:
print(f" โ Deploying {svc}...")
deployed.add(svc)
print()
pending = set(deploy_order) - deployed
print(f"Deployed : {len(deployed)}/{len(deploy_order)} services")
if pending:
print(f"Pending : {sorted(pending)}")
[DEV]
โ Deploying auth-service…
โ Deploying api-gateway…
[STAGING]
โฉ auth-service (already deployed)
โฉ api-gateway (already deployed)
โ Deploying user-service…
[PRODUCTION]
โฉ auth-service (already deployed)
โฉ api-gateway (already deployed)
โฉ user-service (already deployed)
โ Deploying notification-service…
โ Deploying report-service…
Deployed : 5/5 services
Each exercise is focused on one specific structure. This is intentional โ you need to build intuition for each type individually before you start mixing them. Aim to complete each one without looking at the Syntax Reference first; consult it only if you get stuck.
๐ IT Asset Inventory System
Your company needs a script to manage its IT asset inventory. Build a Python script called asset_inventory.py that uses all four data structure types appropriately.
- Assets list โ create a list of at least 8 asset dictionaries. Each asset should have: asset_id (str), type (str โ “laptop”, “server”, “network”, or “phone”), owner (str โ employee name or “unassigned”), location (str โ city name), purchase_year (int), and value_inr (int).
- Immutable config tuple โ create a tuple called asset_meta holding the current inventory date, the company name, and the currency code (“INR”). Unpack it into three variables and use them in your output headers.
- Group by type โ build a dictionary where each key is an asset type and the value is a list of asset IDs of that type. Print this grouped view.
- Unique locations โ use a set to find all unique office locations in the inventory. Print the count and the sorted list of locations.
- Financial summary โ calculate and print: total asset value, average value per asset, most expensive asset (name and value), and total value grouped by asset type.
- Ageing report โ assets older than 4 years (purchase_year < 2022) should be flagged for review. Use a list comprehension to extract them and print a “Flagged for replacement” list.
Some questions ask you to choose the right data structure for a situation. This is a design skill โ the most important thing you take away from M3. Think about the properties of each type before selecting your answer.
servers = ["web-01", "db-01", "api-01", "cache-01", "lb-01"]
print(servers[1:4])
fleet = [{"name": "web-01", "cpu": 45}, {"name": "db-01", "cpu": 82}, {"name": "api-01", "cpu": 33}]
