Python 3–oriented. Good for data engineering / backend screens. Verify version-specific details (typing, match) before the interview.
Indentation defines blocks (no braces). Convention: 4 spaces.
name = "Ada"
scores = [10, 20, 30]
user = {"id": 1, "active": True}
if user["active"]:
total = sum(scores)
print(f"{name}: {total}")
int, float, str, bool, None.list (ordered, mutable), tuple (ordered, immutable), dict (key → value), set (unique elements).for x in items:, while cond:, range(n).def name(a, b=1): … return (optional).import os, from pathlib import Path.squares = [x * x for x in range(10) if x % 2 == 0]
by_id = {u["id"]: u for u in users if u.get("active")}
with)Guarantees cleanup (close files, release locks)—even if an error happens.
with open("data.csv", encoding="utf-8") as f:
first = f.readline()
try:
risky()
except ValueError as e:
log.warning("bad value: %s", e)
finally:
cleanup()
*args and **kwargsVariable positional and keyword arguments—common in wrappers and APIs.
if __name__ == "__main__": runs only when the file is executed directly, not when imported—clean pattern for CLI tools.
A function that takes a function and returns a new function (often adds logging, timing, auth). Know the idea and @decorator sugar even if you don’t write one from scratch cold.
yieldLazy iteration—one item at a time, low memory. Streaming large files/lines is a classic win.
def lines(path):
with open(path, encoding="utf-8") as f:
for line in f:
yield line.rstrip("\n")
def total(xs: list[int]) -> int:
return sum(xs)
Hints are optional at runtime (unless you use a checker); they document contracts and help IDEs.
asyncio for I/O-bound async code (single-thread event loop).threading / multiprocessing for different CPU vs I/O stories—know which interviews usually want (often: “use multiprocessing for CPU-bound, threads/async for I/O”).| Topic | Why it shows up |
|---|---|
venv + pip | Reproducible environments; “works on my machine” antidote. |
pytest | Small unit tests for transforms (pure functions) save pipelines. |
pandas (if DE role) | Vectorized transforms vs row loops—know when to push work to SQL/Snowflake instead. |
logging not print | Structured logs in jobs running in Airflow / containers. |
def f(x=[]))—say why it’s a footgun.Model answers for rapid recall—deep dives depend on role.
Immutable: int, float, str, tuple, frozenset—new objects on “change”. Mutable: list, dict, set—in-place updates affect all references to that object.
== and is?== compares value (calls __eq__). is compares object identity (same memory). Use is for singletons like None, True, False—never == None in idiomatic Python.
*args and **kwargs.*args collects extra positional arguments into a tuple. **kwargs collects keyword arguments into a dict. Used in decorators and flexible APIs.
A callable that wraps a function to add behavior (logging, caching, auth). @timer above def f() is rewritten to f = timer(f) (conceptually).
yield?A generator function uses yield to produce a stream of values lazily—saves memory vs building a giant list and enables pipeline-style processing.
The Global Interpreter Lock allows one thread to execute Python bytecode at a time in CPython—threads don’t parallelize CPU-bound Python work; use multiprocessing or native extensions for parallel CPU, or async for I/O-bound concurrency.
Shallow: new outer container, inner objects shared. Deep: recursive copy of nested structures—use copy.deepcopy when nested mutables must be independent.
with open(...) as f do under the hood?Uses a context manager (__enter__ / __exit__) so the file is closed reliably, including on exceptions.
Keys must be hashable (immutable types like str, int, tuple of immutables). Lists and dicts cannot be keys.
“Pass by object reference”: names bind to objects; assigning inside a function rebinds the local name—mutating a mutable object passed in is visible outside, rebinding the parameter is not.
map/filter?Comprehensions are idiomatic and often clearer; map/filter with lambdas can be harder to read. Performance is usually comparable for small data.
if __name__ == "__main__" for?Code that runs when the script is executed directly, but not when imported as a module—lets you mix library + CLI in one file.
Compact syntax to build lists/dicts from iterables with optional filtering—readable when kept short; nested comprehensions can harm readability.
Specific exceptions, retries with backoff for transient failures, dead-letter paths, structured logging, and idempotent stages so replays are safe.
Pair with SQL Reference Guide and SnowPro study for warehouse-side questions.