Discrete Mathematics
Discrete Math in Interviews
A Google Staff Engineer interview can ask 'prove the algorithm is optimal' or 'what is the lower bound here?' A Meta system design round may ask 'calculate the Bloom filter size for 10 billion URLs at 0.1% false positive rate.' Discrete math is not background knowledge - it is the language of engineering arguments at senior levels.
- **Google L6 coding interview:** grid path counting via C(m+n,m), no-consecutive-1s strings via recurrence - tests depth of mathematical thinking beyond LeetCode patterns
- **Meta system design:** 'How many bits for a Bloom filter serving 10^9 URLs at fp=0.01?' - direct application of m = -n*ln(eps)/(ln2)^2
- **Amazon operational excellence:** 'Prove the load balancing strategy avoids hotspots' - pigeonhole principle applied to hash distribution
- **Staff/Principal interviews:** loop invariant proofs for correctness, decision tree arguments for optimality - expected at L6+ levels
Предварительные знания
Counting Problems: Combinatorics in Interviews
In FAANG interviews, 80% of counting problems reduce to 5 patterns: permutation, combination, partition, Catalan, Stirling , collectively appearing in over 1,200 LeetCode problems. Counting problems appear in coding interviews in disguise: 'how many ways to arrange', 'how many paths in a grid', 'how many valid combinations'. The key skill is pattern recognition: identify the counting model (permutations, combinations, stars-and-bars) and translate it to a formula.
**Core counting formulas:** - **Permutations:** P(n,k) = n!/(n-k)! - ordered selection of k from n - **Combinations:** C(n,k) = n!/(k!(n-k)!) - unordered selection of k from n - **Stars and bars:** C(n+k-1, k) - distribute k identical objects into n distinct bins - **Inclusion-exclusion:** |AuBuC| = sum|Ai| - sum|AinAj| + |AnBnC| - **Multinomial:** n!/(n1!*n2!*...*nk!) - permutations of a multiset
Interview pattern recognition: 'choose k from n, order matters' = permutations. 'Choose k from n, order does not matter' = combinations. 'Distribute identical items into distinct bins' = stars and bars. 'At least one / at most k' = inclusion-exclusion or complement counting.
Grid path problems are a standard interview question. The DP solution (O(mn) time and space) is good, but knowing the closed form C(m+n, m) computed in O(min(m,n)) shows mathematical depth. Explain both and let the interviewer choose.
How many ways can one choose a team of 3 developers from a group of 10, if the order of selection does not matter?
Graph Problems: Bipartiteness and Strongly Connected Components
Graph theory appears in FAANG interviews through concrete algorithmic problems. Two fundamental patterns: bipartite checking (can we 2-color the graph?) and finding strongly connected components (SCC). Both use graph traversal but for different structural questions.
**Bipartite graph:** vertices split into two sets U and V, with edges only between U and V. Equivalent characterization: no odd-length cycles. Detection: BFS with 2-coloring in O(V+E). **Strongly connected components (SCC):** in a directed graph, a maximal subgraph where every vertex is reachable from every other. Kosaraju's algorithm: two DFS passes plus transpose graph, O(V+E).
Real applications of SCC: package dependency resolution (circular imports = one SCC), finding mutual friends clusters in social networks (Meta interview favorite), detecting deadlock cycles in distributed systems, and condensation DAGs for dependency ordering.
Always clarify in interviews: directed or undirected? Simple or multigraph? Connected or possibly disconnected? SCC only makes sense for directed graphs. For undirected graphs, use Union-Find for connected components in O(alpha(n)) per query.
A graph has an odd-length cycle. What can one conclude about bipartiteness?
Probability Problems and E[X] in Interviews
Probability puzzles test understanding of conditional probability, Bayes, and linearity of E[X]. Many seem hard but have elegant solutions once one pick the right probability space or apply the indicator variable trick.
**Classic interview probability patterns:** - **Coupon collector:** E[trials to collect all n coupons] = n * H(n) where H(n) = 1 + 1/2 + ... + 1/n - **Geometric retries:** E[attempts until success] = 1/p - **Monty Hall:** P(win by switching) = 2/3 via Bayes update - **Expected comparisons:** quicksort E[comparisons] via indicator variables - **Birthday paradox:** E[people until collision] = sqrt(pi*N/2) for N slots
Monty Hall explained for interviews: before the host opens a door, the initial choice covers 1/3 of probability. The remaining 2/3 is spread across the other two doors. The host's action is constrained (cannot open the door, cannot open the car door), so all the 2/3 probability collapses onto the single remaining door. Switching wins 2/3 of the time.
Coupon collector for interviews: E[X] = n*H(n) ≈ n*ln(n). For n=52 (playing cards), expect ~236 purchases to complete the set. This is why trading card sets sell so well: one need 4.5x as many packs as distinct cards. Frame the math as a product insight, not just a formula.
There are 4 equally likely coupon types. What is E[purchases needed to collect all 4 types]?
Mathematical Reasoning Under Interview Pressure
At Staff/Senior Engineer level, Google and Meta expect one to prove algorithm correctness, justify lower bounds, and explain why the approach is optimal. This is not about memorizing theorems - it is about having a structured reasoning framework one can deploy under pressure.
**Proof techniques for interviews:** - **Loop invariant:** what holds before / during / after each iteration - **Induction:** base case + inductive step (if true for n, prove for n+1) - **Contradiction:** assume A, derive impossibility - **Pigeonhole principle:** n+1 objects in n bins implies some bin has >= 2 - **Double counting:** count the same quantity two different ways
How to structure a proof in an interview: 1. name the technique ('I will use a loop invariant / induction / contradiction') 2. state the invariant or base case precisely 3. show the inductive step or derive the contradiction 4. summarize the conclusion in one sentence. The interviewer checks structure, not perfect formality.
Pigeonhole is the single most underused tool in technical interviews. 'Prove that any hash function with N outputs has a collision for N+1 inputs' - pigeonhole. 'Prove that in any sequence of n^2+1 distinct numbers there is a monotone subsequence of length n+1' - pigeonhole (Erdos-Szekeres theorem). Recognize these patterns and name the principle explicitly.
Why does comparison-based sorting require Omega(n log n) comparisons in the worst case? Which argument is correct?
Key Ideas
- **Counting toolkit:** C(n,k) for unordered selection, P(n,k) for ordered, stars-and-bars C(n+k-1,k) for distributing identical objects, inclusion-exclusion for constrained counts.
- **Bipartite checking:** graph is bipartite iff no odd cycles. BFS 2-coloring in O(V+E). Applications: scheduling, matching, conflict graphs.
- **Strongly connected components:** Kosaraju's two-DFS algorithm in O(V+E). Applications: circular dependencies, deadlock detection, DAG condensation.
- **Probability patterns:** coupon collector E[X]=n*H(n), Monty Hall via Bayes update (switch wins 2/3), quicksort E[comparisons] via indicator variables.
- **Proof techniques:** loop invariants for correctness, decision trees for lower bounds, pigeonhole for collision/existence arguments, induction for recurrences.
Related Topics
This final lesson synthesizes the entire course into interview readiness:
- Discrete Math in CS — NP-completeness recognition and Bloom filter design appear directly in system design interviews
- Random Variables — E[X] via indicator variables and coupon collector are direct applications of linearity of expectation
- Combinatorics — The counting formulas (C(n,k), stars-and-bars) are the foundation for all interview counting problems
Вопросы для размышления
- One are given the problem 'count the number of ways to place n non-attacking rooks on an n x n board.' How do one recognize the counting model in under 30 seconds, and what formula do applying?
- How would one explain the pigeonhole principle to a non-technical interviewer who asks one to prove that any hash function with m output slots must have a collision when m+1 distinct inputs are hashed?
- At a Staff Engineer interview, one are asked to prove the distributed task scheduling algorithm is load-balanced. Which mathematical technique would this leads to for, and how would one structure the argument in under 3 minutes?