Formal Languages
Formal Languages in FAANG Interviews
Google L5 system design: 'Design a query language for a distributed key-value store'. The right answer starts from the Chomsky hierarchy: regular for simple lookups, CFG for conditional queries. Knowing the theory gives you a structured answer where others are guessing.
- LeetCode 10 (Regex Matching): NFA via DP. Shows up in Google, Meta, and Amazon screens
- Amazon system design: workflow state machine, a statechart with explicit transitions
- Meta infrastructure: a query language for logging. You need to know the trade-offs of Turing-complete vs constrained languages
- Google Research: NP-hardness proofs for scheduling problems in interviews
Automata in Interviews
Automata questions show up at Google and Meta as tasks about regular expressions, lexers, and state machines. Direct 'build a DFA for this language' questions are rarer.
- LeetCode 10 (Regex Matching): NFA via DP, a typical Google L5+ question
- LeetCode 44 (Wildcard Matching): DFA with backtracking or DP, similar in spirit
- Valid Number (LeetCode 65): build an explicit DFA for numeric literals, pure automaton
- IP address validation: a simple DFA, comes up in Meta screens
- State machine for workflow: Amazon system design often asks for a statechart
Why is regex matching solved via DP rather than by constructing an explicit NFA?
Complexity Theory in Interviews
Complexity theory in a FAANG interview shows up as system design questions ('why don't we use algorithm X here?') and as questions about the complexity of specific tasks. Direct theoretical questions are less common but do appear in research roles.
- P vs NP: be able to explain it in 30 seconds without jargon. Required for L6+
- NP-completeness: know 5-6 examples and how to discuss approximations
- Reduction: know how to prove a problem NP-hard via reduction from SAT or 3-SAT
- Undecidability: understand why some static analysis problems are undecidable (Rice)
- Halting problem: the practical corollary is that you cannot statically detect all infinite loops
How would you explain P vs NP in 30 seconds without theoretical jargon?
Language Design: System Design Questions
System design questions at Amazon and Google sometimes ask you to design a DSL or query language. Knowledge of language theory gives concrete leverage here.
Why can a SQL optimizer reorder WHERE conditions while a Python interpreter cannot?
Proofs and Pitfalls
At research-level interviews (Google Research, DeepMind, Meta AI Research) you may be asked to sketch a proof. It pays to know the standard techniques and the common pitfalls.
- Pitfall: 'exponential' does not mean 'NP-hard'. There are exponential problems that are not NP-complete
- Pitfall: NP-hard != NP-complete. NP-hard can sit outside NP (for instance PSPACE-hard)
- Pitfall: reduce FROM an easy problem TO a hard one (not the other way) to prove hardness
- Pitfall: 2-SAT is in P, 3-SAT is NP-complete. Know the boundary
- Pitfall: co-NP != NP (conjecturally). co-NP-complete != NP-complete
How do you prove a new problem B is NP-hard via reduction from 3-SAT?
Key Takeaways
- Automata: regex matching as an implicit NFA via DP. LeetCode Hard, Google classic
- Complexity: P vs NP in 30 seconds, plus 5-6 NP-complete examples
- Language design: the Chomsky hierarchy structures the choice of parser and optimizer
- Proofs: pumping lemma, diagonalization, reduction. Three standard techniques
Related Topics
Interview questions cover the entire formal languages curriculum:
- Finite automata — Regex matching (LeetCode 10), IP validation, number parsing. All implicit DFA/NFA
- NP-completeness — Scheduling, packing, and partitioning problems. Hardness proofs via reduction
- Rice's theorem — Static analysis and linters cannot be complete. Undecidability bounds tooling
- Parser combinators — DSL design: choosing between regex, CFG parser, and a full language defines optimizability
Вопросы для размышления
- How do you explain the difference between 'our algorithm is NP-hard' and 'our problem is NP-hard' to an interviewer?
- Why is GraphQL intentionally not Turing-complete, and what advantages does that give the server?
- How does Rice's theorem explain why TypeScript cannot fully prove the absence of runtime errors?