Formal Languages

Formal Languages in FAANG Interviews

Google L5 system design: 'Design a query language for a distributed key-value store'. The right answer starts from the Chomsky hierarchy: regular for simple lookups, CFG for conditional queries. Knowing the theory gives you a structured answer where others are guessing.

  • LeetCode 10 (Regex Matching): NFA via DP. Shows up in Google, Meta, and Amazon screens
  • Amazon system design: workflow state machine, a statechart with explicit transitions
  • Meta infrastructure: a query language for logging. You need to know the trade-offs of Turing-complete vs constrained languages
  • Google Research: NP-hardness proofs for scheduling problems in interviews

Automata in Interviews

Automata questions show up at Google and Meta as tasks about regular expressions, lexers, and state machines. Direct 'build a DFA for this language' questions are rarer.

  • LeetCode 10 (Regex Matching): NFA via DP, a typical Google L5+ question
  • LeetCode 44 (Wildcard Matching): DFA with backtracking or DP, similar in spirit
  • Valid Number (LeetCode 65): build an explicit DFA for numeric literals, pure automaton
  • IP address validation: a simple DFA, comes up in Meta screens
  • State machine for workflow: Amazon system design often asks for a statechart

Why is regex matching solved via DP rather than by constructing an explicit NFA?

Complexity Theory in Interviews

Complexity theory in a FAANG interview shows up as system design questions ('why don't we use algorithm X here?') and as questions about the complexity of specific tasks. Direct theoretical questions are less common but do appear in research roles.

  • P vs NP: be able to explain it in 30 seconds without jargon. Required for L6+
  • NP-completeness: know 5-6 examples and how to discuss approximations
  • Reduction: know how to prove a problem NP-hard via reduction from SAT or 3-SAT
  • Undecidability: understand why some static analysis problems are undecidable (Rice)
  • Halting problem: the practical corollary is that you cannot statically detect all infinite loops

How would you explain P vs NP in 30 seconds without theoretical jargon?

Language Design: System Design Questions

System design questions at Amazon and Google sometimes ask you to design a DSL or query language. Knowledge of language theory gives concrete leverage here.

Why can a SQL optimizer reorder WHERE conditions while a Python interpreter cannot?

Proofs and Pitfalls

At research-level interviews (Google Research, DeepMind, Meta AI Research) you may be asked to sketch a proof. It pays to know the standard techniques and the common pitfalls.

  • Pitfall: 'exponential' does not mean 'NP-hard'. There are exponential problems that are not NP-complete
  • Pitfall: NP-hard != NP-complete. NP-hard can sit outside NP (for instance PSPACE-hard)
  • Pitfall: reduce FROM an easy problem TO a hard one (not the other way) to prove hardness
  • Pitfall: 2-SAT is in P, 3-SAT is NP-complete. Know the boundary
  • Pitfall: co-NP != NP (conjecturally). co-NP-complete != NP-complete

How do you prove a new problem B is NP-hard via reduction from 3-SAT?

Key Takeaways

  • Automata: regex matching as an implicit NFA via DP. LeetCode Hard, Google classic
  • Complexity: P vs NP in 30 seconds, plus 5-6 NP-complete examples
  • Language design: the Chomsky hierarchy structures the choice of parser and optimizer
  • Proofs: pumping lemma, diagonalization, reduction. Three standard techniques

Related Topics

Interview questions cover the entire formal languages curriculum:

  • Finite automata — Regex matching (LeetCode 10), IP validation, number parsing. All implicit DFA/NFA
  • NP-completeness — Scheduling, packing, and partitioning problems. Hardness proofs via reduction
  • Rice's theorem — Static analysis and linters cannot be complete. Undecidability bounds tooling
  • Parser combinators — DSL design: choosing between regex, CFG parser, and a full language defines optimizability

Вопросы для размышления

  • How do you explain the difference between 'our algorithm is NP-hard' and 'our problem is NP-hard' to an interviewer?
  • Why is GraphQL intentionally not Turing-complete, and what advantages does that give the server?
  • How does Rice's theorem explain why TypeScript cannot fully prove the absence of runtime errors?

Связанные уроки

  • comp-07-regex-in-lexer
  • comp-10-cfg
Formal Languages in FAANG Interviews

0

1

Sign In