Discrete Mathematics

Graph Theory Fundamentals

Graphs are not a textbook abstraction. Git history is a DAG of commits. npm is a dependency graph. The internet is a graph of web pages. Social networks, maps, compilers, databases: all of these reduce to graph problems.

**npm/pip:** when installing a package, the manager topologically sorts the dependency graph to install everything in the correct order
**Google Maps:** BFS or Dijkstra on a weighted road graph builds the optimal route
**Git:** commit history is a DAG, and `git log --graph` visualizes it
**LinkedIn:** 'degrees of separation' is BFS on the professional connection graph

Предварительные знания

Inclusion-Exclusion Principle

What a Graph Is

A graph G = (V, E) is a pair: a set of vertices V and a set of edges E, where each edge connects two vertices. Graphs are a universal language for describing relationships: road networks, module dependencies, social connections, and anything that can be modeled as "things and links between them."

**Undirected graph:** edge (u, v) = (v, u), the relationship is symmetric (friendship on a social network). **Directed graph (digraph):** edge (u → v) ≠ (v → u), the relationship is one-way (package imports, subscriptions, hyperlinks).

Graph type	Edges	Example
Undirected	(u, v) = (v, u)	Road network
Directed	(u → v) ≠ (v → u)	Package dependencies
Weighted	edge carries weight w	Distances between cities
Multigraph	multiple edges between a pair	Flight routes between airports

A graph can be stored in three ways: **adjacency matrix** (O(V²) space, O(1) edge lookup), **adjacency list** (O(V+E) space, efficient for sparse graphs), or **edge list** (compact, convenient for algorithms like Kruskal's). The right choice depends on the graph's density.

Real-world graphs are almost always **sparse** (E << V²): each social network user has hundreds of friends, not billions. Adjacency list is the default choice.

A Python module import system (A imports B, B imports C) needs to be modeled as a graph. Which type fits best?

Vertex Degree and Adjacency

The degree of a vertex deg(v) is the number of edges incident to v. In directed graphs, distinguish in-degree (incoming edges) from out-degree (outgoing edges). Degree is a basic characteristic: high degree means a hub in the network.

**Handshaking lemma:** the sum of all vertex degrees equals twice the number of edges: Σ deg(v) = 2|E|. Consequence: the number of vertices with odd degree is always even.

The adjacency matrix A[i][j] = 1 if there is an edge between vertices i and j, 0 otherwise. For weighted graphs, A[i][j] = edge weight. The k-th power of A gives the number of paths of length k between all pairs of vertices, a powerful analytical tool.

Google's PageRank is based on in-degree: a page is 'important' if edges to it come from other important pages. This generalizes simple degree counting into a recursive importance measure.

An undirected graph has 6 vertices and 9 edges. What is the sum of all vertex degrees?

Traversals: DFS and BFS

Graph traversal means systematically visiting all reachable vertices. Two fundamental methods: **DFS** (depth-first search) goes as deep as possible before backtracking. **BFS** (breadth-first search) explores all vertices at the current distance before moving further.

Criterion	DFS	BFS
Data structure	Stack (recursion)	Queue (deque)
Shortest path	No	Yes (by edge count)
Cycle detection	Yes	Yes
Topological sort	Yes	No
Time complexity	O(V + E)	O(V + E)

DFS on a large graph can cause **stack overflow**: recursion depth reaches V. Use iterative DFS with an explicit stack for production code on graphs with millions of vertices.

BFS is the natural choice for 'find the nearest' problems: shortest path in a maze, minimum hops between network nodes, degree of separation on a social graph. DFS is better for topological sorting, cycle detection, and exhaustive path enumeration.

The task is to find the shortest path between two users in a social network (minimum intermediate connections). Which algorithm fits?

Connected Components

A graph is **connected** if there is a path between every pair of vertices. If the graph is disconnected, it splits into **connected components**: maximal connected subgraphs. Finding components is the first step in analyzing any real-world graph.

**Strong connectivity** in directed graphs: a strongly connected component (SCC) is a maximal set of vertices where every vertex is reachable from every other. Kosaraju's algorithm finds all SCCs in O(V+E) using two DFS passes.

Union-Find (Disjoint Set Union, DSU) is an efficient data structure for tracking components dynamically. It supports 'merge two components' and 'are these two vertices in the same component?' in amortized O(α(V)) ≈ O(1) with path compression and union by rank.

When analyzing legacy codebases, connected components of the dependency graph reveal islands: isolated modules with no connections to the rest. Multiple components often indicate dead code or unused libraries.

A graph with 8 vertices has edges: {1-2, 2-3, 4-5, 6-7, 7-8}. Vertex 8 is not mentioned otherwise. How many connected components does it have?

Key Ideas

**G = (V, E):** a graph is a set of vertices and edges; directed or undirected depending on the problem
**Adjacency list:** the standard representation for sparse graphs, O(V+E) space
**Handshaking lemma:** Σ deg(v) = 2|E|, a fundamental identity about vertex degrees
**BFS:** shortest path by edge count, explores level by level
**DFS:** topological sort, cycle detection, exhaustive path search
**Connected components:** find isolated islands in O(V+E) via DFS or BFS

Вопросы для размышления

Which representation (matrix or adjacency list) is more efficient for a graph with 10,000 vertices and 15,000 edges, and why?
How is a cycle detected in a directed dependency graph, and what does its presence imply in practice?
If BFS and DFS have the same complexity O(V+E), why does the choice between them still matter?

Связанные уроки

Discrete Mathematics

Graph Theory Fundamentals

**npm/pip:** when installing a package, the manager topologically sorts the dependency graph to install everything in the correct order
**Google Maps:** BFS or Dijkstra on a weighted road graph builds the optimal route
**Git:** commit history is a DAG, and `git log --graph` visualizes it
**LinkedIn:** 'degrees of separation' is BFS on the professional connection graph

Предварительные знания

Inclusion-Exclusion Principle

What a Graph Is

Graph type	Edges	Example
Undirected	(u, v) = (v, u)	Road network
Directed	(u → v) ≠ (v → u)	Package dependencies
Weighted	edge carries weight w	Distances between cities
Multigraph	multiple edges between a pair	Flight routes between airports

Real-world graphs are almost always **sparse** (E << V²): each social network user has hundreds of friends, not billions. Adjacency list is the default choice.

A Python module import system (A imports B, B imports C) needs to be modeled as a graph. Which type fits best?

Vertex Degree and Adjacency

**Handshaking lemma:** the sum of all vertex degrees equals twice the number of edges: Σ deg(v) = 2|E|. Consequence: the number of vertices with odd degree is always even.

Google's PageRank is based on in-degree: a page is 'important' if edges to it come from other important pages. This generalizes simple degree counting into a recursive importance measure.

An undirected graph has 6 vertices and 9 edges. What is the sum of all vertex degrees?

Traversals: DFS and BFS

Criterion	DFS	BFS
Data structure	Stack (recursion)	Queue (deque)
Shortest path	No	Yes (by edge count)
Cycle detection	Yes	Yes
Topological sort	Yes	No
Time complexity	O(V + E)	O(V + E)

DFS on a large graph can cause **stack overflow**: recursion depth reaches V. Use iterative DFS with an explicit stack for production code on graphs with millions of vertices.

The task is to find the shortest path between two users in a social network (minimum intermediate connections). Which algorithm fits?

Connected Components

A graph with 8 vertices has edges: {1-2, 2-3, 4-5, 6-7, 7-8}. Vertex 8 is not mentioned otherwise. How many connected components does it have?

Key Ideas

**G = (V, E):** a graph is a set of vertices and edges; directed or undirected depending on the problem
**Adjacency list:** the standard representation for sparse graphs, O(V+E) space
**Handshaking lemma:** Σ deg(v) = 2|E|, a fundamental identity about vertex degrees
**BFS:** shortest path by edge count, explores level by level
**DFS:** topological sort, cycle detection, exhaustive path search
**Connected components:** find isolated islands in O(V+E) via DFS or BFS

Вопросы для размышления

Which representation (matrix or adjacency list) is more efficient for a graph with 10,000 vertices and 15,000 edges, and why?
How is a cycle detected in a directed dependency graph, and what does its presence imply in practice?
If BFS and DFS have the same complexity O(V+E), why does the choice between them still matter?

Graph Theory Fundamentals

Предварительные знания

What a Graph Is

Vertex Degree and Adjacency

Traversals: DFS and BFS

Connected Components

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки

Graph Theory Fundamentals

Предварительные знания

What a Graph Is

Vertex Degree and Adjacency

Traversals: DFS and BFS

Connected Components

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки