diff --git a/58-mst.tex b/58-mst.tex index 2326b8fec8b7db5a2a10f79124ae0c76cdaa514d..3568a3c30f692bb0a7741501d9662401f26ed918 100644 --- a/58-mst.tex +++ b/58-mst.tex @@ -2,3 +2,118 @@ \chapter{Minimum Spanning Trees} +We turn to another classic algorithmic problem on graphs. +Suppose we have a collection of cities $C_1, C_2, \dots, C_n$, +and we would like to build a railroad network. +In principle, we can build a connection between any pair of +cities, but due to geographic realities, some connections can be +much cheaper than others. +Suppose that we have already hired a construction company, and for each +pair $C_i, C_j$ of cities, the company has determined an estimate +$\ell_{ij}$ of the cost for building a direct railroad connection +between $i$ and $j$. Now, our task is as follows: we would like +to select a set of railroad connections such that (i) it is possible +to travel from each city to each other; and (ii) the total cost +of the railroad connections is as small as possible.\footnote{In this +version of the problem, we just care about connectivity. We do not +consider the travel time between cities.} + +This problem can be formalized as follows: we are given +an undirected, connected graph $G = (V, E)$ and a +\emph{weight function} $\ell: E \rightarrow \mathbb{R}^+$ +that assigns a positive cost to each edge. The goal is to +select a set $T \subseteq E$ of edges such that (i) the +subgraph $(V, T)$ of $G$ is connected; and (ii) to total +cost $\sum_{e \in T} \ell(e)$ of $T$ is as small as possible, +among all $T \subseteq E$ that lead to a connected subgraph. + +Since all weights are positive, we see that the subgraph $(V, T)$ +cannot have any cycles: it there were a cycle $C$ in $(V, T)$, +we could remove an arbitrary edge $e$ of $C$. The resulting graph +$(V, T \setminus \{e\})$ would +still be connected, and the total edge weight would be strictly smaller +than for $(V, T)$. Thus, the subgraph $(V, T)$ is a \emph{spanning tree} +for $G$. Because it has the smallest weight, +it is called a \emph{minimum spanning tree} (MST) for $G$. + +We would now like to design an algorithm to construct a minimum +spanning tree for $G$. Our approach is to try a \emph{greedy} strategy. +We would like to start with the empty set $A = \emptyset$, and to add +edges to $A$, one by one, until $A$ is a minimum spanning tree. We capture this +idea in the following definition: + +\textbf{Definition:} Let $G = (V, E)$ be an undirected, connected graph, +and let $\ell: E \rightarrow \mathbb{R}^+$ be a positive weight function. +A set $A \subseteq E$ of edges is called \emph{safe} if there exists an MST +$T \subseteq E$ such that $A \subseteq T$. + +In other words, an edge set $A$ is safe if and only if it is still possible +to extend $A$ to an MST for $G$ by adding more edges. Since $G$ is connected, +an MST for $G$ always exists, and hence $A = \emptyset$, the set that contains +no edges, is safe. This is our starting point. + +Now, all we need is a criterion that allows us to extend a given safe set by one more edge, +resulting in a larger safe set. To state such a criterion, we need one more definition: + +\textbf{Definition:} Let $G = (V, E)$ be an undirected, connected graph. +Let $S \subset V$, $S \neq \emptyset, V$ be a nontrivial +subset of vertices in $G$. +Then, the partition $(S, V \setminus S)$ is called a \emph{cut} of $G$. +We say that an edge $e \in E$ \emph{crosses} the cut $(V, V \setminus S)$ +if $e$ has exactly one endpoint in $S$ and one endpoint in $V \setminus S$. +Otherwise, we say that $e$ \emph{respects} the cut. + +The following lemma tells us when we can add an edge to a safe set $A$: +we choose a cut $(S, V \setminus E)$ that respects all edges in $A$, and we choose +an edge of minimum weight that crosses the cut. More formally, the statement is as follows: + +\begin{lemma} + \label{lem:safe_set} +Let $G = (V, E)$ be an undirected, connected graph, +and let $\ell: E \rightarrow \mathbb{R}^+$ be a positive weight function. +Let $A \subseteq E$ be a safe set of vertices. +Let $S \subset V, S \neq \emptyset, V$ be a nontrivial subset of vertices in $V$, +such that all edges of $A$ respect the cut $(S, V \setminus S)$. + +Now, the following holds: among all edges $f$ that cross the cut $(S, V \setminus S)$, +let $e$ be an edge such that the weight $\ell(e)$ is minimum. Then, the set +$A \cup \{e\}$ is safe. +\end{lemma} + +\begin{proof} +Since $A$ is safe, there exists an MST $T \subseteq E$ such that $A \subseteq T$. + +If $e \in T$, then we are done: in this case, we have $A \cup \{e\} \subseteq T$, +and $A \cup \{e\}$ is safe, as witnessed by $T$. + +Thus, suppose that $e \not\in T$. Consider the suggraph $(V, T \cup \{e \})$. +Since $T$ is a tree, and since $e \not \in T$, +it follows that adding $e$ to $T$ creates a cycle. We call +this cycle $C$, and we note that $e$ is an edge of $C$. +Since (i) $C$ is a cycle; (ii) $e$ is an edge of $C$; and (iii) $e$ crosses +the cut $(S, V \setminus S)$, it follows that $C$ contains another edge +$f \neq e$ that also crosses the cut $(S, V \setminus S)$. +By the choice of $e$, we have $\ell(e) \leq \ell(f)$. + +Now, consider the edge set $T' = (T \cup \{e\}) \setminus \{f\}$. +Then, since $e$ and $f$ lie in a common cycle, the subgraph +$(V, T')$ is connected, and hence a spanning tree for $G$. Furthermore, +because $\ell(e) \leq \ell(f)$, the total weight of $T'$ is not larger +than the total weight of $T$. Thus, $T'$ is also an MST for $G$. +Finally we have that $A \cup \{e\} \subseteq T'$. Thus, +$A \cup \{e\}$ is safe, as claimed. +\end{proof} + +As in Huffman codes, Lemma~\ref{lem:safe_set} is an \emph{exchange lemma}: +it shows that if we have an MST $T$, we can locally modify $T$ to obtain +a new MST $T'$ that contains the desired edge~$e$. + +With Lemma~\ref{lem:safe_set} at hand, the strategy for a general MST-algorithm +is clear: we start with the empty set $A = \emptyset$, and we successively add +edges to $A$, until the MST is complete. In each step, we must construct a suitable +cut $(S, V \setminus S)$ that respects $A$, and to find an edge of minimum weight +that crosses this cut. There are different ways how this can be done. Depending on +our choice of the cut, there are different concrete MST-algorithms that follow this +general strategy. We will look at a few of them in more detail. + +\paragraph{The algorithm of Prim-Jarn\'ik-Dijkstra.}