Kruskal's algorithm.

de56055b · Wolfgang Mulzer · a508dbda · de56055b · de56055b
Commit de56055b authored 7 months ago by Wolfgang Mulzer
--- a/58-mst.tex
+++ b/58-mst.tex
@@ -195,4 +195,81 @@ Using the best available results for this, we achieve a running time
 if $O(|V| \log |V| + |E|)$.

 \paragraph{Kruskal's algorithm.} Kruskal's algorithm uses a more global strategy
-for selecting the next safe edge.
+for selecting the next safe edge. The idea is to take in each step the \emph{lightest}
+edge in $E$ that crosses a cut that respects the current set $A$. How does such an
+edge look like? In general, the subgraph $(V, A)$ is a \emph{forest}: a graph on
+$V$ that does not contain any cycles, but that consists of several connected components
+(some of them may be a single vertex). For an edge $e \in E \setminus A$, there
+is a cut $(S, V \setminus S)$ that respects $A$ and that is crossed by $e$ if and only
+if the endpoints of $e$ lie in different components of the forest $(V, A)$ (this cut
+must separate the components that contain the endpoints of $e$, and the other components
+can be assigned arbitrarily).
+
+Thus, Kruskal's algorithm works as follows: sort the edges in $E$ by weight, from the lightest
+edge to the heaviest edge. Set $A = \emptyset$. For each edge $e$ in this order, check if
+$e$ connects two different components of the current forest $(V, A)$. If so, add $e$ to $A$.
+Otherwise, do nothing. Continue, until all edges have been considered.
+In other words: in each step, Kruskal's algorithm adds the lightest edge to $A$ that adds
+a new connection.
+The pseudocode is as follows:
+\begin{verbatim}
+  A <- {}
+  Sort e by weight
+  for each edge e in sorted order do
+    if e connects two different components of (V, A) then
+      A <- A + {e}
+  return A
+\end{verbatim}
+We still need to explain how to check whether an edge $e$ connects
+two different components of $(V, A)$. For this, we need a way
+to track the connected components of $(V, A)$, as edges
+are added to $A$. More precisely, we need to support the
+following process: initially, we have $A = \emptyset$, and
+each component of $(V, A)$ consists of a single vertex. 
+Then, in each round, we are given an edge $\{v, w\}$ in $E$,
+and we need to \emph{find} the components of $(V, A)$ that contain
+the endpoints $v$ and $w$. If these two components are distinct,
+we need to construct the \emph{union} of the two connected components
+for $v$ and $w$.
+
+This is called the \emph{union-find-problem}. The abstract problem is as follows:
+we are given a set $\mathcal{U} = \{1, \dots, n\}$ with $n$ elements, and our task is to
+maintain a \emph{partition} $\mathcal{P} = \{U_1, U_2, \dots, U_k\}$ of $U$ under the following
+operations\footnote{Recall that a partition of $U$ is a set
+$\mathcal{P} = \{U_1, U_2, \dots, U_k\}$ 
+of nonempty, pairwise disjoint subsets  of $U$ such that every element of $U$ occurs
+in exactly one set.
+Formally, we have (i)  $U_i \subseteq U$ and $U_i \neq \emptyset$,
+for $i = 1, \dots, k$; (ii) $U_i \neq U_j$, for $i, j = 1, \dots, k$, $i \neq j$;
+and (iii) $\bigcup_{i = 1}^k U_i = U$.
+}: 
+(i) \texttt{initialize}$(U)$: create a new partition $\mathcal{P} = \{\{1\}, \{2\}, \dots, \{n\}\}$
+in which every set consists of a single element; (ii) \texttt{find}$(a)$, for 
+$u \in \mathcal{U}$: return a \emph{representative element} $v \in U_i$, where
+$U_i$ is the set in the current partition $\mathcal{P}$ that contains $u$. Crucially, we require
+that for all $u \in U_i$, the function \texttt{find}$(a)$ returns
+\emph{the same} representative $v \in U_i$ (note that it also follows that
+for $u_1, u_2 \in U$ that lie in \emph{different} sets of the current partition,
+\texttt{find}$(u_1)$ and \texttt{find}$(u_2)$ return \emph{different} representatives);
+and (iii) \texttt{union}$(v_1, v_2)$, where $v_1 \in U$ is the representative element for a
+set $U_i$ in the current partition and $v_2$ is the representative element for a
+different set $U_j$ in the current partition: replace the sets $U_i, U_j$ in the 
+current partition $\mathcal{P}$ by their union $U_i \cup U_j$. That is,
+the new partition after \texttt{union}$(v_1, v_2)$  is 
+$\left(\mathcal{P} \setminus \{U_i, U_j\}\right) \cup \{U_i \cup U_j\}$.
+
+Given a data structure that supports the operations of the union-find problem,
+we can now provide a more detailed pseudo-code for Kruskal's algorithm:
+\begin{verbatim}
+  A <- {}
+  initialize(V)
+  Sort e by weight
+  for each edge e = {v, w} in sorted order do
+    a = find(v)
+    b = find(w)
+    if a != b then
+      A <- A + {e}
+      union(a, b)
+  return A
+\end{verbatim}
+
--- a/skript.pdf
+++ b/skript.pdf