Graph Search Techniques
- A graph search (or traversal) technique visits every node exactly once in a systematic fashion.
- Two standard graph search techniques have been widely used:
- Depth-First Search (DFS)
- Breadth-First Search (BFS)
- Depth-First Search (DFS)
- In the case of rooted binary trees, three recursive traversal techniques are widely used:
- Inorder Traversal
- Preorder Traversal
- Postorder Traversal
- Inorder Traversal
- The tree traversal techniques will be reviewed very briefly. BFS and DFS will ve covered in detail.
- Inorder Traversal
Procedure inorder(input: T) begin if T = nil then return; endif inorder(T --> left); // traverse the left subtree recursively visit(T); // visit/process the root inorder(T --> right); // traverse the right subtree recursively end
- Preorder Traversal
Procedure preorder(input: T) begin if T = nil then return; endif visit(T); // visit/process the root preorder(T --> left); // traverse the left subtree recursively preorder(T --> right); // traverse the right subtree recursively end
- Postorder Traversal
Procedure postorder(input: T) begin if T = nil then return; endif postorder(T --> left); // traverse the left subtree recursively postorder(T --> right); // traverse the right subtree recursively visit(T); // visit/process the root end
- DFS follows the following rules:
- Select an unvisited node s, visit it, and treat as the current node
- Find an unvisited neighbor of the current node, visit it, and make it the new current node;
- If the current node has no unvisited neighbors, backtrack to the its parent, and make that the new current node;
Repeat the above two steps until no more nodes can be visited.
- If there are still unvisited nodes, repeat from step 1.
- Select an unvisited node s, visit it, and treat as the current node
- Example in class
- Implementation
- Observations:
the last node visited is the first node from which to proceed.
Also, the backtracking proceeds on the basis of "last visited, first to backtrack too". - This suggests that a stack is the proper data structure to remember the current node and how to backtrack.
- Observations:
- Repeat the same example, but this time using a stack
- Algorithm
Procedure DFS(input: graph G) begin Stack S; Integer s,x; while (G has an unvisited node) do s := an unvisited node; //Corrected "v" to "s" from AY's page visit(s); push(s,S); While (S is not empty) do x := top(S); if (x has an unvisited neighbor y) then visit(y); push(y,S); else pop(S); endif endwhile endwhile end
- Time Complexity:
- Every node is visited once. Also, every edge (x,y) is "crossed" twice: one time when node y is checked from x to see if it is visited (if not visited, then y would be visited from x), and another time, when we backtrack from y to x.
- Therefore, the time of DFS is O(n+|E|).
- If the graph is connected, the time is O(|E|) because the graph has at least n-1 edges, and so n+|E| <= 2|E| -1, implying that n+|E| is O(|E|).
- Every node is visited once. Also, every edge (x,y) is "crossed" twice: one time when node y is checked from x to see if it is visited (if not visited, then y would be visited from x), and another time, when we backtrack from y to x.
- First Application of DFS: Connectivity
- by having a counter count the number of time the outer while-loop iterates in the DFS algorithm, the counter value at the end will be equal to the number of connected components of the input graph G.
- This is because the body of the outer loop, that is, every iteration, fully traverses the connected component that contains the node v.
- If the counter value is 1, then the graph is connected. That is, the DFS algorithm becomes a connectedness-testing algorithm that tells if a graph is connected, in O(n+|E|) time
- If the counter value is > 1, the algorithm will indicate that the graph is disconnected, and the nodes visited in each iteration constitute a separate connected component. In other terms, the DFS algorithm identifies the various connected components.
- by having a counter count the number of time the outer while-loop iterates in the DFS algorithm, the counter value at the end will be equal to the number of connected components of the input graph G.
- Second Application of DFS: minimum spanning trees in uniformly weighted graphs
- Assume that a graph is weighted, but all the weights are equal.
- In that case, all spanning trees are of the same weight (because all trees of n nodes have exactly n-1 edges)
- Thus, to find a minimum spanning tree in such graphs, it suffices to find any spanning tree.
- DFS yields a spanning tree (if the input graph is connected, otherwise, it is a spanning forest). That tree is then a minimum spanning tree. The time to compute the tree is O(|E|), which is better than the O(|E| log |E|) time MST algorithm for general weighted graphs.
- Assume that a graph is weighted, but all the weights are equal.
- Third Application of DFS: Biconnectivity (to be addressed later)
- BFS follows the following rules:
- Select an unvisited node s, visit it, have it be the root in a BFS tree being formed. Its level is called the current level.
- From each node x in the current level, in the order in which the level nodes were visited, visit all the unvisited neighbors of x. The newly visited nodes from this level form a new level that becomes the next current level.
- Repeat the previous step until no more nodes can be visited.
- If there are still unvisited nodes, repeat from Step 1.
- Select an unvisited node s, visit it, have it be the root in a BFS tree being formed. Its level is called the current level.
- Example in class
- Implementation
- Observations:
the first node visited in each level is the first node from which to proceed to visit new nodes. - This suggests that a queue is the proper data structure to remember the order of the steps.
- Observations:
- Repeat the same example, but this time using a stack
- Algorithm
Procedure BFS(input: graph G) begin Queue Q; Integer s,x; while (G has an unvisited node) do s := an unvisited node; visit(s); Enqueue(s,Q); While (Q is not empty) do x := Dequeue(Q); For (unvisited neighbor y of x) do visit(y); Enqueue(y,Q); endfor endwhile endwhile end
- Time Complexity:
- Every node is visited once. Also, every edge (x,y) is "crossed" once when node y is checked from x to see if it is visited (if not visited, then y would be visited from x).
- Therefore, the time of BFS is O(n+|E|).
- Every node is visited once. Also, every edge (x,y) is "crossed" once when node y is checked from x to see if it is visited (if not visited, then y would be visited from x).
- Applications of BFS:
Connectivity and spanning trees are easily done with BFS much as with DFS. The details are left as an exercise.V. Biconnectivity: A Major Application of DFS
- Definition: A node in a connected graph is called an articulation point if the deletion of that node disconnects the graph.
- Definition: A connected graph is called biconnected if it has no articulation points. That is, the deletion of any single node leaves the graph connected.
- In the case of networks, an articulation point is referred to as a single point of failure.
- The Biconnectivity Problem:
- Input: a connected graph G
- Problem: Determine whether or not the graph is biconncted. If not biconnected, find all the articulation
points.
Note the following classiffication of edges in a directed graph G after performing DFS starting at node 1, and explored in numerical order:
- Input: a connected graph G
- DFS on a connected graph G yields a DFS tree whose edges are from the graph. Draw those edges as straight edges. Add the remaining edges of the graph as dashed edges in the tree.
- Theorem: Each dashed edge goes from a descendant to an ancestor. For that reason, the dashed edges are called backward edges (or simply back edges).
- Proof: The proof is by contradiction.
Let (x,y) be a dashed edge between nodes that are not ancestor-descendent, that is, x and y are in separate subtrees of the DFS tree. Assume x was visited before y.
At the very last time (say time t) when a search for unvisited neighbors of x was conducted and none found, x was backtracked from, never to return to x again.
Well, since x is visited before y and y is not a descendant of x, y has not been visited at time t.
But since y is a neighbor of x and y is not visited at time t, y would have to be visited from x before the algorithm backtracks from x. That would make y a descendent of x.
Contradiction.Therefore, no such cross edge (x,y) can exist in a DFS tree.
Q.E.D. - Observations:
- the DFS tree along with the back edges form a new layout of the entire graph. In other terms, we don't have to refer to the original layout of G anymore. We can talk about the new DFS Tree plus Back-edges as the support graph. In the following sections, any mention of G refers to this support graph.
- If the root has more than one child, then the root is an articulation point. That is because the removal of the root makes the subtrees of that root disconnected from one another since there are no cross dashed edges between them.
- On the other hand, if the root has a single child, removing the root leaves a single tree in place, and thus the remaining graph is still connected.
- This suggests a first algorithm for identifying articulation points: Do a DFS from each node and check that node to see if it has more than one child. This algorithm, however, takes O(n|E|) time. A better algorithm will be designed that takes only O(|E|) time.
A better approach:
- the DFS tree along with the back edges form a new layout of the entire graph. In other terms, we don't have to refer to the original layout of G anymore. We can talk about the new DFS Tree plus Back-edges as the support graph. In the following sections, any mention of G refers to this support graph.
- The nodes of the graph will be relabeled so that the new labels carry meaningful information. Indeed, each node i will have two new labels: DFN[i] and L[i].
- DFN[i] is the time at which i is visited. Thus, the first node visited (i.e., the root) has its DFN = 1. The second node visited has a DFN = 2, and so on.
- To define L[w] for every node w, let's denote by w-tree the subtree rooted by w.
- Define L[w]=the DFN of the highest node (closest to the root) reachable from the w-tree by a back edge
or from w by a null path. - Note that if no back edge originates from the w-tree, L[w]=DFN[w].
- From the definition of L[w], it can be seen that
L[w]=the DFN of the highest node reachable from
- w by a back edge, or
- a v-tree by a back edge where v is a child of w, or
- w by a null path (i.e., w itself)
- w by a back edge, or
- This definition of L[w] is motivated by the observation that a non-root node x is an articulation point if and only if x has a subtree (w-tree) such every backward edge that originates from that subtree ends at x or below, i.e., L[w] is at x or below.
- Observe that L[w] is at x or below if and only if L[w] >= DFN[x].
- Therefore, we have:
a non-root node x is an articulation point if and only if x has a child w
such that L[w] >= DFN[x]. - Therefore, once we compute the DFNs and L's of all the nodes, it is easy to determine which node is an articulation point. What remains is how to compute L.
- By the definition of DFN (where the nodes visited earlier have smaller DFN values), the definition of L[w] can be restated as:
L[w]=min{DFN[y] where y is w or is reachable from
- w by a back edge, or
- a v-tree by a back edge where v is a child of w}
- w by a back edge, or
- observe that the last line can be replaced by L[v]
- Therefore,
L[w]=min{DFN[w], min{DFN[y] | where y is reachable from w}, {L[v]|v is a child of w}}
- The latter definition allows for a bottom-up computation of L[x] using DFS. Once the Ls of the children have been computed, L[x] can easily be derived.
Take the following Graph:
Low points (updated as shown) of each node may be obtained when edges are examined by the
following sequence: (a; b); (b; c); (c; d), (d; e); (e; c); (e; f), (f; d); (c; a); (c; g), (a; h); (h; i); (i; j),
(j; k); (k; l); (l; j), (k; i); (j; h). - The computation of L[x], being a minimum, will be carried out in progression: we initialize L[x] to DFN[x], and thereafter, everytime a relevant term (to be minimized over) becomes available, the value of L[x] is updated by comparing it to that term and assigning the term to L[x] if the term is smaller.
- The Algorithm for finding the articulation points is therefore based on DFS. The algorithm is presented next, as a modification of the DFS algorithm. The changes are put in green color.
Procedure DFS(input: graph G,) begin Stack S; Integer s,x; Integer DFN[1:n], L[1:n], Parent[1:n]; Integer num := 1; s := an unvisited node; visit(s); push(s,S); DFN[s] := num; num++; L[s] := DFN[s]; While (S is not empty) do x := top(S); if (x has an unvisited neighbor y) then visit(y); push(y,S); DFN[y] := num; num++; Parent[y] := x; L[y] := DFN[y]; else pop(S); for (every neighbor y of x) do if (y != parent[x] and DFN[y] < DFN[x]) then /* y is an ancestor of x, and (x,y) is a back edge*/ L[x] := min(L[x],DFN[y]); else if (x = Parent[y]) then L[x] : min(L[x],L[y]); if (L[y] >= DFN[x] and x is not root) then x is an articulation point; endif endif endif endfor endif endwhile if (s has more than one child) then s is an articulation point; endif end
- Time Complexity:
The new statements add constant-time operations except for the new for loop at the time of backtracking.
This new for-loop crosses the edges one more time to update the L values and check for articulation points. This increases the time by another O(|E|).
The final if-statement to check for the status of the root can be done by scanning the Parent array to count the number of children of the root s ( by counting the number of nodes whose Parent is s). This takes O(n) = O(|E|) time.
Therefore, the time complexity of the whole algorithm is O(|E|).
- Definition: A node in a connected graph is called an articulation point if the deletion of that node disconnects the graph.
I. Introduction
II. Tree Traversal Techniques
III. Depth-First Search
IV. Breadth-First Search
V. Biconnectivity: A Major Application of DFS
I. Introduction
II. Tree Traversal Techniques
III. Depth-First Search
IV. Breadth-First Search