gogoWebsite

Topological sorting (idea + code + example)

Updated to 2 hours ago

Table of contents

1. What is a directional acyclic diagram

1.1 Image understanding

1.2 Formal definition

1.3DAG features

1.4 DAG practical application

2. What is the idea of ​​topological sorting?

2.1 The core idea of ​​topological sorting

2.2 Implementation method of topological sorting

2.3 Application of topological sorting

3. Methods based on Kahn algorithm

4. Depth-first search (DFS)-based method


1. What is a directional acyclic diagram

Before understanding the sorting of graphs, we first understand what a directional acyclic graph is.

Directed Amorphous Graph, abbreviationDAG(Directed Acyclic Graph), a concept in graph theory, has the following two main characteristics:

  1. Directed(Directed):

    Each edge in the graph has a direction, that is, each edge points from one node to another. For example, if there is an edge u→v, this means that from node u points to node v.
  2. Ringless(Acyclic):

    There is no loop in the figure that contains multiple nodes. In other words, starting from one node and among other nodes that can be reached through several edges, it is impossible to return to this starting node.

1.1 Image understanding

  • DirectedIt means that you can only "walking" along the edge in the picture. Just like a one-way street network, you can only drive in the designated direction and not in the opposite direction.

  • RinglessIt means that in this graph, you cannot start from one node, and after passing several nodes, you will return to the starting point. It's like you're in a one-way street network, it's impossible to go around and go back to the starting point.

1.2 Formal definition

  • A graph G=(V,E) consists of a set of nodes V and a set of directed edges E. If graph G is a directed acyclic graph, then for each path v1→v2→⋯→vn in the graph, it is impossible to v1=vn in the path, that is, it is impossible to form a closed path from a certain node to its own.

1.3DAG features

  1. Topological sorting:DAG is the only graph structure that can perform topological sorting. Because there are no rings, nodes can be sorted by dependencies.

  2. Dependencies:DAG is often used to represent dependencies. For example, task scheduling, compilation sequence, data flow chart, etc. can all be represented by DAG.

  3. The relationship between undirected acyclic graph (tree) and DAG

    A tree is a special undirected and acyclic diagram. If each undirected edge in the tree is turned into a directed edge, and all edges point from the parent to the child, it becomes a DAG.

1.4 DAG practical application

  1. Task Scheduling: In the operating system, the dependencies of tasks are usually represented by DAG, ensuring that the dependent tasks are executed sequentially.

  2. Compilation dependencies: When compiling large software, there may be dependencies between different modules, and DAG can be used to determine the compilation order.

  3. Version control: In distributed version control systems (such as Git), commit history can be represented by DAG, and branching and merge operations create or connect nodes in DAG.

  4. Data flow analysis: In compiler optimization, DAG is used to represent the dependencies between expressions and variables, and optimize the calculation process.

Example

Consider the following DAG:

1 → 2 → 3 → 5
     ↘ 4 ↗
  • Here, node 1 points to node 2, node 2 points to node 3 and 4, and node 4 and 3 points to node 5.
  • There is no ring in this diagram, so it is a DAG.
  • One of the possible results of topological sorting is:1 2 4 3 5or1 2 3 4 5, both sorts conform to the nature of DAG.

2. What is the idea of ​​topological sorting?

After knowing the directed acyclic graph, let’s understand topological sorting.

Topological Sorting is a targetDirected Amorphous Graph(DAG, Directed Acyclic Graph) sorting method. It arranges all nodes in the graph in a linear order, so that for each directed edge u→v in the graph, node u is ranked before node v.

2.1 The core idea of ​​topological sorting

  1. Dependencies sorting

    The goal of topological sorting is to sort nodes, ensuring that in the sorting result, each node is behind all its dependent nodes. That is, if there is a directed edge u→v, u must be before v in the sorting result.
  2. Acyclic

    Topological sorting can only be applied to directed acyclic graphs (DAGs). If there are loops in the graph (i.e., loops), topological sorting cannot be completed because the nodes in the ring depend on each other and an appropriate sorting order cannot be determined.
  3. Utilization of entry

    EntryIndicates the number of edges pointing to a node. Topological sorting starts with nodes with an incoming degree of 0, because these nodes have no dependencies and can be processed first. After sorting a node with an entry of 0, delete the edges related to that node and update the entry of other nodes. For each node whose incoming degree becomes 0, add it to the next round of sorting.
  4. Recursively eliminate dependencies

    By constantly removing nodes with a degree of 0 and their edges, the nodes and edges in the graph gradually decrease, and eventually all nodes are sorted, or the remaining nodes are detected to form a ring.

2.2 Implementation method of topological sorting

There are two common methods to implement topological sorting:

  1. Kahn algorithm (entry-based algorithm)

    This method directly uses the concept of entry:
    • If all nodes are processed, a valid topological sort is obtained; if any nodes are not processed, it means that there is a ring in the figure.
    • Repeat steps 2 and 3 until the queue is empty.
    • Delete all outgoing edges of this node and update the incoming degree of the target node. If the incoming degree of a node becomes 0, it is added to the queue.
    • Take out nodes from the queue and add them to the topological sorting results.
    • Find all nodes with an incoming degree of 0 and put them in a queue.
  2. DFS (Deep First Search) Algorithm

    Topologies are achieved through Depth First Search (DFS):
    • This method can also detect rings. If a node that is already in access during the DFS process is encountered, it means that there is a ring in the figure.
    • The order of nodes in the final stack is the result of topological sorting.
    • When DFS completes access to a node, adds the node to a stack.
    • Perform DFS on each node in the graph and record the nodes in reverse order of completion time.

2.3 Application of topological sorting

  • Task Scheduling: When there are dependencies between certain tasks, topological sorting can determine the order in which tasks are executed.
  • Course Schedule: If some courses have pre-requisite requirements, topological sorting can be used to determine the order of learning of courses.
  • Build the system: When compiling or building a software project, topological sorting can be used to determine the compilation order of modules or files.

III. Based onMethods of Kahn algorithm

The core idea of ​​Kahn algorithm

  1. Initialize the entry table: Calculate and store the incoming degree of each node, that is, how many edges point to the node.

  2. Find nodes with entry 0: Add all nodes with an entry degree of 0 to a queue, because these nodes have no pre-dependencies and can be processed first.

  3. Gradually delete nodes

    • Take a node with an in-degree of 0 from the queue and add it to the topology sorting result.
    • Delete all outgoing edges of the node (i.e., reduce the incoming degree of its adjacent node by 1).
    • If the incoming degree of an adjacent node is therefore reduced to 0, it is added to the queue.
  4. Check the sort results

    • Repeat the above steps until the queue is empty. If the final topological sort result contains all nodes, the sorting is successful.
    • If there is a ring in the graph, some nodes will never make their entry to 0, and the queue will be empty in advance, and the topological sorting fails at this time.

Advantages of Kahn algorithm

  • Efficiency: Each node and each edge are processed only once, and the time complexity is O(V+E)O(V+E)O(V+E), where VVV is the number of nodes and EEE is the number of edges.
  • Simple and easy to implement: Using the concept of queue and incoming, the algorithm is clear and logical, and is suitable for dependency sorting in actual engineering problems.

Code:

#include <cstdio>

 const int MAXN = 200005; // Maximum number of nodes
 const int MAXM = 200005; // Maximum number of edges

 // Topological sorting requires quick collection of nodes with real-time entry 0
 int indegree[MAXN]; // Input of each node (number of edges pointing to that node)
 int queue[MAXN]; // Queue used to store nodes with incoming degree 0
 int front, back; // head and tail pointers of queue

 // Create a picture requires
 int head[MAXN] = { 0 }; // Store the head node number of each node's edge link list
 int to[MAXM]; // Storing the target node of each edge
 int next[MAXM]; // Store the number of the next edge of each edge (link list structure)
 int cnt; // The counter of the current edge, used to uniquely identify each edge

 // Collect answers required
 int ans[MAXN]; // Store the order of nodes after topology sorting

 // Add a directed edge from u to v
 void addEdge(int u, int v) {
     to[cnt] = v; // The target node with edge number cnt is v
     next[cnt] = head[u]; // The next edge with edge numbered cnt is the existing first edge starting from u
     head[u] = cnt++; // Update head[u] to the current new edge and add an edge counter
 }

 // Topological sorting function, return whether the sorting is successful or not
 bool topologicalSort(int n) {
     front = back = 0; // Initialize the queue pointer
     // Add all nodes with a degree of 0 to the queue
     for (int i = 1; i <= n; i++) {
         if (indegree[i] == 0)
             queue[back++] = i;
     }

     int size = 0; // Used to record the number of nodes that have been sorted
     // Process each node in the queue
     while (front < back) {
         int u = queue[front++]; // Take out a node from the head of the queue
         ans[size++] = u; // Add it to the result array
         // traverse all outgoing edges of node u
         for (int edgeId = head[u]; edgeId; edgeId = next[edgeId]) {
             // Reduce the incoming point node
             if (--indegree[to[edgeId]] == 0)
                 // If the incoming point node is reduced to 0, add it to the queue
                 queue[back++] = to[edgeId];
         }
     }
     // If the number of nodes after sorting is equal to the total number of nodes, the sorting is successful
     return size == n;
 }

 int main() {
     cnt = 1; // Initialize edge counter
     int n, m, u, v;
     scanf("%d%d", &n, &m); // Read the number of nodes n and edges m
     // Read each edge and build a graph
     while (m--) {
         scanf("%d%d", &u, &v);
         addEdge(u, v); // Add an edge from u to v
         indegree[v]++; // Update the entry level of the target node v
     }
     // Perform topological sorting, if successful, output the sorting result
     if (topologicalSort(n)) {
         for (int i = 0; i < n - 1; i++)
             printf("%d", ans[i]); // Output topology sorting results
         printf("%d\n", ans[n - 1]);
     }
     else
         printf("-1\n"); // If sorting fails, output -1

     return 0;
 }

Example:

enter:

6 6
1 2
2 3
3 4
4 5
5 6
1 3

Here it means there are 6 nodes and 6 edges:

  • Edge 1 → 2
  • Side 2 → 3
  • Side 3 → 4
  • Edge 4 → 5
  • Side 5 → 6
  • Edge 1 → 3

Structure of the diagram:

  • 1 → 2
  • 1 → 3
  • 2 → 3
  • 3 → 4
  • 4 → 5
  • 5 → 6

Topological sorting process:

  1. Calculate the entry:

    • Node 1: Entry 0
    • Node 2: Incoming degree 1 (from 1)
    • Node 3: Incoming degree 2 (from 1, 2)
    • Node 4: Incoming degree 1 (from 3)
    • Node 5: Incoming degree 1 (from 4)
    • Node 6: Incoming degree 1 (from 5)
  2. Initialize the queue:

    • Node with an entry degree of 0: Node 1.
  3. Processing queues:

    • Take node 1 from the queue and add it to the topology sequence:

      • Topological sequence:1
      • Update node 2's incoming degree to 0 and add it to the queue.
      • The entry of update node 3 is 1.
    • Take node 2 from the queue and add it to the topology sequence:

      • Topological sequence:1, 2
      • Update node 3's incoming degree to 0 and add it to the queue.
    • Take node 3 from the queue and add it to the topology sequence:

      • Topological sequence:1, 2, 3
      • Update node 4's incoming degree to 0 and add it to the queue.
    • Take node 4 from the queue and add it to the topology sequence:

      • Topological sequence:1, 2, 3, 4
      • Update node 5's incoming degree to 0 and add it to the queue.
    • Take node 5 from the queue and add it to the topology sequence:

      • Topological sequence:1, 2, 3, 4, 5
      • Update node 6's incoming degree to 0 and add it to the queue.
    • Take node 6 from the queue and add it to the topology sequence:

      • Topological sequence:1, 2, 3, 4, 5, 6

result:

The last topological sequence is1 2 3 4 5 6. This sorting meets the requirements of topological sorting, that is, for each directed edge u→v, node u is before node v.

4. Depth-first search (DFS)-based method

DFS's idea of ​​implementing topological sorting

  1. The representation of the picture

    • Using adjacency tablegraph[]Storage map, wheregraph[u]is a slave nodeuList of nodes pointed to.
  2. Recursive DFS traversal

    • For each unvisited nodeu, perform depth-first search.
    • In DFS, recursively access all adjacent nodes, and then push the current node into the stack.
    • In this way, the order of nodes in the stack is exactly the reverse order of topological sorting.
  3. Output topology sort

    • Finally, the topological sort results are output by popping up the nodes in the stack, ensuring that the predecessor node always appears before the successor node.

Features of DFS topological sorting

  • Test ring: If a node that has been in the stack is detected in DFS and is accessed again, it means that there is a ring in the figure. However, this code version does not include ring detection function.
  • Reverse output: The topological sorting results of DFS are completed by inverse stack output, which is different from the Kahn algorithm.

Code:

#include <cstdio>
 #include <vector>
 #include <stack>
 #include <cstring>

 const int MAXN = 200005; // Maximum number of nodes
 std::vector<int> graph[MAXN]; // Adjacent table used to store graphs
 bool visited[MAXN]; // Mark whether the node has been accessed
 std::stack<int> topoStack; // Stack used to store topological sorting results

 // Add a directed edge from u to v to the graph
 void addEdge(int u, int v) {
     graph[u].push_back(v);
 }

 // Topological sorting using DFS
 void dfs(int u) {
     visited[u] = true; // Mark node u as accessed
     // traverse all adjacent nodes of node u
     for (int v : graph[u]) {
         if (!visited[v]) {
             dfs(v); // If node v is not accessed, it is accessed recursively
         }
     }
     (u); // The current node has been processed and pushed into the stack
 }

 // Main function
 int main() {
     int n, m, u, v;
     scanf("%d%d", &n, &m); // Read the number of nodes n and edges m

     // Initialization
     memset(visited, false, sizeof(visited));

     // Read each edge and build a graph
     while (m--) {
         scanf("%d%d", &u, &v);
         addEdge(u, v); // Add an edge from u to v
     }

     // DFS for all nodes. If the node has not been accessed, call dfs
     for (int i = 1; i <= n; i++) {
         if (!visited[i]) {
             dfs(i);
         }
     }

     // Output topology sorting results
     while (!()) {
         printf("%d ", ()); // Output the top element of the stack
         (); // Pop up the top element of the stack
     }
     printf("\n");

     return 0;
 }

Example:

enter:

6 6
1 2
2 3
3 4
4 5
5 6
1 3

The graph represented by the input

  • Number of nodes: 6
  • Number of edges: 6
  • Directed edges:
    • 1 → 2
    • 2 → 3
    • 3 → 4
    • 4 → 5
    • 5 → 6
    • 1 → 3

The structure of this diagram is as follows:

1 → 2 → 3 → 4 → 5 → 6
 \____/

Steps to implement topological sorting in DFS

  1. The representation of the picture

    • We use adjacency tablegraph[]Store the structure of this graph.
    • For Node 1,graph[1] = {2, 3}Represents the two sides starting from 1, pointing to 2 and 3 respectively.
  2. Start DFS

    • initializationvisited[]The array isfalse, means that all nodes are not accessed.
    • We start with Node 1 DFS.
  3. DFS Process

    • Node 1: First access node 1. markvisited[1] = true. Recursively access 1's neighbor node 2.
    • Node 2: Access node 2. markvisited[2] = true. Recursively access 2's neighbor node 3.
    • Node 3: Access node 3. markvisited[3] = true. Recursively access 3's neighbor node 4.
    • Node 4: Access node 4. markvisited[4] = true. Recursively access 4's neighbor node 5.
    • Node 5: Access node 5. markvisited[5] = true. Recursively access 5 neighbor node 6.
    • Node 6: Access node 6. markvisited[6] = true. Node 6 has no neighbors, ends recursion, and node 6 is pushed into the stack.

    Stack content: [6]

    • End the recursion of node 5 and node 5 is pushed into the stack.

    Stack content: [6, 5]

    • End the recursion of node 4, and node 4 is pushed into the stack.

    Stack content: [6, 5, 4]

    • End the recursion of node 3 and node 3 is pushed into the stack.

    Stack content: [6, 5, 4, 3]

    • Return to Node 2, Node 2 is pushed into the stack.

    Stack content: [6, 5, 4, 3, 2]

    • Go back to Node 1, and another neighbor node 3 of Node 1 is accessed, but 3 has been accessed and does not require repeated access.
    • Node 1 is pushed into the stack.

    Stack content: [6, 5, 4, 3, 2, 1]

  4. Continue DFS

    Since the DFS of Node 1 already covers all nodes of the entire graph, the DFS processing of the remaining nodes is no longer necessary.
  5. Output topology sort

    Finally, the elements in the stack are output in the pop-up order, that is, the topological sorting result:1 2 3 4 5 6