# Minimum Spanning Tree in Data Structure and its Algorithms

A spanning tree is a subset of the undirected graph. This has all the vertices connected by a minimum edge. When all these vertices are connected in a graph then there is one spanning tree, and more such trees can exist in a graph.

What is a minimum spanning tree?

A minimum spanning tree or MST is a subset of the edges of a connected yet undirected graph. It connects all the vertices using a minimum possible total weight edge. To derive an MST, algorithms like Prims or Kruskals are used.

Applications of a minimum spanning tree

A minimum spanning tree can be used to-

• Find the paths on the map
• To design networks especially ones like telecommunication, water supply network or electrical grids
• These are also used to find approximate solutions for complex mathematical issues including the travelling salesman problem.
• The same is also used for –
• Cluster analysis
• Real-time face tracking
• Real-time verification by locating faces in the video stream
• Max bottleneck paths
• Entropy-based image registration
• Adding white noise to digital recording to overcome distortion (Dithering)
• Protocols in a computer network to avoid network cycles

Further, a graph can have more than one spanning tree and if there are n number of vertices, the resulting spanning tree can have n-1 edges. There may exist more than one spanning tree in this context if the edge of the graph is associated with the weight, so you need to determine which one is the right spanning tree. In case there are any duplicated weighted edges then the graph may have multiple minima spanning trees.

Properties of a minimum spanning tree

• On removing any edge from the tree, it becomes disconnected and thus one cannot remove any edge of this tree. Similarly, adding an edge creates a loop and thus it is avoided
• In the graph, each edge has a distinct weight. If the same is not present and the edge has a minimum weight that is not distinct then it is possible to have more than one tree. Thus a connected graph G can have more than one spanning tree
• All possible spanning trees of this graph can have the same number of edges or vertices
• The spanning-tree will not have any cycle or loops
• The spanning tree also has an n-1 edge where n is the number of vertices or nodes of the minimum spanning tree
• A complete graph can have an nn-2 number of spanning trees
• For a complete graph, it is possible to remove the e-n+1 edge to achieve a spanning tree
• Undirected graphs do not have a spanning tree
• The weight of a spanning tree is equal to the sum of weights given on each edge of the tree
• If G(V, E) is a graph then every other spanning tree of the graph will have a (V-1) edge. Here V is the number of vertices and E is the number of edges

Effective ways to find a minimum spanning tree using algorithms?

The minimum spanning tree from a graph is found using the following algorithms –

• Kruskal’s algorithm

An algorithm is a form of minimum spanning tree that is used in programming languages like Java, C++, etc. This takes the graph as an input and then uses a subset of the edges of the graph that in turn has the following –

• Forms a tree that includes every vertex
• Has a minimum sum of weights among all the trees that can be formed from the graph

How does this algorithm work?

The algorithm is the part of algorithms termed greedy algorithms. These find the local optimum in the hopes of finding a global optimum.

The idea is to start with the edges using the lowest weight and subsequently add to the edges till we reach the goals. The steps for implementing such algorithms include –

• Sorting all the edges from low to high weight

Take the edges with the lowest weight and subsequently add them to the spanning tree. Adding the edge creates a cycle and then rejects the edge.

• Keep adding edges until you reach all the vertices
• Any Minimum spanning tree algorithm also revolves around checking if the edge creates a loop. The method used for the purpose is called Union FInd. The same divide the vertices into clusters and allows one to check if the two vertices belong to the same cluster. Hence it is important to decide whether adding an edge creates a cycle.

• Prim’s algorithm

Using Prim’s algorithm is recommended in languages like C, C++, Java and Python. The algorithm is another Minimum spanning tree algorithm that takes any graph as an input and uses it to find a subset of the edge of the graph.

This includes –

• Form a tree that includes every vertex
• Has a minimum sum of weights among all the trees that can be formed from the graph

How does this algorithm work?

This is another form of a greedy algorithm that can be used to find the local optimum.

For implementing such an algorithm –

• Initialise the minimum spanning tree with a vertex chosen at random
• Find the edges that connect the tree to the new vertices, find the minimum and add it to the tree.
• Repeat the above step to get a minimum spanning tree

Comparing the prims and Kruskal’s algorithm

Both the Prims and Kruskal’s algorithms use separate techniques to achieve a minimum spanning tree. Instead of starting with the vertex, the latter uses sorting the edges from a low to high weight. The idea is to keep adding the lowest edges while ignoring the ones that create a cycle.

 Prims algorithm Kruskal’s algorithm It starts with building the tree from any vertex in the graph It uses sorting the edges from a low to high weight for building the tree It traverses a node more than once to get a minimum distance It traverses one node at a time Prim’s algorithm gives a connected algorithm and works well with graphs It may create forests or disconnected algorithms. It works well for both connected and disconnected elements. It runs faster in the dense graphs It works well with sparse graphs The resulting tree always remains connected The resulting tree always remains disconnected It grows by adding to a random vertex or the cheapest vertex of an existing tree It grows by adding the next cheapest edge to the existing tree or vertex The same initialised within a node The same initiates with an edge It has a time complexity of O(V2) Here the time complexity is O(logV)

Indirect applications of minimum spanning tree

• Max bottleneck paths
• LDPC codes for error correction
• Real-time face verification and tracking
• Cluster analysis
• Image registration using Renyi entropy
• Reducing data storage in sequencing the amino acids
• autoconfig protocols for Ethernet bridging to avoid cycles in the given network

In conclusion

The Minimum Spanning Tree algorithm is best known for its use in finding the shortest path between nodes in a given graph. The algorithm ensures that there is no smaller tree with fewer edges connecting the same nodes and that either all distances are different or no two paths share an edge. 