Disjoint Set / Union Find

Data Structure: What is a disjoint set (union find data structure)? What are the operations it supports and give their runtimes. Implement it
  • https://www.youtube.com/watch?v=ID00PMy0-vE https://www.youtube.com/watch?v=n_t0a_8H8VY
  • Operations:
  • make_set - Creates a set with one element union_by_rank - two sets and merge them into one find_set - operation to return an identity of a set which usually acts as a representative
  • Application:
    • Kruskals
    • Cycle in a undirected graph
  • Implementations:
  • The most popular one uses union by rank and path compression to run these operations efficiently
  • Path compression is an optimization
class Node:
    def __init__(self, data, rank=0, parent=self):
        self.rank = rank # int: approximate depth in tree
        self.data = data # int: data / label of node
        self.parent = parent # Node: reference to representative element

    def _find_set(self):
        Find the representative recursively and does path
        compression as well.
        parent = self.parent

        if parent == self:
            return parent

        node.parent = self.find_set(parent) 
				# path compression in line above: once parent is found, each nodes' 
				# parent pointer gets changed to the "highest rank node"
				# or "representative" of the set

				return node.parent

class DisjointSet: # Tree
		def __init__(self):
        self.map = {} # key, value = data, node

		def make_set(self, data):
        node = Node(data)
        self.map[data] = node

    def union_by_rank(self, data1, data2):
        Combines two sets together to one.
        Does union by rank return true if data1 and data2 
				are in different set before union else false.
        n1, n2 = self.map[data1], self.map[data2]

        parent1, parent2 = self.find_set(n1), self.find_set(n2)

        if(parent1.data == parent2.data):
            return False

        if (parent1.rank >= parent2.rank):
            parent1.rank = parent1.rank + 1 if parent1.rank == parent2.rank else parent1.rank
            parent2.parent = parent1
            parent1.parent = parent2

        return True

    def find_set(self, data):
        Finds the representative of this set
        node = self.map[data]
        return node._find_set()

  • Complexity: O(nα(n))O(n \cdot \alpha (n))
  • Related: Find a cycle in an undirected graph