System-Design-Question

Top K Frequent Elements

Category: dsa Date: 2026-02-24

Top K Frequent Elements System Design Discussion

1. Requirements (Functional + Non-functional)

Functional Requirements:

Non-functional Requirements:

2. High-Level Architecture

We will use a distributed system architecture to handle large inputs. The architecture consists of three layers:

  1. Load Balancer: Distributes incoming requests across multiple worker nodes.
  2. Worker Node: Processes the input array, counts frequencies, and stores them in a distributed hash table (DHT).
  3. Result Node: Combines the frequency counts from all worker nodes and returns the top K elements.

Technology Stack:

3. Database Design

We will use a distributed hash table (DHT) to store the frequency counts. Each worker node will store the frequency counts for a subset of the input array.

DHT Design:

4. Scaling Strategy

To handle large inputs, we can scale the system horizontally by adding more worker nodes. Each worker node will process a subset of the input array and store the frequency counts in the DHT.

Scaling Strategy:

5. Bottlenecks

The following are potential bottlenecks in the system:

6. Trade-offs

The following are trade-offs in the system design:

Top K Frequent Elements Solution using the First Principle of System Design

The first principle of system design is to “Keep it Simple, Stupid” (KISS). In this case, we can use a simple algorithm to find the top K frequent elements:

  1. Count the frequency of each element in the input array.
  2. Sort the elements by frequency in descending order.
  3. Return the top K elements.

However, this algorithm may not be efficient for large inputs. To improve performance, we can use a distributed system architecture and a distributed hash table (DHT) to store the frequency counts.

Code Example:

// Define a class to represent a frequency count
case class FrequencyCount(elementId: Int, frequency: Int)

// Define a class to represent the Top K Frequent Elements result
case class TopKFrequentElements(k: Int, result: List[FrequencyCount])

// Define a function to count the frequency of each element
def countFrequency(inputArray: Array[Int]): Map[Int, Int] = {
  inputArray.groupBy(identity).mapValues(_.size)
}

// Define a function to find the top K frequent elements
def findTopKFrequentElements(inputArray: Array[Int], k: Int): TopKFrequentElements = {
  val frequencyCounts = countFrequency(inputArray)
  val sortedFrequencyCounts = frequencyCounts.toList.sortBy(_._2)(Ordering.Int.reverse)
  val topKFrequentElements = sortedFrequencyCounts.take(k).map { case (elementId, frequency) => FrequencyCount(elementId, frequency) }
  TopKFrequentElements(k, topKFrequentElements)
}

Note that this is a simplified example and may not be suitable for production use. A more robust solution would involve using a distributed system architecture and a distributed hash table (DHT) to store the frequency counts.