mediumHeap and Priority QueuePattern: Heap Operations

Top K Elements in Stream Solution

Problem Statement

You are tasked with designing a data structure, `TopKElements`, that efficiently tracks the `k` largest integer elements encountered from a stream of incoming values. Your implementation will be initialized with a specific `k` and then receive a series of values through an `add` operation. After all values from an initial array have been processed by the `add` method, the data structure should be able to report its currently held `k` largest elements. Specifically, you need to implement the following: 1. A class `TopKElements`. 2. Its constructor `__init__(self, k: int)`: Initializes the data structure to maintain the `k` largest elements. 3. A method `add(self, value: int)`: Processes a new integer `value` from the stream, updating the collection of top `k` elements if necessary. After all `add` operations for a given set of input values are complete, you should return a list containing the `k` largest elements present in the collection. If the total number of distinct values added is less than `k`, return all elements that were added. The order of elements in the returned list does not matter.

Examples

Example 1:
Input:k = 3 values_to_add = [10, 4, 15, 8, 20]
Output:[20, 15, 10]
Explanation: A `TopKElements` object is initialized with `k=3`. After sequentially adding 10, 4, 15, 8, and 20, the three largest elements maintained are 20, 15, and 10.
Example 2:
Input:k = 4 values_to_add = [-5, 2, 1, -3]
Output:[2, 1, -3, -5]
Explanation: With `k=4`, we track the 4 largest elements. Since only 4 elements [-5, 2, 1, -3] are added, all of them are considered the largest.

Constraints

  • 1 <= k <= 10^5
  • 1 <= N <= 10^5 (where N is the total count of values added)
  • -10^9 <= value <= 10^9
Time: O(N log k) for N add operations, where each add takes O(log k) time due to heap operations. Initialization is O(1). Retrieving the k elements is O(k). Space: O(k) to store the k largest elements in the min-heap.
The optimal approach uses a min-heap (priority queue) of size `k`. For each incoming value, if the heap has fewer than `k` elements, the value is added. If the heap is full, the value is compared with the smallest element in the heap (the heap's root); if the new value is larger, the smallest element is removed and the new value is added. This ensures the heap always contains the `k` largest elements seen so far, with each `add` operation taking `O(log k)` time.

Run, Test & Submit Code

Ready to practice this challenge? Launch our interactive compilation environment with compiler validation.

Solve on Interactive Workspace

Tested Solutions

import heapq import sys class TopKElements: def __init__(self, k: int): self.k = k self.min_heap = [] # This will store the k largest elements def add(self, value: int) -> None: # If the heap has less than k elements, just add it. if len(self.min_heap) < self.k: heapq.heappush(self.min_heap, value) # If the heap is full and the new value is larger than the smallest # element in the heap (which is at index 0 for a min-heap), # replace the smallest with the new value. elif value > self.min_heap[0]: heapq.heapreplace(self.min_heap, value) # Pops smallest, pushes new value def get_top_k(self) -> list[int]: # The order of elements in the returned list does not matter. # The min_heap itself contains the k largest elements. return self.min_heap def main(): input = sys.stdin.readline # Read k k = int(input().strip()) # Read the initial array of values # Assuming values are space-separated on a single line values_str = input().strip() initial_values = [] if values_str: # Handle case where the line might be empty initial_values = list(map(int, values_str.split())) # Create the TopKElements instance top_k_tracker = TopKElements(k) # Add initial values for val in initial_values: top_k_tracker.add(val) # Get the top k elements result = top_k_tracker.get_top_k() # The problem states the order doesn't matter. # For deterministic output in testing environments, it's common to sort. result.sort() print(*(result)) # Unpack and print elements separated by space if __name__ == "__main__": main()