Full question:
You have a vector of signed integers and you need to support an API with two exposed methods, insert(int) and getmedian().
Describe a data structure you would use to support this API and describe the running time of the two methods.
Sample Answer
What is a median ?
Median is s the number separating the higher half of a sample, from the lower half.
Thus, the question actually needs some clarification, as median might be treated differently in case of even number of elements in an array.
Two most common approaches are:
1. Calculate average of the two middle numbers
2. Use largest(smallest) as median
Then you agree on the definition, the answer is quite simple and very elegant. You should use min-max-median heap ADS (abstract data structure).
Min-max-median heap abstract data structure consists of two heaps connected via root node. Such as, root is a median and right child of the root is min heap of all elements from the array bigger than median, while left node is a max heap of all elements less than a median.
Formal definition:
An mmm-heap is a binary tree with the following properties:
1) The median of all elements is located at the root.
2) The left subtree of the root is a min-max heap H1 of size r((n - 1)/2)1, containing elements less than or equal to the median.
3) The right subtree is a max-min heap H, of size L((n - 1)/2)J containing only elements greater than or equal to the median.
Here is the research article on Min-Max heaps and the usage of such data structures:MinMaxHeaps.PDF
Runing times:
GetMedian() - constant time O(1), as we need just get the tree root
Inset(int) - log(n) by using standard heap insertion
Interviewed for position: SDE
You have a vector of signed integers and you need to support an API with two exposed methods, insert(int) and getmedian().
Describe a data structure you would use to support this API and describe the running time of the two methods.
Sample Answer
What is a median ?
Median is s the number separating the higher half of a sample, from the lower half.
Thus, the question actually needs some clarification, as median might be treated differently in case of even number of elements in an array.
Two most common approaches are:
1. Calculate average of the two middle numbers
2. Use largest(smallest) as median
Then you agree on the definition, the answer is quite simple and very elegant. You should use min-max-median heap ADS (abstract data structure).
Min-max-median heap abstract data structure consists of two heaps connected via root node. Such as, root is a median and right child of the root is min heap of all elements from the array bigger than median, while left node is a max heap of all elements less than a median.
Formal definition:
An mmm-heap is a binary tree with the following properties:
1) The median of all elements is located at the root.
2) The left subtree of the root is a min-max heap H1 of size r((n - 1)/2)1, containing elements less than or equal to the median.
3) The right subtree is a max-min heap H, of size L((n - 1)/2)J containing only elements greater than or equal to the median.
Here is the research article on Min-Max heaps and the usage of such data structures:MinMaxHeaps.PDF
Runing times:
GetMedian() - constant time O(1), as we need just get the tree root
Inset(int) - log(n) by using standard heap insertion
Interviewed for position: SDE