There is another very easy solution to this problem (based on somewhat greedy approach) So it goes like this : First you make two copies of the given input array. ~O(N) Now you sort one of them by x coordinate and the other one by y coordinate. ~O(NlogN) Now in both lists, find out the pairwise distances (consecutive pairs only) and store the minimum found in both the sorted lists. ~O(N) Now simply return the minimum of the two minimums found ~O(N) So the overall complexity comes out to be O(NlogN). Hope this helps!
Thank you for sharing this alternative approach! This is an interesting solution but it's NOT guaranteed to find the answer! - Correctness: While this approach will often find the closest pair, it's not guaranteed to always do so. Consider a case where the closest pair of points are diagonal to each other, they might not be adjacent in either the x-sorted or y-sorted list. - Space Complexity: This solution requires O(N) extra space for the additional sorted arrays, whereas the divide-and-conquer approach can be implemented with O(log N) extra space (for the recursion stack). - Generalization: The divide-and-conquer approach we discussed can be more easily generalized to higher dimensions, while this approach might be more challenging to extend.
@@RealRyanRadFrom you reply, i can see that you are doubting the correctness of the greedy algorithm. Like for ex if i take closest pair of points to be diagonally placed as you stated in your reply, let's say : (3,6), (3,3), (4,4) (4,8) Then we sort it according to x we will get : (3,3), (3,6), (4,4), (4,8) Then we sort it according to y we get : (3,3), (4,4), (3,6), (4,8) Now if we find the pairwise minimum in 1st array we will get : sqrt(3) Now if we find the pairwise minimum in 2nd array we will get : sqrt(2) We will return the minimum of two which is sqrt(2) and also the correct answer. Please correct me if i understood your 1st point wrong and provide a test case that you have in mind where this greedy algorithm fails ? Coming to your second point. In order for the divide and conquer approach to work in O(NLogN) time : - you must sort the array according to x (to find the partition line in constant time). Here you can use the input array to store the result. - you must also sort the array according to y coordinate for finding minimum distance between 14 (or 7) adjacent coordinates in the region of length 2(delta) in linear time. So you will require ~O(N) extra space. Note that, if you don't do this preprocessing step, you will need to sort the 14 (or 7) points in every recursive call, which actually makes the time complexity of this algorithm as O(N(LogN)^2) Hence, your algorithm takes ~O(N) space complexity as well Coming to your third point, i would say that the problem can have a maximum of 3 dimensions (x,y,z) in which case, we will sort the list according to all 3 coordinates and compute the minimum among 3, similar to what i did for 2-D. Not sure how this divide and conquer approach would scale to higher dimensions so can't comment on that. Hope this clarifies any issues !
@@RealRyanRadFrom you reply, i can see that you are doubting the correctness of the greedy algorithm. Like for ex if i take closest pair of points to be diagonally placed as you stated in your reply, let's say : (3,6), (3,3), (4,4) (4,8) Then we sort it according to x we will get : (3,3), (3,6), (4,4), (4,8) Then we sort it according to y we get : (3,3), (4,4), (3,6), (4,8) Now if we find the pairwise minimum in 1st array we will get : sqrt(3) Now if we find the pairwise minimum in 2nd array we will get : sqrt(2) We will return the minimum of two which is sqrt(2) and also the correct answer. Please correct me if i understood your 1st point wrong and provide a test case that you have in mind where this greedy algorithm fails ? Coming to your second point. In order for the divide and conquer approach to work in O(NLogN) time : - you must sort the array according to x (to find the partition line in constant time). Here you can use the input array to store the result. - you must also sort the array according to y coordinate for finding minimum distance between 14 (or 7) adjacent coordinates. Here you will require another ~O(N) extra space. Note that, if you don't do this preprocessing step, you will need yo sort the 14 (or 7) points in every recursive call, which actually makes the time complexity of this algorithm as O(N(LogN)^2) Hence, your algorithm takes ~O(N) space complexity as well Coming to your third point, i would say that the problem can have a maximum of 3 dimensions (x,y,z) in which case, we will sort the list according to all 3 coordinates and compute the minimum among 3 similar to what i did for 2-D. Not sure how this divide and conquer approach would scale to higher dimensions so can't comment on that. Hope this clarifies any issues !
From you reply, i can see that you are doubting the correctness of the greedy algorithm. Like for ex if i take closest pair of points to be diagonally placed as you stated in your reply, let's say : (3,6), (3,3), (4,4) (4,8) Then we sort it according to x we will get : (3,3), (3,6), (4,4), (4,8) Then we sort it according to y we get : (3,3), (4,4), (3,6), (4,8) Now if we find the pairwise minimum in 1st array we will get : sqrt(3) Now if we find the pairwise minimum in 2nd array we will get : sqrt(2) We will return the minimum of two which is sqrt(2) and also the correct answer. Please correct me if i understood your 1st point wrong and provide a test case that you have in mind where this greedy algorithm fails ? Coming to your second point. In order for the divide and conquer approach to work in O(NLogN) time : - you must sort the array according to x (to find the partition line in constant time). Here you can use the input array to store the result. - you must also sort the array according to y coordinate for finding minimum distance between 14 (or 7) adjacent coordinates. Here you will require another ~O(N) extra space. Note that, if you don't do this preprocessing step, you will need yo sort the 14 (or 7) points in every recursive call, which actually makes the time complexity of this algorithm as O(N(LogN)^2) Hence, your algorithm takes ~O(N) space complexity as well Coming to your third point, i would say that the problem can have a maximum of 3 dimensions (x,y,z) in which case, we will sort the list according to all 3 coordinates and compute the minimum among 3 similar to what i did for 2-D. Not sure how this divide and conquer approach would scale to higher dimensions so can't comment on that. Hope this clarifies any issues !
Aren't there 8 points not 6? Four in the middle (one bordering the midpoint line from both sides on both the top and bottom of the strip, and two on each respective side. Like this --> : :|: :). We need to look at 7 points when doing so, not 5.
Thank you for your thoughtful question and the neat visualization (: :|: :)! - Remember, δ is the smallest distance we've found so far. Any two points closer than δ would have been caught in our earlier comparisons. - Consider a 2δ x δ rectangle centered on our dividing line. We can think of this as two δ x δ squares side by side. - In a single δ x δ square, we can fit at most 4 points (one in each corner) without any pair being closer than δ. - However, when we combine two such squares (forming 2δ x δ rectangle centered on our dividing line), we can't maintain 4 points in each square without violating our δ distance. Here's why: o If we try to place 4 points around the dividing line (2 on each side), these points would be closer to the other points on the far left and right corners than δ, contradicting our assumption. o We can only have points exactly on the dividing line, not slightly to the left or right. If we placed points at (x-ε) or (x+ε) even where ε is very very small, they would still violate the δ distance with other points in their respective squares. It's a common misconception, but the maximum is indeed 6 points, not 8. Finally, 6 or 8, there will be still constant number of points to check int he worst case scenario in each strip so the complexity of the solution remains about the same.
This content is fantastic! It's very informative and well-explained. Your channel is a hidden gem.
Glad you liked it. Much appreciated!
There is another very easy solution to this problem (based on somewhat greedy approach)
So it goes like this :
First you make two copies of the given input array. ~O(N)
Now you sort one of them by x coordinate and the other one by y coordinate. ~O(NlogN)
Now in both lists, find out the pairwise distances (consecutive pairs only) and store the minimum found in both the sorted lists. ~O(N)
Now simply return the minimum of the two minimums found ~O(N)
So the overall complexity comes out to be O(NlogN). Hope this helps!
Thank you for sharing this alternative approach! This is an interesting solution but it's NOT guaranteed to find the answer!
- Correctness: While this approach will often find the closest pair, it's not guaranteed to always do so. Consider a case where the closest pair of points are diagonal to each other, they might not be adjacent in either the x-sorted or y-sorted list.
- Space Complexity: This solution requires O(N) extra space for the additional sorted arrays, whereas the divide-and-conquer approach can be implemented with O(log N) extra space (for the recursion stack).
- Generalization: The divide-and-conquer approach we discussed can be more easily generalized to higher dimensions, while this approach might be more challenging to extend.
@@RealRyanRadFrom you reply, i can see that you are doubting the correctness of the greedy algorithm.
Like for ex if i take closest pair of points to be diagonally placed as you stated in your reply, let's say : (3,6), (3,3), (4,4) (4,8)
Then we sort it according to x we will get : (3,3), (3,6), (4,4), (4,8)
Then we sort it according to y we get : (3,3), (4,4), (3,6), (4,8)
Now if we find the pairwise minimum in 1st array we will get : sqrt(3)
Now if we find the pairwise minimum in 2nd array we will get : sqrt(2)
We will return the minimum of two which is sqrt(2) and also the correct answer. Please correct me if i understood your 1st point wrong and provide a test case that you have in mind where this greedy algorithm fails ?
Coming to your second point. In order for the divide and conquer approach to work in O(NLogN) time :
- you must sort the array according to x (to find the partition line in constant time). Here you can use the input array to store the result.
- you must also sort the array according to y coordinate for finding minimum distance between 14 (or 7) adjacent coordinates in the region of length 2(delta) in linear time. So you will require ~O(N) extra space. Note that, if you don't do this preprocessing step, you will need to sort the 14 (or 7) points in every recursive call, which actually makes the time complexity of this algorithm as O(N(LogN)^2)
Hence, your algorithm takes ~O(N) space complexity as well
Coming to your third point, i would say that the problem can have a maximum of 3 dimensions (x,y,z) in which case, we will sort the list according to all 3 coordinates and compute the minimum among 3, similar to what i did for 2-D. Not sure how this divide and conquer approach would scale to higher dimensions so can't comment on that. Hope this clarifies any issues !
@@RealRyanRadFrom you reply, i can see that you are doubting the correctness of the greedy algorithm.
Like for ex if i take closest pair of points to be diagonally placed as you stated in your reply, let's say : (3,6), (3,3), (4,4) (4,8)
Then we sort it according to x we will get : (3,3), (3,6), (4,4), (4,8)
Then we sort it according to y we get : (3,3), (4,4), (3,6), (4,8)
Now if we find the pairwise minimum in 1st array we will get : sqrt(3)
Now if we find the pairwise minimum in 2nd array we will get : sqrt(2)
We will return the minimum of two which is sqrt(2) and also the correct answer. Please correct me if i understood your 1st point wrong and provide a test case that you have in mind where this greedy algorithm fails ?
Coming to your second point. In order for the divide and conquer approach to work in O(NLogN) time :
- you must sort the array according to x (to find the partition line in constant time). Here you can use the input array to store the result.
- you must also sort the array according to y coordinate for finding minimum distance between 14 (or 7) adjacent coordinates. Here you will require another ~O(N) extra space. Note that, if you don't do this preprocessing step, you will need yo sort the 14 (or 7) points in every recursive call, which actually makes the time complexity of this algorithm as O(N(LogN)^2)
Hence, your algorithm takes ~O(N) space complexity as well
Coming to your third point, i would say that the problem can have a maximum of 3 dimensions (x,y,z) in which case, we will sort the list according to all 3 coordinates and compute the minimum among 3 similar to what i did for 2-D. Not sure how this divide and conquer approach would scale to higher dimensions so can't comment on that. Hope this clarifies any issues !
From you reply, i can see that you are doubting the correctness of the greedy algorithm.
Like for ex if i take closest pair of points to be diagonally placed as you stated in your reply, let's say : (3,6), (3,3), (4,4) (4,8)
Then we sort it according to x we will get : (3,3), (3,6), (4,4), (4,8)
Then we sort it according to y we get : (3,3), (4,4), (3,6), (4,8)
Now if we find the pairwise minimum in 1st array we will get : sqrt(3)
Now if we find the pairwise minimum in 2nd array we will get : sqrt(2)
We will return the minimum of two which is sqrt(2) and also the correct answer. Please correct me if i understood your 1st point wrong and provide a test case that you have in mind where this greedy algorithm fails ?
Coming to your second point. In order for the divide and conquer approach to work in O(NLogN) time :
- you must sort the array according to x (to find the partition line in constant time). Here you can use the input array to store the result.
- you must also sort the array according to y coordinate for finding minimum distance between 14 (or 7) adjacent coordinates. Here you will require another ~O(N) extra space. Note that, if you don't do this preprocessing step, you will need yo sort the 14 (or 7) points in every recursive call, which actually makes the time complexity of this algorithm as O(N(LogN)^2)
Hence, your algorithm takes ~O(N) space complexity as well
Coming to your third point, i would say that the problem can have a maximum of 3 dimensions (x,y,z) in which case, we will sort the list according to all 3 coordinates and compute the minimum among 3 similar to what i did for 2-D. Not sure how this divide and conquer approach would scale to higher dimensions so can't comment on that. Hope this clarifies any issues !
Aren't there 8 points not 6? Four in the middle (one bordering the midpoint line from both sides on both the top and bottom of the strip, and two on each respective side. Like this --> : :|: :). We need to look at 7 points when doing so, not 5.
Thank you for your thoughtful question and the neat visualization (: :|: :)!
- Remember, δ is the smallest distance we've found so far. Any two points closer than δ would have been caught in our earlier comparisons.
- Consider a 2δ x δ rectangle centered on our dividing line. We can think of this as two δ x δ squares side by side.
- In a single δ x δ square, we can fit at most 4 points (one in each corner) without any pair being closer than δ.
- However, when we combine two such squares (forming 2δ x δ rectangle centered on our dividing line), we can't maintain 4 points in each square without violating our δ distance. Here's why:
o If we try to place 4 points around the dividing line (2 on each side), these points would be closer to the other points on the far left and right corners than δ, contradicting our assumption.
o We can only have points exactly on the dividing line, not slightly to the left or right. If we placed points at (x-ε) or (x+ε) even where ε is very very small, they would still violate the δ distance with other points in their respective squares.
It's a common misconception, but the maximum is indeed 6 points, not 8. Finally, 6 or 8, there will be still constant number of points to check int he worst case scenario in each strip so the complexity of the solution remains about the same.