Note: This post is over two years old and so the information contained here might be out of date. If you do spot something please leave a comment and we will endeavour to correct.
19th June 2018 - 5 minutes read time
I have been doing some reading and watching lectures of programming theory recently and I was reminded of this algorithm I learned about in university. Binary searching an array is a divide and conquer algorithm that takes an array and searches for a value in that array by splitting the array into halves. The algorithm works like this.
Given a sorted array, find the midpoint.
If the value at the mid point is greater than the value being searched for then the value must be in the first half of the array.
If the value at the mid point is less than the value being searched for then the value must be in the first half of the array.
Take the half of the array in question and repeat the first step by re-pointing the mid point.
Repeat this until the length of the array is 1 or 0. If the array length is 1 then this is the value. If the length of the array is 0 then the value was not in the array.
There is some implementation details around ensuring that the middle value is correct and to avoid running off the end of the array.
Here is an implementation of the algorithm in PHP.
/**
* Use binary search to find a key of a value in an array.
*
* @param array $array
* The array to search for the value.
* @param int $value
* A value to be searched.
*
* @return int|null
* Returns the key of the value in the array, or null if the value is not found.
*/
function binarySearch($array, $value) {
// Set the left pointer to 0.
$left = 0;
// Set the right pointer to the length of the array -1.
$right = count($array) - 1;
while ($left <= $right) {
// Set the initial midpoint to the rounded down value of half the length of the array.
$midpoint = (int) floor(($left + $right) / 2);
if ($array[$midpoint] < $value) {
// The midpoint value is less than the value.
$left = $midpoint + 1;
} elseif ($array[$midpoint] > $value) {
// The midpoint value is greater than the value.
$right = $midpoint - 1;
} else {
// This is the key we are looking for.
return $midpoint;
}
}
// The value was not found.
return NULL;
}
To run some tests on this algorithm I ran the following code. All of which prints true.
// Generate an array.
$array = range(0, 10);
// Loop through the array, searching for each value.
foreach ($array as $key => $value) {
echo var_export(binarySearch($array, $value) === $value, TRUE) . PHP_EOL;
}
// Search for values outside of the array.
echo var_export(binarySearch($array, -1) === NULL, TRUE) . PHP_EOL;
echo var_export(binarySearch($array, 11) === NULL, TRUE) . PHP_EOL;
This is a neat little algorithm that is great for searching over arrays of sorted data where the keys of the array are sequential. Much more performant than simply looping through the array to find the value.
This is because the count() function will return the length of the array, but if we want to get the last item in the array using this value it will be 1 item beyond the end of the array.
The array I'm creating here starts from 0 so although the length is 10, the index of the last item is 9.
Name
Philip Norton
Submitted by giHlZp8M8D on Tue, 03/31/2020 - 15:17
A good binary search implementation does not return NULL if element is not found. It should return the correct insertion index for the element instead. This way it can serve both purposes (retrieval and insertion), because it only costs O(1) to check if the element at the returned index is the needle.
That's actually a really good point. I initially though "how would you know if the value was not found versus a found index", but you are right in that you would only need to go and check the index for the return value. I might have another go at this.
Thank you for the comment!
Name
Philip Norton
Submitted by giHlZp8M8D on Mon, 06/21/2021 - 08:37
XML is a useful format for configuration, data storage, and transmitting data from one system to another. As a human readable format that can be easily read by machines it quickly gained favor in lots of different systems as a mechanism for data storage.
Pi day was a few weeks ago, but I came across this simple approximation of pi recently and decided to put together an example in PHP since it seemed pretty simple.
This approximation of pi centers around a real world example, but we can simulate this using some code.
A parabolic curve is a type of curve where every point is an equal distance from a focal point. There a number of different way to generate this sort of curve using math, but one of the simplest is to use straight lines to create the illusion of the curve.
I quite like the end of the year report from Spotify that they call "Wrapped". This is a little application in which they tell you what your favorite artist was and what sort of genres you listened to the most during the year.
Let's say you had a class that you wanted to use, but there was some sort of error in creating the object. This might be that the wrong parameters were passed, or the third party service (eg. a database) wasn't available at the time of creation.
Comments
Why do you count right as count($array) - 1. Why do you need this -1 ?
Submitted by Valeriy on Mon, 03/30/2020 - 12:27
PermalinkThis is because the count() function will return the length of the array, but if we want to get the last item in the array using this value it will be 1 item beyond the end of the array.
The array I'm creating here starts from 0 so although the length is 10, the index of the last item is 9.
Submitted by giHlZp8M8D on Tue, 03/31/2020 - 15:17
PermalinkA good binary search implementation does not return NULL if element is not found. It should return the correct insertion index for the element instead. This way it can serve both purposes (retrieval and insertion), because it only costs O(1) to check if the element at the returned index is the needle.
Submitted by BASTA! on Sun, 06/20/2021 - 21:03
PermalinkThat's actually a really good point. I initially though "how would you know if the value was not found versus a found index", but you are right in that you would only need to go and check the index for the return value. I might have another go at this.
Thank you for the comment!
Submitted by giHlZp8M8D on Mon, 06/21/2021 - 08:37
PermalinkAdd new comment