tag:blogger.com,1999:blog-8858921836121088131.post2361728546414156045..comments2024-03-24T09:39:21.170-07:00Comments on Adam Horvath's blog: Beating the binary search algorithm – interpolation search, galloping searchAdamhttp://www.blogger.com/profile/09720738376586525885noreply@blogger.comBlogger1125tag:blogger.com,1999:blog-8858921836121088131.post-79287672770499945092017-05-27T22:28:37.828-07:002017-05-27T22:28:37.828-07:00You state: "One of the big confusions around ...You state: "One of the big confusions around interpolation search is that the O(log(log(n)) number of comparisons may yield O(log(log(n)) runtime" and "If the guesses are accurate, the number of comparisons can be around O(log(log(n)), runtime around O(log(n))"<br /><br />This is not correct.<br /><br />Mind you that both 2*log(log(n)) and 1000000*log(log(n)) are both O(log(log(n)). For any constant c, c*f(n) is O(f(n)).<br /><br />The c tends to be comparatively large in case of interpolation search vs other usual search algorithms. And it depends on implementation[*]. But with large enough n asymptotically faster algorithm will win. <br /><br />If one looks at the performance data presented one could see that, when compared with example binary search code, example interpolation search has a lead of ~15% for n > 500000. Even built-in binary search (reliably faster that anything else) is virtually equal at the biggest n (n=81920000).<br /><br />So yes, interpolation search will be faster when data is big enough. How big depends on how optimized implementation is[*], here it looks that it would be for n somewhere between 0.5 million and 100+ million.<br /><br /><br />There is other issue with interpolation search, not mentioned here:<br />while it's expected cost is O(log(log(n))), it's pessimistic cost is O(n). Try filling test arrays with numbers obtained the following way (of course sort it): <br />(int)Math.pow(62*Math.random(), 1.41)<br /><br /><br />*] Example implementation is not optimized. It converts back and forth between int and float (which is slow), it has extraneous ifs, etc...Anonymoushttps://www.blogger.com/profile/05589634664900182465noreply@blogger.com