Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Algorithms
  4. Boundless Binary Search

Boundless Binary Search

Scheduled Pinned Locked Moved Algorithms
comdata-structuresquestion
10 Posts 4 Posters 3 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • I Offline
    I Offline
    Igor van den Hoven
    wrote on last edited by
    #1

    I've been playing around with binary search algorithms and have created a novel new variant that appears to be significantly faster than any traditional implementations I've seen.

    int boundless_binary_search(int *array, int array_size, int key)
    {
    register int mid, i;

        mid = i = array\_size - 1;
    
        while (mid > 7)
        {
                mid = mid / 2;
    
                if (key < array\[i - mid\]) i -= mid;
        }
        while (i && key < array\[i\]) --i;
    
        if (key == array\[i\])
        {
                return i;
        }
        return -1;
    

    }

    I'm wondering if this is a novel binary search implementation, or has this approach been used by someone before? Some variants and benchmark graphs: https://sites.google.com/site/binarysearchcube/binary-search[^]

    L A A 3 Replies Last reply
    0
    • I Igor van den Hoven

      I've been playing around with binary search algorithms and have created a novel new variant that appears to be significantly faster than any traditional implementations I've seen.

      int boundless_binary_search(int *array, int array_size, int key)
      {
      register int mid, i;

          mid = i = array\_size - 1;
      
          while (mid > 7)
          {
                  mid = mid / 2;
      
                  if (key < array\[i - mid\]) i -= mid;
          }
          while (i && key < array\[i\]) --i;
      
          if (key == array\[i\])
          {
                  return i;
          }
          return -1;
      

      }

      I'm wondering if this is a novel binary search implementation, or has this approach been used by someone before? Some variants and benchmark graphs: https://sites.google.com/site/binarysearchcube/binary-search[^]

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      Perhaps I'm not thinking about it right (I have a cold so I'm not particularly sharp right now), but it seems to me that if the item is in the second half of the array, it will essentially devolve into a linear search.

      I 1 Reply Last reply
      0
      • L Lost User

        Perhaps I'm not thinking about it right (I have a cold so I'm not particularly sharp right now), but it seems to me that if the item is in the second half of the array, it will essentially devolve into a linear search.

        I Offline
        I Offline
        Igor van den Hoven
        wrote on last edited by
        #3

        It switches to a linear search when roughly 8 elements are left. The i != 0 check is there in case the key is smaller than the value at index 0.

        L 1 Reply Last reply
        0
        • I Igor van den Hoven

          It switches to a linear search when roughly 8 elements are left. The i != 0 check is there in case the key is smaller than the value at index 0.

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          Yes ok, I guess the cold got to me. I was thinking something weird based around the assumption that mid started in the middle, which it obviously doesn't. So it works, that's good. It seems closely related to the variant which keeps a "midpoint" and a "span" (here it's the next span (mid), and the midpoint plus that next span (i)). Same pros and cons too (needs fixup at the end, but inner loop is simple), the "midpoint/span" variant is usually seen (when seen at all) in its "more complicated math in the inner loop"-form which doesn't need fixup, but then what's the point.

          I 1 Reply Last reply
          0
          • L Lost User

            Yes ok, I guess the cold got to me. I was thinking something weird based around the assumption that mid started in the middle, which it obviously doesn't. So it works, that's good. It seems closely related to the variant which keeps a "midpoint" and a "span" (here it's the next span (mid), and the midpoint plus that next span (i)). Same pros and cons too (needs fixup at the end, but inner loop is simple), the "midpoint/span" variant is usually seen (when seen at all) in its "more complicated math in the inner loop"-form which doesn't need fixup, but then what's the point.

            I Offline
            I Offline
            Igor van den Hoven
            wrote on last edited by
            #5

            Using a midpoint and a span is slower because it requires 2 assignments per loop opposed to 1.5 (on average) in my implementation. I assume it's the fixup and assignment issue that left academics stumped for the past 60 years. There are also caching issues for larger arrays, for which I've created a variant that is mindful in that regard.

            L 1 Reply Last reply
            0
            • I Igor van den Hoven

              Using a midpoint and a span is slower because it requires 2 assignments per loop opposed to 1.5 (on average) in my implementation. I assume it's the fixup and assignment issue that left academics stumped for the past 60 years. There are also caching issues for larger arrays, for which I've created a variant that is mindful in that regard.

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #6

              Gregorius van den Hoven wrote:

              requires 2 assignments per loop opposed to 1.5 (on average) in my implementation.

              Well you realize it'll be a conditional move, right? But only one, whereas regular binary search would have two (or an unpredictable branch, yuck). So you'd get something like (from GCC output)

              .L4:
              sar edx
              mov ecx, eax
              sub ecx, edx
              cmp [esi+ecx*4], ebx
              cmovg eax, ecx
              cmp edx, 7
              jg .L4

              In a quick test, GCC didn't feel like using cmovs for plain old "left and right bounds" binary search. You can easily measure a huge difference due to that sort of thing, and I'm not sure that's 100% fair, after all you could implement ye olde binary search with cmovs.

              I 1 Reply Last reply
              0
              • L Lost User

                Gregorius van den Hoven wrote:

                requires 2 assignments per loop opposed to 1.5 (on average) in my implementation.

                Well you realize it'll be a conditional move, right? But only one, whereas regular binary search would have two (or an unpredictable branch, yuck). So you'd get something like (from GCC output)

                .L4:
                sar edx
                mov ecx, eax
                sub ecx, edx
                cmp [esi+ecx*4], ebx
                cmovg eax, ecx
                cmp edx, 7
                jg .L4

                In a quick test, GCC didn't feel like using cmovs for plain old "left and right bounds" binary search. You can easily measure a huge difference due to that sort of thing, and I'm not sure that's 100% fair, after all you could implement ye olde binary search with cmovs.

                I Offline
                I Offline
                Igor van den Hoven
                wrote on last edited by
                #7

                I'm not well versed in that area. Looks like the compiler is doing something differently when you free up a registry space, but it's still slower.

                int tailing_binary_search(int *array, int array_size, int key)
                {
                register int bot, mid, i;

                    if (key < array\[0\])
                    {
                            return -1;
                    }
                
                    bot = 0;
                    i = array\_size - 1;
                    mid = i / 2;
                
                    while (bot + 7 < i)
                    {
                            if (key < array\[mid\])
                            {
                                    i = mid - 1;
                                    mid -= (i - bot) / 2;
                            }
                            else
                            {
                                    bot = mid;
                                    mid += (i - bot) / 2;
                            }
                    }
                    while (key < array\[i\]) --i;
                
                    if (key == array\[i\])
                    {
                            return i;
                    }
                    return -1;
                

                }

                L 1 Reply Last reply
                0
                • I Igor van den Hoven

                  I'm not well versed in that area. Looks like the compiler is doing something differently when you free up a registry space, but it's still slower.

                  int tailing_binary_search(int *array, int array_size, int key)
                  {
                  register int bot, mid, i;

                      if (key < array\[0\])
                      {
                              return -1;
                      }
                  
                      bot = 0;
                      i = array\_size - 1;
                      mid = i / 2;
                  
                      while (bot + 7 < i)
                      {
                              if (key < array\[mid\])
                              {
                                      i = mid - 1;
                                      mid -= (i - bot) / 2;
                              }
                              else
                              {
                                      bot = mid;
                                      mid += (i - bot) / 2;
                              }
                      }
                      while (key < array\[i\]) --i;
                  
                      if (key == array\[i\])
                      {
                              return i;
                      }
                      return -1;
                  

                  }

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #8

                  Yea GCC at least makes some branches there in the loop, and it's using more complicated divisions by 2 (that work when negative).

                  1 Reply Last reply
                  0
                  • I Igor van den Hoven

                    I've been playing around with binary search algorithms and have created a novel new variant that appears to be significantly faster than any traditional implementations I've seen.

                    int boundless_binary_search(int *array, int array_size, int key)
                    {
                    register int mid, i;

                        mid = i = array\_size - 1;
                    
                        while (mid > 7)
                        {
                                mid = mid / 2;
                    
                                if (key < array\[i - mid\]) i -= mid;
                        }
                        while (i && key < array\[i\]) --i;
                    
                        if (key == array\[i\])
                        {
                                return i;
                        }
                        return -1;
                    

                    }

                    I'm wondering if this is a novel binary search implementation, or has this approach been used by someone before? Some variants and benchmark graphs: https://sites.google.com/site/binarysearchcube/binary-search[^]

                    A Offline
                    A Offline
                    Alan Balkany
                    wrote on last edited by
                    #9

                    This approach has been done before, e.g. the implementation of qsort in the C runtime library transitions from quicksort to a (linear) insertion sort when the partition size gets below a certain threshold.

                    1 Reply Last reply
                    0
                    • I Igor van den Hoven

                      I've been playing around with binary search algorithms and have created a novel new variant that appears to be significantly faster than any traditional implementations I've seen.

                      int boundless_binary_search(int *array, int array_size, int key)
                      {
                      register int mid, i;

                          mid = i = array\_size - 1;
                      
                          while (mid > 7)
                          {
                                  mid = mid / 2;
                      
                                  if (key < array\[i - mid\]) i -= mid;
                          }
                          while (i && key < array\[i\]) --i;
                      
                          if (key == array\[i\])
                          {
                                  return i;
                          }
                          return -1;
                      

                      }

                      I'm wondering if this is a novel binary search implementation, or has this approach been used by someone before? Some variants and benchmark graphs: https://sites.google.com/site/binarysearchcube/binary-search[^]

                      A Offline
                      A Offline
                      alexpeter
                      wrote on last edited by
                      #10

                      The code given here and within the analysis is flawed, not that much in the sense that it is not going to work, but in the sense that it follows the idea explained in a wrong way. If the idea was to reach and analyze last about 6 elements, that is not what this code is doing. The bug is subtle. For example this code, which executes if we have an element we want somewhere within first few elements

                          int mid, i;
                          
                          mid = i = 2147483647-1;
                      
                          while (mid > 7)
                          {
                                  mid = mid / 2;
                      
                                  i -= mid;
                          }
                          
                          printf("%d\\n",i);
                      

                      is giving 34, which means that we will have 34 elements(!) to parse more if the element we are looking for is somewhere at the beginning. This is not even near 7. Only if this code is corrected and tested, we can really compare the idea. To have an algorithm that occasionally parse through 34 elements is really not a binary search even if this is happening here and there. So a very BIG caution about implementing any new idea regarding binary search. Yes, there could be the way and this idea above could be something useful but that must be verified. It seems from this perspective that the flaw of finding first few elements significantly slower is what made this algorithm faster on average. In any case the code above cannot be taken as correct. The gymnastic of pointer and size (here incorrectly named mid) is much more complicated. Known solution with pointers and size with deferred detection looks more like this

                      int binary_search3s(int a[], int n, int t)
                      {
                      int p=0;

                      while(n>1)
                      {
                          if(a\[p + n/2\]<=t)
                          {
                              p+=n/2;
                              n++;
                          }
                      
                          n=n/2;
                      
                      }
                      
                      if(a\[p\]==t) return p;
                      
                      return -1;
                      

                      }

                      This should be a starting point if anything is to be improved. (3s in the function name is from a family of solutions.)

                      Aleksandar

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups