Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Friday's Coding Challenge

Friday's Coding Challenge

Scheduled Pinned Locked Moved The Lounge
c++algorithmsarchitectureperformancehelp
48 Posts 22 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Chris Maunder

    Read the fine print here[^]. :|

    cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

    Richard Andrew x64R Offline
    Richard Andrew x64R Offline
    Richard Andrew x64
    wrote on last edited by
    #21

    Whoa, somebody missed the joke icon!

    The difficult we do right away... ...the impossible takes slightly longer.

    R 1 Reply Last reply
    0
    • C Chris Maunder

      Here's a more involved problem that is suitable for a lazy Friday afternoon. Suppose you have a table (or other structure) that stores a trillion name/value pairs. You need to look up values from this table millions of times as fast as possible, but you don't have enough memory to simply store the table in memory. One thing you do notice, though, is that the same values tend to be requested multiple times over short periods of time. So for 1 minute you may only be accessing 1000 values, repeatedly, then another minute - or hour (who knows) - you may be accessing an entirely different set of 1000 values. You can't cache the entire table. The challenge is to provide a caching algorithm that will automatically adapt to the changing subset of values being requested. Pseudo code is fine but ASM gets you Man Points.

      cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

      M Offline
      M Offline
      Michael Bergman
      wrote on last edited by
      #22

      I would use an LRFU[^] (a hybrid of LRU least recently used and LFU least frequently used) algorithm.

      m.bergman

      For Bruce Schneier, quanta only have one state : afraid.

      To succeed in the world it is not enough to be stupid, you must also be well-mannered. -- Voltaire

      Honesty is the best policy, but insanity is a better defense. -- Steve Landesberg

      1 Reply Last reply
      0
      • Richard Andrew x64R Richard Andrew x64

        Whoa, somebody missed the joke icon!

        The difficult we do right away... ...the impossible takes slightly longer.

        R Offline
        R Offline
        Rajesh R Subramanian
        wrote on last edited by
        #23

        I did see the joke icon, but I'm sick of seeing someone or the other replying with this same "joke" every time a programming related thread is started in the lounge. Not that I'm voting on that post, but if it really is meant to be a joke, it's not even mildly funny.

        "Real men drive manual transmission" - Rajesh.

        Richard Andrew x64R 1 Reply Last reply
        0
        • S Simon_Whale

          I like these Challenges as they give me a chance to try something beyond what I do at work! I would also like to see what the possible answer could be too

          Lobster Thermidor aux crevettes with a Mornay sauce, served in a Provençale manner with shallots and aubergines, garnished with truffle pate, brandy and a fried egg on top and Spam - Monty Python Spam Sketch

          C Offline
          C Offline
          Chris Maunder
          wrote on last edited by
          #24

          OK, I'll throw one of our solutions into the ring seeing as we're not getting any actual code, nor even pseudo-code (though Chris Losinger[^] was closest)

          Create a nice linked list - say 5000 elements.
          Decide on the number of common requests (say 1000)
          For every request, check to see if it's in the list by traversing from the head element
          If the element is in the array
          If the element is in the first 1000 items
          return the value
          else
          move the value to the head of the cache
          and drop the last item in the cache if we have more than 5000 items
          and return the value
          else
          Look up the value from the table
          and add it to the head of the list
          and drop the last item in the cache if we have more than 5000 items
          and return the value

          The specific situation this problem was motivated from was IP lookups and spiders. Generally IP lookups were random, but occasionally we'd have a single IP generating tens of thousands of lookups. We ended up running a very small (500-1000) size cache with a "quick lookup" section at the head of the list of 300 items. This ran faster than any other caching method we used at the time. We have since moved to a more general caching method that combines linked list and dictionary so we have much faster lookup, a nice "quick lookup" area, and a fast reordering. I keep meaning to post the code. One of these days...

          cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

          S B P 3 Replies Last reply
          0
          • C Chris Maunder

            :rolleyes: I'm pulling out small puzzles we have already solved and that I enjoyed solving. It's easier for me to pose a question that I have already solved (at least to a point where it works sufficiently) than to rip off programming challenges from other sites and books that people can simply Google to get the answer to. So how about a different challenge for you: come up with your own programming challenge.

            cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

            J Offline
            J Offline
            jesarg
            wrote on last edited by
            #25

            I love programming problems, but I have meetings all afternoon long today and won't be able to do anything on the forums until this evening. Try me again next Friday.

            1 Reply Last reply
            0
            • C Chris Maunder

              Here's a more involved problem that is suitable for a lazy Friday afternoon. Suppose you have a table (or other structure) that stores a trillion name/value pairs. You need to look up values from this table millions of times as fast as possible, but you don't have enough memory to simply store the table in memory. One thing you do notice, though, is that the same values tend to be requested multiple times over short periods of time. So for 1 minute you may only be accessing 1000 values, repeatedly, then another minute - or hour (who knows) - you may be accessing an entirely different set of 1000 values. You can't cache the entire table. The challenge is to provide a caching algorithm that will automatically adapt to the changing subset of values being requested. Pseudo code is fine but ASM gets you Man Points.

              cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

              E Offline
              E Offline
              ErnestoNet
              wrote on last edited by
              #26

              The solution to that problem is "memcached" (http://memcached.org/[^]). Of course, you can write your own, but being the code opensource, I´d check at what they're doing. They say some of how it works, here: http://amix.dk/blog/post/19356[^] Basically: They focus primarily on memory fragmentation. About the algorithm: "why would you waste processor cycles on finding expired items when you're not receiving any requests for it (as in, no one sees the data) *and* you haven't reached your memory constraints yet ?"

              C 1 Reply Last reply
              0
              • C Chris Maunder

                OK, I'll throw one of our solutions into the ring seeing as we're not getting any actual code, nor even pseudo-code (though Chris Losinger[^] was closest)

                Create a nice linked list - say 5000 elements.
                Decide on the number of common requests (say 1000)
                For every request, check to see if it's in the list by traversing from the head element
                If the element is in the array
                If the element is in the first 1000 items
                return the value
                else
                move the value to the head of the cache
                and drop the last item in the cache if we have more than 5000 items
                and return the value
                else
                Look up the value from the table
                and add it to the head of the list
                and drop the last item in the cache if we have more than 5000 items
                and return the value

                The specific situation this problem was motivated from was IP lookups and spiders. Generally IP lookups were random, but occasionally we'd have a single IP generating tens of thousands of lookups. We ended up running a very small (500-1000) size cache with a "quick lookup" section at the head of the list of 300 items. This ran faster than any other caching method we used at the time. We have since moved to a more general caching method that combines linked list and dictionary so we have much faster lookup, a nice "quick lookup" area, and a fast reordering. I keep meaning to post the code. One of these days...

                cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                S Offline
                S Offline
                Simon_Whale
                wrote on last edited by
                #27

                Thanks for that Chris even from that pseudo code even I could implement a coded solution. Its always good learn something new!

                Lobster Thermidor aux crevettes with a Mornay sauce, served in a Provençale manner with shallots and aubergines, garnished with truffle pate, brandy and a fried egg on top and Spam - Monty Python Spam Sketch

                1 Reply Last reply
                0
                • E ErnestoNet

                  The solution to that problem is "memcached" (http://memcached.org/[^]). Of course, you can write your own, but being the code opensource, I´d check at what they're doing. They say some of how it works, here: http://amix.dk/blog/post/19356[^] Basically: They focus primarily on memory fragmentation. About the algorithm: "why would you waste processor cycles on finding expired items when you're not receiving any requests for it (as in, no one sees the data) *and* you haven't reached your memory constraints yet ?"

                  C Offline
                  C Offline
                  Chris Maunder
                  wrote on last edited by
                  #28

                  And what caching algorithm would you use with MemCache? There are trillions of values you need to store. Assume you can't hold them all in memory, even with a distributed cache. This isn't a hardware / memory / processor problem. It's about thinking through the actual problem.

                  cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                  E 2 Replies Last reply
                  0
                  • C Chris Maunder

                    Here's a more involved problem that is suitable for a lazy Friday afternoon. Suppose you have a table (or other structure) that stores a trillion name/value pairs. You need to look up values from this table millions of times as fast as possible, but you don't have enough memory to simply store the table in memory. One thing you do notice, though, is that the same values tend to be requested multiple times over short periods of time. So for 1 minute you may only be accessing 1000 values, repeatedly, then another minute - or hour (who knows) - you may be accessing an entirely different set of 1000 values. You can't cache the entire table. The challenge is to provide a caching algorithm that will automatically adapt to the changing subset of values being requested. Pseudo code is fine but ASM gets you Man Points.

                    cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #29

                    So the first reads are from file. Cache the results in a MRU cache. Sebsequent reads should hit the cache. If not, back to the file read, then dump the LRU item off the MRU cache.

                    ============================== Nothing to say.

                    1 Reply Last reply
                    0
                    • A Andrew Rissing

                      This sounds oddly like something for CodeProject. Are you trying to cut overhead costs by outsourcing to the people who visit this site? :D Diabolical! [Edit: Ha...sounds like I wasn't the first to think such[^].]

                      P Offline
                      P Offline
                      PIEBALDconsult
                      wrote on last edited by
                      #30

                      Andrew Rissing wrote:

                      Diabolical!

                      "Inconceivable!"

                      A 1 Reply Last reply
                      0
                      • C Chris Maunder

                        OK, I'll throw one of our solutions into the ring seeing as we're not getting any actual code, nor even pseudo-code (though Chris Losinger[^] was closest)

                        Create a nice linked list - say 5000 elements.
                        Decide on the number of common requests (say 1000)
                        For every request, check to see if it's in the list by traversing from the head element
                        If the element is in the array
                        If the element is in the first 1000 items
                        return the value
                        else
                        move the value to the head of the cache
                        and drop the last item in the cache if we have more than 5000 items
                        and return the value
                        else
                        Look up the value from the table
                        and add it to the head of the list
                        and drop the last item in the cache if we have more than 5000 items
                        and return the value

                        The specific situation this problem was motivated from was IP lookups and spiders. Generally IP lookups were random, but occasionally we'd have a single IP generating tens of thousands of lookups. We ended up running a very small (500-1000) size cache with a "quick lookup" section at the head of the list of 300 items. This ran faster than any other caching method we used at the time. We have since moved to a more general caching method that combines linked list and dictionary so we have much faster lookup, a nice "quick lookup" area, and a fast reordering. I keep meaning to post the code. One of these days...

                        cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                        B Offline
                        B Offline
                        Bassam Abdul Baki
                        wrote on last edited by
                        #31

                        Chris Maunder wrote:

                        If the element is in the first 1000 items return the value

                        Why not move it up the chain by adding a count of how many times this has been requested? You'll need to sort the 5,000 elements each time, but that shouldn't be a problem. To minimize sorting, you can make the counts integers modulo 100 to give each fifty or so the same number and not have to sort them each time.

                        Web - BM - RSS - Math - LinkedIn

                        C 1 Reply Last reply
                        0
                        • P PIEBALDconsult

                          Andrew Rissing wrote:

                          Diabolical!

                          "Inconceivable!"

                          A Offline
                          A Offline
                          Andrew Rissing
                          wrote on last edited by
                          #32

                          PIEBALDconsult wrote:

                          "Inconceivable!"

                          I don't think that word means what you think it means....[^]

                          1 Reply Last reply
                          0
                          • B Bassam Abdul Baki

                            Chris Maunder wrote:

                            If the element is in the first 1000 items return the value

                            Why not move it up the chain by adding a count of how many times this has been requested? You'll need to sort the 5,000 elements each time, but that shouldn't be a problem. To minimize sorting, you can make the counts integers modulo 100 to give each fifty or so the same number and not have to sort them each time.

                            Web - BM - RSS - Math - LinkedIn

                            C Offline
                            C Offline
                            Chris Maunder
                            wrote on last edited by
                            #33

                            I like the chunking idea, but resorting 5000 elements each time is onerous.

                            cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                            B 1 Reply Last reply
                            0
                            • C Chris Maunder

                              And what caching algorithm would you use with MemCache? There are trillions of values you need to store. Assume you can't hold them all in memory, even with a distributed cache. This isn't a hardware / memory / processor problem. It's about thinking through the actual problem.

                              cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                              E Offline
                              E Offline
                              ErnestoNet
                              wrote on last edited by
                              #34

                              Memcache sets a TTL (in milliseconds) when it adds the entry. After it expires it requeries. The parameters to set that TTL should be how often data changes in that table. I guess you could keep track of how many "visits" each item has and how often it changes in the original table. So, a simple algorithm would set to set the TTL based on a formula on which are visited a lot (increase TTL) and how fast they change (decrease TTL). Memcache itself uses MRU, LRU and lazy expired-LRU cleanup when memory is full.

                              it´s the journey, not the destination that matters

                              1 Reply Last reply
                              0
                              • C Chris Maunder

                                :rolleyes: I'm pulling out small puzzles we have already solved and that I enjoyed solving. It's easier for me to pose a question that I have already solved (at least to a point where it works sufficiently) than to rip off programming challenges from other sites and books that people can simply Google to get the answer to. So how about a different challenge for you: come up with your own programming challenge.

                                cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                                A Offline
                                A Offline
                                Andrew Rissing
                                wrote on last edited by
                                #35

                                I can come up with a programming challenge. The Codeproject site is down. The owner of the site would rather sleep in with the minus-silly degree weather outside. You have one phone and you have to find a way to determine his number out of the X potential numbers in Canada. Find the quickest way to wake him up before he wakes up on his own.

                                C 1 Reply Last reply
                                0
                                • A Andrew Rissing

                                  I can come up with a programming challenge. The Codeproject site is down. The owner of the site would rather sleep in with the minus-silly degree weather outside. You have one phone and you have to find a way to determine his number out of the X potential numbers in Canada. Find the quickest way to wake him up before he wakes up on his own.

                                  C Offline
                                  C Offline
                                  Chris Maunder
                                  wrote on last edited by
                                  #36

                                  That's too easy. You know the location is in Toronto, so hack into the PetSmart online order system, look up the last years orders in Toronto, filter by Hamster feed, order by volume, descending, then take the first order, match to client details, and you have a phone number.

                                  cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                                  1 Reply Last reply
                                  0
                                  • C Chris Maunder

                                    I like the chunking idea, but resorting 5000 elements each time is onerous.

                                    cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                                    B Offline
                                    B Offline
                                    Bassam Abdul Baki
                                    wrote on last edited by
                                    #37

                                    I agree. I added the chunking after the sorting, but kept it since sorting would still have to be done once the chunkiness gets defragmented, which shouldn't be often in theory. So do I get anything for improving on your code? Huh, huh? :D

                                    Web - BM - RSS - Math - LinkedIn

                                    1 Reply Last reply
                                    0
                                    • N Nagy Vilmos

                                      Chris Maunder wrote:

                                      ASM gets you Man Points.

                                      And brainfuck? Do we get points for using brainfuck?


                                      Panic, Chaos, Destruction. My work here is done. Drink. Get drunk. Fall over - P O'H OK, I will win to day or my name isn't Ethel Crudacre! - DD Ethel Crudacre I cannot live by bread alone. Bacon and ketchup are needed as well. - Trollslayer Have a bit more patience with newbies. Of course some of them act dumb - they're often *students*, for heaven's sake - Terry Pratchett

                                      G Offline
                                      G Offline
                                      Gary Wheeler
                                      wrote on last edited by
                                      #38

                                      You're just trying to see how long it takes Chris to modify the vulgarity filter to remove 'Brainfuck' ;).

                                      Software Zen: delete this;

                                      1 Reply Last reply
                                      0
                                      • C Chris Maunder

                                        Here's a more involved problem that is suitable for a lazy Friday afternoon. Suppose you have a table (or other structure) that stores a trillion name/value pairs. You need to look up values from this table millions of times as fast as possible, but you don't have enough memory to simply store the table in memory. One thing you do notice, though, is that the same values tend to be requested multiple times over short periods of time. So for 1 minute you may only be accessing 1000 values, repeatedly, then another minute - or hour (who knows) - you may be accessing an entirely different set of 1000 values. You can't cache the entire table. The challenge is to provide a caching algorithm that will automatically adapt to the changing subset of values being requested. Pseudo code is fine but ASM gets you Man Points.

                                        cheers, Chris Maunder The Code Project | Co-founder Microsoft C++ MVP

                                        S Offline
                                        S Offline
                                        Slacker007
                                        wrote on last edited by
                                        #39

                                        I don't have a Perl script for this so I can't help you. Next challenge... :-D

                                        Just along for the ride. "the meat from that butcher is just the dogs danglies, absolutely amazing cuts of beef." - DaveAuld (2011)
                                        "No, that is just the earthly manifestation of the Great God Retardon." - Nagy Vilmos (2011) "It is the celestial scrotum of good luck!" - Nagy Vilmos (2011)

                                        1 Reply Last reply
                                        0
                                        • C Chris Losinger

                                          i'd try it with a fixed-size double-ended list. get a request for A check the list for A, starting at the 'front' if A is in the list, move it to the front if A isn't in the list add it to the front of the list if you just added and the list has more than N items, pull the item off the back end, and discard. frequently-used items will stay near the front of the list. infrequently-used items will get pushed out, eventually. (you could probably also do this with a circular buffer.)

                                          image processing toolkits | batch image processing

                                          M Offline
                                          M Offline
                                          Marc Clifton
                                          wrote on last edited by
                                          #40

                                          Chris Losinger wrote:

                                          frequently-used items will stay near the front of the list. infrequently-used items will get pushed out, eventually.

                                          That seems like a nice approach. Avoid the temporal issue entirely. Marc

                                          My Blog
                                          An Agile walk on the wild side with Relationship Oriented Programming

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups