Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. 'diff' algorithm

'diff' algorithm

Scheduled Pinned Locked Moved C / C++ / MFC
performancealgorithmsquestionlearning
5 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    moliate
    wrote on last edited by
    #1

    I need to find differences between two large chunks of memory where the differences themselves are fairly small. I have downloaded GNU diff sources (Eugene Meyers?) and need to compare performance on my data with other diff-algorithms, like McIlroy-Hunt. I have only basic knowleadge on the topic, so any webresources or book recommendations would be greatly appreciated. Thanks /moliate


    The corners of my eyes catch hasty, bloodless motion - a mouse? Well, certainly a peripheral of some kind.

    Neil Gaiman - Cold Colours

    S S 2 Replies Last reply
    0
    • M moliate

      I need to find differences between two large chunks of memory where the differences themselves are fairly small. I have downloaded GNU diff sources (Eugene Meyers?) and need to compare performance on my data with other diff-algorithms, like McIlroy-Hunt. I have only basic knowleadge on the topic, so any webresources or book recommendations would be greatly appreciated. Thanks /moliate


      The corners of my eyes catch hasty, bloodless motion - a mouse? Well, certainly a peripheral of some kind.

      Neil Gaiman - Cold Colours

      S Offline
      S Offline
      Shog9 0
      wrote on last edited by
      #2

      This may be overkill for your needs, but check out: http://sourceforge.net/projects/xdelta/[^] Shog9 ------

      That why you have a dual processor system. One for system, one for the screen saver - Mark Nischalke on Win2k server administration

      M 1 Reply Last reply
      0
      • S Shog9 0

        This may be overkill for your needs, but check out: http://sourceforge.net/projects/xdelta/[^] Shog9 ------

        That why you have a dual processor system. One for system, one for the screen saver - Mark Nischalke on Win2k server administration

        M Offline
        M Offline
        moliate
        wrote on last edited by
        #3

        Looks very interesting. I've printed the article and will read it tonight. Thanks a lot! /moliate

        1 Reply Last reply
        0
        • M moliate

          I need to find differences between two large chunks of memory where the differences themselves are fairly small. I have downloaded GNU diff sources (Eugene Meyers?) and need to compare performance on my data with other diff-algorithms, like McIlroy-Hunt. I have only basic knowleadge on the topic, so any webresources or book recommendations would be greatly appreciated. Thanks /moliate


          The corners of my eyes catch hasty, bloodless motion - a mouse? Well, certainly a peripheral of some kind.

          Neil Gaiman - Cold Colours

          S Offline
          S Offline
          Scott H Settlemier
          wrote on last edited by
          #4

          nice discussion here[^] The problem with a big chunk of memory is the pure size of the string. Most diff-like programs use something other than a byte or character as the basic string element. (e.g. a line of text is probably the most common string element for the diff algorithm. hence the 'string' is really only as long as the number of lines in the file.) If you can give up the minimum # of byte changes in the two memory blocks, you can get better performance by using some suitable 'chunk' definition to break up memory into the elements of the strings to be passed for diff analysis. I'm no expert on this but was implementing something similar just recently where the diff was first calculated on our chunks, and only if the changed regions were small, would a finer intra-chunk diff be performed. Be careful though, your chunk definition should have the property that changes naturally fall within chunks, not across chunk boundaries.

          M 1 Reply Last reply
          0
          • S Scott H Settlemier

            nice discussion here[^] The problem with a big chunk of memory is the pure size of the string. Most diff-like programs use something other than a byte or character as the basic string element. (e.g. a line of text is probably the most common string element for the diff algorithm. hence the 'string' is really only as long as the number of lines in the file.) If you can give up the minimum # of byte changes in the two memory blocks, you can get better performance by using some suitable 'chunk' definition to break up memory into the elements of the strings to be passed for diff analysis. I'm no expert on this but was implementing something similar just recently where the diff was first calculated on our chunks, and only if the changed regions were small, would a finer intra-chunk diff be performed. Be careful though, your chunk definition should have the property that changes naturally fall within chunks, not across chunk boundaries.

            M Offline
            M Offline
            moliate
            wrote on last edited by
            #5

            Thanks. My problem is mostly that the string cannot be divided into natural chunks, meaning that the insertion of a single byte could impose a lot of extra processing. I'll look more closely at the PERL code for the LCS. It seems to have some memory improvements over full table LCS, which is very nice when working with strings of 100kb each. Cheers /moliate


            The corners of my eyes catch hasty, bloodless motion - a mouse? Well, certainly a peripheral of some kind.

            Neil Gaiman - Cold Colours

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups