Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. List compare algorithm

List compare algorithm

Scheduled Pinned Locked Moved C#
algorithmsregextutoriallounge
16 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Stevo Z

    Hi, I would like to compare 2 instances of List < Point > which meet following specification. - both lists contain uneven count of random Points (List < Point > ) - special method evaluates two points (bool MatchPoints(ref Point p1, ref Point p2); ) - there is no similarity in points in those lists. They can be perfectly similar or they can have no points in commmon at all. - it is possible to do whatever processing (hashing etc...) of those lists when created, because they won't change much (if at all) after created. - result should be saying percentage of points that match from all evaluated points - there will be up to 1000 points approx in a single list. simplyfied Example:

    bool MatchPoints(ref Point p1, ref Point p2)
    {
    return (p1.X == p2.X && p1.Y == p2.Y);
    }

    List<point> l1, l2;

    l1 = {[0,0], [1,1], [2,2], [3,3]}
    l2 = {[1,1], [3,3], [4,4]}

    points that match = {[1,1] , [3,3]}
    points that don't match = {[0,0], [2,2], [4,4]}

    percentage matched = 2*2 / 7 = 4/7;

    any ideas :) thank you

    zilo

    P Offline
    P Offline
    Pete OHanlon
    wrote on last edited by
    #2

    Very good. Now what's your question? What have you tried and where have you failed?

    Deja View - the feeling that you've seen this post before.

    My blog | My articles

    S 1 Reply Last reply
    0
    • P Pete OHanlon

      Very good. Now what's your question? What have you tried and where have you failed?

      Deja View - the feeling that you've seen this post before.

      My blog | My articles

      S Offline
      S Offline
      Stevo Z
      wrote on last edited by
      #3

      My question is what would be the best algorithm in terms of speed to do that. I haven't tried anything yet, although I have some ideas. I real, the MatchPoints method evaluates distance between two points . They don't have to be exactly the same to match, they just need to be whithin a specified distance. Now what I'm trying to do is create a Region from rectangles (should be circles but I guess rectangles approach will be faster and much less expensive) where each point is the center of it's rectangle. Union all rectangles to region. So after I create the list of points, I also create Region that these points 'cover'. After that I just need to run points from one list on the other's list Region for IsVisible(Point point) constraint. I need to test it to see how it performs...

      zilo

      1 Reply Last reply
      0
      • S Stevo Z

        Hi, I would like to compare 2 instances of List < Point > which meet following specification. - both lists contain uneven count of random Points (List < Point > ) - special method evaluates two points (bool MatchPoints(ref Point p1, ref Point p2); ) - there is no similarity in points in those lists. They can be perfectly similar or they can have no points in commmon at all. - it is possible to do whatever processing (hashing etc...) of those lists when created, because they won't change much (if at all) after created. - result should be saying percentage of points that match from all evaluated points - there will be up to 1000 points approx in a single list. simplyfied Example:

        bool MatchPoints(ref Point p1, ref Point p2)
        {
        return (p1.X == p2.X && p1.Y == p2.Y);
        }

        List<point> l1, l2;

        l1 = {[0,0], [1,1], [2,2], [3,3]}
        l2 = {[1,1], [3,3], [4,4]}

        points that match = {[1,1] , [3,3]}
        points that don't match = {[0,0], [2,2], [4,4]}

        percentage matched = 2*2 / 7 = 4/7;

        any ideas :) thank you

        zilo

        C Offline
        C Offline
        carbon_golem
        wrote on last edited by
        #4

        Do you have to use your own Point class? I think System.Windows.Point has the == operator overloaded already so matching code would be a little simpler, if you're already using it, then disregard. I ran into this problem before, and I started with the smaller of the 2 lists to use as a starting point as you're not going to have any match lists longer than the shortest list. But you're still stuck with a runtime of O(n2). ( I think that's right, but it's been a while for O notation). You could also put your two point lists into Generic Lists and if you're using .NET3.5 there is a Union<> generic extension method that can be used, but I'm unsure if that'll buy you any performance. I looked at the IL generated by the MatchPoints function, and it's 37 Bytes. If you drop the ref and do p1 == p2 the codesize drops to 13 bytes in IL.

        "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

        S 1 Reply Last reply
        0
        • C carbon_golem

          Do you have to use your own Point class? I think System.Windows.Point has the == operator overloaded already so matching code would be a little simpler, if you're already using it, then disregard. I ran into this problem before, and I started with the smaller of the 2 lists to use as a starting point as you're not going to have any match lists longer than the shortest list. But you're still stuck with a runtime of O(n2). ( I think that's right, but it's been a while for O notation). You could also put your two point lists into Generic Lists and if you're using .NET3.5 there is a Union<> generic extension method that can be used, but I'm unsure if that'll buy you any performance. I looked at the IL generated by the MatchPoints function, and it's 37 Bytes. If you drop the ref and do p1 == p2 the codesize drops to 13 bytes in IL.

          "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

          S Offline
          S Offline
          Stevo Z
          wrote on last edited by
          #5

          Hi, thanx for answer, I've created my own Point2D struct because accessing fields directly without Properties is much faster when considering a lot of iterations. I can overload the == operator in my Point2D struct same way as it is in Point struct if I needed... yes my lists are Generic lists (using .NET 2.0). I don't understand what's O(n2) :) .

          zilo

          C D 2 Replies Last reply
          0
          • S Stevo Z

            Hi, thanx for answer, I've created my own Point2D struct because accessing fields directly without Properties is much faster when considering a lot of iterations. I can overload the == operator in my Point2D struct same way as it is in Point struct if I needed... yes my lists are Generic lists (using .NET 2.0). I don't understand what's O(n2) :) .

            zilo

            C Offline
            C Offline
            carbon_golem
            wrote on last edited by
            #6

            Could you sort the lists? Maybe adding X+Y and sorting the lists on that parameter, and then doing a comparison to see if a match is possible may prune the searches down. The worst case is that all the points add to the same number and you're doing an extra comparison. But practically, you'd probably cut on average half of the possible cases. And there's definitely some tricks you can do by saving positions in the list where you know values can't match. For example, when comparing near the end of the list, you know that sorted elements in the other list's beginning aren't going to match, so skip over them using a fencepost parameter you have saved. Hope this helps.

            "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

            S 1 Reply Last reply
            0
            • C carbon_golem

              Could you sort the lists? Maybe adding X+Y and sorting the lists on that parameter, and then doing a comparison to see if a match is possible may prune the searches down. The worst case is that all the points add to the same number and you're doing an extra comparison. But practically, you'd probably cut on average half of the possible cases. And there's definitely some tricks you can do by saving positions in the list where you know values can't match. For example, when comparing near the end of the list, you know that sorted elements in the other list's beginning aren't going to match, so skip over them using a fencepost parameter you have saved. Hope this helps.

              "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

              S Offline
              S Offline
              Stevo Z
              wrote on last edited by
              #7

              I was thiking of that too but I can't figure out how to sort it. Just adding x+y doesn't tell a thing about that point location. e.g you have two points [0,20] and [20,0]. They both have sum of 20 hence are far away from each other.

              zilo

              C 1 Reply Last reply
              0
              • S Stevo Z

                I was thiking of that too but I can't figure out how to sort it. Just adding x+y doesn't tell a thing about that point location. e.g you have two points [0,20] and [20,0]. They both have sum of 20 hence are far away from each other.

                zilo

                C Offline
                C Offline
                carbon_golem
                wrote on last edited by
                #8

                [0, 20][20, 0] have the same sum, but aren't close... but if they have the same sum, they could be the same. you know for a fact that if 2 points do not have the same sum, then they aren't ever going to be equivalent. Now, knowing that, if you sort on the sum of the two you would have that to use also. So the sums, would be ascending ala.. List A:[1][2][3][3][3][4][5][5][5][78][9000]... (Longer) List B:[1][1][3][4][4][4][5][5][6][78][9000]... (Shorter) What I'm saying is this. Sort both on the sum. Pick the shorter list to use for the base comparison. The following code won't work, but you should get the idea from that. This doesn't use any kind of fencepost. If it were, we'd save the index of the last matched in the list and use that as a starting point instead of beginning at the front of the longer list. foreach(Point a in shorter){ foreach(Point b in longer){ if(a.Sum != b.Sum) break; if(a == b) union.Add(a); break; } }

                "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

                S 1 Reply Last reply
                0
                • S Stevo Z

                  Hi, thanx for answer, I've created my own Point2D struct because accessing fields directly without Properties is much faster when considering a lot of iterations. I can overload the == operator in my Point2D struct same way as it is in Point struct if I needed... yes my lists are Generic lists (using .NET 2.0). I don't understand what's O(n2) :) .

                  zilo

                  D Offline
                  D Offline
                  DavidNohejl
                  wrote on last edited by
                  #9

                  Zilo(svk) wrote:

                  I've created my own Point2D struct because accessing fields directly without Properties is much faster when considering a lot of iterations.

                  Are you sure? I belive JIT compiler does optimize it away, so you are only doing extra work...

                  Zilo(svk) wrote:

                  I don't understand what's O(n2)

                  http://en.wikipedia.org/wiki/Big_O_notation[^] Oversimplification: for N points, it will do N*N "operations".


                  [My Blog]
                  "Visual studio desperately needs some performance improvements. It is sometimes almost as slow as eclipse." - Rüdiger Klaehn
                  "Real men use mspaint for writing code and notepad for designing graphics." - Anna-Jayne Metcalfe

                  S 1 Reply Last reply
                  0
                  • C carbon_golem

                    [0, 20][20, 0] have the same sum, but aren't close... but if they have the same sum, they could be the same. you know for a fact that if 2 points do not have the same sum, then they aren't ever going to be equivalent. Now, knowing that, if you sort on the sum of the two you would have that to use also. So the sums, would be ascending ala.. List A:[1][2][3][3][3][4][5][5][5][78][9000]... (Longer) List B:[1][1][3][4][4][4][5][5][6][78][9000]... (Shorter) What I'm saying is this. Sort both on the sum. Pick the shorter list to use for the base comparison. The following code won't work, but you should get the idea from that. This doesn't use any kind of fencepost. If it were, we'd save the index of the last matched in the list and use that as a starting point instead of beginning at the front of the longer list. foreach(Point a in shorter){ foreach(Point b in longer){ if(a.Sum != b.Sum) break; if(a == b) union.Add(a); break; } }

                    "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

                    S Offline
                    S Offline
                    Stevo Z
                    wrote on last edited by
                    #10

                    Your approach is right but, ... I dont know if you red my second post. Matching points is not a matter if they are the same, it is a matter of the distance between then. They may be the same, but they may not e.g. lets say const distance = 3; p1 [0,0] p2 [2,2] distance between p1 and p2 is: d = Math.Sqrt((p2.X-p1.X)^2 + (p2.Y-p1.Y)^2) = 2,3... (2,3 < 3) == true so p1 and p2 match.

                    zilo

                    C 1 Reply Last reply
                    0
                    • D DavidNohejl

                      Zilo(svk) wrote:

                      I've created my own Point2D struct because accessing fields directly without Properties is much faster when considering a lot of iterations.

                      Are you sure? I belive JIT compiler does optimize it away, so you are only doing extra work...

                      Zilo(svk) wrote:

                      I don't understand what's O(n2)

                      http://en.wikipedia.org/wiki/Big_O_notation[^] Oversimplification: for N points, it will do N*N "operations".


                      [My Blog]
                      "Visual studio desperately needs some performance improvements. It is sometimes almost as slow as eclipse." - Rüdiger Klaehn
                      "Real men use mspaint for writing code and notepad for designing graphics." - Anna-Jayne Metcalfe

                      S Offline
                      S Offline
                      Stevo Z
                      wrote on last edited by
                      #11

                      Run this code

                      using System.Drawing;
                      using System.Diagnostics;
                      using System;

                      namespace ConsoleApplication1
                      {
                      class Program
                      {
                      static void Main(string[] args)
                      {
                      const int count = 100000000;
                      int x = 0;
                      Stopwatch watch = new Stopwatch();

                              Console.WriteLine("Starting Point iteration");
                              Point p = new Point(1, 1);
                              Point2D p2d = new Point2D(1, 1);
                              watch.Start();            
                              for (int i = 0; i < count; i++)
                              { 
                                  x = p.X + p.Y;
                              }
                              watch.Stop();
                              Console.WriteLine("Point iteration took {0} mls\\n\\r", watch.ElapsedMilliseconds);
                              watch.Reset();
                              Console.WriteLine("Starting Point2D iteration");
                              watch.Start();     
                              for (int i = 0; i < count; i++)
                              {               
                                  x = p2d.X + p2d.Y;
                              }
                              watch.Stop();
                              Console.WriteLine("Point2D iteration took {0} mls", watch.ElapsedMilliseconds);
                      
                              Console.ReadKey();
                          }
                      
                          struct Point2D
                          {
                              public int X, Y;
                      
                              public Point2D(int x, int y)
                              {
                                  this.X = x;
                                  this.Y = y;
                              }
                          }
                      }   
                      

                      }

                      zilo

                      D 1 Reply Last reply
                      0
                      • S Stevo Z

                        Your approach is right but, ... I dont know if you red my second post. Matching points is not a matter if they are the same, it is a matter of the distance between then. They may be the same, but they may not e.g. lets say const distance = 3; p1 [0,0] p2 [2,2] distance between p1 and p2 is: d = Math.Sqrt((p2.X-p1.X)^2 + (p2.Y-p1.Y)^2) = 2,3... (2,3 < 3) == true so p1 and p2 match.

                        zilo

                        C Offline
                        C Offline
                        carbon_golem
                        wrote on last edited by
                        #12

                        Sorry for the confusion there. I presume that you're familiar with connected graphs. I'd pre-generate the graph and it's weights to start from. You'll be stuck with the math of making the distances for every node to every other node, but you won't be doing it more than once. But once you have a list of nodes and distances to other nodes you can simply sort out the distances to get the number of nodes that have distances less than what you're looking for. Now you're stuck with a math optimization problem, and there's lots on the net about squeezing performance out of a distance equation.

                        "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

                        S 1 Reply Last reply
                        0
                        • C carbon_golem

                          Sorry for the confusion there. I presume that you're familiar with connected graphs. I'd pre-generate the graph and it's weights to start from. You'll be stuck with the math of making the distances for every node to every other node, but you won't be doing it more than once. But once you have a list of nodes and distances to other nodes you can simply sort out the distances to get the number of nodes that have distances less than what you're looking for. Now you're stuck with a math optimization problem, and there's lots on the net about squeezing performance out of a distance equation.

                          "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

                          S Offline
                          S Offline
                          Stevo Z
                          wrote on last edited by
                          #13

                          hmmm. this would mean to build a graph of all points(nodes) in graphics rectangle? I dnon't think I got you right, because this would mean to connect all points on screen, which is awful lot. But you gave me good idea. Maybe I could hash allready calculated paths and just get the result next time same path will be calculated.

                          zilo

                          C 1 Reply Last reply
                          0
                          • S Stevo Z

                            hmmm. this would mean to build a graph of all points(nodes) in graphics rectangle? I dnon't think I got you right, because this would mean to connect all points on screen, which is awful lot. But you gave me good idea. Maybe I could hash allready calculated paths and just get the result next time same path will be calculated.

                            zilo

                            C Offline
                            C Offline
                            carbon_golem
                            wrote on last edited by
                            #14

                            Awesome... Here's a link to a good article about using some .NET 3.5 stuff to cache results. I know you're using 2.0, but it's worth the read and it may give you more ideas. http://msdn2.microsoft.com/en-us/vcsharp/bb870976.aspx[^]

                            "Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

                            1 Reply Last reply
                            0
                            • S Stevo Z

                              Run this code

                              using System.Drawing;
                              using System.Diagnostics;
                              using System;

                              namespace ConsoleApplication1
                              {
                              class Program
                              {
                              static void Main(string[] args)
                              {
                              const int count = 100000000;
                              int x = 0;
                              Stopwatch watch = new Stopwatch();

                                      Console.WriteLine("Starting Point iteration");
                                      Point p = new Point(1, 1);
                                      Point2D p2d = new Point2D(1, 1);
                                      watch.Start();            
                                      for (int i = 0; i < count; i++)
                                      { 
                                          x = p.X + p.Y;
                                      }
                                      watch.Stop();
                                      Console.WriteLine("Point iteration took {0} mls\\n\\r", watch.ElapsedMilliseconds);
                                      watch.Reset();
                                      Console.WriteLine("Starting Point2D iteration");
                                      watch.Start();     
                                      for (int i = 0; i < count; i++)
                                      {               
                                          x = p2d.X + p2d.Y;
                                      }
                                      watch.Stop();
                                      Console.WriteLine("Point2D iteration took {0} mls", watch.ElapsedMilliseconds);
                              
                                      Console.ReadKey();
                                  }
                              
                                  struct Point2D
                                  {
                                      public int X, Y;
                              
                                      public Point2D(int x, int y)
                                      {
                                          this.X = x;
                                          this.Y = y;
                                      }
                                  }
                              }   
                              

                              }

                              zilo

                              D Offline
                              D Offline
                              DavidNohejl
                              wrote on last edited by
                              #15

                              Right... Debug, without optimalizations: Starting Point iteration Point iteration took 477 mls Starting Point2D iteration Point2D iteration took 263 mls Release, with optimalizations on: Starting Point iteration Point iteration took 45 mls Starting Point2D iteration Point2D iteration took 43 mls And same configuration, second run: Starting Point iteration Point iteration took 43 mls Starting Point2D iteration Point2D iteration took 43 mls


                              [My Blog]
                              "Visual studio desperately needs some performance improvements. It is sometimes almost as slow as eclipse." - Rüdiger Klaehn
                              "Real men use mspaint for writing code and notepad for designing graphics." - Anna-Jayne Metcalfe

                              S 1 Reply Last reply
                              0
                              • D DavidNohejl

                                Right... Debug, without optimalizations: Starting Point iteration Point iteration took 477 mls Starting Point2D iteration Point2D iteration took 263 mls Release, with optimalizations on: Starting Point iteration Point iteration took 45 mls Starting Point2D iteration Point2D iteration took 43 mls And same configuration, second run: Starting Point iteration Point iteration took 43 mls Starting Point2D iteration Point2D iteration took 43 mls


                                [My Blog]
                                "Visual studio desperately needs some performance improvements. It is sometimes almost as slow as eclipse." - Rüdiger Klaehn
                                "Real men use mspaint for writing code and notepad for designing graphics." - Anna-Jayne Metcalfe

                                S Offline
                                S Offline
                                Stevo Z
                                wrote on last edited by
                                #16

                                You're right, thank you.

                                zilo

                                1 Reply Last reply
                                0
                                Reply
                                • Reply as topic
                                Log in to reply
                                • Oldest to Newest
                                • Newest to Oldest
                                • Most Votes


                                • Login

                                • Don't have an account? Register

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • World
                                • Users
                                • Groups