List compare algorithm

Pete OHanlon

Very good. Now what's your question? What have you tried and where have you failed?

Deja View - the feeling that you've seen this post before.

Stevo Z

My question is what would be the best algorithm in terms of speed to do that. I haven't tried anything yet, although I have some ideas. I real, the MatchPoints method evaluates distance between two points . They don't have to be exactly the same to match, they just need to be whithin a specified distance. Now what I'm trying to do is create a Region from rectangles (should be circles but I guess rectangles approach will be faster and much less expensive) where each point is the center of it's rectangle. Union all rectangles to region. So after I create the list of points, I also create Region that these points 'cover'. After that I just need to run points from one list on the other's list Region for IsVisible(Point point) constraint. I need to test it to see how it performs...

zilo

carbon_golem

Do you have to use your own Point class? I think System.Windows.Point has the == operator overloaded already so matching code would be a little simpler, if you're already using it, then disregard. I ran into this problem before, and I started with the smaller of the 2 lists to use as a starting point as you're not going to have any match lists longer than the shortest list. But you're still stuck with a runtime of O(n2). ( I think that's right, but it's been a while for O notation). You could also put your two point lists into Generic Lists and if you're using .NET3.5 there is a Union<> generic extension method that can be used, but I'm unsure if that'll buy you any performance. I looked at the IL generated by the MatchPoints function, and it's 37 Bytes. If you drop the ref and do p1 == p2 the codesize drops to 13 bytes in IL.

"Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

Stevo Z

Hi, thanx for answer, I've created my own Point2D struct because accessing fields directly without Properties is much faster when considering a lot of iterations. I can overload the == operator in my Point2D struct same way as it is in Point struct if I needed... yes my lists are Generic lists (using .NET 2.0). I don't understand what's O(n2) :) .

zilo

carbon_golem

Could you sort the lists? Maybe adding X+Y and sorting the lists on that parameter, and then doing a comparison to see if a match is possible may prune the searches down. The worst case is that all the points add to the same number and you're doing an extra comparison. But practically, you'd probably cut on average half of the possible cases. And there's definitely some tricks you can do by saving positions in the list where you know values can't match. For example, when comparing near the end of the list, you know that sorted elements in the other list's beginning aren't going to match, so skip over them using a fencepost parameter you have saved. Hope this helps.

"Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

Stevo Z

I was thiking of that too but I can't figure out how to sort it. Just adding x+y doesn't tell a thing about that point location. e.g you have two points [0,20] and [20,0]. They both have sum of 20 hence are far away from each other.

zilo

carbon_golem

[0, 20][20, 0] have the same sum, but aren't close... but if they have the same sum, they could be the same. you know for a fact that if 2 points do not have the same sum, then they aren't ever going to be equivalent. Now, knowing that, if you sort on the sum of the two you would have that to use also. So the sums, would be ascending ala.. List A:[1][2][3][3][3][4][5][5][5][78][9000]... (Longer) List B:[1][1][3][4][4][4][5][5][6][78][9000]... (Shorter) What I'm saying is this. Sort both on the sum. Pick the shorter list to use for the base comparison. The following code won't work, but you should get the idea from that. This doesn't use any kind of fencepost. If it were, we'd save the index of the last matched in the list and use that as a starting point instead of beginning at the front of the longer list. foreach(Point a in shorter){ foreach(Point b in longer){ if(a.Sum != b.Sum) break; if(a == b) union.Add(a); break; } }

"Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

DavidNohejl

Zilo(svk) wrote:

I've created my own Point2D struct because accessing fields directly without Properties is much faster when considering a lot of iterations.

Are you sure? I belive JIT compiler does optimize it away, so you are only doing extra work...

Zilo(svk) wrote:

I don't understand what's O(n2)

http://en.wikipedia.org/wiki/Big_O_notation[^] Oversimplification: for N points, it will do N*N "operations".

[My Blog]
"Visual studio desperately needs some performance improvements. It is sometimes almost as slow as eclipse." - Rüdiger Klaehn
"Real men use mspaint for writing code and notepad for designing graphics." - Anna-Jayne Metcalfe

Stevo Z

Your approach is right but, ... I dont know if you red my second post. Matching points is not a matter if they are the same, it is a matter of the distance between then. They may be the same, but they may not e.g. lets say const distance = 3; p1 [0,0] p2 [2,2] distance between p1 and p2 is: d = Math.Sqrt((p2.X-p1.X)^2 + (p2.Y-p1.Y)^2) = 2,3... (2,3 < 3) == true so p1 and p2 match.

zilo

Stevo Z

Run this code

using System.Drawing;
using System.Diagnostics;
using System;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
const int count = 100000000;
int x = 0;
Stopwatch watch = new Stopwatch();

        Console.WriteLine("Starting Point iteration");
        Point p = new Point(1, 1);
        Point2D p2d = new Point2D(1, 1);
        watch.Start();            
        for (int i = 0; i < count; i++)
        { 
            x = p.X + p.Y;
        }
        watch.Stop();
        Console.WriteLine("Point iteration took {0} mls\\n\\r", watch.ElapsedMilliseconds);
        watch.Reset();
        Console.WriteLine("Starting Point2D iteration");
        watch.Start();     
        for (int i = 0; i < count; i++)
        {               
            x = p2d.X + p2d.Y;
        }
        watch.Stop();
        Console.WriteLine("Point2D iteration took {0} mls", watch.ElapsedMilliseconds);

        Console.ReadKey();
    }

    struct Point2D
    {
        public int X, Y;

        public Point2D(int x, int y)
        {
            this.X = x;
            this.Y = y;
        }
    }
}

}

zilo

carbon_golem

Sorry for the confusion there. I presume that you're familiar with connected graphs. I'd pre-generate the graph and it's weights to start from. You'll be stuck with the math of making the distances for every node to every other node, but you won't be doing it more than once. But once you have a list of nodes and distances to other nodes you can simply sort out the distances to get the number of nodes that have distances less than what you're looking for. Now you're stuck with a math optimization problem, and there's lots on the net about squeezing performance out of a distance equation.

"Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

Stevo Z

hmmm. this would mean to build a graph of all points(nodes) in graphics rectangle? I dnon't think I got you right, because this would mean to connect all points on screen, which is awful lot. But you gave me good idea. Maybe I could hash allready calculated paths and just get the result next time same path will be calculated.

zilo

carbon_golem

Awesome... Here's a link to a good article about using some .NET 3.5 stuff to cache results. I know you're using 2.0, but it's worth the read and it may give you more ideas. http://msdn2.microsoft.com/en-us/vcsharp/bb870976.aspx[^]

"Run for your life from any man who tells you that money is evil. That sentence is the leper's bell of an approaching looter." --Ayn Rand

DavidNohejl

Right... Debug, without optimalizations: Starting Point iteration Point iteration took 477 mls Starting Point2D iteration Point2D iteration took 263 mls Release, with optimalizations on: Starting Point iteration Point iteration took 45 mls Starting Point2D iteration Point2D iteration took 43 mls And same configuration, second run: Starting Point iteration Point iteration took 43 mls Starting Point2D iteration Point2D iteration took 43 mls

[My Blog]
"Visual studio desperately needs some performance improvements. It is sometimes almost as slow as eclipse." - Rüdiger Klaehn
"Real men use mspaint for writing code and notepad for designing graphics." - Anna-Jayne Metcalfe

Stevo Z

You're right, thank you.

zilo