state-of-the art computer/machine vision system

BupeChombaDerrick

Hi, i'am a researcher in computer vision system (Electronics Engineer by profession) designing a system capable of out performing the current state of the art vision system. OpenCV 2.2 did not impress me, vision by machines seems 2 lag behind the simplest animal u can think of (like a cat or something else). i think computers are powerful enough 2 handle vision nearly as good as humans. Why are the state of art vision systems very task specific and not as robust as they should be?any suggestions?

Lost User

Let's see your system and then we can judge.

Unrequited desire is character building. OriginalGriff

BupeChombaDerrick

Well first i have 2 deal with patent issues plus i'am writing a journal on it, i'have written a proprietary vision library and will be ready 2 show my system 2 the world when all the legal issues are done and when i finalise the design. these legal issues make innovation very difficult

Lost User

BCDXBOX360 wrote:

i'am designing a system capable of out performing the current state of the art vision system.

BCDXBOX360 wrote:

these legal issues make innovation very difficult

You seem to have two conflicting statements here.

Unrequited desire is character building. OriginalGriff

Bernhard Hiller

BCDXBOX360 wrote:

Why are the state of art vision systems very task specific and not as robust as they should be?

Looks like you haven't even started your prestigious project. You would have seen the fact that with "real" items there are no perfect matches to stored representations. Any animal is capable of recognizing a tree from the data its eyes send to its brain. Write a software which will detect that tree in a bitmap. And then, in a picture of the same tree taken from a different place, and recognize that that's the same tree...

Lost User

I was tempted to say that but thought I would give OP the benefit of the doubt.

Unrequited desire is character building. OriginalGriff

killabyte

yup image recognition is half good dsp and half black magic still

BupeChombaDerrick

well i wanted other views from the codeproject community on computer/machine vision algorithm limitations, i have been researching on the current developments in vision systems for 2 years now and have been iteratively refining my design over time based on new and promising heuristics of vision.

BupeChombaDerrick

Richard MacCutchan wrote:

You seem to have two conflicting statements here.

maybe i was supposed to write that ,legal processes of getting patents and other rights to an invention discourages innovation but does not make it impossible.

BupeChombaDerrick

Bernhard Hiller wrote:

Write a software which will detect that tree in a bitmap. And then, in a picture of the same tree taken from a different place, and recognize that that's the same tree...

it was found that neurons called view-tuned-units exists in animal/human brains that encode only one view of a given object(in this case a tree) and these feed into a view - invariant unit. the principal design criterion for my vision system is based on that same principal, but the secret is to encode those views in time and space (memory) efficient algorithm.simalar to an algorithm by S. Hinterstoisser, V. Lepetit, S. Ilic, P. Fua, and N. Navab, “Dominant orientation templates for real-time detection of texture-less objects,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010. they used different views of the same object encoded in a very compact and efficient way but their method works for texture-less objects but is efficient even for a very large database of objects

mikemar

This is in response to all your posts. I gather that you haven't started or are in early stages of your project. I'm also working on some vision recognition stuff but I'm pretty far along and I can tell you you'll find a lot more complications than you realize as you go. That's the reason why a lot of systems are domain specific. It allows them to take advantage of certain known facts and "cheat" so to speak since no one has created a general purpose system yet. In addition to the difficulty that one posters already mentioned here are just a few of the other things you need to consider: 1) Defining the edge of objects: Most objects in the real world will have areas where the edges are blurred rather than sharp color changes. Look up canny edge detection and it will explain some of this stuff. 2) Recognizing 2 areas are part of the same object: Consider a cat with black and white patches. How is a vision system supposed to know that 2 areas with radically different colors are part of the same object. 3) Depth Perception: If you use 2 cameras similar to our 2 eyes you can match 2 objects and then compare the parallax shift. However, this only works at certain distances. Our brains probably only use this at short distances, several other methods are used at long distances where the parallax shift isn't large enough to judge. Also why are you worried about patents at this stage? I doubt you are going to get sued for simply experimenting with something. If your system does end up working and you want to commericialize it then buy/license the rights from the existing patent holders that are in your way. In addition you may find your idea changes a lot as you work on it and run into difficulties, it did with me.

BupeChombaDerrick

mikemarquard wrote:

Defining the edge of objects: Most objects in the real world will have areas where the edges are blurred rather than sharp color changes. Look up canny edge detection and it will explain some of this stuff.

Recognizing 2 areas are part of the same object: Consider a cat with black and white patches. How is a vision system supposed to know that 2 areas with radically different colors are part of the same object.

Depth Perception: If you use 2 cameras similar to our 2 eyes you can match 2 objects and then compare the parallax shift. However, this only works at certain distances. Our brains probably only use this at short distances, several other methods are used at long distances where the parallax shift isn't large enough to judge.

I would agree that my ideas will change in time because they already have, but for the better, at first i started off trying edge detection methods but later on realised that edge detection is not necessary, descriptors such as SIFT,SURF,DOT,HOG and many more use orientation and not contours. This is supported by biological vision in simple and complex cells, my system follows this trend. orientation is not affected by blurring thus more robust and descriptive. 2) My system uses local image patches and a part based recognition infrastructure without segmentation since segmentation is a by-product of recognition then the vision system is not supposed to segment out scenes or potential objects before recognizing them. 3) My system is not currently designed to use stereo cameras it uses a single camera and does not need depth or capturing a 3D representation to aid recognition. my project as evolved in actual sense and i'am using my on vision library to implement the system and i have figured out how to encode image data in an efficient and robust manner for building a generic object recognition system. How do i know that it will work?well i have been progressively testing simple building blocks of the system and now i'am certain that this will work when the whole system is put together. i am optimizing my vision library for the final implementation and probably months remaining before completion.

mikemar

It sounds like your ideas and my ideas are a lot different. Actually my ideas ideas are a lot different than any of the ideas I've read about and my image segmentation is technically not an edge detection algorithm either. I wish you the best of luck and if you have some big successes I'd love to hear about it.

BCDXBOX360 wrote:

My system is not currently designed to use stereo cameras it uses a single camera and does not need depth or capturing a 3D representation to aid recognition.

You might have more limited aims than I do but I have to question this one. If you're trying to build something that is capable of doing what a human or animal can do I don't see how this can work cause clearly humans and animals see in 3d. Also if you choose this path keep in mind objects look radically different from different views. Without some sort of 3d perception it is going to be difficult to get the system to recognize multiple views as being part of the same object.

Lost User

I rather meant that, having claimed that you were going to create a state of the art system that would beat anything currently available, you are now saying that you cannot do it because of the difficulty of getting a patent. That sounds like an excuse not a reason.

Unrequited desire is character building. OriginalGriff

YvesDaoust

My answer will be simple: artificial intelligence is still nowhere.

BupeChombaDerrick

mikemarquard wrote:

objects look radically different from different views. Without some sort of 3d perception it is going to be difficult to get the system to recognize multiple views as being part of the same object.

that's why my system uses a multi-view representation as i explained earlier,during the learning phase multi-views of the same object are learned and efficiently encoded for fast retrieval, this is supported in biological vision, neurons called view-tuned-units can only respond to a single view of a given 3d object but a collection of them gives a view-invariant behaviour. my system also implements a knowledge transfer technique for one-short learning (this reduces training sets as the system learns more and more things, just like humans!) and animals/humans can see effectively with a single eye proving that depth adds very little information(maybe little enough to be ignored for now). we see in what i call false 3d (it's only out of experience with this world that enables the brain to encode multi-views of various objects and cheats us that we see in 3d) the truth of the matter is that we see in 2d representation especially for recognition purposes. i think depth is used to tell how far the recognized object is from your eyes more accurately but this information is not used in actual recognition of the object.

BupeChombaDerrick

YvesDaoust wrote:

My answer will be simple: artificial intelligence is still nowhere.

i can't fully agree with you, look at chess, can you beat a computer at it's highest skill level, i don't think so! the problem with machines of today is that they just lack perceptual skills or what is called sensory perception, they are given buttons for people to push rather than a complex sensing device. And the belief among humans that there is no such a thing as artificial intelligence discourages researchers. Remember people in the old days thought that people will never fly but we have very heavy man-made machines called planes that can fly. all we need is a break through especially in machine perception to have everybodies jaw dropped.

BupeChombaDerrick

Richard MacCutchan wrote:

I rather meant that, having claimed that you were going to create a state of the art system that would beat anything currently available, you are now saying that you cannot do it because of the difficulty of getting a patent. That sounds like an excuse not a reason.

okey lets forget about the patent issues for now, i was initually worried about ideas being stolen, but right now as i write this i'am sitting in front of a lap-top with a vision-library (designed and coded by me) capable of out-performing the current state of the art vision systems. (i'am just optimizing the library and doing some final toughes)

Lost User

BCDXBOX360 wrote:

i'am sitting in front of a lap-top with a vision-library (designed and coded by me) capable of out-performing the current state of the art vision systems.

In that case I'll go back to my first comment: "Let us see it in action and then we can judge.".

Unrequited desire is character building. OriginalGriff

mikemar

BCDXBOX360 wrote:

my system also implements a knowledge transfer technique for one-short learning (this reduces training sets as the system learns more and more things, just like humans!) and animals/humans can see effectively with a single eye proving that depth adds very little information(maybe little enough to be ignored for now)

Actually its been found that if a person is born blind in one eye they never develop proper depth perception. The reason you and I can see in 3D if we cover an eye is because as children we learned other ques for judging depth. However, we needed 2 eyes to learn these ques because without them we have very little information to accurately gauge where an object is at and thus know how other ques corresponds to a particular location. I would say this; I don't know of any animal that has only 1 eye so I think depth perception must be important and parallax shift or I think the proper term is stereopsis is important. PS If you were the person who downvoted me I'm not trying to be critical or discourage you, I just enjoy debating these topics with people of similar interest and hearing their opinions.