Computer vision is literally taking every industry by storm and is maturing very rapidly. After AlexNet in 2012, machine learning algorithms have gained success. New algorithms and models are surfacing each day. Competition held by DeepGlobe and SpaceNet give a continuous insight into Object Detection. With Machine Learning trends sky rocketing, the availability of pre-trained models has become excessive. Winners of competitions all around the globe upload their working files on GitHub . This makes training your models easier and time efficient. If you are restricted by time or machine specs, pre-trained models are your go to but having access just to models and data is not enough.
The question is: Do you know how to use them? Well, as if that wasn’t obvious! If you are a newbie, you might know what object detection is. You do know that it is “Detecting Objects” but there’s more to it. If that was a shocker to you, read on to find out what object detection is and how many forms does it exist in.
Object Recognition is a relatively generic term to make you geared up for object detection. Although both the terms might look exactly the same but that is not so. Object Recognition simply refers to identifying objects in a image or video in a digital format.
There are two ways an object can be recognized in an image, either through classification/segmentation or localization. The next paragraphs explain what these alien terms mean.
The most common and widely applied application of Computer Vision is Image Classification. Image Classification algorithms simply “look”at the image and answer “Is this so and so object”? To help you understand this better, here’s an example: Let’s just say that you trained your algorithm on a lot of pictures of dogs. Now when you feed a new image to your algorithm, it will tell yo whether the new image has a dog in it or not. Simple enough? This has its applications in medical imaging, facial recognition, building detection and other such things.
Now when you know what image segmentation is, it should be easier to take a guess on this. If you said, localization means to know where the object is located, then congratulations!! You have a knack for Machine Learning and you might even do better at it. Localization means to see where the object actually is. One thing to be clear of at this point is that localization does not mean to just locate one object. No. Localization means to locate every possible object in the image.
So your algorithm draw a bounding box around the object and tells you “Hey, there’s an object here”. At this point, the algorithm doesn’t know what sort of object it is. It just know there is an object in the image. If there are more than detectable objects in one image, it will draw more than one bounding box. Common sense?
Coming to the monster term now. If you have any confusion about image classification or localization, I would suggest to go back up on the article. Why? Because object detection is a combination of both image classification and localization. In simple words, Object Detection means to recognize that there is an object an image and then assigning it a label.
Label is the name of an object. Taking the example of dog. Now let’s say your algorithm recognizes that there is a dog and a ball in the image. That’s image classification. Next you do some improvements in your algorithm and it recognizes where exactly that dog and ball is. That’s localization. Next, you integrate both things. Now your algorithm can tell there’s a dog and a ball in he image AND exactly where it is. THAT’s detection. I hope that gives you an insight on Object Detection.
Now you know what classification, localization and detection is, be ready to know what segmentation is.
Segmentation is a more sophisticated form of detection. Segmentation is of two types; Semantic and Instance but we don’t need to worry about it. Segmentation means to detect an object in terms of pixels. Instead of a rough boundary of a bounding box, it highlights the pixels that contain that object. In case of our example, the segmentation result will contain a neatly drawn pixel-by-pixel boundary of the dog and tell you where it is.
After this insight on Object Detection, you are ready to go explore more on your own and learn new things. It is necessary that you are clear on these terms because these will help you choose the purpose of your task.