Hi every one, this post would be the first part of a series in which i will be discussing, a generic object detection algorithm using python and some external libraries.
Digital image processing is one of the most important areas today, when facebook is detecting the faces of our friends automatically and google is allowing us to search using images, google glass and microsoft’s HoloLens have initiated a new era of augmented reality. BIG data science is blooming and one of its major strengths is to process non trivial data, including images and videos. Facebook, instagram, twitter have millions of images posted every day, what if one can extract all the information from these images? just think about it and you will understand why in today’s age every tech person must atleast have a basic understanding of how images are actually interpreted, stored and matched. So here i will first give an abstract idea of basic image matching and then in a later post some python code to actually implement it.
One thing we surely know that images are stored in binary (1,0) where the sequence of bits depend upon the format (jpg, png, etc) and on whether it is an 8 bit or 16 bit image. But anyways it will be stored as sequences of 1’s and 0’s. So if we have to compare two exactly same images it would be a trivial task, just consider them two arrays, apply any naive algorithm and there we have it. we can easily find out whether two images match or not.
But in reality it is extremely rare to match two exactly same images. Most of the time you would have to match two similar images.
Now that trivial technique will now work, because these images are not exactly same but are similar.
So this is where digital image processing comes in. Now the task is to find the similarity among the images, and if the similarity is greater than a certain limit we can say that the images are same. The steps to do that are as follows :
1. Find unique features of each image.
2. Match the features of first image to the second.
3. Calculate the number of features matched.
The word feature is the key here, if one can understand what a feature is, rest of the task would be a piece of cake. So a feature is a point in an image which can uniquely identify an image to another e.g. consider the region in white rectangle, can any point residing in this rectangle be called a feature? No. Because any point selected would have same coloured point all around it, and there would be a lot of such point so it would not be a unique point.
In comparison to that if we select a point which has different coloured points around it, e.g. any point in yellow rectangle would have different coloured points around it, with different intensities and change in colour in a specific direction, there will be very less no of point which would exactly match it, and if we consider enough information of the points around it, it will not match to any other point but itself, so it would be a good feature.Once you find such a point you can store information like its colour density, density of point around it and the gradient angle etc in any data structure, and we have a feature!
It is very common to consider the point on edges as features, and by an edge i mean the point where colour intensity is changing rapidly. And hence most of the algorithms of feature detection are also called edge detection algorithms. There are a number of these algorithms available some of them are Harism Hessian, SIFT and FAST. All these does the same task with slight variations, and their implementations can be found in any programming language. Now by applying any of these algorithm we can have something like this, where red dots are the features found.
Now since all of these point are unique features, we can compare the features of these two images and count how many of them match, and if enough matches are found we can say the the images are similar.
So here is the basic idea to how two images can be matched, i hope i was successful in explaining, in case of any questions do contact!