Template Matching - Template-based Matching and Convolution

Template-based Matching and Convolution

A basic method of template matching uses a convolution mask (template), tailored to a specific feature of the search image, which we want to detect. This technique can be easily performed on grey images or edge images. The convolution output will be highest at places where the image structure matches the mask structure, where large image values get multiplied by large mask values.

This method is normally implemented by first picking out a part of the search image to use as a template: We will call the search image S(x, y), where (x, y) represent the coordinates of each pixel in the search image. We will call the template T(x _t, y _t), where (x_t, y_t) represent the coordinates of each pixel in the template. We then simply move the center (or the origin) of the template T(x _t, y _t) over each (x, y) point in the search image and calculate the sum of products between the coefficients in S(x, y) and T(x_t, y_t) over the whole area spanned by the template. As all possible positions of the template with respect to the search image are considered, the position with the highest score is the best position. This method is sometimes referred to as 'Linear Spatial Filtering' and the template is called a filter mask.

For example, one way to handle translation problems on images, using template matching is to compare the intensities of the pixels, using the SAD (Sum of absolute differences) measure.

A pixel in the search image with coordinates (x_s, y_s) has intensity I_s(x_s, y_s) and a pixel in the template with coordinates (x_t, y_t) has intensity I_t(x_t, y_t ). Thus the absolute difference in the pixel intensities is defined as Diff(x_s, y_s, x _t, y _t) = | I_s(x_s, y_s) – I_t(x _t, y _t) |.

The mathematical representation of the idea about looping through the pixels in the search image as we translate the origin of the template at every pixel and take the SAD measure is the following:

S_rows and S_cols denote the rows and the columns of the search image and T_rows and T_cols denote the rows and the columns of the template image, respectively. In this method the lowest SAD score gives the estimate for the best position of template within the search image. The method is simple to implement and understand, but it is one of the slowest methods.

Read more about this topic: Template Matching