CN112801141A

CN112801141A - Heterogeneous image matching method based on template matching and twin neural network optimization

Info

Publication number: CN112801141A
Application number: CN202110022206.4A
Authority: CN
Inventors: 赵岩; 林建宇; 李灵珊; 王世刚; ***
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2021-01-08
Filing date: 2021-01-08
Publication date: 2021-05-14
Anticipated expiration: 2041-01-08
Also published as: CN112801141B

Abstract

A heterogeneous image matching method based on template matching and twin neural network optimization belongs to the field of computer image processing and solves the problems that the existing heterogeneous image matching is difficult and the contour edge is thick and the positioning accuracy is low when Sobel operators carry out feature extraction. The method comprises the following steps: preprocessing an SAR image; image half-pixel processing; converting sketch images based on Sobel operators; matching the templates; the twin neural networks are fine matched. The method is suitable for image processing operation of matching two different source images with different sizes in the same area scene based on template matching. In the template matching process, the sketch image conversion based on the Sobel operator is carried out on the heterogeneous image, the matching effect is enhanced, and the improved twin neural network is applied for fine matching subsequently, so that the heterogeneous image matching effect is better, and the matching precision is higher.

Description

Heterogeneous image matching method based on template matching and twin neural network optimization

Technical Field

The invention belongs to the technical field of computer image processing, and particularly relates to a heterogeneous image matching method based on template matching and twin neural network optimization.

Background

Image matching is a very important task in the field of computer vision and image processing, and is mainly used for matching two or more images acquired at different times, different sensors, different viewing angles and different shooting conditions. The image matching is the basis of processing and application of various images, and the matching effect directly influences the processing and application work of subsequent images.

The heterogeneous image matching is a process of performing similarity and consistency analysis on corresponding relations of image content, characteristics, structures, relations, textures, gray levels and the like through images acquired by different source sensors to seek the same image target. Heterogeneous image matching methods can be roughly divided into two categories: region-based matching methods and feature-based matching methods. Region-based matching methods directly utilize the gray information of an image or some transformation of the gray information for similarity measurement. The method divides the sets according to the gray level, and a certain corresponding relation often exists between the sets, and the corresponding relation is a process without gray mapping constraint and characteristic extraction when solving the matching of the heterogeneous images. The method takes a template as a unit, and calculates the similarity degree of a current window and the template at each position of an image according to a certain similarity measurement criterion, so the method focuses on the design of the similarity measurement criterion.

The gray scale difference of the same pixel points among the heterogeneous images is large, but because the images in the same area are collected, the edge information of the target area can be clearly distinguished after the images are subjected to edge detection. The adopted edge detection mainly comprises a Canny operator, a Prewitt operator, a Sobel operator and the like, wherein the Sobel operator is a discrete difference operator and is used for calculating an approximate value of the gray scale of the image brightness function. Using this operator at any point in the image will produce the corresponding gray scale vector or its normal vector. The Sobel operator has two, one for detecting horizontal edges and the other for detecting vertical edges. Compared with the Prewitt operator, the Sobel operator weights the influence of the position of the pixel, so that the edge blurring degree can be reduced, and the effect is better. However, the Sobel operator detects the edge according to the phenomenon that the gray weighting difference of upper, lower, left and right adjacent points of the pixel point reaches an extreme value at the edge, has a smoothing effect on noise, provides more accurate edge direction information, and has insufficient edge positioning accuracy; moreover, the Sobel operator detects the extracted edge of the edge image to be coarse, and the image is not an expected result, so that the error range in matching is large.

Disclosure of Invention

In view of the difficulties in heterogeneous image matching and the problems of thick outline edge and low positioning precision in feature extraction by a Sobel operator, the invention provides a heterogeneous image matching method based on template matching and twin neural network optimization. The method is mainly suitable for image processing operation of matching two different source images with different sizes in the same area scene based on template matching. In the template matching process, the sketch image conversion based on the Sobel operator is carried out on the heterogeneous image, the matching effect is enhanced, and the improved twin neural network is applied for fine matching subsequently, so that the heterogeneous image matching effect is better, and the matching precision is higher.

The technical scheme adopted by the invention for solving the technical problem is as follows:

the invention discloses a heterogeneous image matching method based on template matching and twin neural network optimization, which comprises the following steps:

firstly, preprocessing an SAR image;

step two, image half-pixel processing;

step three, converting the sketch image based on a Sobel operator;

step four, template matching;

and step five, precisely matching the twin neural networks.

Further, the specific operation steps of the first step are as follows:

and carrying out median filtering processing on the SAR image to be matched so as to remove light spots and speckle noise of the SAR image to obtain a de-noised image.

Further, the formula of the median filtering is as follows:

y(i)＝Med[x(i-N),...,x(i),...,x(i+N)]

in the formula, an odd-length L-long window is defined, where L is 2N +1, N is a positive integer, and at a certain time, the signal sample values in the window are x (i-N), …, x (i), …, and x (i + N), where x (i) is the signal sample value located at the center of the window, y (i) represents the output value of median filtering, and Med represents the median calculation.

Further, the specific operation steps of the second step are as follows:

and respectively carrying out half-pixel processing on the optical image and the preprocessed SAR image by adopting a bilinear interpolation method, and converting the size of the image into four times of the original image.

Further, the bilinear interpolation method has the following formula:

in the formula, f (R)₁) Represents R₁Pixel value of dot, f (R)₂) Represents R₂Pixel value of dot, f (A)₁) To representA₁Pixel value of dot, f (A)₂) Is represented by A₂Pixel value of dot, f (A)₃) Is represented by A₃Pixel value of dot, f (A)₄) Is represented by A₄Pixel value of a point, f (P) represents pixel value of unknown point P, x₁Represents a known point A₁、A₃Abscissa of (a), x₂Represents a known point A₂、A₄Abscissa of (a), y₁Represents a known point A₁、A₂Ordinate of (a), y₂Represents a known point A₃、A₄M denotes the abscissa of the unknown point P, n denotes the ordinate of the unknown point P, R₁Is represented by the formula A₁Coordinate points, R, co-ordinate with the ordinate and with the abscissa of the unknown point P₂Is represented by the formula A₃Coordinate points, A, having the same ordinate and the same abscissa as the unknown point P₁Coordinate points representing pixel points at the upper left corner of the image, A₂Coordinate points representing pixel points in the upper right corner of the image, A₃Coordinate points, A, representing pixel points at the lower left corner of the image₄Coordinate points, A, representing pixels in the lower right corner of the image₁、A₂、A₃、A₄Are all known;

known as A₁And A₄The coordinates of the unknown point P in the area are obtained.

Further, the specific operation steps of the third step are as follows:

s3.1, utilizing Sobel operator to extract edge characteristics

Utilizing a Sobel edge filter to extract edge features of an optical image and an SAR image: the convolution kernels of the Sobel operator are a horizontal detection convolution kernel and a vertical detection convolution kernel, and the convolution kernels are respectively applied to the input image to generate independent measurement values of gradient components in each direction, wherein A represents an original image, and G represents a gradient component in each direction_xAnd G_yThe gray values of the image detected by the transverse and longitudinal edges are respectively represented, and then the calculation formula is as follows:

g is to be_xAnd G_yCombining together to find the magnitude of the absolute value of the gradient G and the direction of the gradient G at each point; comparing the absolute value of the gradient G at a certain point with the middle value of the gray value of the whole image, and if the gradient G at the point is greater than the middle value of the gray value of the whole image, considering the point as an edge point;

s3.2 non-maximum suppression

And (3) processing the image obtained in the S3.1 by adopting non-maximum value suppression, and smoothing the image after edge thinning to obtain a corresponding sketch image.

Further, the specific operation steps of the step four are as follows:

and carrying out template matching according to the obtained sketch image, matching the region of the SAR sketch image in the optical sketch image, framing the SAR sketch image by using a rectangular frame, outputting an upper left-corner pixel point of the rectangular frame in the optical sketch image, and outputting nine adjacent pixel points taking the upper left-corner pixel point as a center.

Furthermore, the template matching adopts a normalized correlation coefficient matching algorithm, and the calculation formula is as follows:

in the formula, R (x, y) represents a matching result, T 'represents a template image, I' represents an image to be matched, x represents an abscissa of the upper left corner of the window dividing block in the image to be matched, y represents an ordinate of the upper left corner of the window dividing block in the image to be matched, x 'represents an abscissa of each pixel point in the window dividing block, and y' represents an ordinate of each pixel point in the window dividing block.

Further, the specific operation steps of the step five are as follows:

and inputting the registered optical image and SAR image in the training set as the training set into two input ends of a twin neural network for training, outputting the trained model, calling the model to perform image matching on nine pixel points which are output by template matching and take the upper left corner pixel point as the center, outputting the image with the highest degree of acquaintance as the final SAR image in the area of the optical image, and outputting the pixel point coordinate corresponding to the image with the highest degree of acquaintance as the final result.

Furthermore, the twin neural network comprises two input ends, two neural networks and a Loss module; respectively inputting the two input images into two neural networks, and respectively mapping the input images to new spaces by the two neural networks to form the representation of the input images in the new spaces; and finally, evaluating the similarity of the two input images through Loss calculation.

The invention has the beneficial effects that:

the invention starts from edge detection and processes the heterogeneous images so as to facilitate subsequent matching work. According to the method, the medium value filtering processing is carried out on the SAR image to be matched, so that light spots and speckle noise of the SAR image can be removed, and a denoised image is obtained; respectively carrying out half-pixel processing on the optical image and the SAR image after median filtering processing by adopting a bilinear interpolation method, and then carrying out sketch image conversion based on a Sobel operator, wherein in the step, the image after edge thinning is obtained by further processing by using non-maximum value inhibition after the Sobel operator so as to solve the problem that the edge image extracted by the conventional Sobel operator is thicker; performing template matching on the processed sketch image to obtain an approximate matching point coordinate; and respectively matching nine pixel points with the obtained coordinate point as the center by adopting a twin neural network, selecting an optimal point for outputting, and obtaining a final matching coordinate so as to solve the problem of insufficient edge positioning precision. According to the method, edge detection and template matching are carried out on different source images, the problem of difficulty in matching caused by different heterogeneous image imaging principles in the existing matching method is solved, meanwhile, the error of a matching result is reduced by adopting template matching, and matching comparison is carried out in the neighborhood of the matching result by adopting a twin neural network subsequently, so that the matching accuracy is further improved.

Drawings

FIG. 1 is a flow chart of the heterogeneous image matching method based on template matching and twin neural network optimization according to the present invention.

FIG. 2 is a schematic structural diagram of a twin neural network.

FIG. 3 is a comparison of the two methods in a harbor scenario. a represents the matching result of the SIFT matching method of the existing original brightness image, b represents the SAR image to be matched, and c represents the matching result of the invention.

Fig. 4 is a comparison diagram of two methods in an urban scene. a represents the matching result of the SIFT matching method of the existing original brightness image, b represents the SAR image to be matched, and c represents the matching result of the invention.

Fig. 5 is a comparison graph of two methods in a river scene. a represents the matching result of the SIFT matching method of the existing original brightness image, b represents the SAR image to be matched, and c represents the matching result of the invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

The core of the invention is that: for the heterogeneous images to be matched, in the process of template matching, the sketch image conversion based on the Sobel operator is adopted, so that the matching difficulty and complexity are reduced, the sketch image more suitable for template matching is formed, and then the precise matching is further performed through the twin neural network, so that the matching accuracy is improved.

As shown in FIG. 1, the heterogeneous image matching method based on template matching and twin neural network optimization of the present invention mainly comprises the following steps:

step one, SAR image preprocessing

When the sketch image is subjected to image matching, more accurate matching can be better performed according to the contour information in the image. Because the image aimed by the invention is the heterogeneous image matching of the optical image and the SAR image, and the imaging principle of the SAR image causes the SAR image to have coherent speckle noise, the SAR image is preprocessed firstly. The Sobel operator is a discrete difference operator, and the Sobel operator is used for realizing the detection of the image edge by adding the gray value weight difference of four adjacent pixel points of the upper, the lower, the left and the right of each pixel in the image and achieving an extreme value at the edge. The SAR image preprocessing method based on Sobel operator includes the following steps:

carrying out median filtering processing on the SAR image, wherein the formula of the median filtering is as follows:

y(i)＝Med[x(i-N),...,x(i),...,x(i+N)]

And carrying out median filtering on the SAR image to be matched to remove light spots and speckle noise of the SAR image so as to obtain a de-noised image.

Step two, image half-pixel processing

Respectively carrying out half-pixel processing on the optical image and the SAR image after median filtering processing by adopting a bilinear interpolation method, and converting the size of the image into four times of the original image, wherein the formula of the bilinear interpolation method is as follows:

in the formula, f (R)₁) Represents R₁Pixel value of dot, f (R)₂) Represents R₂Pixel value of dot, f (A)₁) Is represented by A₁Pixel value of dot, f (A)₂) Is represented by A₂Pixel value of dot, f (A)₃) Is represented by A₃Pixel value of dot, f (A)₄) Is represented by A₄Pixel value of a point, f (P) represents pixel value of unknown point P, x₁Representing known pointsA₁、A₃Abscissa of (a), x₂Represents a known point A₂、A₄Abscissa of (a), y₁Represents a known point A₁、A₂Ordinate of (a), y₂Represents a known point A₃、A₄M denotes the abscissa of the unknown point P, n denotes the ordinate of the unknown point P, R₁Is represented by the formula A₁Coordinate points, R, co-ordinate with the ordinate and with the abscissa of the unknown point P₂Is represented by the formula A₃Coordinate points, A, having the same ordinate and the same abscissa as the unknown point P₁Coordinate points representing pixel points at the upper left corner of the image, A₂Coordinate points representing pixel points in the upper right corner of the image, A₃Coordinate points, A, representing pixel points at the lower left corner of the image₄Coordinate points, A, representing pixels in the lower right corner of the image₁、A₂、A₃、A₄Are all known.

Step three, sketch image conversion based on Sobel operator

S3.1, utilizing Sobel operator to extract edge characteristics

Compared with other operators such as canny, the Sobel operator has a good characteristic effect and is accurate in edge positioning when the edge contour of the SAR image is actually extracted, and the SAR image with gradually changed gray scale and more noise can be better processed, so that the Sobel edge filter is selected to extract the edge characteristics of the optical image and the SAR image.

The convolution kernels employed by the Sobel operator can be divided into horizontal detection convolution kernels and vertical detection convolution kernels, and the convolution kernels can be applied to the input image separately to generate separate measurements of the gradient components in each direction, G if the original image is denoted by a_xAnd G_yThe gray values of the image detected by the transverse and longitudinal edges are respectively represented, and then the calculation formula is as follows:

g is to be_xAnd G_yCombined together to find the magnitude of the absolute value of the gradient G at each point and the direction of the gradient G. And comparing the absolute value of the gradient G at a certain point with the middle value of the gray value of the whole image, and if the absolute value of the gradient G at the certain point is greater than the middle value of the gray value of the whole image, considering the point as an edge point.

S3.2 non-maximum suppression

Because the Sobel operator has the problem of thick outline edge when extracting edge features, the invention adopts non-maximum suppression to process the image obtained in S3.1, and the image is smoothed after edge thinning to obtain a sketch image.

Step four, template matching

And carrying out template matching according to the obtained sketch image, matching the region of the SAR sketch image in the optical sketch image, framing the SAR sketch image by using a rectangular frame, and outputting the upper left corner pixel point of the rectangular frame in the optical sketch image as a matching result and a precision index.

The applied Template Matching algorithm (Template Matching) is one of the methods for finding a specific target in an image, the principle of this method is very simple, traverse every possible position in the image, compare the similarity with the Template everywhere, output the area with the highest similarity as the target area, through comparison, the Matching precision of the normalized correlation coefficient Matching algorithm (CV _ T M _ CCOEFF _ normal) in Template Matching is higher, its calculation formula is as follows:

After template matching, nine adjacent pixel points with the upper left corner pixel point as the center are output

Step five, fine matching of twin neural networks

Firstly, inputting a corresponding registered optical image and SAR image which are used as a training set into two input ends of a twin neural network for training, outputting a trained model, calling the model to perform image matching on nine pixel points which are output by template matching and take the upper left corner pixel point as the center, outputting an image with the highest recognition degree as a final SAR image in an optical image area, and outputting the pixel point coordinate corresponding to the image with the highest similarity degree as a final result.

The structure of the twin neural network is shown in fig. 2. The purpose of a twin neural network is to measure how similar two inputs are. The twin neural Network has two inputs (Input1 and Input2) and the two Input images are respectively Input into two neural networks (Network1 and Network2) which respectively map the Input images into new spaces to form representations of the Input images in the new spaces. Through the calculation of Loss, the similarity of two input images is evaluated.

And carrying out picture matching on nine pixel points taking the upper left corner pixel point as the center by using the improved twin neural network, and selecting the optimal pixel point coordinate as a result to output.

Detailed description of the invention

The feasibility of the method provided by the invention is verified in the following specific tests. Compared with the existing template matching method, the method disclosed by the invention is used for carrying out comparative analysis on three aspects of feature point extraction, correct matching rate and matching speed.

1. Working conditions

In the experiment, an intel core i9-9900k CPU @3.60ghz 16 processor is adopted, a PC (personal computer) of Windows10 is operated, the display card is 2 GeForce RTX 1080Ti, and the programming language is Python language.

2. Analysis of experimental content and results

As shown in fig. 3, it can be seen by comparing the original SAR image that the image results matched by the present invention are more similar, and then the upper left pixel point of the two matching results and the upper left pixel point of the actual SAR image are further output to perform a test and output an accuracy rate with a pixel point error smaller than 5, as shown in table 1, thereby proving that the accuracy rate of the method of the present invention is greatly improved.

Wherein, the calculation formula of the correct matching rate is as follows:

in the formula, P is the correct matching rate, Q is the correct matching number with a matching error of 5 pixels or less, and N is the total matching number.

In different cases, the comparison results of the present invention with the existing template matching method are shown in fig. 3 to 5. The results of the experiments were compared in three scenes, namely, port (fig. 3), city (fig. 4) and river (fig. 5), and the results of the correct matching rates are shown in table 1.

As can be seen from fig. 3 to 5 and table 1, the correct matching rate of the present invention is higher than that of the existing template matching method under different conditions.

TABLE 1 comparison of correct matching ratio (%)

	Traditional template matching method	The invention
			Port port	61.33	91.58
City	65.47	88.72
			River flow	52.19	79.65

The three experimental results show that the sketch image conversion based on the Sobel operator solves the problem that common characteristics of images on characteristics such as gray scale, brightness, color and the like are difficult to obtain and difficult to match due to different imaging principles of related heterogeneous images, forms a sketch image with image information, reduces the matching difficulty, and solves the problem of low matching accuracy due to the fact that a twin neural network is added for fine matching. Meanwhile, the adopted twin neural network structure is simpler, so that the matching rate is higher during application.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. The heterogeneous image matching method based on template matching and twin neural network optimization is characterized by comprising the following steps of:

firstly, preprocessing an SAR image;

step two, image half-pixel processing;

step three, converting the sketch image based on a Sobel operator;

step four, template matching;

and step five, precisely matching the twin neural networks.

2. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 1, wherein the specific operation steps of step one are as follows:

3. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 2, wherein the formula of the median filter is as follows:

y(i)＝Med[x(i-N),...,x(i),...,x(i+N)]

4. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 1, wherein the specific operation steps of step two are as follows:

5. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 4, wherein the bilinear interpolation method has the following formula:

in the formula, f (R)₁) Represents R₁Pixel value of dot, f (R)₂) Represents R₂Pixel value of dot, f (A)₁) Is represented by A₁Pixel value of dot, f (A)₂) Is represented by A₂Pixel value of dot, f (A)₃) Is represented by A₃Pixel value of dot, f (A)₄) Is represented by A₄Pixel value of a point, f (P) represents pixel value of unknown point P, x₁Represents a known point A₁、A₃Abscissa of (a), x₂Represents a known point A₂、A₄Abscissa of (a), y₁Represents a known point A₁、A₂Ordinate of (a), y₂Represents a known point A₃、A₄M denotes the abscissa of the unknown point P, n denotes the ordinate of the unknown point P, R₁Is represented by the formula A₁Coordinate points, R, co-ordinate with the ordinate and with the abscissa of the unknown point P₂Is represented by the formula A₃Coordinate points, A, having the same ordinate and the same abscissa as the unknown point P₁Coordinate points representing pixel points at the upper left corner of the image, A₂Coordinate points representing pixel points in the upper right corner of the image, A₃Coordinate points, A, representing pixel points at the lower left corner of the image₄Coordinate points, A, representing pixels in the lower right corner of the image₁、A₂、A₃、A₄Are all known;

6. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 1, wherein the specific operation steps of step three are as follows:

s3.1, utilizing Sobel operator to extract edge characteristics

Utilizing a Sobel edge filter to extract edge features of an optical image and an SAR image: the convolution kernels of the Sobel operator are a horizontal detection convolution kernel and a vertical detection convolution kernelThe convolution kernels are applied separately to the input image to generate separate measures of the gradient components in each direction, denoted the original image by A, G_xAnd G_yThe gray values of the image detected by the transverse and longitudinal edges are respectively represented, and then the calculation formula is as follows:

s3.2 non-maximum suppression

7. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 1, wherein the specific operation steps of step four are as follows:

8. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 7, wherein the template matching adopts a normalized correlation coefficient matching algorithm, and the calculation formula is as follows:

9. The heterogeneous image matching method based on template matching and twin neural network optimization according to claim 8, wherein the specific operation steps of step five are as follows:

inputting the registered optical image and SAR image in the training set into two input ends of a twin neural network for training, outputting the trained model, calling the model to perform image matching on nine pixel points which are output by template matching and take the upper left corner pixel point as the center, outputting the image with the highest recognition degree as the final SAR image in the area of the optical image, and outputting the pixel point coordinate corresponding to the image with the highest recognition degree as the final result.

10. The heterogeneous image matching method based on template matching and twin neural network optimization of claim 9, wherein the twin neural network comprises two inputs, two neural networks, a Loss module; respectively inputting the two input images into two neural networks, and respectively mapping the input images to new spaces by the two neural networks to form the representation of the input images in the new spaces; and finally, evaluating the similarity of the two input images through Loss calculation.