CN111401203A

CN111401203A - Target identification method based on multi-dimensional image fusion

Info

Publication number: CN111401203A
Application number: CN202010165922.3A
Authority: CN
Inventors: 李良福; 刘培祯; 王娇颖; 高强; 钱钧; 周国良; 何曦; 侯瑞; 刘轩; 王超
Original assignee: Xian institute of Applied Optics
Current assignee: Xian institute of Applied Optics
Priority date: 2020-03-11
Filing date: 2020-03-11
Publication date: 2020-07-10

Abstract

The invention belongs to the technical field of artificial intelligence and computer vision, and particularly relates to a target identification method based on multi-dimensional image fusion. According to the fusion algorithm of the Laplacian pyramid decomposition structure based on multi-resolution analysis, correlation between imaging characteristics and image features of different sensors is researched, and the fusion algorithm of the Laplacian pyramid decomposition structure based on multi-resolution analysis is realized. By using pyramidal decomposition of the image, objects of different sizes in the image can be analyzed. Meanwhile, information obtained by analyzing the lower layer with high resolution can be used for guiding the analysis of the upper layer with low resolution, so that the analysis and calculation can be greatly simplified. Because the image fusion method of the invention accords with the natural reality condition in the complex environment of the battlefield, compared with other existing image fusion methods, the invention has the characteristics of good fusion effect, rich details and the like.

Description

Target identification method based on multi-dimensional image fusion

Technical Field

The invention belongs to the technical field of artificial intelligence and computer vision, and particularly relates to a target identification method based on multi-dimensional image fusion.

Background

The image destination identification is realized by comparing the stored target information with the current image information. Description of an image is a prerequisite for object recognition, and by representing relevant features of individual objects in the image or scene, or even relationships between objects, using numbers or symbols, an abstract representation of the features of the objects and their relationships is obtained. When the image recognition technology is used for extracting the individual features in the image, a template matching model can be adopted. In some specific applications, image recognition needs to give the position of an object in addition to what the object to be recognized is. Currently, image recognition technology is widely applied in various fields, such as biomedicine, satellite remote sensing, robot vision, cargo detection, target tracking, autonomous vehicle navigation, public security, banking, transportation, military, electronic commerce, multimedia network communication, and the like. With the rapid development of artificial intelligence and computer vision technologies, target recognition based on machine vision, target recognition based on deep learning and the like appear, and the accuracy and recognition efficiency of image recognition are greatly improved.

However, there are disadvantages to the image information acquired by a single band sensor. For example, visible images are rich in detail, but cannot be imaged at night or in weak light; the infrared image can be imaged for 24 hours, but the distribution of the temperature of the object is obtained, and the observation of the details cannot be realized. By adopting an image fusion means, the multiband information of a single sensor or the information provided by different sensors can be integrated, and the redundancy and contradiction possibly existing among the information of the multiple sensors can be eliminated, so that the information transparency in the image can be enhanced, the interpretation precision, reliability and utilization rate can be improved, and clear, complete and accurate information description of the target can be formed. The efficient image fusion method can comprehensively process the information of the multi-source channel according to the needs, thereby effectively improving the utilization rate of the image information, the reliability of the system on target identification and the automation degree of the system.

In unmanned reconnaissance aircraft, vehicle-mounted panoramic situation perception, ship-borne photoelectric searching and tracking and other systems, the target identification technology based on multi-dimensional image fusion can meet multiple requirements of a military photoelectric system, the automatic and intelligent perception capability of an external scene is achieved, meanwhile, the target identification technology based on multi-dimensional image fusion has wide application in aerial survey and industrial measurement in the civil field, and therefore huge social benefits and economic benefits can be brought to the military and reconnaissance fields in China.

China journal command control and simulation 2019, Vol.28, No.1, pp.1-5 publishes a paper entitled "sea battlefield image target recognition based on deep learning", and Author Suiping et al analyze the advantages and defects of an R-CNN series model based on regional suggestion and a YO L O model based on regression in the paper, and comb up the application status of the deep learning technology in the sea battlefield image target recognition.

Disclosure of Invention

Technical problem to be solved

The technical problem to be solved by the invention is as follows: in order to meet the target identification requirement under a complex environment, how to provide a target identification method based on multi-dimensional image fusion for an unmanned system.

(II) technical scheme

In order to solve the above technical problem, the present invention provides a target identification method based on multi-dimensional image fusion, wherein the method comprises:

step 1: preprocessing the image; the method comprises the following steps:

step 11: calculating an image relative parameter transformation matrix;

when receiving an identification command sent by the unmanned reconnaissance system device, acquiring a visible light image as a reference image g through a corresponding sensor_BInfrared image as candidate image g_c；

Assuming that the number of matching points is N, the number of N is at least 3, and assuming that N is 3, 3 feature points, each B, are selected from the reference image₁，B₂And B₃(ii) a In the candidate image g_cIn the method, 3 corresponding matching points are selected, and are respectively C₁，C₂And C₃；

For reference image g_BAnd candidate image g_CN pairs of matching points are found in the two multi-source images; reference image g_BAnd candidate image g_CRelative parameter transformation matrix P between_C←BThe following least square method formula is adopted for calculation:

P_C←B＝C*B^T*(B*B^T)^-1

where C denotes odd coordinates of a matching point in the candidate image coordinate system, C is a 3 × N matrix, B denotes odd coordinates of a matching point in the reference image coordinate system, B is a 3 × N matrix, and B is^TFor the transpose matrix of B, the relative parameter transform matrix P_C←BIs 3 × 3 matrix;

step 12: transforming the candidate image into a reference image coordinate system;

from the reference image g, using inverse mapping_BStarting from this, the reference image g is solved by a transformation function_BOn each pixel point in the candidate image g_CA corresponding position on; from the reference image g_BIn each point B⁰The odd coordinates of (g) can be calculated according to the following formula_CPoint C corresponding thereto⁰Odd-order coordinates of (c):

wherein, B⁰Has a horizontal coordinate of

The value range is (1,2, …, W); vertical coordinate is

The value range is (1,2, …, H); c⁰Has a horizontal coordinate of

The value range is (1,2, …, W); vertical coordinate is

The value range is (1,2, …, H);

candidate image g_CMidpoint C⁰Giving the reference image g a pixel gray value of_BCorresponding pixel point B⁰Then, a transformed candidate image g is obtained_B←CAnd combine the image g_B←CAs output and reference image g_BCarrying out fusion;

step 2: performing image fusion on the image based on the multi-source characteristics; the method comprises the following steps:

a reference image g_BAnd the transformed candidate image g_B←CPerforming fusion by adoptingThe image fusion algorithm of the multi-resolution analysis pyramid decomposition structure is realized by the following steps:

step 21: carrying out Gauss tower type decomposition on the image;

for a reference image g as a source image_BIn the order of G₀As the zero layer of the Gaussian pyramid, the image G of the first layer of the Gaussian pyramid_lComprises the following steps:

in the formula: n1 is the top layer number of Gauss pyramid; c_lThe number of columns of the image of the l layer of the Gauss pyramid is obtained; r_lW (m, n) represents a 5 × 5 window function, and has a low-pass characteristic, which is as follows:

step 22, establishing L aplace pyramid of the image;

wherein:

in the formula (I), the compound is shown in the specification,

is composed of G_lInterpolated and enlarged image, size and G_l-1Are the same size, but

And G_l-1The values of the gray values of the new pixels interpolated between the original pixels are determined by a weighted average of the gray values of the original pixels; due to G_lIs to G_l-1Obtained by low-pass filtering, i.e. G_lIs fuzzificationDown-sampled G_l-1Thus, therefore, it is

Ratio of detail information of G_l-1Less;

thus, a decomposed image L P of L aplace pyramid layers was obtained_l：

Wherein N2 is layer number of L aplace pyramid top layer, L P_lA layer i image representing L aplace pyramid decomposition;

step 23, reconstructing a source image by L aplace pyramid;

by transformation of the above formula

Step 24, fusing images based on L aplace pyramid decomposition;

and setting A and B as two source images and F as a fused image, wherein the fusion process is as follows:

241, performing L aplace pyramid decomposition on each source image to establish respective L aplace pyramids;

242, respectively fusing all decomposition layers of the image pyramid to obtain an L aplace pyramid of the fused image;

step 243, carrying out image reconstruction on the fused L aplace pyramid to obtain a final fused image;

and step 3: carrying out target identification; the method comprises the following steps:

step 31: carrying out big data annotation;

selecting a rectangular area by using a marking tool for M collected big data images, defining the label of a background area as 0, and defining the label of a target area as 1; classifying to form a training set and a verification set which have a certain scale and are used for training the deep learning model, and realizing the identification of the class 1 target; the number of M is at least 12000;

step 32: carrying out data training;

training a target classification model by using the data set marked in the previous step;

step 33: performing real-time image fusion;

collecting a visible light image and an infrared image in real time, and fusing to obtain a fused image;

step 34: carrying out target identification and positioning;

and carrying out ship target detection on the fusion image obtained in the last step by using the trained classification model, recording the size and the position of all ship targets identified by the classification model, and identifying the rectangular area on the image by using a rectangular identification frame.

(III) advantageous effects

Compared with the prior art, the invention has the following beneficial effects:

(1) the invention provides a method for fusion based on multi-dimensional image information. By researching the correlation between the imaging characteristics and the image characteristics of different sensors, the fusion algorithm of the Laplacian pyramid decomposition structure based on multi-resolution analysis is realized. With the pyramidal decomposition of the image, objects of different sizes in the image can be analyzed, e.g., high resolution layers (lower layers) can be used to analyze details and low resolution layers (top layers) can be used to analyze larger objects. Meanwhile, information obtained by analyzing the lower layer with high resolution can be used for guiding the analysis of the upper layer with low resolution, so that the analysis and calculation can be greatly simplified. The image tower decomposition provides a convenient and flexible image multi-resolution analysis method, and the image Laplacian decomposition can decompose the importance (such as edges) of the image to different tower decomposition layers according to different scales. Because the image fusion method of the invention accords with the natural reality condition in the complex environment of the battlefield, compared with other existing image fusion methods, the invention has the characteristics of good fusion effect, rich details and the like.

(2) In the invention, the fused image is subjected to target identification by adopting a convolutional neural network based on deep learning, and the method has the characteristics of accurate identification, strong scale and illumination change resistance and the like.

(3) In the present invention, feature points are used for heterogeneous image registration. And the least square method is adopted to calculate the image transformation parameters, so that the method has the advantages of high calculation precision, high speed, good fusion recognition effect and the like.

Drawings

Fig. 1(a) -1 (d) are graphs showing the results of multi-dimensional image fusion experiments. Wherein, fig. 1(a) is a multi-dimensional image fusion software start interface; FIG. 1(b) is an interface of the multi-dimensional image fusion software after loading the image video; FIG. 1(c) is a visible light and infrared image registration point selection interface; fig. 1(d) is a multi-dimensional image fusion result.

FIG. 2 is a flowchart illustrating the operation of the method for identifying an object based on multi-dimensional image fusion according to the present invention.

FIG. 3 is a schematic diagram of image fusion based on pyramid decomposition according to the present invention.

Fig. 4(a) and 4(b) are graphs of experimental results of object recognition on a ship object image video according to a preferred embodiment of the present invention.

Detailed Description

In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.

In order to meet the target identification requirement in a complex environment, the invention provides a target identification method based on multi-dimensional image fusion for an unmanned system.

The method mainly aims to fuse multiband video image sequences by adopting an image fusion method and finally intelligently identify a target by adopting a deep learning method. It follows that a sequence of video images is the object of the invention that needs to be processed. The target identification technology based on multi-dimensional image fusion is used for labeling, training and learning multi-band fusion image big data and automatically identifying a plurality of targets in an image.

The image acquisition device adopts a full-Rui video visible light camera, and adopts an EXA-IR2 uncooled thermal imager for infrared. The computer hardware in the preferred embodiment adopts an I7-7700 processor, the dominant frequency is 2.80G, the size of the hard disk is 1T, and the target identification algorithm calculation for one frame of image by using the computer only needs about 0.1 second. And collecting 1 frame of visible light image and 1 frame of infrared image to perform image fusion operation. The visible light image is assumed to be a reference image, and the infrared image is assumed to be a candidate image. The basic flow of the image fusion technology is that firstly, a reference image and a candidate image are registered, so that a mathematical transformation model of the two images is established; and then, according to the established mathematical transformation model, carrying out unified coordinate transformation, namely transforming all candidate image sequences into a coordinate system of the reference image so as to form a fused image. For two images, due to the existence of scale, rotation and translation transformation, the transformation relation needs at least 3 pairs of matching points to solve.

The target identification method based on multi-dimensional image fusion provided by the preferred embodiment of the present invention completes the real-time identification of the image according to the workflow shown in fig. 2, and the identification process includes the following four major contents.

Specifically, the target identification method based on multi-dimensional image fusion provided by the invention comprises the following steps:

step 1: preprocessing the image; the method comprises the following steps:

step 11: calculating an image relative parameter transformation matrix;

For reference image g_BAnd candidate image g_CN pairs of matching points are found in the two multi-source images; if N ≧ N3, then reference image g_BAnd candidate image g_CRelative parameter transformation matrix P between_C←BThe following least square method formula is adopted for calculation:

P_C←B＝C*B^T*(B*B^T)^-1

due to the reference image g_BAnd candidate image g_CThere is a certain transformation relation between them, so it is necessary to transform them into the same coordinate system for image fusion. In the present invention, the candidate image g needs to be extracted_CConversion to reference image g_BIn a coordinate system.

wherein, B⁰Has a horizontal coordinate of

The value range is (1,2, …, W); vertical coordinate is

The value range is (1,2, …, H); c⁰Has a horizontal coordinate of

The value range is (1,2, …, W); vertical coordinate is

The value range is (1,2, …, H);

candidate image g_CMidpoint C⁰Pixel gray value g_c(i, j) assigning a reference image g_BCorresponding pixel point B⁰Then, a transformed candidate image g is obtained_B←CAnd combine the image g_B←CAs output and reference image g_BCarrying out fusion;

generally, image transformation can adopt two mapping modes: forward mapping and reverse mapping; the forward mapping is to transform the candidate image to the coordinate space where the reference image is located according to the calculated image transformation parameters; i.e. each pixel of the candidate image is scanned and the position of each pixel corresponding to the reference image is calculated in turn by means of the transformation function. When two adjacent pixel points of the candidate image are mapped to two non-adjacent pixel points of the reference image, discrete mosaic and virtual point holes occur. Therefore, a conversion idea is required, which can use a reverse idea to find the coordinates of the candidate image corresponding to each point of the reference image. The reverse mapping is from the reference image g_BStarting from this, the reference image g is solved by a transformation function_BOn each pixel point in the candidate image g_CTo the corresponding position on. First, a reference image g is scanned_BThen according to the transformation function, calculating the candidate image g_CCorresponding sampled pixel ofPoints and assigning the gray value of the point to the reference image g_BThe corresponding pixel point of (2).

The reverse mapping is better than the forward mapping because each pixel of the reference image can be scanned to obtain an appropriate gray value, thereby avoiding the situation that some points of the output image in the forward mapping may not be assigned and virtual point holes and mosaics appear.

a reference image g_BAnd the transformed candidate image g_B←CThe fusion is carried out, and an image fusion algorithm of a pyramid decomposition structure based on multi-resolution analysis is adopted, and the method comprises the following steps:

step 21: carrying out Gauss tower type decomposition on the image;

for a reference image g as a source image_BIn the order of G₀As the zero layer (bottom layer) of the Gaussian pyramid, the image G of the first layer of the Gaussian pyramid_lComprises the following steps:

in the formula: n1 is the top layer number of Gauss pyramid; c_lThe number of columns of the image of the l layer of the Gauss pyramid is obtained; r_lW (m, n) represents a 5 × 5 window function (generating kernel) with a low-pass characteristic, which is as follows:

step 22, establishing L aplace pyramid of the image;

wherein:

in the formula (I), the compound is shown in the specification,

And G_l-1The values of the gray values of the new pixels interpolated between the original pixels are determined by a weighted average of the gray values of the original pixels; due to G_lIs to G_l-1Obtained by low-pass filtering, i.e. G_lIs fuzzification, down-sampling G_l-1Thus, therefore, it is

Ratio of detail information of G_l-1Less;

thus, a decomposed image L P of L aplace pyramid layers was obtained_l：

step 23, reconstructing a source image by L aplace pyramid;

by transformation of the above formula

Step 24, fusing images based on L aplace pyramid decomposition;

an image fusion method based on L aplace pyramid decomposition is shown in FIG. 3, wherein A and B are two source images, F is a fused image, and the fusion process is as follows:

step 31: carrying out big data annotation;

selecting a rectangular area by using a marking tool for M collected big data images, defining label of a background area as 0, and defining an area label of a ship target as 1; classifying to form a training set and a verification set which have a certain scale and are used for training a deep learning model, and realizing the identification of the class 1 ship target; the number of M is at least 12000;

step 32: carrying out data training;

training a ship target classification model by using the data set marked in the previous step;

step 33: performing real-time image fusion;

step 34: carrying out target identification and positioning;

Fig. 4(a) and 4(b) show experimental results of the object recognition based on multi-dimensional image fusion using the present preferred embodiment. Where there are two ship targets in fig. 4(a) and one ship target in fig. 4 (b). It can be seen that the invention has better target identification effect because of adopting the target identification method based on multi-dimensional image fusion.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.