CN114648475A - Infrared and visible light image fusion method and system based on low-rank sparse representation - Google Patents

Infrared and visible light image fusion method and system based on low-rank sparse representation Download PDF

Info

Publication number
CN114648475A
CN114648475A CN202210246475.3A CN202210246475A CN114648475A CN 114648475 A CN114648475 A CN 114648475A CN 202210246475 A CN202210246475 A CN 202210246475A CN 114648475 A CN114648475 A CN 114648475A
Authority
CN
China
Prior art keywords
image
infrared
visible light
fusion
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210246475.3A
Other languages
Chinese (zh)
Inventor
陶体伟
刘明霞
王琳琳
王倩倩
刘明慧
杨德运
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taishan University
Original Assignee
Taishan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taishan University filed Critical Taishan University
Priority to CN202210246475.3A priority Critical patent/CN114648475A/en
Publication of CN114648475A publication Critical patent/CN114648475A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides an infrared and visible light image fusion method based on low-rank sparse representation, which comprises the following steps of: taking infrared and visible light images as different task inputs, and decomposing the infrared and visible light images into a significant part and a basic part by a sparse consistent potential low-rank representation decomposition module; further mining the decomposed basic part by using a deep neural network model, and extracting more effective fusion characteristics; in the image fusion module, different fusion strategies are applied to the basic part and the significant part to obtain a final fusion image. The invention further provides an infrared and visible light image fusion system based on the low-rank sparse representation. The final fusion image obtained by the invention utilizes the complementary information of the multi-modal sensor image data, so that the obtained final fusion image has high definition and distinct layers, and the salient region and the target in the scene are relatively more prominent.

Description

Infrared and visible light image fusion method and system based on low-rank sparse representation
Technical Field
The invention relates to the technical field of image fusion, in particular to an infrared and visible light image fusion method and system based on low-rank sparse representation.
Background
Image fusion is an image enhancement technique aimed at combining images from different sensors to generate a fused image with complementary information and a salient visual effect, with better target salient information and comprehensive detailed background information. Therefore, it can provide reliable and information-rich images for advanced visual tasks in fields including object detection, computer-aided diagnosis, semantic segmentation, and the like.
Generally, methods of image fusion are mainly classified into three categories: (1) a multi-scale decomposition based approach; (2) sparse representation-based methods; (3) a method based on deep learning. The multi-scale decomposition method is the most widely researched and applied method in the field of image fusion, and can decompose an image into a low-frequency sub-band and a high-frequency sub-band and then perform weight fusion. But they are prone to cause the pseudo-Gibbs phenomenon because they do not have translational invariance. The key to sparse representation-based methods is how to construct redundant dictionaries, and the learning of dictionaries is very time consuming. In the fusion method based on the depth learning, the depth features of the source image are mainly utilized to generate the fusion image, but the image decomposition is rarely concerned.
In order to obtain a fused image with rich image information and good visual effect, a potential low-rank representation (LatLRR) method is widely applied to the field of infrared and visible light image fusion because the latlow-rank representation (LatLRR) method can effectively decompose the significance part and the basic part of the image. The LatLRR-based algorithm focuses on decomposing the image lines and solving the decomposed basic part and significant part by using an averaging and summing strategy to obtain the final fused image. Meanwhile, due to the rise of deep learning, the method of combining the LatLRR and the deep neural network also achieves greater achievement.
However, existing LatLRR-based image fusion methods do not take into account that infrared and visible images are obtained by different sensors at the same location. When image fusion is carried out, infrared images and visible light images are respectively decomposed, and the spatial consistency relationship between the infrared images and the visible light images is ignored, so that more characteristic information cannot be effectively captured. Therefore, how to obtain an image with better fusion effect is a great challenge in the field of infrared and visible light image fusion.
Disclosure of Invention
Aiming at the technical problems, the technical scheme adopted by the invention is as follows:
the invention provides an infrared and visible light image fusion method based on low-rank sparse representation,
an embodiment of the invention provides an infrared and visible light image fusion method based on low-rank sparse representation, which comprises the following steps:
s10, decomposing the infrared image and the visible light image respectively by using a potential low-rank representation decomposition model of sparse consistency;
the potential low-rank representation decomposition model of sparse consistency satisfies the following conditions:
(1)
Figure BDA0003544880870000021
(2)s.t.Xm=XmZm+LmXm+Em
wherein, XmRepresenting a source image, ZmLow rank matrix, L, representing an imagemA saliency matrix representing an image, EmA noise part representing an image, m being 1, 2, representing an infrared image and a visible light image, respectively; i Zm||*Represents ZmNuclear norm, | | Lm||*Represents LmIs a kernel norm, | Z | | luminance2,1Simultaneous L representation of a low rank matrix for infrared images and a low rank matrix for visible images2,1Norm regularization constraint; i Em||2,1Indicating that L is performed separately for noise portions of infrared images and noise portions of visible light2,1Norm regularization constraint; gamma is a balance noise parameter, and alpha is a regularization balance coefficient;
s20, solving the potential low-rank representation decomposition model of sparse consistency by using an alternating direction multiplier method to respectively obtain basic parts F of the infrared imagea1And a significance part Fb1And a base portion F of the visible imagea2And a significance part Fb2
S30, obtaining F by using a deep neural network modela1And Fa2Performing depth feature extraction and fusion to obtain a basic part fusion image Fa;
s40, adding Fb1And Fb2Fusing to obtain a saliency part fused image Fb
S50, mixing Fa and FbAnd fusing to obtain an infrared and visible light fused image.
Another embodiment of the present invention provides an infrared and visible light image fusion system based on low-rank sparse representation, including: the image fusion module comprises an image decomposition module, a feature extraction module and an image fusion module;
the image decomposition module is used for decomposing the infrared image and the visible light image by utilizing a potential low-rank representation decomposition model of sparse consistency, solving the potential low-rank representation decomposition model of sparse consistency by utilizing an alternating direction multiplier method, and obtaining a basic part F of the infrared image respectivelya1And a significance part Fb1And a base portion F of the visible imagea2And a significance part Fb2
Wherein the sparse consistency potential low rank representation decomposition model satisfies the following conditions:
Figure BDA0003544880870000031
s.t.Xm=XmZm+LmXm+Em
wherein, XmRepresenting a source image, ZmLow rank matrix, L, representing an imagemA saliency matrix representing an image, EmA noise part representing an image, m being 1, 2, representing an infrared image and a visible light image, respectively; | | Zm||*Represents ZmKernel norm, | | Lm||*Represents LmIs a kernel norm, | Z | | luminance2,1Simultaneous L representation of a low rank matrix for infrared images and a low rank matrix for visible images2,1Norm regularization constraint; i Em||2,1Representing L for noise part of infrared image and noise part of visible light respectively2,1Norm regularization constraint; gamma is a balance noise parameter, and alpha is a regularization balance coefficient;
the feature extraction module is used for obtaining F by utilizing a deep neural network model paira1And Fa2Performing depth feature extraction and fusion to obtain a basic part fusion image Fa;
the image fusion module is used for fusing the Fb1And Fb2Fusing to obtain a significant part fused image FbAnd Fa and FbAnd fusing to obtain an infrared and visible light fused image.
According to the low-rank sparse representation-based infrared and visible light image fusion method and system, firstly, infrared and visible light images are used as different task inputs and are decomposed by a sparse consistency potential low-rank representation decomposition module. In the decomposition process, the L2, 1 norm is adopted to restrain the rank so as to keep the rank sparse consistency, and the low-rank consensus representation of the infrared and visible light images is synchronously obtained; secondly, further mining the basic part by using a deep neural network model, and extracting more effective fusion characteristics; and finally, in an image fusion module, applying different fusion strategies to the basic part and the salient part, wherein the salient part uses a weighted summation fusion strategy, the basic part is fused through a maximum selection strategy, and the obtained final basic part and the salient part are subjected to summation operation to obtain a final fusion image. In other words, the final fusion image obtained by the invention utilizes the complementary information of the multi-modal sensor image data, so that the obtained final fusion image has high definition and distinct hierarchy, and the salient region and the target in the scene are relatively more prominent. In addition, synchronous low-rank decomposition is carried out on the multi-modal images, and the operation efficiency of the algorithm can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of an infrared and visible light image fusion method based on low-rank sparse representation according to an embodiment of the present invention;
fig. 2a to 2c are fused image diagrams obtained by using the low-rank sparse representation-based infrared and visible light image fusion method provided by the embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
Fig. 1 is a schematic flow chart of an infrared and visible light image fusion method based on low-rank sparse representation according to an embodiment of the present invention. As shown in fig. 1, an infrared and visible light image fusion method based on low-rank sparse representation according to an embodiment of the present invention may include the following steps:
and S10, decomposing the infrared image and the visible light image respectively by using a potential low-rank representation decomposition model of sparse consistency.
In the embodiment of the present invention, the sparse consistency potential low rank representation decomposition model satisfies the following conditions:
(1)
Figure BDA0003544880870000041
(2)s.t.Xm=XmZm+LmXm+Em
wherein, XmRepresenting a source image, ZmLow rank matrix, L, representing an imagemA saliency matrix representing an image, EmA noise part representing an image, m being 1, 2, representing an infrared image and a visible light image, respectively; i Zm||*Represents ZmNuclear norm, | | Lm||*Represents LmIs a kernel norm, | Z | | luminance2,1Simultaneous L representation of a low rank matrix for infrared images and a low rank matrix for visible images2,1Norm regularization constraint; i Em||2,1Representing L for noise part of infrared image and noise part of visible light respectively2,1Norm regularization constraint; gamma is a balance noise parameter for balancing the influence of noise. Alpha is a regularized balance coefficient for constraining ZmAnd Lm. γ and α can be empirical values, and in one exemplary embodiment, both γ and α can be in the range of [0.001, 0.01, 0.1, 1, 10, 100 ]]Preferably, γ and α are both equal to 100.
By the above conditions (1) and (2), the basic portion X of each image can be obtainedmZmA significant portion LmXmAnd a noise part Em
In the embodiment of the invention, a potential low-rank representation method is improved, common information among different modal data is fully utilized through an improved Scc-LatLRR method, and a double-rank (Z, L) part of an input image is solved by adopting sparse consistency at the same part. The invention applies sparse consistency constraint to the matrix Z in double rank, because for the matrix Z, more space consistency information among the matrixes can be obtained through constraint. The reason why L is not constrained is: when the matrix Z obtains certain spatial consistency information, since the information amount of the image itself is limited, the Scc-LatLRR is constrained in a manner of X ═ XZ + LX + E, so that the information amount of the saliency portion increases, and the saliency portion is a key portion in image fusion.
S20, solving the potential low-rank representation decomposition model of sparse consistency by using an alternating direction multiplier method to respectively obtain basic parts F of the infrared imagea1And a significance part Fb1And a base portion F of the visible imagea2And a significance part Fb2
In the model training phase, the invention uses an Alternating Direction multiplier Method of multipliers (ADMM) to optimize a potential low-rank representation decomposition model of sparse consistency. Specifically, a large global problem can be decomposed into a plurality of smaller local subproblems which are easy to solve by constructing an augmented lagrange function, and a large global function solution is obtained by coordinating the solutions of the subproblems. The specific solving process comprises the following steps:
s201, obtaining an augmented lagrangian function model defined by the following conditions (3) and (4):
Figure BDA0003544880870000051
s.t.Xm=XmSm+KmXm+Em,Zm=Jm,Zm=Sm,Lm=Km (4)
s202, iteratively and alternately updating each variable according to the following conditions (5) to (11) until a convergence condition is met:
Figure BDA0003544880870000061
Figure BDA0003544880870000062
Figure BDA0003544880870000063
Figure BDA0003544880870000064
Figure BDA0003544880870000065
Figure BDA0003544880870000066
Figure BDA0003544880870000067
wherein k is an iteration number, a value of k is 1 to N, N is a total iteration number, ρ is an iteration step length, and preferably, ρ is 1.1. u. ofkFor the balance factor at the kth iteration, the initial value u is set to 0.0001, and as the iterations grow slowly, it is reachedmax=1010And then stopping.
Figure BDA0003544880870000068
Is an augmented Lagrange multiplier during the kth iteration; i is an identity matrix, Xm' represents XmThe transposed matrix of (2). G is a matrix formed by drawing and connecting together low-rank matrices derived from a single image, n represents the matrix dimension,
Figure BDA0003544880870000069
Figure BDA00035448808700000610
r is Zk+11 x n of (a)2Row direction of dimensionThe matrix of n x n is reconstructed by weight, in particular, after passing L2,1Norm regularization constraint solving Zk+1Then, to
Figure BDA00035448808700000611
Performing a reconstitution of Zk+1Reconstructing the low-rank matrix of the intermediate pull row vector into a matrix with dimension n x n, namely
Figure BDA00035448808700000612
In the embodiment of the present invention, it is,
Figure BDA00035448808700000613
the closed-form solution of (a) may be decomposed using SVD, the closed-form solution thus obtained being
Figure BDA00035448808700000614
In the same way, the method for preparing the composite material,
Figure BDA0003544880870000071
make the closed type solution of
Figure BDA0003544880870000072
The above condition (8) can be obtained by the following procedure:
Figure BDA0003544880870000073
to pair
Figure BDA0003544880870000074
The partial derivative is calculated so that it equals 0, resulting in:
Figure BDA0003544880870000075
similarly, the above condition (9) can be satisfied by
Figure BDA0003544880870000076
Partial derivativeNumber solution to obtain
Figure BDA0003544880870000077
S203, updating the augmented Lagrange multiplier according to the following conditions (12) to (15):
Figure BDA0003544880870000078
Figure BDA0003544880870000079
Figure BDA00035448808700000710
Figure BDA00035448808700000711
s304, updating the balance factor according to the following condition (16):
uk+1=min(ρuk,umax) (16)。
in the embodiment of the present invention, preferably, umax=1010
Further, in the embodiment of the present invention, the convergence condition satisfies:
Figure BDA00035448808700000712
Figure BDA00035448808700000713
Figure BDA0003544880870000081
epsilon is a preset value, preferably epsilon is 10-6
Further, in the embodiment of the present invention, Z may bem、Jm、Sm、Lm、Km、Em、Wm、Mm、YmAnd VmAre all set to 0.
S30, obtaining F by using the deep neural network model paira1And Fa2And extracting depth features and fusing to obtain a basic part fused image Fa.
The basic part of the decomposition of the LatLRR method is similar to the smoothed version of the source image, which contains the most efficient image information. To better fuse it, the present invention introduces a deep neural network model for further feature mining.
In an exemplary embodiment of the invention, the deep neural network model may be a convolutional neural network VGG model. The convolution groups may extract features of the image through convolution operations, and as the number of convolution groups increases, more abstract features may be extracted. Because the difference between the feature diagram size output by the 5 th convolution of the VGG19 convolutional neural network and the detail content diagram of the source image is too large, the weight diagram is constructed on the basis of the output of the first 4 convolution groups, namely the first four convolution groups are mainly selected to fuse the infrared image and the visible light image base part.
Further, S30 may include:
s301, adding Fa1And Fa2And inputting the data into a convolutional neural network VGG model.
In an embodiment of the invention, the convolutional neural network VGG model may be a trained model. The training of the convolutional neural network VGG model may employ existing techniques.
S302, obtaining a multi-channel feature map output by a convolution group i in a convolution neural network VGG model
Figure BDA0003544880870000082
The value of i is 1 to 4, and k is the number of channels of the output characteristic diagram of the convolution group i;
s302, pressObtaining a single-channel feature map of the convolution group i according to the following conditions (17) and (18)
Figure BDA0003544880870000083
And
Figure BDA0003544880870000084
Figure BDA0003544880870000085
Figure BDA0003544880870000086
Figure BDA0003544880870000087
and
Figure BDA0003544880870000088
respectively represent
Figure BDA0003544880870000089
And
Figure BDA00035448808700000810
the value at position (x, y). The above conditions (17) and (18) are obtained by using the norm pair of L1
Figure BDA00035448808700000811
And
Figure BDA00035448808700000812
compressing to obtain single-channel characteristic diagram
Figure BDA00035448808700000813
And
Figure BDA00035448808700000814
s303, according to the following conditions (19) and (20)
Figure BDA0003544880870000091
And
Figure BDA0003544880870000092
carrying out smoothing treatment to obtain a single-channel characteristic diagram after smoothing treatment
Figure BDA0003544880870000093
And
Figure BDA0003544880870000094
Figure BDA0003544880870000095
Figure BDA0003544880870000096
where r is a predetermined region size, preferably, r is 1.
The above conditions (19) and (20) are used to perform a region mean operation, i.e., to smooth the single-channel feature map, so that the fused image can be more natural.
S304, acquiring a first infrared weight map according to the following conditions (21) and (22)
Figure BDA0003544880870000097
And a first visible light weight map
Figure BDA0003544880870000098
Figure BDA0003544880870000099
Figure BDA00035448808700000910
The weight obtained by the above conditions (21) and (22) is an initial normalized weight.
S305, acquiring a second infrared weight map according to the following conditions (23) and (24)
Figure BDA00035448808700000911
And a second visible light weight map
Figure BDA00035448808700000912
Figure BDA00035448808700000913
Figure BDA00035448808700000914
Since the pooling layer of the convolutional neural network is a data downsampling operation, the size of the image becomes smaller in the course of successive iterations. In order to obtain consistent image fusion size, the weight mapping needs to be adjusted by using up-sampling
Figure BDA00035448808700000915
I.e., the image has the same size as the original portion of the image after the decomposition by the above-mentioned conditions (23) and (24).
S306, acquiring a fusion image obtained by the convolution group i
Figure BDA00035448808700000916
4 fused images were obtained.
S307, obtaining the following conditions (25) to (27)
Figure BDA00035448808700000917
And
Figure BDA00035448808700000918
Figure BDA0003544880870000101
Figure BDA0003544880870000102
Figure BDA0003544880870000103
wherein the content of the first and second substances,
Figure BDA0003544880870000104
representing a fused image
Figure BDA0003544880870000105
The size of the contained source image information amount, namely the size of the index value represents the source image information amount in the fusion image, and is used for measuring the information correlation between the source image and the fusion image.
Figure BDA0003544880870000106
Respectively representing fused images
Figure BDA0003544880870000107
The amount of information in the infrared image and the visible light image contained in (a). P (X1), P (X2) and
Figure BDA0003544880870000108
respectively representing an infrared source image X1, a visible light source image X2, and a fused image
Figure BDA0003544880870000109
Normalized edge histogram of (1).
Figure BDA00035448808700001010
And
Figure BDA00035448808700001011
respectively representing an infrared source image X1 and a visible source image X2 and a fused image
Figure BDA00035448808700001012
Normalized joint gray histogram of (1).
S308, obtaining
Figure BDA00035448808700001013
And mixing MImaxThe corresponding fused image is designated as Fa.
A large MI value indicates that the fused image of the source images contains more source image information, and the fusion performance is improved. Thereby, according to
Figure BDA00035448808700001014
The final base part Fa can be obtained according to the maximum selection principle.
In this embodiment, the depth feature extraction is performed on the basic portions of the two images decomposed in the previous step by using the convolutional neural network model, and since the basic portion information includes the consistency information of the two images, the basic portion information is continuously mined, so that the fusion effect of the basic portion images is enhanced to a certain extent.
In another embodiment of the present invention, the deep neural network model may employ a residual network (ResNet 50) model.
S40, adding Fb1And Fb2Fusing to obtain a saliency part fused image Fb
Those skilled in the art will convert Fb1And Fb2Fusing to obtain a saliency part fused image FbMay be of the prior art. For example, in one exemplary embodiment, Fb=Fb1+Fb2. In another exemplary embodiment, Fb=k1*Fb1+k2*Fb2K1 and k2 are each Fb1And Fb2The weight coefficient of (c).
S50, mixing Fa and FbAnd fusing to obtain an infrared and visible light fused image.
In the embodiment of the invention, the image fusion can be carried out by adopting a weighted summation strategy, namely F ═ Fa+Fb
[ examples ] A method for producing a compound
In an embodiment of the invention, the infrared and visible fused Dataset uses TNO, linked as https:// figshare. com/articles/Dataset/TNO _ Image _ Fusion _ Dataset/1008029. From which 21 pairs of infrared and visible images were selected as the original images, based on image quality and frequency of appearance in the paper. The method adopts evaluation indexes CE, EN, MI, FMIdct and Qabf based on the information theory, human visual perception evaluation index VIF based on the information theory, evaluation index SCD based on the source and the generated image and human visual sensitivity evaluation index Qcb to perform qualitative evaluation. In addition to CE, a larger value generally means a better fusion effect.
All experiments were performed in hardware and software environments such as Matlab 2018a and Windows 10 systems, Intercore i5-8400 CPU and 32-GB RAM. Taking the images shown in fig. 2a and 2b as an example (corresponding to fig. 14 in table 1, the size is 768 × 576), the runtime of processing using the existing LatLRR method and the LatLRR method improved by the present invention (Scc-LatLRR method) can be as shown in table 1. As can be seen from Table 1, the operation time of the conventional LatLRR method is 164.263553 seconds, while the operation time of the Scc-LatLRR method of the present invention is 64.060377 seconds, which saves nearly three times of the operation time. It can be seen from the running time average that the method of the present invention can achieve higher running efficiency. The fused image obtained by the method of the present invention can be shown in fig. 2 c. The evaluation values of the images processed by the LatLRR method and the Scc-LatLRR method improved by the present invention can be shown in table 2. As can be seen from Table 2, the evaluation indexes of the image obtained by the invention are superior to those of the LatLRR method.
Table 1: dataset picture runtime
Figure BDA0003544880870000111
Figure BDA0003544880870000121
Table 2: evaluation of index value
Method EN MI VIF Qabf FMIdct SCD Qcb CE
LatLRR 6.19917 12.39833 0.30237 0.34063 0.30742 1.60412 0.48911 1.50908
SccLatLRR 6.76293 13.52587 0.89563 0.48190 0.40875 1.69510 0.51472 1.30792
The application scenes of the infrared and visible light image fusion method based on the low-rank sparse representation provided by the embodiment of the invention can comprise the fields of military reconnaissance, medical diagnosis, artificial intelligence and the like. By taking military reconnaissance as an example, the fusion method provided by the invention is used for fusing the infrared image and the visible light image, so that the actual, reliable and obvious judgment basis of the target can be provided for the fighter, and the fighting efficiency can be improved.
Another embodiment of the present invention further provides a low-rank sparse representation-based infrared and visible light image fusion system, including: the image fusion module comprises an image decomposition module, a feature extraction module and an image fusion module;
the image decomposition module is used for decomposing the infrared image and the visible light image by utilizing a potential low-rank representation decomposition model of sparse consistency, solving the potential low-rank representation decomposition model of sparse consistency by utilizing an alternating direction multiplier method, and obtaining a basic part F of the infrared image respectivelya1And a significance part Fb1And a base portion F of the visible imagea2And a significance part Fb2
Wherein the sparse consistency potential low rank representation decomposition model satisfies the following conditions:
Figure BDA0003544880870000131
s.t.Xm=XmZm+LmXm+Em
wherein, XmRepresenting a source image, ZmLow rank matrix, L, representing an imagemA saliency matrix representing an image, EmRepresenting the noise part of the image, m 1, 2, respectively, redAn external image and a visible light image; i Zm||*Represents ZmNuclear norm, | | Lm||*Represents LmThe kernel norm of (1), Z _ y _ counting2,1Simultaneous L representation of a low rank matrix for infrared images and a low rank matrix for visible images2,1Norm regularization constraint; | | Em||2,1Representing L for noise part of infrared image and noise part of visible light respectively2,1Norm regularization constraint; gamma is a balance noise parameter, and alpha is a regularization balance coefficient;
the feature extraction module is used for obtaining F by utilizing a deep neural network model paira1And Fa2Performing depth feature extraction and fusion to obtain a basic part fusion image Fa;
the image fusion module is used for fusing the Fb1And Fb2Fusing to obtain a significant part fused image FbAnd a combination of Fa and FbAnd fusing to obtain an infrared and visible light fused image.
Further, the image decomposition module is specifically configured to perform the following operations:
s1, obtaining an augmented Lagrangian function model defined by the following conditions:
Figure BDA0003544880870000132
s.t.Xm=XmSm+KmXm+Em,Zm=Jm,Zm=Sm,Lm=Km
and S2, iteratively and alternately updating each variable according to the following conditions until a convergence condition is met:
Figure BDA0003544880870000133
Figure BDA0003544880870000134
Figure BDA0003544880870000141
Figure BDA0003544880870000142
Figure BDA0003544880870000143
Figure BDA0003544880870000144
Figure BDA0003544880870000145
wherein k is iteration frequency, k is 1 to N, N is total iteration frequency, rho is iteration step length, u iskIs the balance factor at the k-th iteration,
Figure BDA0003544880870000146
is an augmented Lagrange multiplier during the kth iteration; i is an identity matrix, Xm' represents XmThe transposed matrix of (2);
Figure BDA0003544880870000147
Figure BDA0003544880870000148
r is Zk+1Each 1 x n of2The row vectors of the dimensions reconstruct a matrix of n x n.
And S3, updating the augmented Lagrangian multiplier according to the following conditions:
Figure BDA0003544880870000149
Figure BDA00035448808700001410
Figure BDA00035448808700001411
Figure BDA00035448808700001412
and S4, updating the balance factor according to the following conditions:
uk+1=min(ρuk,μmax)
further, the convergence condition satisfies:
Figure BDA00035448808700001413
Figure BDA0003544880870000151
Figure BDA0003544880870000152
epsilon is a preset value.
Further, the deep neural network model may be a convolutional neural network VGG model.
Further, the image fusion module is specifically configured to perform the following operations:
s11, adding Fa1And Fa2Inputting the data into a convolutional neural network VGG model;
s12, obtaining multiple channels of convolution group i output in the convolution neural network VGG modelRoad characteristic diagram
Figure BDA0003544880870000153
The value of i is 1 to 4, and k is the number of channels of the output characteristic diagram of the convolution group i;
s13, acquiring a single-channel feature map of the convolution group i according to the following conditions
Figure BDA0003544880870000154
And
Figure BDA0003544880870000155
Figure BDA0003544880870000156
Figure BDA0003544880870000157
Figure BDA0003544880870000158
and
Figure BDA0003544880870000159
respectively represent
Figure BDA00035448808700001510
And
Figure BDA00035448808700001511
a value at position (x, y);
s14, according to the following conditions
Figure BDA00035448808700001512
And
Figure BDA00035448808700001513
carrying out smoothing treatment to obtain a single-channel characteristic diagram after smoothing treatment
Figure BDA00035448808700001514
And
Figure BDA00035448808700001515
Figure BDA00035448808700001516
Figure BDA00035448808700001517
wherein r is the set area size;
s15, acquiring a first infrared weight map according to the following conditions
Figure BDA00035448808700001518
And a first visible light weight map
Figure BDA00035448808700001519
Figure BDA00035448808700001520
Figure BDA00035448808700001521
S16, acquiring a second infrared weight map according to the following conditions
Figure BDA00035448808700001522
And a second visible light weight map
Figure BDA0003544880870000161
Figure BDA0003544880870000162
c∈{0,1,...(2i-1-1)}
Figure BDA0003544880870000163
c∈{0,1,...(2i-1-1)}
S17, obtaining a fusion image obtained by the convolution group i
Figure BDA0003544880870000164
4 fused images were obtained.
S18, obtaining the following conditions
Figure BDA0003544880870000165
And
Figure BDA0003544880870000166
Figure BDA0003544880870000167
Figure BDA0003544880870000168
Figure BDA0003544880870000169
wherein the content of the first and second substances,
Figure BDA00035448808700001610
representing a fused image
Figure BDA00035448808700001611
The amount of the source image information contained in (1);
Figure BDA00035448808700001612
Figure BDA00035448808700001613
respectively representing fused images
Figure BDA00035448808700001614
The amount of information in the infrared image and the visible light image contained in (a); p (X1), P (X2) and
Figure BDA00035448808700001615
respectively representing an infrared source image X1, a visible light source image X2, and a fused image
Figure BDA00035448808700001616
Normalized edge histogram of (a);
Figure BDA00035448808700001617
and
Figure BDA00035448808700001618
respectively representing an infrared source image X1 and a visible source image X2 and a fused image
Figure BDA00035448808700001619
Normalized joint gray level histogram of (1);
s19, obtaining
Figure BDA00035448808700001620
And mixing MImaxThe corresponding fused image is designated as Fa.
In summary, the infrared and visible light image fusion method and system based on low-rank sparse representation provided by the embodiments of the present invention have at least the following advantages:
(1) by utilizing complementary information of multi-modal sensor image data, a potential low-rank representation multi-modal image fusion framework based on sparse consistency constraint is developed, and deep feature information is deeply mined through VGG (vertical gradient generator) to obtain a better image fusion effect. Different from the previous method for decomposing infrared and visible light images separately by using the LatLRR, the method provided by the invention can better utilize two image modalities by synchronizing low-rank decomposition, and is beneficial to acquiring the common part of the basic parts of different sensor images in the same scene.
(2) Considering that the basic part in the potential low-rank representation decomposition is similar to a smooth version of a source image and has more common information among images, in order to obtain fusion effective information, a feature extraction algorithm based on a VGG neural network is selected, and feature depth extraction of two heterogeneous modes can be achieved.
(3) And the multi-modal image is subjected to synchronous low-rank decomposition, so that the image processing efficiency can be improved.
Although some specific embodiments of the present invention have been described in detail by way of illustration, it should be understood by those skilled in the art that the above illustration is only for the purpose of illustration and is not intended to limit the scope of the invention. It will also be appreciated by those skilled in the art that various modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the present disclosure is defined by the appended claims.

Claims (8)

1. A low-rank sparse representation-based infrared and visible light image fusion method is characterized by comprising the following steps:
s10, decomposing the infrared image and the visible light image respectively by using a potential low-rank representation decomposition model of sparse consistency;
the potential low-rank representation decomposition model of sparse consistency satisfies the following conditions:
(1)
Figure RE-FDA0003600309890000011
(2)s.t.Xm=XmZm+LmXm+Em
wherein, XmRepresenting a source image, ZmLow rank matrix, L, representing an imagemA saliency matrix representing an image, EmA noise part representing an image, m being 1, 2, representing an infrared image and a visible light image, respectively; i Zm||*Represents ZmNuclear norm, | | Lm||*Represents LmIs a kernel norm, | Z | | luminance2,1Simultaneous L representation of a low-rank matrix for infrared images and a low-rank matrix for visible images2,1Norm regularization constraint; i Em||2,1Representing L for noise part of infrared image and noise part of visible light respectively2,1Norm regularization constraint; gamma is a balance noise parameter, and alpha is a regularization balance coefficient;
s20, solving the potential low-rank representation decomposition model of sparse consistency by using an alternating direction multiplier method to respectively obtain basic parts F of the infrared imagea1And a significance part Fb1And a base portion F of the visible imagea2And a significance part Fb2
S30, obtaining F by using the deep neural network model paira1And Fa2Performing depth feature extraction and fusion to obtain a basic part fusion image Fa;
s40, adding Fb1And Fb2Fusing to obtain a significant part fused image Fb
S50, mixing Fa and FbAnd fusing to obtain an infrared and visible light fused image.
2. The method of claim 1, wherein S20 further comprises:
s201, obtaining an augmented lagrangian function model defined by the following conditions (3) and (4):
Figure RE-FDA0003600309890000012
s.t.Xm=XmSm+KmXm+Em,Zm=jm,Zm=Sm,Lm=Km (4)
s202, iteratively and alternately updating each variable according to the following conditions (5) to (11) until a convergence condition is met:
Figure RE-FDA0003600309890000021
Figure RE-FDA0003600309890000022
Figure RE-FDA0003600309890000023
Figure RE-FDA0003600309890000024
Figure RE-FDA0003600309890000025
Figure RE-FDA0003600309890000026
Figure RE-FDA0003600309890000027
wherein k is iteration frequency, k is 1 to N, N is total iteration frequency, rho is iteration step length, u iskIs the balance factor at the k-th iteration,
Figure RE-FDA0003600309890000028
is an augmented Lagrange multiplier during the kth iteration; i is an identity matrix, Xm' represents XmThe transposed matrix of (2);
Figure RE-FDA0003600309890000029
Figure RE-FDA00036003098900000210
r is Zk+1Each 1 x n of2The row vectors of the dimensions reconstruct a matrix of n x n.
3. The method of claim 2, wherein S20 further comprises:
s203, updating the augmented Lagrange multiplier according to the following conditions (12) to (15):
Figure RE-FDA00036003098900000211
Figure RE-FDA00036003098900000212
Figure RE-FDA00036003098900000213
Figure RE-FDA0003600309890000031
4. the method of claim 2, wherein S20 further comprises:
s304, updating the balance factor according to the following condition (16):
uk+1=min(ρuk,μmax) (16)。
5. the method of claim 2, wherein the convergence condition satisfies:
Figure RE-FDA0003600309890000032
Figure RE-FDA0003600309890000033
Figure RE-FDA0003600309890000034
epsilon is a preset value.
6. The method of claim 1, wherein in S30, the deep neural network model is a convolutional neural network VGG model.
7. The method of claim 6, wherein S30 further comprises:
s301, adding Fa1And Fa2Inputting the data into a convolutional neural network VGG model;
s302, obtaining a multi-channel feature map output by a convolution group i in a convolution neural network VGG model
Figure RE-FDA0003600309890000035
The value of i is 1 to 4, and k is the number of channels of the output characteristic diagram of the convolution group i;
s302, acquiring a single-channel feature map of the convolution group i according to the following conditions (17) and (18)
Figure RE-FDA0003600309890000036
And
Figure RE-FDA0003600309890000037
Figure RE-FDA0003600309890000038
Figure RE-FDA0003600309890000039
Figure RE-FDA00036003098900000310
and
Figure RE-FDA00036003098900000311
respectively represent
Figure RE-FDA00036003098900000312
And
Figure RE-FDA00036003098900000313
a value at position (x, y);
s303, according to the following conditions (19) and (20)
Figure RE-FDA00036003098900000314
And
Figure RE-FDA00036003098900000315
carrying out smoothing treatment to obtain a single-channel characteristic diagram after smoothing treatment
Figure RE-FDA00036003098900000316
And
Figure RE-FDA00036003098900000317
Figure RE-FDA00036003098900000318
Figure RE-FDA00036003098900000319
wherein r is the set area size;
s304, acquiring a first infrared weight map according to the following conditions (21) and (22)
Figure RE-FDA0003600309890000041
And a first visible light weight map
Figure RE-FDA0003600309890000042
Figure RE-FDA0003600309890000043
Figure RE-FDA0003600309890000044
S305, acquiring a second infrared weight map according to the following conditions (23) and (24)
Figure RE-FDA0003600309890000045
And a second visible light weight map
Figure RE-FDA0003600309890000046
Figure RE-FDA0003600309890000047
Figure RE-FDA0003600309890000048
S306, acquiring a fusion image obtained by the convolution group i
Figure RE-FDA0003600309890000049
Obtaining 4 fusion images;
s307, obtaining the following conditions (25) to (27)
Figure RE-FDA00036003098900000410
And
Figure RE-FDA00036003098900000411
Figure RE-FDA00036003098900000412
Figure RE-FDA00036003098900000413
Figure RE-FDA00036003098900000414
wherein the content of the first and second substances,
Figure RE-FDA00036003098900000415
representing a fused image
Figure RE-FDA00036003098900000416
The amount of source image information contained in the image;
Figure RE-FDA00036003098900000417
Figure RE-FDA00036003098900000418
respectively representing fused images
Figure RE-FDA00036003098900000419
The amount of information in the infrared image and the visible light image contained in (a); p (X1), P (X2) and
Figure RE-FDA00036003098900000420
respectively representing an infrared source image X1, a visible light source image X2, and a fused image
Figure RE-FDA00036003098900000421
Normalized edge histogram of (a);
Figure RE-FDA00036003098900000422
and
Figure RE-FDA00036003098900000423
respectively representing an infrared source image X1 and a visible source image X2 and a fused image
Figure RE-FDA00036003098900000424
Normalized joint gray level histogram of (1);
s308, obtaining
Figure RE-FDA00036003098900000425
And mixing MImaxThe corresponding fused image is designated as Fa.
8. An infrared and visible light image fusion system based on low-rank sparse representation, comprising: the image fusion module comprises an image decomposition module, a feature extraction module and an image fusion module;
the image decomposition module is used for decomposing the infrared image and the visible light image by utilizing a potential low-rank representation decomposition model of sparse consistency, solving the potential low-rank representation decomposition model of sparse consistency by utilizing an alternating direction multiplier method, and obtaining a basic part F of the infrared image respectivelya1And a significance part Fb1And a base portion F of the visible light imagea2And a significance part Fb2
Wherein the sparse consistency potential low rank representation decomposition model satisfies the following conditions:
Figure RE-FDA0003600309890000051
s.t.Xm=XmZm+LmXm+Em
wherein the content of the first and second substances,Xmrepresenting a source image, ZmLow rank matrix, L, representing an imagemA saliency matrix representing an image, EmA noise part representing an image, m being 1, 2, representing an infrared image and a visible light image, respectively; i Zm||*Represents ZmNuclear norm, | | Lm||*Represents LmThe kernel norm of (1), Z _ y _ counting2,1Simultaneous L representation of a low rank matrix for infrared images and a low rank matrix for visible images2,1Norm regularization constraint; i Em||2,1Representing L for noise part of infrared image and noise part of visible light respectively2,1Norm regularization constraint; gamma is a balance noise parameter, and alpha is a regularization balance coefficient;
the feature extraction module is used for obtaining F by utilizing a deep neural network model paira1And Fa2Performing depth feature extraction and fusion to obtain a basic part fusion image Fa;
the image fusion module is used for fusing Fb1And Fb2Fusing to obtain a significant part fused image FbAnd Fa and FbAnd fusing to obtain an infrared and visible light fused image.
CN202210246475.3A 2022-03-14 2022-03-14 Infrared and visible light image fusion method and system based on low-rank sparse representation Pending CN114648475A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210246475.3A CN114648475A (en) 2022-03-14 2022-03-14 Infrared and visible light image fusion method and system based on low-rank sparse representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210246475.3A CN114648475A (en) 2022-03-14 2022-03-14 Infrared and visible light image fusion method and system based on low-rank sparse representation

Publications (1)

Publication Number Publication Date
CN114648475A true CN114648475A (en) 2022-06-21

Family

ID=81994004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210246475.3A Pending CN114648475A (en) 2022-03-14 2022-03-14 Infrared and visible light image fusion method and system based on low-rank sparse representation

Country Status (1)

Country Link
CN (1) CN114648475A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116109539A (en) * 2023-03-21 2023-05-12 智洋创新科技股份有限公司 Infrared image texture information enhancement method and system based on generation of countermeasure network
CN116485694A (en) * 2023-04-25 2023-07-25 中国矿业大学 Infrared and visible light image fusion method and system based on variation principle
CN116681637A (en) * 2023-08-03 2023-09-01 国网安徽省电力有限公司超高压分公司 Ultra-high voltage converter transformer infrared and visible light monitoring image fusion method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116109539A (en) * 2023-03-21 2023-05-12 智洋创新科技股份有限公司 Infrared image texture information enhancement method and system based on generation of countermeasure network
CN116485694A (en) * 2023-04-25 2023-07-25 中国矿业大学 Infrared and visible light image fusion method and system based on variation principle
CN116485694B (en) * 2023-04-25 2023-11-07 中国矿业大学 Infrared and visible light image fusion method and system based on variation principle
CN116681637A (en) * 2023-08-03 2023-09-01 国网安徽省电力有限公司超高压分公司 Ultra-high voltage converter transformer infrared and visible light monitoring image fusion method and system
CN116681637B (en) * 2023-08-03 2024-01-02 国网安徽省电力有限公司超高压分公司 Ultra-high voltage converter transformer infrared and visible light monitoring image fusion method and system

Similar Documents

Publication Publication Date Title
CN114648475A (en) Infrared and visible light image fusion method and system based on low-rank sparse representation
Guo et al. LIME: Low-light image enhancement via illumination map estimation
Jiang et al. Matrix factorization for low-rank tensor completion using framelet prior
Yin et al. Highly accurate image reconstruction for multimodal noise suppression using semisupervised learning on big data
CN102246204B (en) Devices and methods for processing images using scale space
EP3532997B1 (en) Training and/or using neural network models to generate intermediary output of a spectral image
Ghorai et al. Multiple pyramids based image inpainting using local patch statistics and steering kernel feature
Vitoria et al. Semantic image inpainting through improved wasserstein generative adversarial networks
Shen et al. Convolutional neural pyramid for image processing
CN111046868B (en) Target significance detection method based on matrix low-rank sparse decomposition
Liu et al. Painting completion with generative translation models
CN111986132A (en) Infrared and visible light image fusion method based on DLatLRR and VGG & Net
CN110992367A (en) Method for performing semantic segmentation on image with shielding area
CN116012255A (en) Low-light image enhancement method for generating countermeasure network based on cyclic consistency
CN112767297A (en) Infrared unmanned aerial vehicle group target simulation method based on image derivation under complex background
Li et al. Nonconvex L1/2-regularized nonlocal self-similarity denoiser for compressive sensing based CT reconstruction
CN113706407B (en) Infrared and visible light image fusion method based on separation characterization
CN114092610B (en) Character video generation method based on generation of confrontation network
Sun et al. Tensor Gaussian process with contraction for multi-channel imaging analysis
Ren et al. Medical image super-resolution based on semantic perception transfer learning
Tao et al. Latent low-rank representation with sparse consistency constraint for infrared and visible image fusion
Wu et al. Coarse-to-Fine Low-Light Image Enhancement With Light Restoration and Color Refinement
CN112734655B (en) Low-light image enhancement method for enhancing CRM (customer relationship management) based on convolutional neural network image
Hongying et al. Image completion by a fast and adaptive exemplar-based image inpainting
Li et al. A review of image colourisation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination