CN111402306A - Low-light-level/infrared image color fusion method and system based on deep learning - Google Patents

Low-light-level/infrared image color fusion method and system based on deep learning Download PDF

Info

Publication number
CN111402306A
CN111402306A CN202010175703.3A CN202010175703A CN111402306A CN 111402306 A CN111402306 A CN 111402306A CN 202010175703 A CN202010175703 A CN 202010175703A CN 111402306 A CN111402306 A CN 111402306A
Authority
CN
China
Prior art keywords
convolution
network
infrared image
image
light
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010175703.3A
Other languages
Chinese (zh)
Inventor
刘超
胡清平
姚远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese People's Liberation Army 32801
Original Assignee
Chinese People's Liberation Army 32801
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese People's Liberation Army 32801 filed Critical Chinese People's Liberation Army 32801
Priority to CN202010175703.3A priority Critical patent/CN111402306A/en
Publication of CN111402306A publication Critical patent/CN111402306A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a dim light/infrared image color fusion method and system based on deep learning, wherein the fusion method comprises the following steps: acquiring training data; preprocessing the training data; constructing a low-light/infrared image chromaticity prediction network; constructing a loss function; training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain a trained dim light/infrared image chromaticity prediction network; extracting the chromaticity information of the target image by adopting the trained dim light/infrared image chromaticity prediction network; extracting brightness information of a target image; and finally synthesizing the color image based on the brightness information and the chrominance information. The method can realize the prediction of natural and stable colors and improve the observability of the color fusion image.

Description

Low-light-level/infrared image color fusion method and system based on deep learning
Technical Field
The invention relates to the field of night vision, in particular to a low-light/infrared image color fusion method and system based on deep learning.
Background
The night vision technology is a photoelectric technology for realizing night observation by means of a photoelectric imaging device, and the existing night vision equipment mainly comprises a low-light level night vision device and a thermal infrared imager. The low-light level night vision device detects reflected light of a target on moonlight and night sky light in a passive working mode, the obtained image is strong in layering sense, scene and target details are clear, the human eye observation habit is met, and the low-light level night vision device has the defects of poor contrast, limited gray level, large weather influence, easiness in interference of external environment light, limited detection distance and the like; the infrared thermal imager generates a scene thermal image according to the temperature and radiation difference between a target and a background, can work day and night, and has the advantages of high temperature resolution, good image contrast, large dynamic range, smoke penetration and haze penetration, and the like. In order to effectively solve the above problems of night vision equipment, color fusion of a low-light image and an infrared image has become an important development direction of night vision technology.
The existing dim light/infrared color fusion technology based on color transfer selects different modes such as ocean, desert, snowfield and the like for color transfer, but because the dim light/infrared color fusion technology only uses a small amount of simple statistical characteristics such as pixel neighborhood gray level mean value, standard deviation and the like as matching parameters, better pixel matching and color transfer effects are difficult to obtain, and the obtained dim light/infrared color fusion image still has the problems of unnatural colors, instability and the like.
Disclosure of Invention
The invention aims to provide a low-light/infrared image color fusion method and system based on deep learning, which can realize the prediction of natural and stable colors and improve the observability of a color fusion image.
In order to achieve the purpose, the invention provides the following scheme:
a low-light/infrared image color fusion method based on deep learning, the fusion method comprises the following steps:
acquiring training data;
preprocessing the training data;
constructing a low-light/infrared image chromaticity prediction network;
constructing a loss function;
training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain a trained dim light/infrared image chromaticity prediction network;
extracting the chromaticity information of the target image by adopting the trained dim light/infrared image chromaticity prediction network;
extracting brightness information of a target image;
and finally synthesizing the color image based on the brightness information and the chrominance information.
Optionally, the preprocessing the training data specifically includes:
and continuously preprocessing the training data by adopting a multimode image calibration technology.
Optionally, the dim light/infrared image chromaticity prediction network includes: a feature extraction sub-network and a chroma prediction sub-network.
Optionally, the loss function specifically adopts the following formula:
Figure BDA0002410747210000021
where h denotes the height coordinate of the training image, w denotes the width coordinate of the training image, and v (Z)h,w) Weight, z, representing sparseness of chrominance information in training seth,wExpressing the chromaticity information of the maximum probability distribution of pixel points with coordinates of h and w, q expressing the quantization value of chromaticity space, zh,w,qThe probability distribution that the coordinate is h and the chromaticity information of the w pixel point is q is represented,
Figure BDA0002410747210000022
the expression coordinate is h, and the chroma information of the w pixel point is the predicted value of the probability distribution of q.
Optionally, the network type of the feature extraction sub-network is an encoder-decoder network type, and includes an encoding network unit and a decoding network unit; the coding network unit adopts a conventional convolution unit and a hole convolution unit; the decoding network unit adopts a transposition convolution unit;
the chrominance extraction sub-network adopts a transposed convolution unit.
Optionally, the coding network units in the feature extraction sub-network specifically include a first conventional convolution unit, a second conventional convolution unit, a third conventional convolution unit, and a fourth hole convolution unit;
the decoding network unit in the feature extraction sub-network specifically includes: a fifth transposed convolution unit and a sixth transposed convolution unit;
the transposed convolution unit in the chrominance extraction sub-network specifically includes a seventh transposed convolution unit.
Optionally, the first conventional convolution unit Conv1 includes a first conventional convolution layer Conv1_1 and a second conventional convolution layer Conv1_2, where the Conv1_1 has a convolution kernel size of 2, a convolution kernel number of 64, a step size of 1, and an input channel of 3, and the Conv1_2 has a convolution kernel size of 3, a convolution kernel number of 64, a step size of 2, and an input channel of 64;
the second conventional convolution unit Conv2 has a convolution kernel size of 3, a convolution kernel number of 128, a step size of 2, and an input channel of 64;
the convolution kernel size of the third conventional convolution unit Conv3 is 2, the number of convolution kernels is 64, the step is 1, and the input channel is 3;
the fourth hole convolution unit DilaConv4 includes a first hole convolution layer DilaConv4_1 and a second hole convolution layer DilaConv4_2, the convolution kernel size of the first hole convolution layer DilaConv4_1 is 3, the number of convolution kernels is 256, the step is 1, and the input channel is 256, the convolution kernel size of the second hole convolution layer DilaConv4_2 is 3, the number of convolution kernels is 256, the step is 1, and the input channel is 256;
the size of a convolution kernel of the fifth transpose convolution unit TransConv5 is 3, the number of the convolution kernels is 128, the stride is 1, and the input channel is 128;
the size of a convolution kernel of the sixth transpose convolution unit TransConv6 is 3, the number of the convolution kernels is 64, the stride is 1, and the input channel is 256;
the seventh transpose convolution unit transcconv 7 has a convolution kernel size of 3, a number of convolution kernels of 313, a stride of 1, and an input channel of 128.
Optionally, the method further includes, after constructing the low light/infrared image chromaticity prediction network: and adopting BN (Batchnormalization) technology to normalize the output characteristics of each level in the dim light/infrared image chromaticity prediction network.
The invention additionally provides a dim light/infrared image color fusion system based on deep learning, which comprises:
the training data acquisition module is used for acquiring training data;
the preprocessing module is used for preprocessing the training data;
the prediction network construction module is used for constructing a low-light/infrared image chromaticity prediction network;
the loss function constructing module is used for constructing a loss function;
the training module is used for training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain the trained dim light/infrared image chromaticity prediction network;
the chrominance information extraction module is used for extracting chrominance information of the target image by adopting the trained dim light/infrared image chrominance prediction network;
the brightness information extraction module is used for extracting the brightness information of the target image;
and the color image synthesis module is used for finally synthesizing a color image based on the brightness information and the chrominance information.
Optionally, the preprocessing module specifically includes continuously preprocessing the training data by using a multi-mode image calibration technique.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method, training data are obtained, and the training data are preprocessed; constructing a low-light/infrared image chromaticity prediction network; constructing a loss function; training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain a trained dim light/infrared image chromaticity prediction network; the trained dim light/infrared image chromaticity prediction network is adopted to extract chromaticity information of a target image, luminance information of the target image is extracted, a color image is finally synthesized based on the luminance information and the chromaticity information, a large number of features in an original image and a training set image can be automatically obtained to serve as matching parameters, therefore, good pixel matching and color transfer effects can be obtained, color information is obtained by learning the mapping relation between a dim light image, an infrared image and a daytime color image, a color fusion image with natural and stable colors is obtained, the understanding of an observer to a scene can be improved, the identification degree of the target is improved, and the scene perception capability of the observer is effectively enhanced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a low-light/infrared image color fusion method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a training set for low light/infrared color fusion according to an embodiment of the present invention;
FIG. 3 is a dim level/infrared color fusion test set according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a chrominance prediction network according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a deep learning-based dim light/infrared image color fusion system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a low-light/infrared image color fusion method and system based on deep learning, which can realize the prediction of natural and stable colors and improve the observability of a color fusion image.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a flowchart of a low-light/infrared image color fusion method based on deep learning according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 101: training data is acquired.
Step 102: and preprocessing the training data.
Step 103: and constructing a low-light/infrared image chromaticity prediction network.
Step 104: a loss function is constructed.
Step 105: and training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain the trained dim light/infrared image chromaticity prediction network.
Step 106: and extracting the chromaticity information of the target image by adopting the trained glimmer/infrared image chromaticity prediction network.
Step 107: luminance information of the target image is extracted.
Step 108: and finally synthesizing the color image based on the brightness information and the chrominance information.
The following steps are described in detail:
step 101: training data is acquired.
The method comprises the steps of acquiring dim light, infrared and color images under different scenes in a certain area, acquiring the data set at the early dawn or at the dusk moment in order to ensure that the data set has good adaptability and quality, and acquiring the data set at least in a high acquisition quantity, wherein the dim light/infrared and color fusion original data under the same scene can be acquired through the part, and the part comprises a dim light image set L, an infrared image set I and a color image set C, which are specifically shown in figure 2.
Step 102: and preprocessing the training data.
The quality of data is important for an image processing method based on deep learning, acquired low-light/infrared color fusion original data is preprocessed, and due to the fact that in the acquisition process, the difference exists between the field angle and the optical axis position of low-light, infrared and color image acquisition equipment, the data of the low-light image set L, the infrared image set I and the color image set C need to be calibrated by adopting a multimode image calibration technology.
The method comprises the steps of performing registration operation on acquired low-light-level, infrared and color images in a pairwise registration mode, using the infrared images as a reference, mapping the color images and the low-light-level images to the reference of the infrared images to realize registration of the three types of images, and finally performing identical registration processing on all images in a data set.
Step 103: and constructing a low-light/infrared image chromaticity prediction network.
The invention designs a micro-light/infrared image chrominance prediction network (CP-Net), the structure of which is shown in figure 4 (wherein each convolution schematic arrow represents one or more convolution operations), the whole network consists of a feature extraction sub-network and a chrominance prediction sub-network: the feature extraction sub-network adopts a coder-decoder style network, the coding network unit adopts conventional convolution and hollow convolution to extract the multi-scale spatial features of the dim light and infrared images layer by layer, and deep convolution is adopted to express the complex features with larger scale and higher level; the decoding network unit expands the coding layer through the transposition convolution, improves the spatial resolution, integrates the multi-scale characteristic information of each level through residual connection, and provides more accurate position information for the subsequent chroma prediction sub-network; the chroma prediction sub-network performs color information prediction (313 is the number of colors after color space quantization) next to the decoding network unit by using a conventional convolution with a convolution kernel number of 313. Since pooling layers can result in partial information loss, the entire network does not use pooling layers, and the reduction in spatial resolution is achieved by using conventional convolution with steps greater than 1. Due to the difference of the characteristic values of different scales, the output characteristics of each level are normalized by adopting the BN technology.
For ease of understanding, the conventional convolutional layer is denoted as Conv (f)i,ni,si,ci) The void convolution layer is represented by Dilaconv (f)i,ni,si,ci) The transposed convolutional layer is denoted as TransConv (f)i,ni,si,ci) Wherein the variable fi,ni,si,ciRespectively representing the size, number, stride and number of input channels of the convolution kernel (the number of input channels of the current layer and the number of convolution kernels of the previous layer represent the same quantity, i.e. ci=ni-1I is the number of layers) the network input data spatial resolution is 128 × 128.
The whole network in the feature extraction sub-network adopts convolution with a size of 3 × 3, the coding network units in the feature extraction sub-network specifically comprise a first conventional convolution unit, a second conventional convolution unit, a third conventional convolution unit and a fourth hole convolution unit, the decoding network units in the feature extraction sub-network specifically comprise a fifth transposed convolution unit and a sixth transposed convolution unit, and the transposed convolution unit in the chrominance extraction sub-network specifically comprises a seventh transposed convolution unit.
The first conventional convolution unit is mainly used for extracting low-layer features of the multiband night vision image and increasing the receptive field by reducing the spatial resolution of output features, and specifically comprises two convolution layers, namely a first conventional convolution layer Conv1_1 and a second conventional convolution layer Conv1_2, wherein Conv1_1 is mainly used for extracting the number of the low-layer features, the number of the convolution layers is set to n1_ 1-64 by referring to most networks, and in order to fully utilize the low-layer feature information, the convolution step of each layer is set to s1_ 1-1; if the data set contains data in two wavebands, such as micro-light and infrared, c1_1 is 2, so that the first layer of the first conventional convolution unit can be represented as Conv1_1(2,64,1,3), the output characteristic size is unchanged, the convolutional neural network has good adaptability, can accept any dimension input without changing the structure, and generates any dimension output, so that CP-Net can be applied to searching for the mapping relationship between the multiband data sets, such as three-waveband fusion and the like.
Conv1_2 increases the field of view by increasing the step size to decrease the output signature size, so set s1_2 to 2, the number of convolution kernels is kept the same as the previous layer and set n1_2 to 64, then this layer can be denoted as Conv1_2(3,64,2,64), while the output signature space size is halved to 64 × 64. the 2 nd convolution unit uses a regular convolution with a step size of 2, so the output signature size continues to halve2128, so the second convolution unit can be denoted as Conv2(3,128,2,64) with an output signature reduction of 32 × 32 the third convolution unit can be denoted as Conv3(3,256,2,128) with an output signature reduction of 16 × 16.
Hole convolution as a new convolution method can increase the receptive field of the output feature without reducing the spatial resolution, and the number of layers of hole convolution determines the maximum receptive field that can be finally obtained, so hole convolution with a spreading factor of 2 is used, which can be represented by DilaConv4_1(3,256,1,256), DilaConv4_2(3,256,1,256), and the output feature map size is still 16 ×. fifth and sixth convolution units are distributed symmetrically with the coding network unit, and the previous features are upsampled and aggregated by transposed convolution, increasing the spatial resolution of the feature map, and halving the number of feature channels. for synthesizing feature information of different scales, the first and second convolution units are connected to the sixth and fifth convolution units, respectively, so that the fifth and sixth convolution units are represented by Conv3, 35, 128, 54, 64, 128, 64, and 64 are configured as a main convolution table, and 3664, and the feature information is extracted.
Table 1 detailed configuration information of feature extraction subnetwork
Figure BDA0002410747210000081
Figure BDA0002410747210000091
The chroma prediction sub-network generates a color probability distribution for each pixel using a transposed convolution module with a feature number of 313 and increases the spatial resolution to reach the original input size, then this layer can be denoted as TransConv7(3,313,1,128).
And predicting the probability distribution of each color by adopting a softmax output layer, and converting into the original dimensionality of the image by a Rehsape layer. Let pi(k) Representing the probability that the ith pixel belongs to the kth quantized value, the softmax output layer is defined as:
Figure BDA0002410747210000092
table 2 gives detailed configuration information of the chroma prediction sub-network. It can be seen that the core of the chroma prediction sub-network is to perform some kind of color prediction through each layer of features, and to realize the probability distribution calculation of colors through the Softmax layer.
Table 2 detailed configuration information of chroma prediction sub-network
Figure BDA0002410747210000093
Step 104: a loss function is constructed.
The color information itself is a multi-modal problem, i.e. many objects may have different colors, but the loss function MSE is not robust to the multi-modal natural color problem. Because if a target can take a set of different chroma values, the primary optimal solution for MSE will be the average of the set, which will result in grayish, unsaturated results in the colorization process; if a reasonably colored set is non-convex, the true color will be outside the set, giving an unreliable result. Therefore, the chromaticity problem is converted into a multi-classification problem to predict the color probability distribution of each pixel, a class balance concept is introduced in the training process, the lost weight is adjusted to emphasize colors with low occurrence frequency, and the designed network is ensured to have color diversity and authenticity.
The channel where the multi-band night vision image is located is X ∈ RH×W×2(where two dimensions represent dim light, infrared, respectively), then the goal is to learn a mapping
Figure BDA0002410747210000101
(wherein, the belt
Figure BDA0002410747210000102
The parameter of (d) represents a predicted value, and the parameter of (d) does not represent a true value) to represent the corresponding chroma channel Y ∈ RH×W×2
The step of converting the chrominance prediction problem into a multi-classification problem is to first quantize the ab rendering space into a set of points of grid size 10, with a total of 313 pairs of Q over the entire rendered areaH×W×2To study
Figure BDA0002410747210000103
Probability distribution to possible colors
Figure BDA0002410747210000104
A mapping of (2).
Due to true value Yh,wIs chroma information of an actual image, and its chroma value is not a quantized value, and is used for a predicted value
Figure BDA0002410747210000105
Comparing with the true value, and defining that Z is H-1(Y) using a soft coding structure to convert the true color space into a vector Z, i.e. finding the distance Y in the output space (313 quantization values)h,wThe nearest 5 neighborhoods, where the distance d (k) from each neighbor to the true value is calculated, k ═ (1,2,3, …,313), and each distance is weighted proportionally by the gaussian kernel of σ ═ 5, then the probability that the ith pixel belongs to the kth quantization value can be expressed as:
Figure BDA0002410747210000106
the probability distribution Z of all colors on an image can be obtained by the above formula [0,1 ═ c]H×W×Q. The multi-class cross entropy loss function can be designed as follows:
Figure BDA0002410747210000107
where v (-) is a weighting term used to balance the loss function, which can be determined based on the sparsity of the color class. Using functions
Figure BDA0002410747210000108
Probability distribution of color
Figure BDA0002410747210000109
Mapping to color values
Figure BDA00024107472100001010
Step 105: and training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain the trained dim light/infrared image chromaticity prediction network.
On the basis of a network model, hyper-parameters such as an optimization function, a learning rate, batch size, filter number and the like of a low-light/infrared image natural sense color fusion network are set, wherein the optimization function adopts an Adam algorithm, the learning rate is set to be l ═ 0.001, a network weight initialization adopts an MSRA method, bias is completely initialized to be 0, an activation function adopts an Re L U function, the batch size is set to be 16, a training period is set to be 80, finally, a Keras open source framework is adopted to carry out network training on an NVIDIA GTX850 video card, and the trained network can realize the prediction of chrominance information under the condition of inputting given micro-light and infrared images.
Step 106: and extracting the chromaticity information of the target image by adopting the trained glimmer/infrared image chromaticity prediction network.
Step 107: luminance information of the target image is extracted.
The invention utilizes the luminance information of the NR L pseudo color fusion image as the source of luminance information for the final color fusion.
Step 108: and finally synthesizing the color image based on the brightness information and the chrominance information.
The method can realize the final natural color fusion image prediction based on the combination of the brightness information and the predicted chrominance information.
The method selects an image fusion data set [10,11] provided by Dutch human research institute (TNO Hunan Factors), wherein the data set comprises registered micro-light and infrared images and color images under corresponding scenes, the color images are not registered with the micro-light and infrared images under the corresponding scenes, firstly, the multi-mode image calibration technology is utilized to register the color images and the micro-light infrared images under the corresponding scenes, finally, 5 groups of images with similar scenes are selected as a training set, and a group of typical training images in the group of images comprise three parts (a), (b) and (c), and the micro-light images, the infrared images and the color images under the same scene are sequentially arranged from left to right.
In the training process, 5000 pairs of low-light, infrared and color image subareas with the size of 128 × 128 images are randomly extracted from a training sample set, data enhancement is carried out by adopting methods such as rotation and translation, then the low-light and infrared image block pairs are combined into a two-dimensional multi-channel image as network input, the chrominance information ab of the color image block is extracted as output, and network parameters are set according to the foregoing.
Finally, the trained model is tested on a test set to obtain a natural color fusion image, as shown in fig. 3, the natural color fusion image comprises three parts (a), (b) and (c), wherein the part (a) is a low-light-level image, the part (b) is an infrared image, and the part (c) is a color image, so that the color fusion image in the fig. 3 has the color characteristics of a reference image in the training set, a house is red, the sky is light blue, a road is gray, a tree grass land is green, the color is natural and rich, and the observation habit of human eyes is met.
Fig. 5 is a schematic structural diagram of a deep learning-based dim light/infrared image color fusion system according to an embodiment of the present invention, where the system shown in fig. 5 includes: a training data acquisition module 201, a preprocessing module 202, a prediction network construction module 203, a loss function construction module 204, a training module 205, a chrominance information extraction module 206, a luminance information extraction module 207, and a color image synthesis module 208.
The training data obtaining module 201 is configured to obtain training data.
The preprocessing module 202 is configured to preprocess the training data.
The prediction network construction module 203 is used for constructing a low light/infrared image chromaticity prediction network.
The loss function building block 204 is used to build a loss function.
The training module 205 is configured to train the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function, so as to obtain a trained dim light/infrared image chromaticity prediction network.
The chrominance information extraction module 206 is configured to extract chrominance information of the target image by using the trained dim light/infrared image chrominance prediction network;
the brightness information extraction module 207 is used for extracting the brightness information of the target image;
the color image synthesis module 208 is configured to finally synthesize a color image based on the luminance information and the chrominance information.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. A dim light/infrared image color fusion method based on deep learning is characterized in that the fusion method comprises the following steps:
acquiring training data;
preprocessing the training data;
constructing a low-light/infrared image chromaticity prediction network;
constructing a loss function;
training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain a trained dim light/infrared image chromaticity prediction network;
extracting the chromaticity information of the target image by adopting the trained dim light/infrared image chromaticity prediction network;
extracting brightness information of a target image;
and finally synthesizing the color image based on the brightness information and the chrominance information.
2. The deep learning-based micro/infrared image color fusion method according to claim 1, wherein preprocessing the training data specifically comprises:
and continuously preprocessing the training data by adopting a multimode image calibration technology.
3. The deep learning-based micro light/infrared image color fusion method according to claim 1, wherein the micro light/infrared image chromaticity prediction network comprises: a feature extraction sub-network and a chroma prediction sub-network.
4. The deep learning-based micro/infrared image color fusion method according to claim 1, wherein the loss function specifically adopts the following formula:
Figure FDA0002410747200000011
h denotes the height coordinate of the training image, w denotes the width coordinate of the training image, v (Z)h,w) Weight, z, representing sparseness of chrominance information in training seth,wExpressing the chromaticity information of the maximum probability distribution of pixel points with coordinates of h and w, q expressing the quantization value of chromaticity space, zh,w,qThe probability distribution that the coordinate is h and the chromaticity information of the w pixel point is q is represented,
Figure FDA0002410747200000012
the expression coordinate is h, and the chroma information of the w pixel point is the predicted value of the probability distribution of q.
5. The deep learning based micro-optic/infrared image color fusion method as claimed in claim 3, wherein the network type of the feature extraction sub-network is an encoder-decoder network type, comprising an encoding network unit and a decoding network unit; the coding network unit adopts a conventional convolution unit and a hole convolution unit; the decoding network unit adopts a transposition convolution unit;
the chrominance extraction sub-network adopts a transposed convolution unit.
6. The deep learning-based micro/infrared image color fusion method as claimed in claim 5, wherein the coding network units in the feature extraction sub-network specifically comprise a first conventional convolution unit, a second conventional convolution unit, a third conventional convolution unit and a fourth hole convolution unit;
the decoding network unit in the feature extraction sub-network specifically includes: a fifth transposed convolution unit and a sixth transposed convolution unit;
the transposed convolution unit in the chrominance extraction sub-network specifically includes a seventh transposed convolution unit.
7. The deep learning based micro/infrared image color fusion method according to claim 6, wherein the first conventional convolution unit Conv1 comprises a first conventional convolution layer Conv1_1 and a second conventional convolution layer Conv1_2, the Conv1_1 has a convolution kernel size of 2, a convolution kernel number of 64, a step size of 1, and an input channel of 3, the Conv1_2 has a convolution kernel size of 3, a convolution kernel number of 64, a step size of 2, and an input channel of 64;
the second conventional convolution unit Conv2 has a convolution kernel size of 3, a convolution kernel number of 128, a step size of 2, and an input channel of 64;
the convolution kernel size of the third conventional convolution unit Conv3 is 2, the number of convolution kernels is 64, the step is 1, and the input channel is 3;
the fourth hole convolution unit DilaConv4 includes a first hole convolution layer DilaConv4_1 and a second hole convolution layer DilaConv4_2, the convolution kernel size of the first hole convolution layer DilaConv4_1 is 3, the number of convolution kernels is 256, the step is 1, and the input channel is 256, the convolution kernel size of the second hole convolution layer DilaConv4_2 is 3, the number of convolution kernels is 256, the step is 1, and the input channel is 256;
the size of a convolution kernel of the fifth transpose convolution unit TransConv5 is 3, the number of the convolution kernels is 128, the stride is 1, and the input channel is 128;
the size of a convolution kernel of the sixth transpose convolution unit TransConv6 is 3, the number of the convolution kernels is 64, the stride is 1, and the input channel is 256;
the seventh transpose convolution unit transcconv 7 has a convolution kernel size of 3, a number of convolution kernels of 313, a stride of 1, and an input channel of 128.
8. The deep learning based micro light/infrared image color fusion method according to claim 1, further comprising, after constructing a micro light/infrared image chroma prediction network: and adopting BN technology to carry out normalization processing on the output characteristics of each level in the dim light/infrared image chromaticity prediction network.
9. A low-light/infrared image color fusion system based on deep learning, the system comprising:
the training data acquisition module is used for acquiring training data;
the preprocessing module is used for preprocessing the training data;
the prediction network construction module is used for constructing a low-light/infrared image chromaticity prediction network;
the loss function constructing module is used for constructing a loss function;
the training module is used for training the dim light/infrared image chromaticity prediction network based on the preprocessed training data and the loss function to obtain the trained dim light/infrared image chromaticity prediction network;
the chrominance information extraction module is used for extracting chrominance information of the target image by adopting the trained dim light/infrared image chrominance prediction network;
the brightness information extraction module is used for extracting the brightness information of the target image;
and the color image synthesis module is used for finally synthesizing a color image based on the brightness information and the chrominance information.
10. The deep learning based micro-optic/infrared image color fusion system of claim 9 wherein the pre-processing module specifically includes continuing pre-processing the training data using a multi-modal image calibration technique.
CN202010175703.3A 2020-03-13 2020-03-13 Low-light-level/infrared image color fusion method and system based on deep learning Pending CN111402306A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010175703.3A CN111402306A (en) 2020-03-13 2020-03-13 Low-light-level/infrared image color fusion method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010175703.3A CN111402306A (en) 2020-03-13 2020-03-13 Low-light-level/infrared image color fusion method and system based on deep learning

Publications (1)

Publication Number Publication Date
CN111402306A true CN111402306A (en) 2020-07-10

Family

ID=71432416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010175703.3A Pending CN111402306A (en) 2020-03-13 2020-03-13 Low-light-level/infrared image color fusion method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN111402306A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898671A (en) * 2020-07-27 2020-11-06 中国船舶工业综合技术经济研究院 Target identification method and system based on fusion of laser imager and color camera codes
CN111967530A (en) * 2020-08-28 2020-11-20 南京诺源医疗器械有限公司 Fluorescence area identification method of medical fluorescence imaging system
CN112164017A (en) * 2020-09-27 2021-01-01 中国兵器工业集团第二一四研究所苏州研发中心 Deep learning-based polarization colorization method
CN113012087A (en) * 2021-03-31 2021-06-22 中南大学 Image fusion method based on convolutional neural network
CN113191970A (en) * 2021-04-24 2021-07-30 北京理工大学 Orthogonal color transfer network and method
CN113609893A (en) * 2021-06-18 2021-11-05 大连民族大学 Low-illuminance indoor human body target visible light feature reconstruction method and network based on infrared camera
CN113643202A (en) * 2021-07-29 2021-11-12 西安理工大学 Low-light-level image enhancement method based on noise attention map guidance
CN113920172A (en) * 2021-12-14 2022-01-11 成都睿沿芯创科技有限公司 Target tracking method, device, equipment and storage medium
TWI773526B (en) * 2020-09-29 2022-08-01 大陸商北京靈汐科技有限公司 Image processing method, device, computer equipment and storage medium
WO2022257184A1 (en) * 2021-06-09 2022-12-15 烟台艾睿光电科技有限公司 Method for acquiring image generation apparatus, and image generation apparatus
CN116309216A (en) * 2023-02-27 2023-06-23 南京博视医疗科技有限公司 Pseudo-color image fusion method and image fusion system based on multiple wave bands

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298769A (en) * 2011-06-11 2011-12-28 浙江理工大学 Colored fusion method of night vision low-light image and infrared image based on color transmission
US20150326863A1 (en) * 2012-06-29 2015-11-12 Canon Kabushiki Kaisha Method and device for encoding or decoding and image
CN108133470A (en) * 2017-12-11 2018-06-08 深圳先进技术研究院 Infrared image and low-light coloured image emerging system and method
CN109712105A (en) * 2018-12-24 2019-05-03 浙江大学 A kind of image well-marked target detection method of combination colour and depth information
CN110120028A (en) * 2018-11-13 2019-08-13 中国科学院深圳先进技术研究院 A kind of bionical rattle snake is infrared and twilight image Color Fusion and device
CN110276731A (en) * 2019-06-17 2019-09-24 艾瑞迈迪科技石家庄有限公司 Endoscopic image color restoring method and device
US20190378258A1 (en) * 2017-02-10 2019-12-12 Hangzhou Hikvision Digital Technology Co., Ltd. Image Fusion Apparatus and Image Fusion Method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298769A (en) * 2011-06-11 2011-12-28 浙江理工大学 Colored fusion method of night vision low-light image and infrared image based on color transmission
US20150326863A1 (en) * 2012-06-29 2015-11-12 Canon Kabushiki Kaisha Method and device for encoding or decoding and image
US20190378258A1 (en) * 2017-02-10 2019-12-12 Hangzhou Hikvision Digital Technology Co., Ltd. Image Fusion Apparatus and Image Fusion Method
CN108133470A (en) * 2017-12-11 2018-06-08 深圳先进技术研究院 Infrared image and low-light coloured image emerging system and method
CN110120028A (en) * 2018-11-13 2019-08-13 中国科学院深圳先进技术研究院 A kind of bionical rattle snake is infrared and twilight image Color Fusion and device
CN109712105A (en) * 2018-12-24 2019-05-03 浙江大学 A kind of image well-marked target detection method of combination colour and depth information
CN110276731A (en) * 2019-06-17 2019-09-24 艾瑞迈迪科技石家庄有限公司 Endoscopic image color restoring method and device

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
RICHARD ZHANG, PHILLIP ISOLA, ALEXEI A. EFROS: "Colorful Image Colorization", vol. 9907, pages 649 - 666 *
YUE LI1, LI LI, ZHU LI, JIANCHAO YANG, NING XU, DONG LIU, HOUQIANG LI: "A HYBRID NEURAL NETWORK FOR CHROMA INTRA PREDICTION", pages 1797 - 1801 *
何芳州: "色度与亮度特征相融合的彩色人脸识别算法", vol. 42, no. 07, pages 74 - 78 *
徐中辉,吕维帅: "基于卷积神经网络的图像着色", vol. 44, no. 10, pages 19 - 22 *
潘豪亮,闫青,徐奕,梁龙飞,杨小康: "融合亮度信息与色度信息的图像梯度提取方法", vol. 38, no. 09, pages 18 - 24 *
苗启广 等: "《多传感器图像融合技术及应用》", vol. 1, 西安电子科技大学, pages: 29 - 31 *
黄应清, 齐鸥, 蒋晓瑜, 刘中喧: "新的伪彩色图像融合质量客观评价指标", vol. 47, no. 8, pages 368 - 394 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898671B (en) * 2020-07-27 2022-05-24 中国船舶工业综合技术经济研究院 Target identification method and system based on fusion of laser imager and color camera codes
CN111898671A (en) * 2020-07-27 2020-11-06 中国船舶工业综合技术经济研究院 Target identification method and system based on fusion of laser imager and color camera codes
CN111967530A (en) * 2020-08-28 2020-11-20 南京诺源医疗器械有限公司 Fluorescence area identification method of medical fluorescence imaging system
CN112164017A (en) * 2020-09-27 2021-01-01 中国兵器工业集团第二一四研究所苏州研发中心 Deep learning-based polarization colorization method
CN112164017B (en) * 2020-09-27 2023-11-17 中国兵器工业集团第二一四研究所苏州研发中心 Polarization colorization method based on deep learning
TWI773526B (en) * 2020-09-29 2022-08-01 大陸商北京靈汐科技有限公司 Image processing method, device, computer equipment and storage medium
CN113012087A (en) * 2021-03-31 2021-06-22 中南大学 Image fusion method based on convolutional neural network
CN113012087B (en) * 2021-03-31 2022-11-04 中南大学 Image fusion method based on convolutional neural network
CN113191970A (en) * 2021-04-24 2021-07-30 北京理工大学 Orthogonal color transfer network and method
CN113191970B (en) * 2021-04-24 2022-10-21 北京理工大学 Orthogonal color transfer network and method
WO2022257184A1 (en) * 2021-06-09 2022-12-15 烟台艾睿光电科技有限公司 Method for acquiring image generation apparatus, and image generation apparatus
CN113609893A (en) * 2021-06-18 2021-11-05 大连民族大学 Low-illuminance indoor human body target visible light feature reconstruction method and network based on infrared camera
CN113609893B (en) * 2021-06-18 2024-04-16 大连民族大学 Low-illuminance indoor human body target visible light characteristic reconstruction method and network based on infrared camera
CN113643202A (en) * 2021-07-29 2021-11-12 西安理工大学 Low-light-level image enhancement method based on noise attention map guidance
CN113920172A (en) * 2021-12-14 2022-01-11 成都睿沿芯创科技有限公司 Target tracking method, device, equipment and storage medium
CN116309216A (en) * 2023-02-27 2023-06-23 南京博视医疗科技有限公司 Pseudo-color image fusion method and image fusion system based on multiple wave bands
CN116309216B (en) * 2023-02-27 2024-01-09 南京博视医疗科技有限公司 Pseudo-color image fusion method and image fusion system based on multiple wave bands

Similar Documents

Publication Publication Date Title
CN111402306A (en) Low-light-level/infrared image color fusion method and system based on deep learning
CN111709902B (en) Infrared and visible light image fusion method based on self-attention mechanism
CN112507793B (en) Ultra-short term photovoltaic power prediction method
CN107392130B (en) Multispectral image classification method based on threshold value self-adaption and convolutional neural network
CN113283444B (en) Heterogeneous image migration method based on generation countermeasure network
CN109472837A (en) The photoelectric image conversion method of confrontation network is generated based on condition
CN109949353A (en) A kind of low-light (level) image natural sense colorization method
CN110930308B (en) Structure searching method of image super-resolution generation network
CN117726550B (en) Multi-scale gating attention remote sensing image defogging method and system
CN113506275B (en) Urban image processing method based on panorama
CN113112441B (en) Multi-band low-resolution image synchronous fusion method based on dense network and local brightness traversal operator
CN114067018A (en) Infrared image colorization method for generating countermeasure network based on expansion residual error
Qian et al. Fast color contrast enhancement method for color night vision
CN114119356A (en) Method for converting thermal infrared image into visible light color image based on cycleGAN
CN116189021B (en) Multi-branch intercrossing attention-enhanced unmanned aerial vehicle multispectral target detection method
CN116258653B (en) Low-light level image enhancement method and system based on deep learning
CN114638764B (en) Multi-exposure image fusion method and system based on artificial intelligence
Richart et al. Image colorization with neural networks
CN115587961A (en) Cell imaging method based on multi-exposure image fusion technology
CN111881924B (en) Dark-light vehicle illumination identification method combining illumination invariance and short-exposure illumination enhancement
CN114529488A (en) Image fusion method, device and equipment and storage medium
CN112203072A (en) Aerial image water body extraction method and system based on deep learning
CN102667853A (en) Filter setup learning for binary sensor
CN111402223A (en) Transformer substation defect problem detection method using transformer substation video image
Makwana et al. Efficient color transfer method based on colormap clustering for night vision applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination