CN110443865B - Multispectral imaging method and device based on RGB camera and depth neural network - Google Patents

Multispectral imaging method and device based on RGB camera and depth neural network Download PDF

Info

Publication number
CN110443865B
CN110443865B CN201910690025.1A CN201910690025A CN110443865B CN 110443865 B CN110443865 B CN 110443865B CN 201910690025 A CN201910690025 A CN 201910690025A CN 110443865 B CN110443865 B CN 110443865B
Authority
CN
China
Prior art keywords
image
layer
multispectral
neural network
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910690025.1A
Other languages
Chinese (zh)
Other versions
CN110443865A (en
Inventor
边丽蘅
傅毫
张军
曹先彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201910690025.1A priority Critical patent/CN110443865B/en
Publication of CN110443865A publication Critical patent/CN110443865A/en
Application granted granted Critical
Publication of CN110443865B publication Critical patent/CN110443865B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Color Television Image Signal Generators (AREA)
  • Processing Of Color Television Signals (AREA)

Abstract

The invention discloses a multispectral imaging method and a multispectral imaging device based on an RGB camera and a deep neural network, wherein the method comprises the following steps: designing a convolution neural network with input being a mosaic image and output being a corresponding multispectral image; training a convolutional neural network by utilizing an existing multispectral data set to obtain an optimal network model; and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image. The method can directly complete the calculation imaging from the mosaic image to the multispectral image by utilizing the deep neural network, is more suitable for practical application, and is simple and easy to realize.

Description

Multispectral imaging method and device based on RGB camera and depth neural network
Technical Field
The invention relates to the technical field of computational photography, in particular to a multispectral imaging method and device based on an RGB (red, green and blue) camera and a deep neural network.
Background
Multispectral imaging techniques, which are popular in the 80's of the 20 th century, combine spectroscopic techniques with imaging techniques to obtain information about multiple wavelength bands per pixel of an image. Compared with a common color camera, such as three spectral channels of an RGB camera, a multispectral imaging system generally includes dozens or even hundreds of spectral channels, each spectral band can be regarded as a static gray image and respectively represents intensity information of different bands, and images on different spectral bands include more spatial information and spectral information of an observed object. Compared with the traditional imaging technology, the multispectral imaging technology has great advantages, and can more comprehensively, clearly and accurately understand and know an observation target, so that the multispectral imaging technology is widely applied to military, industry, agriculture and other aspects. The mainstream multispectral imaging technology at present is to compensate the spectral resolution by sacrificing the spatial resolution or the temporal resolution so as to acquire multispectral information. How to acquire images with high time resolution, high spatial resolution and high spectral resolution has become a research hotspot of computational photography, which shows that research in this direction is important and widely applied.
At present, a single-sensor color camera is mainly used for acquiring an image of incomplete spatial information of three channels, namely a mosaic image, of red, green and blue (R, G, B) by adding a layer of Color Filter Array (CFA) in front of a detector array, and then supplementing default information of 3 channels by a corresponding algorithm so as to achieve the purpose of color imaging.
On the basis, research is carried out to directly reconstruct a multispectral image from an RGB color image by using a compressive sensing theory method and a deep learning theory method. The method mainly utilizes the sparsity of images through an algorithm of a compressive sensing theory, firstly trains a group of complete sparse bases by utilizing the existing multispectral data set, on the basis, utilizes the sparse bases to represent RGB images and solve a representation coefficient under the most sparse condition, and utilizes the coefficient to reversely solve the corresponding multispectral images. The algorithm firstly needs to ensure that the sparse substrate is complete, otherwise, the reconstruction result is reduced; secondly, a complete RGB three-channel airspace image is needed, in practice, a camera can only acquire incomplete airspace information (namely a mosaic image) of RGB three channels, and then default information (namely demosaicing) is complemented through an algorithm, so that the adopted demosaicing algorithm also has influence on a reconstruction result. And then, carrying out multispectral reconstruction by using an algorithm of a deep learning theory, wherein the algorithm firstly carries out interpolation operation by using complete RGB three-channel spatial information to obtain spatial information of a plurality of channels, and then outputs the multi-channel information as the input of a network to obtain a corresponding multispectral image. The algorithm also requires a complete RGB three-channel spatial domain image. At present, no algorithm capable of directly recovering a multispectral image from data collected by a detector (namely a mosaic image) exists.
Disclosure of Invention
The present application is based on the recognition and discovery by the inventors of the following problems:
recently, neural networks have achieved great success in computer vision, such as object classification, target detection, super resolution, and the like, and the performance of algorithms based on neural networks is mostly superior to that of traditional algorithms. The deep neural network mainly adopts a convolution neural network, the convolution of the image means extracting the characteristics in the image, an optimal model is obtained through training the existing data, and then the model is utilized to carry out corresponding tasks. In recent years, with the improvement of algorithms and hardware, the structure of the neural network is increasingly complex, becomes deeper and wider, and the actual performance is also greatly improved. One feature of deep neural networks is that the more similar the data set used for testing and the training set, the better the network will tend to achieve, and the spectra of substances in nature will have a certain similarity, based on which the present invention uses algorithms based on deep neural networks for multispectral imaging.
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, one objective of the present invention is to provide a multispectral imaging method based on an RGB camera and a deep neural network, which can directly complete the computational imaging from a mosaic image to a multispectral image by using the deep neural network, and is more suitable for practical application and simple and easy to implement.
Another objective of the present invention is to provide a multispectral imaging device based on an RGB camera and a deep neural network.
In order to achieve the above object, an embodiment of the present invention provides a multispectral imaging method based on an RGB camera and a deep neural network, including the following steps: designing a convolution neural network with input being a mosaic image and output being a corresponding multispectral image; training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image.
According to the multispectral imaging method based on the RGB camera and the depth neural network, the mosaic image is obtained by shooting through the RGB color camera, the depth neural network is used for recovering and reconstructing the multispectral image from the mosaic image, and finally the spatial domain-spectral domain combined acquisition of a natural scene is achieved, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging method is more suitable for practical application and is simple and easy to implement.
In addition, the multispectral imaging method based on the RGB camera and the deep neural network according to the above embodiment of the present invention may further have the following additional technical features:
further, in one embodiment of the present invention, the convolutional neural network includes a residual learning-based neural network, a multi-scale structure-based neural network, and a parallel-multi-scale structure-based neural network.
Further, in an embodiment of the present invention, the training the convolutional neural network by using the existing multispectral dataset includes: will train the data set { xi,yi(i ═ 1,2,3 … N) as the multispectral dataset, where x isiFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:
y′=f(x,p),
wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model.
Further, in one embodiment of the present invention, a corresponding loss function is defined, and then the loss function value is minimized to achieve the purpose of optimizing the network, wherein the definition of the loss function includes a one-norm based algorithm, a two-norm based algorithm, a structure similarity index based algorithm, and a multi-scale structure similarity index based algorithm.
Further, in an embodiment of the present invention, the method further includes: and collecting the mosaic image through an RGB color camera.
In order to achieve the above object, another embodiment of the present invention provides a multispectral imaging device based on an RGB camera and a deep neural network, including: the design module is used for designing a convolutional neural network with input being a mosaic image and output being a corresponding multispectral image; the training module is used for training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; and the shooting module is used for obtaining a current mosaic image of an actual scene by adopting an RGB (red, green and blue) color camera for shooting, and obtaining a target multispectral image by taking the mosaic image as the input of the optimal network model.
According to the multispectral imaging device based on the RGB camera and the depth neural network, the mosaic image is obtained by shooting through the RGB color camera, the depth neural network is used for recovering and reconstructing the multispectral image from the mosaic image, and finally the spatial domain-spectral domain combined acquisition of a natural scene is achieved, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging device is more suitable for practical application and is simple and easy to implement.
In addition, the multispectral imaging device based on the RGB camera and the deep neural network according to the above embodiment of the present invention may further have the following additional technical features:
further, in one embodiment of the present invention, the convolutional neural network includes a residual learning-based neural network, a multi-scale structure-based neural network, and a parallel-multi-scale structure-based neural network.
Further, in one embodiment of the present invention, the training module is further configured to train the data set { x }i,yi(i ═ 1,2,3 … N) as the multispectral dataset, where x isiFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:
y′=f(x,p),
wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model.
Further, in one embodiment of the present invention, a corresponding loss function is defined, and then the loss function value is minimized to achieve the purpose of optimizing the network, wherein the definition of the loss function includes a one-norm based algorithm, a two-norm based algorithm, a structure similarity index based algorithm, and a multi-scale structure similarity index based algorithm.
Further, in an embodiment of the present invention, the method further includes: and the acquisition module is used for acquiring the mosaic image through the RGB color camera.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a flowchart of a method of multispectral imaging based on an RGB camera and a depth neural network according to an embodiment of the present invention;
FIG. 2 is a flow chart of obtaining a network model according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a multispectral imaging process according to an embodiment of the invention;
FIG. 4 is a block diagram of a deep neural network based on residual learning according to an embodiment of the present invention;
FIG. 5 is a block diagram of a deep neural network based on a multi-scale structure according to an embodiment of the present invention;
FIG. 6 is a block diagram of a deep neural network based on a parallel-multiscale structure according to an embodiment of the present invention;
FIG. 7 is a diagram of an imaging model based on a Bayer filter array according to an embodiment of the invention;
fig. 8 is a schematic structural diagram of a multispectral imaging device based on an RGB camera and a deep neural network according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
Hereinafter, a multispectral imaging method and apparatus based on an RGB camera and a deep neural network according to an embodiment of the present invention will be described with reference to the accompanying drawings, and first, a multispectral imaging method based on an RGB camera and a deep neural network according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 1 is a flowchart of a multispectral imaging method based on an RGB camera and a depth neural network according to an embodiment of the present invention.
As shown in fig. 1, the multispectral imaging method based on an RGB camera and a depth neural network includes the following steps:
in step S101, the input is designed as a mosaic image, and the output is a convolutional neural network corresponding to the multispectral image.
It will be appreciated that the convolutional neural network is first structured so that its input is a mosaic image and its output is a corresponding multi-spectral image. The embodiment of the invention designs a deep neural network with a specific structure to realize the joint acquisition of a space domain and a spectral domain of a natural scene.
Wherein, in one embodiment of the present invention, the mosaic image may be captured by an RGB color camera.
Specifically, the embodiment of the present invention may design a convolutional neural network, where an input of the mosaic image is directly acquired by an RGB color camera, and an output of the mosaic image is a corresponding multispectral image. Wherein the sensor surface of the RGB color camera is covered by a color filter array, and commonly used color filter arrays include, but are not limited to, Bayer color filter array (Bayer CFA).
Further, in one embodiment of the present invention, the convolutional neural network includes a residual learning-based neural network, a multi-scale structure-based neural network, and a parallel-multi-scale structure-based neural network.
Specifically, fig. 2 shows the acquisition mode of the optimal network model, and fig. 3 shows the process of multispectral imaging. Firstly, a mosaic image directly collected by an RGB color camera is designed as an input, and the mosaic image is output to a depth neural network corresponding to a multispectral image. Commonly used structures include, but are not limited to, those listed below: a neural network based on residual learning, a neural network based on a multi-scale structure, and a neural network based on a parallel-multi-scale structure. The neural network based on residual error learning, the neural network based on the multi-scale structure, and the neural network based on the parallel-multi-scale structure will be described below, respectively.
FIG. 4 shows residual-based learningA network is designed, the network comprising two parts. The first part is input as mosaic image directly collected by RGB color camera, and the mosaic image is first input via input module including c1A convolutional layer and a1An active layer. Then passes through K feature extraction layers, each feature extraction layer containing c2A convolutional layer and a2An active layer. Finally pass through c3A convolutional layer and a3And an active layer for adding the output of the portion to the input of the first portion and sending the sum to the second portion. The second part comprises two processes. The first process is a dimension-raising process, 3 channels respectively represent different spectral channels of the image, and each channel is operated respectively and comprises c4A convolutional layer and a4Activate the layer and then add the results into the next process. In the second process, firstly pass c5A convolutional layer and a5An active layer; then passes through M feature extraction layers, each feature extraction layer containing c6A convolutional layer and a6An active layer; finally pass through c7A convolutional layer and a7And activating the layer, and adding the part of output and the data subjected to the previous dimensionality raising to obtain a final multispectral image.
Fig. 5 shows a network designed based on a multi-scale structure, which comprises two parts. The first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c1A convolutional layer and a1The active layer and then the result is added into the second part. The second part contains K downsampling operations and K upsampling operations. Each time the down-sampling operation is performed, the resolution of the image becomes half and the number of channels becomes twice. A downsampling operation comprises a Conv _ block and a pooling layer. One Conv _ block contains c2A convolutional layer a2An active layer and b2A normalization layer. An upsample operation comprises an Up _ block, a cross-layer join operation and a Conv _ block. One Up _ block contains an interpolation operation and c3A convolutional layer a3An active layer and b3A normalization layer. The cross-layer connection isThe down-sampling process is connected in parallel with the image with the same resolution in the up-sampling process. Finally pass through c4And outputting the convolution layers to obtain a multispectral image.
Fig. 6 shows a network designed based on a parallel-multiscale structure, which comprises two parts. The first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c1A convolutional layer and a1The active layer and then the result is added into the second part. The second part first passes through a Conv _ block containing c and 4 Bottleneck1A convolutional layer a2An active layer and b2A normalization layer. One Bottleneck contains c3A convolutional layer a3An active layer and b3A normalization layer. Then, 4 parallel sub-networks are passed, wherein each row represents a sub-network, and K sub-networks are provided, and the resolution of the picture is reduced generally and the number of channels is doubled every next row. In each sub-network, feature extraction is carried out through 4 basic blocks (each basic block comprises c)4A convolutional layer a4An active layer and b4A normalization layer). Then, information fusion between different sub-networks is carried out, the image resolution is halved through a convolutional layer (stranded 3 multiplied by 3), and the number of channels is doubled; the resolution of the image is doubled by one interpolation operation and one convolution layer, and the number of channels is halved. Finally pass through c5And outputting the convolution layers to obtain a multispectral image.
In the three structural networks, the network based on residual learning has relatively few parameters, and training is simpler, but the final performance is lower than that of the other two networks; the network based on the multi-scale structure and the network based on the parallel-multi-scale structure can obtain relatively optimal performance, but have more parameters, and have higher training difficulty and longer time.
In step S102, the convolutional neural network is trained using the existing multispectral data set to obtain an optimal network model.
It can be understood that, in the embodiment of the present invention, the network may be trained through the existing multispectral dataset on the basis of step S101, so as to obtain an optimal network model. That is, after a network of a particular architecture is designed, it is trained end-to-end. The network is trained by using the existing multispectral data set as a training set, and finally an optimal network model is obtained.
In one embodiment of the present invention, a corresponding loss function is defined, and then the loss function value is minimized to achieve the purpose of optimizing the network, wherein the definition of the loss function includes the following steps: one-norm based algorithms, two-norm based algorithms, structural similarity index based algorithms, and multi-scale structural similarity index based algorithms.
Specifically, the training part of the network is in an end-to-end training mode, the input is a mosaic image, and the output is a corresponding multispectral image. Will train the data set { xi,yiAs a multispectral dataset (1, 2,3 … N), where xiFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:
y′=f(x,p),
wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model. The optimal network is obtained by optimizing the following models, including but not limited to the following ones:
(1)minp|y′-y|;
(2)minp||y′-y||2
(3)minpl 1-SSIM (y', y) |, where SSIM is the structural similarity index.
The optimization is mainly realized by back propagation of a neural network, and the adopted algorithm comprises, but is not limited to, the following algorithms: SGD, Adam, etc.
In step S103, a current mosaic image of the actual scene is obtained by using an RGB color camera to capture, and the mosaic image is used as an input of the optimal network model to obtain a target multispectral image.
It can be understood that after the optimal network model is obtained, in the embodiment of the present invention, the RGB color camera is used to capture a mosaic image, the mosaic image is used as an input of the network model, and an output of the network model is a corresponding multispectral image. That is to say, after the optimal network model is obtained through training, in practical use, a mosaic image is obtained through shooting by using an RGB color camera, and then the mosaic image is used as the input of the neural network, and the output is the corresponding multispectral image.
For example, for better illustration, fig. 7 shows an actual imaging process, taking a Bayer-based filter array as an example, a mosaic image based on the Bayer filter array is first obtained by using a detector array, and then is used as an input of a neural network, and an output of the mosaic image is a corresponding multispectral image.
According to the multispectral imaging method based on the RGB camera and the depth neural network, provided by the embodiment of the invention, the mosaic image is obtained by shooting through the RGB color camera, the multispectral image is restored and reconstructed from the mosaic image through the depth neural network, and finally the spatial domain-spectral domain combined acquisition of a natural scene is realized, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging method is more suitable for practical application and is simple and easy to realize.
Next, a proposed multispectral imaging device based on an RGB camera and a deep neural network according to an embodiment of the present invention is described with reference to the drawings.
Fig. 8 is a schematic structural diagram of a multispectral imaging device based on an RGB camera and a deep neural network according to an embodiment of the present invention.
As shown in fig. 8, the RGB camera and depth neural network based multispectral imaging device 10 includes: a design module 100, a training module 200, and a capture module 300.
The design module 100 is configured to design a convolutional neural network with an input being a mosaic image and an output being a corresponding multispectral image. The training module 200 is configured to train the convolutional neural network using the existing multispectral data set to obtain an optimal network model. The shooting module 300 is configured to obtain a current mosaic image of an actual scene by using an RGB color camera to shoot, and obtain a target multispectral image by using the mosaic image as an input of an optimal network model. The device 10 of the embodiment of the invention can directly complete the calculation imaging from the mosaic image to the multispectral image by using the deep neural network, is more suitable for practical application, and is simple and easy to realize.
Further, in one embodiment of the present invention, the convolutional neural network includes a residual-based learning neural network, a multi-scale structure-based neural network, and a parallel-multi-scale structure-based neural network.
Further, in one embodiment of the invention, the training module 200 is further configured to train the data set { x }i,yiAs a multispectral dataset, where x is 1,2,3 … NiFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:
y′=f(x,p),
wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model.
Further, in one embodiment of the present invention, a corresponding loss function is defined, and then the loss function value is minimized to achieve the purpose of optimizing the network, wherein the definition of the loss function includes a one-norm based algorithm, a two-norm based algorithm, a structure similarity index based algorithm, and a multi-scale structure similarity index based algorithm.
Further, in one embodiment of the present invention, the apparatus 10 of the embodiment of the present invention further comprises: and an acquisition module. The acquisition module is used for acquiring the mosaic image through the RGB color camera.
It should be noted that the foregoing explanation on the embodiment of the multispectral imaging method based on the RGB camera and the depth neural network is also applicable to the multispectral imaging device based on the RGB camera and the depth neural network of the embodiment, and details are not repeated here.
According to the multispectral imaging device based on the RGB camera and the depth neural network, which is provided by the embodiment of the invention, the mosaic image is obtained by shooting through the RGB color camera, the multispectral image is restored and reconstructed from the mosaic image through the depth neural network, and finally the spatial domain-spectral domain combined acquisition of a natural scene is realized, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging device is more suitable for practical application and is simple and easy to realize.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (6)

1. A multispectral imaging method based on an RGB camera and a depth neural network is characterized by comprising the following steps:
designing a convolutional neural network with the input of a mosaic image directly acquired by an RGB color camera and the output of the mosaic image corresponding to a multispectral image, wherein the convolutional neural network comprises a neural network based on residual error learning, and the neural network based on residual error learning comprises two parts: the first part is input as mosaic image directly collected by RGB color camera, and the mosaic image is first input via input module including c1A convolutional layer and a1An activation layer, then K feature extraction layers, each feature extraction layer containing c2A convolutional layer and a2An active layer, finally pass c3A convolutional layer and a3An active layer for adding the output of the first part and the input of the second part into a second part, the second part comprises two processes, the first process is a dimension raising process, 3 channels respectively represent different spectral channels of the image, each channel is respectively operated, and each channel comprises c4A convolutional layer and a4Activating the layers, adding the results to the next process, the second process, first passing through c5A convolutional layer and a5An active layer; then passes through M feature extraction layers, each feature extraction layer containing c6A convolutional layer and a6An active layer; finally pass through c7A convolutional layer and a7The activation layer adds the output of the activation layer and the data after the previous dimensionality lifting to obtain a final multispectral image;
training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training of the convolutional neural network by using the existing multispectral data set comprises the following steps: will train the data set { xi,yiN as the multispectral dataset, wherein x is 1,2,3iFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing the loss function value for the purpose of optimizing the network, wherein the definition of the loss function includes a norm-based calculationA method, an algorithm based on a two-norm, an algorithm based on a structural similarity index, and an algorithm based on a multi-scale structural similarity index; and
and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image.
2. A multispectral imaging method based on an RGB camera and a depth neural network is characterized by comprising the following steps:
designing a convolutional neural network which inputs a mosaic image directly acquired by an RGB color camera and outputs a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a multi-scale structure, and the network designed based on the multi-scale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c1A convolutional layer and a1Activating the layers and adding the results into a second section comprising K down-sampling operations and K up-sampling operations, each time the down-sampling operation results in half the resolution of the image and twice the number of channels, one down-sampling operation comprising a Conv _ block and a pooling layer, one Conv _ block comprising c2A convolutional layer a2An active layer and b2A normalization layer, one upsampling operation comprising an Up _ block, one Cross-layer join operation and one Conv _ block, one Up _ block comprising one interpolation operation and c3A convolutional layer a3An active layer and b3A normalization layer, wherein the cross-layer connection is realized by connecting the images with the same resolution in the down-sampling process and the up-sampling process in parallel and finally passing through c4Each convolution layer outputs to obtain a multispectral image;
training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training of the convolutional neural network by using the existing multispectral data set comprises the following steps: will train the data set { xi,yiN as the multispectral dataset, wherein x is 1,2,3iFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and
and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image.
3. A multispectral imaging method based on an RGB camera and a depth neural network is characterized by comprising the following steps:
designing a convolutional neural network with input being a mosaic image directly acquired by an RGB color camera and output being a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a parallel-multiscale structure, and the network designed based on the parallel-multiscale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c1A convolutional layer and a1Activating layers and adding the results into a second part, the second part passing first a Conv _ block and 4 Bottleneck, a Conv _ block containing c2A convolutional layer a2An active layer and b2A normalization layer, a Bottleneck containing c3A convolutional layer a3An active layer and b3The normalization layer then passes through parallel sub-networks, where each row represents a sub-network, for a total of K sub-networks, and for each next row, the resolution of the picture is reduced, typically, the number of channels is doubled,in each sub-network, feature extraction is carried out through 4 basic blocks, and each basic block contains c4A convolutional layer a4An active layer and b4The normalization layer is used for information fusion among different sub-networks, the image resolution is halved through a convolution layer stranded 3 multiplied by 3, and the number of channels is doubled; the resolution of the image is doubled by one interpolation operation and one convolution layer, the number of channels is halved, and finally c is carried out5Each convolution layer outputs to obtain a multispectral image;
training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training of the convolutional neural network by using the existing multispectral data set comprises the following steps: will train the data set { xi,yiN as the multispectral dataset, wherein x is 1,2,3iFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and
and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image.
4. An RGB camera and depth neural network based multispectral imaging device, comprising:
the design module is used for designing a convolutional neural network which is input into a mosaic image directly collected by an RGB color camera and output into a corresponding multispectral image, wherein the convolutional neural network comprises a neural network based on residual error learning, and the neural network based on the residual error learning comprises two parts: first, theA part of the input is a mosaic image directly collected by an RGB color camera, firstly passes through an input module comprising c1A convolutional layer and a1An activation layer, then K feature extraction layers, each feature extraction layer containing c2A convolutional layer and a2An active layer, finally pass c3A convolutional layer and a3An active layer for adding the output of the first part and the input of the second part into a second part, the second part comprises two processes, the first process is a dimension raising process, 3 channels respectively represent different spectral channels of the image, each channel is respectively operated, and each channel comprises c4A convolutional layer and a4Activating the layers, adding the results to the next process, the second process, first passing through c5A convolutional layer and a5An active layer; then passes through M feature extraction layers, each feature extraction layer containing c6A convolutional layer and a6An active layer; finally pass through c7A convolutional layer and a7The activation layer adds the output of the activation layer and the data after the previous dimensionality lifting to obtain a final multispectral image;
the training module is used for training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training module is further configured to apply a set of training data { xi,yiN as the multispectral dataset, wherein x is 1,2,3iFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and
and the shooting module is used for obtaining a current mosaic image of an actual scene by adopting an RGB (red, green and blue) color camera for shooting, and obtaining a target multispectral image by taking the mosaic image as the input of the optimal network model.
5. An RGB camera and depth neural network based multispectral imaging device, comprising:
the design module is used for designing a convolutional neural network which is input into a mosaic image directly collected by an RGB color camera and output into a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a multi-scale structure, and the network designed based on the multi-scale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c1A convolutional layer and a1Activating the layers and adding the results into a second section comprising K down-sampling operations and K up-sampling operations, each time the down-sampling operation results in half the resolution of the image and twice the number of channels, one down-sampling operation comprising a Conv _ block and a pooling layer, one Conv _ block comprising c2A convolutional layer a2An active layer and b2A normalization layer, one upsampling operation comprising an Up _ block, one Cross-layer join operation and one Conv _ block, one Up _ block comprising one interpolation operation and c3A convolutional layer a3An active layer and b3A normalization layer, wherein the cross-layer connection is realized by connecting the images with the same resolution in the down-sampling process and the up-sampling process in parallel and finally passing through c4Each convolution layer outputs to obtain a multispectral image;
the training module is used for training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training module is further configured to apply a set of training data { xi,yiN as the multispectral dataset, wherein x is 1,2,3iFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: y ═ f (x, p), where p is for the network modelThe parameters y' are multispectral images predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and
and the shooting module is used for obtaining a current mosaic image of an actual scene by adopting an RGB (red, green and blue) color camera for shooting, and obtaining a target multispectral image by taking the mosaic image as the input of the optimal network model.
6. An RGB camera and depth neural network based multispectral imaging device, comprising:
the design module is used for designing a convolutional neural network which is input into a mosaic image directly collected by an RGB color camera and output into a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a parallel-multiscale structure, and the network designed based on the parallel-multiscale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c1A convolutional layer and a1Activating layers and adding the results into a second part, the second part passing first a Conv _ block and 4 Bottleneck, a Conv _ block containing c2A convolutional layer a2An active layer and b2A normalization layer, a Bottleneck containing c3A convolutional layer a3An active layer and b3The method comprises the steps of enabling a normalization layer to pass through parallel sub-networks, enabling each line to represent one sub-network, enabling K sub-networks to be shared, enabling the resolution of pictures to be reduced generally and enabling the number of channels to be doubled when the pictures go to the next line, and enabling 4 basic blocks to be used for feature extraction in each sub-network, wherein each basic block comprises c4A convolutional layer a4An active layer and b4A normalization layer, then performing information fusion between different sub-networks, passing through a convolutional layer stranded 3 × 3The image resolution is halved, and the number of channels is doubled; the resolution of the image is doubled by one interpolation operation and one convolution layer, the number of channels is halved, and finally c is carried out5Each convolution layer outputs to obtain a multispectral image;
the training module is used for training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training module is further configured to apply a set of training data { xi,yiN as the multispectral dataset, wherein x is 1,2,3iFor the input mosaic image, yiFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and
and the shooting module is used for obtaining a current mosaic image of an actual scene by adopting an RGB (red, green and blue) color camera for shooting, and obtaining a target multispectral image by taking the mosaic image as the input of the optimal network model.
CN201910690025.1A 2019-07-29 2019-07-29 Multispectral imaging method and device based on RGB camera and depth neural network Active CN110443865B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910690025.1A CN110443865B (en) 2019-07-29 2019-07-29 Multispectral imaging method and device based on RGB camera and depth neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910690025.1A CN110443865B (en) 2019-07-29 2019-07-29 Multispectral imaging method and device based on RGB camera and depth neural network

Publications (2)

Publication Number Publication Date
CN110443865A CN110443865A (en) 2019-11-12
CN110443865B true CN110443865B (en) 2021-10-15

Family

ID=68432019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910690025.1A Active CN110443865B (en) 2019-07-29 2019-07-29 Multispectral imaging method and device based on RGB camera and depth neural network

Country Status (1)

Country Link
CN (1) CN110443865B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110880162B (en) * 2019-11-22 2023-03-10 中国科学技术大学 Snapshot spectrum depth combined imaging method and system based on deep learning
CN111695407B (en) * 2020-04-23 2023-04-07 西安电子科技大学 Gender identification method, system, storage medium and terminal based on multispectral fusion
CN113008371B (en) * 2021-03-05 2022-02-08 南京大学 Hyperspectral imaging method for deep learning dispersion-based fuzzy solution
CN115587949A (en) * 2022-10-27 2023-01-10 贵州大学 Agricultural multispectral visual reconstruction method based on visible light image

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE0402576D0 (en) * 2004-10-25 2004-10-25 Forskarpatent I Uppsala Ab Multispectral and hyperspectral imaging
US9336570B2 (en) * 2014-05-15 2016-05-10 The United States Of America, As Represented By The Secretary Of The Navy Demosaicking system and method for color array based multi-spectral sensors
CN106840398B (en) * 2017-01-12 2018-02-02 南京大学 A kind of multispectral light-field imaging method
CN109146831A (en) * 2018-08-01 2019-01-04 武汉大学 Remote sensing image fusion method and system based on double branch deep learning networks
CN109410164B (en) * 2018-11-14 2019-10-22 西北工业大学 The satellite PAN and multi-spectral image interfusion method of multiple dimensioned convolutional neural networks

Also Published As

Publication number Publication date
CN110443865A (en) 2019-11-12

Similar Documents

Publication Publication Date Title
CN110443865B (en) Multispectral imaging method and device based on RGB camera and depth neural network
Syu et al. Learning deep convolutional networks for demosaicing
KR100944462B1 (en) Satellite image fusion method and system
Mizutani et al. Multispectral demosaicking algorithm based on inter-channel correlation
Zhou et al. Deep residual network for joint demosaicing and super-resolution
CN110211044B (en) Multispectral imaging method and device based on demosaicing algorithm and principal component analysis
US20220301114A1 (en) Noise Reconstruction For Image Denoising
Mihoubi et al. Multispectral demosaicing using intensity-based spectral correlation
CN109889800B (en) Image enhancement method and device, electronic equipment and storage medium
CN112805744A (en) System and method for demosaicing multispectral images using depth panchromatic image-guided residual interpolation
Habtegebrial et al. Deep convolutional networks for snapshot hypercpectral demosaicking
US9336570B2 (en) Demosaicking system and method for color array based multi-spectral sensors
CN116152120B (en) Low-light image enhancement method and device integrating high-low frequency characteristic information
CN116309126B (en) Five-band multispectral image reconstruction method based on autoregressive model
CN113168671A (en) Noise point estimation
US10944923B2 (en) Code division compression for array cameras
CN116029930A (en) Multispectral image demosaicing method based on convolutional neural network
CN110675320A (en) Method for sharpening target image under spatial parameter change and complex scene
CN111401453A (en) Mosaic image classification and identification method and system
US20220247889A1 (en) Raw to rgb image transformation
CN108051087B (en) Eight-channel multispectral camera design method for rapid imaging
CN111667434B (en) Near infrared enhancement-based weak light color imaging method
Zhao et al. PPI Edge Infused Spatial-Spectral Adaptive Residual Network for Multispectral Filter Array Image Demosaicing
CN112989593B (en) High-spectrum low-rank tensor fusion calculation imaging method based on double cameras
Iriyama et al. Deep demosaicking considering inter-channel correlation and self-similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant