CN110443865B

CN110443865B - Multispectral imaging method and device based on RGB camera and depth neural network

Info

Publication number: CN110443865B
Application number: CN201910690025.1A
Authority: CN
Inventors: 边丽蘅; 傅毫; 张军; 曹先彬
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2019-07-29
Filing date: 2019-07-29
Publication date: 2021-10-15
Anticipated expiration: 2039-07-29
Also published as: CN110443865A

Abstract

The invention discloses a multispectral imaging method and a multispectral imaging device based on an RGB camera and a deep neural network, wherein the method comprises the following steps: designing a convolution neural network with input being a mosaic image and output being a corresponding multispectral image; training a convolutional neural network by utilizing an existing multispectral data set to obtain an optimal network model; and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image. The method can directly complete the calculation imaging from the mosaic image to the multispectral image by utilizing the deep neural network, is more suitable for practical application, and is simple and easy to realize.

Description

Multispectral imaging method and device based on RGB camera and depth neural network

Technical Field

The invention relates to the technical field of computational photography, in particular to a multispectral imaging method and device based on an RGB (red, green and blue) camera and a deep neural network.

Background

Multispectral imaging techniques, which are popular in the 80's of the 20 th century, combine spectroscopic techniques with imaging techniques to obtain information about multiple wavelength bands per pixel of an image. Compared with a common color camera, such as three spectral channels of an RGB camera, a multispectral imaging system generally includes dozens or even hundreds of spectral channels, each spectral band can be regarded as a static gray image and respectively represents intensity information of different bands, and images on different spectral bands include more spatial information and spectral information of an observed object. Compared with the traditional imaging technology, the multispectral imaging technology has great advantages, and can more comprehensively, clearly and accurately understand and know an observation target, so that the multispectral imaging technology is widely applied to military, industry, agriculture and other aspects. The mainstream multispectral imaging technology at present is to compensate the spectral resolution by sacrificing the spatial resolution or the temporal resolution so as to acquire multispectral information. How to acquire images with high time resolution, high spatial resolution and high spectral resolution has become a research hotspot of computational photography, which shows that research in this direction is important and widely applied.

At present, a single-sensor color camera is mainly used for acquiring an image of incomplete spatial information of three channels, namely a mosaic image, of red, green and blue (R, G, B) by adding a layer of Color Filter Array (CFA) in front of a detector array, and then supplementing default information of 3 channels by a corresponding algorithm so as to achieve the purpose of color imaging.

On the basis, research is carried out to directly reconstruct a multispectral image from an RGB color image by using a compressive sensing theory method and a deep learning theory method. The method mainly utilizes the sparsity of images through an algorithm of a compressive sensing theory, firstly trains a group of complete sparse bases by utilizing the existing multispectral data set, on the basis, utilizes the sparse bases to represent RGB images and solve a representation coefficient under the most sparse condition, and utilizes the coefficient to reversely solve the corresponding multispectral images. The algorithm firstly needs to ensure that the sparse substrate is complete, otherwise, the reconstruction result is reduced; secondly, a complete RGB three-channel airspace image is needed, in practice, a camera can only acquire incomplete airspace information (namely a mosaic image) of RGB three channels, and then default information (namely demosaicing) is complemented through an algorithm, so that the adopted demosaicing algorithm also has influence on a reconstruction result. And then, carrying out multispectral reconstruction by using an algorithm of a deep learning theory, wherein the algorithm firstly carries out interpolation operation by using complete RGB three-channel spatial information to obtain spatial information of a plurality of channels, and then outputs the multi-channel information as the input of a network to obtain a corresponding multispectral image. The algorithm also requires a complete RGB three-channel spatial domain image. At present, no algorithm capable of directly recovering a multispectral image from data collected by a detector (namely a mosaic image) exists.

Disclosure of Invention

The present application is based on the recognition and discovery by the inventors of the following problems:

recently, neural networks have achieved great success in computer vision, such as object classification, target detection, super resolution, and the like, and the performance of algorithms based on neural networks is mostly superior to that of traditional algorithms. The deep neural network mainly adopts a convolution neural network, the convolution of the image means extracting the characteristics in the image, an optimal model is obtained through training the existing data, and then the model is utilized to carry out corresponding tasks. In recent years, with the improvement of algorithms and hardware, the structure of the neural network is increasingly complex, becomes deeper and wider, and the actual performance is also greatly improved. One feature of deep neural networks is that the more similar the data set used for testing and the training set, the better the network will tend to achieve, and the spectra of substances in nature will have a certain similarity, based on which the present invention uses algorithms based on deep neural networks for multispectral imaging.

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, one objective of the present invention is to provide a multispectral imaging method based on an RGB camera and a deep neural network, which can directly complete the computational imaging from a mosaic image to a multispectral image by using the deep neural network, and is more suitable for practical application and simple and easy to implement.

Another objective of the present invention is to provide a multispectral imaging device based on an RGB camera and a deep neural network.

In order to achieve the above object, an embodiment of the present invention provides a multispectral imaging method based on an RGB camera and a deep neural network, including the following steps: designing a convolution neural network with input being a mosaic image and output being a corresponding multispectral image; training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image.

According to the multispectral imaging method based on the RGB camera and the depth neural network, the mosaic image is obtained by shooting through the RGB color camera, the depth neural network is used for recovering and reconstructing the multispectral image from the mosaic image, and finally the spatial domain-spectral domain combined acquisition of a natural scene is achieved, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging method is more suitable for practical application and is simple and easy to implement.

In addition, the multispectral imaging method based on the RGB camera and the deep neural network according to the above embodiment of the present invention may further have the following additional technical features:

further, in one embodiment of the present invention, the convolutional neural network includes a residual learning-based neural network, a multi-scale structure-based neural network, and a parallel-multi-scale structure-based neural network.

Further, in an embodiment of the present invention, the training the convolutional neural network by using the existing multispectral dataset includes: will train the data set { xⁱ,yⁱ(i ═ 1,2,3 … N) as the multispectral dataset, where x isⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:

y′＝f(x,p)，

wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model.

Further, in one embodiment of the present invention, a corresponding loss function is defined, and then the loss function value is minimized to achieve the purpose of optimizing the network, wherein the definition of the loss function includes a one-norm based algorithm, a two-norm based algorithm, a structure similarity index based algorithm, and a multi-scale structure similarity index based algorithm.

Further, in an embodiment of the present invention, the method further includes: and collecting the mosaic image through an RGB color camera.

In order to achieve the above object, another embodiment of the present invention provides a multispectral imaging device based on an RGB camera and a deep neural network, including: the design module is used for designing a convolutional neural network with input being a mosaic image and output being a corresponding multispectral image; the training module is used for training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; and the shooting module is used for obtaining a current mosaic image of an actual scene by adopting an RGB (red, green and blue) color camera for shooting, and obtaining a target multispectral image by taking the mosaic image as the input of the optimal network model.

According to the multispectral imaging device based on the RGB camera and the depth neural network, the mosaic image is obtained by shooting through the RGB color camera, the depth neural network is used for recovering and reconstructing the multispectral image from the mosaic image, and finally the spatial domain-spectral domain combined acquisition of a natural scene is achieved, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging device is more suitable for practical application and is simple and easy to implement.

In addition, the multispectral imaging device based on the RGB camera and the deep neural network according to the above embodiment of the present invention may further have the following additional technical features:

Further, in one embodiment of the present invention, the training module is further configured to train the data set { x }ⁱ,yⁱ(i ═ 1,2,3 … N) as the multispectral dataset, where x isⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:

y′＝f(x,p)，

Further, in an embodiment of the present invention, the method further includes: and the acquisition module is used for acquiring the mosaic image through the RGB color camera.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a flowchart of a method of multispectral imaging based on an RGB camera and a depth neural network according to an embodiment of the present invention;

FIG. 2 is a flow chart of obtaining a network model according to an embodiment of the invention;

FIG. 3 is a schematic diagram of a multispectral imaging process according to an embodiment of the invention;

FIG. 4 is a block diagram of a deep neural network based on residual learning according to an embodiment of the present invention;

FIG. 5 is a block diagram of a deep neural network based on a multi-scale structure according to an embodiment of the present invention;

FIG. 6 is a block diagram of a deep neural network based on a parallel-multiscale structure according to an embodiment of the present invention;

FIG. 7 is a diagram of an imaging model based on a Bayer filter array according to an embodiment of the invention;

fig. 8 is a schematic structural diagram of a multispectral imaging device based on an RGB camera and a deep neural network according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

Hereinafter, a multispectral imaging method and apparatus based on an RGB camera and a deep neural network according to an embodiment of the present invention will be described with reference to the accompanying drawings, and first, a multispectral imaging method based on an RGB camera and a deep neural network according to an embodiment of the present invention will be described with reference to the accompanying drawings.

Fig. 1 is a flowchart of a multispectral imaging method based on an RGB camera and a depth neural network according to an embodiment of the present invention.

As shown in fig. 1, the multispectral imaging method based on an RGB camera and a depth neural network includes the following steps:

in step S101, the input is designed as a mosaic image, and the output is a convolutional neural network corresponding to the multispectral image.

It will be appreciated that the convolutional neural network is first structured so that its input is a mosaic image and its output is a corresponding multi-spectral image. The embodiment of the invention designs a deep neural network with a specific structure to realize the joint acquisition of a space domain and a spectral domain of a natural scene.

Wherein, in one embodiment of the present invention, the mosaic image may be captured by an RGB color camera.

Specifically, the embodiment of the present invention may design a convolutional neural network, where an input of the mosaic image is directly acquired by an RGB color camera, and an output of the mosaic image is a corresponding multispectral image. Wherein the sensor surface of the RGB color camera is covered by a color filter array, and commonly used color filter arrays include, but are not limited to, Bayer color filter array (Bayer CFA).

Specifically, fig. 2 shows the acquisition mode of the optimal network model, and fig. 3 shows the process of multispectral imaging. Firstly, a mosaic image directly collected by an RGB color camera is designed as an input, and the mosaic image is output to a depth neural network corresponding to a multispectral image. Commonly used structures include, but are not limited to, those listed below: a neural network based on residual learning, a neural network based on a multi-scale structure, and a neural network based on a parallel-multi-scale structure. The neural network based on residual error learning, the neural network based on the multi-scale structure, and the neural network based on the parallel-multi-scale structure will be described below, respectively.

FIG. 4 shows residual-based learningA network is designed, the network comprising two parts. The first part is input as mosaic image directly collected by RGB color camera, and the mosaic image is first input via input module including c₁A convolutional layer and a₁An active layer. Then passes through K feature extraction layers, each feature extraction layer containing c₂A convolutional layer and a₂An active layer. Finally pass through c₃A convolutional layer and a₃And an active layer for adding the output of the portion to the input of the first portion and sending the sum to the second portion. The second part comprises two processes. The first process is a dimension-raising process, 3 channels respectively represent different spectral channels of the image, and each channel is operated respectively and comprises c₄A convolutional layer and a₄Activate the layer and then add the results into the next process. In the second process, firstly pass c₅A convolutional layer and a₅An active layer; then passes through M feature extraction layers, each feature extraction layer containing c₆A convolutional layer and a₆An active layer; finally pass through c₇A convolutional layer and a₇And activating the layer, and adding the part of output and the data subjected to the previous dimensionality raising to obtain a final multispectral image.

Fig. 5 shows a network designed based on a multi-scale structure, which comprises two parts. The first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c₁A convolutional layer and a₁The active layer and then the result is added into the second part. The second part contains K downsampling operations and K upsampling operations. Each time the down-sampling operation is performed, the resolution of the image becomes half and the number of channels becomes twice. A downsampling operation comprises a Conv _ block and a pooling layer. One Conv _ block contains c₂A convolutional layer a₂An active layer and b₂A normalization layer. An upsample operation comprises an Up _ block, a cross-layer join operation and a Conv _ block. One Up _ block contains an interpolation operation and c₃A convolutional layer a₃An active layer and b₃A normalization layer. The cross-layer connection isThe down-sampling process is connected in parallel with the image with the same resolution in the up-sampling process. Finally pass through c₄And outputting the convolution layers to obtain a multispectral image.

Fig. 6 shows a network designed based on a parallel-multiscale structure, which comprises two parts. The first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c₁A convolutional layer and a₁The active layer and then the result is added into the second part. The second part first passes through a Conv _ block containing c and 4 Bottleneck₁A convolutional layer a₂An active layer and b₂A normalization layer. One Bottleneck contains c₃A convolutional layer a₃An active layer and b₃A normalization layer. Then, 4 parallel sub-networks are passed, wherein each row represents a sub-network, and K sub-networks are provided, and the resolution of the picture is reduced generally and the number of channels is doubled every next row. In each sub-network, feature extraction is carried out through 4 basic blocks (each basic block comprises c)₄A convolutional layer a₄An active layer and b₄A normalization layer). Then, information fusion between different sub-networks is carried out, the image resolution is halved through a convolutional layer (stranded 3 multiplied by 3), and the number of channels is doubled; the resolution of the image is doubled by one interpolation operation and one convolution layer, and the number of channels is halved. Finally pass through c₅And outputting the convolution layers to obtain a multispectral image.

In the three structural networks, the network based on residual learning has relatively few parameters, and training is simpler, but the final performance is lower than that of the other two networks; the network based on the multi-scale structure and the network based on the parallel-multi-scale structure can obtain relatively optimal performance, but have more parameters, and have higher training difficulty and longer time.

In step S102, the convolutional neural network is trained using the existing multispectral data set to obtain an optimal network model.

It can be understood that, in the embodiment of the present invention, the network may be trained through the existing multispectral dataset on the basis of step S101, so as to obtain an optimal network model. That is, after a network of a particular architecture is designed, it is trained end-to-end. The network is trained by using the existing multispectral data set as a training set, and finally an optimal network model is obtained.

In one embodiment of the present invention, a corresponding loss function is defined, and then the loss function value is minimized to achieve the purpose of optimizing the network, wherein the definition of the loss function includes the following steps: one-norm based algorithms, two-norm based algorithms, structural similarity index based algorithms, and multi-scale structural similarity index based algorithms.

Specifically, the training part of the network is in an end-to-end training mode, the input is a mosaic image, and the output is a corresponding multispectral image. Will train the data set { xⁱ,yⁱAs a multispectral dataset (1, 2,3 … N), where xⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:

y′＝f(x,p)，

wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model. The optimal network is obtained by optimizing the following models, including but not limited to the following ones:

(1)min_p|y′-y|；

(2)min_p||y′-y||²；

(3)min_pl 1-SSIM (y', y) |, where SSIM is the structural similarity index.

The optimization is mainly realized by back propagation of a neural network, and the adopted algorithm comprises, but is not limited to, the following algorithms: SGD, Adam, etc.

In step S103, a current mosaic image of the actual scene is obtained by using an RGB color camera to capture, and the mosaic image is used as an input of the optimal network model to obtain a target multispectral image.

It can be understood that after the optimal network model is obtained, in the embodiment of the present invention, the RGB color camera is used to capture a mosaic image, the mosaic image is used as an input of the network model, and an output of the network model is a corresponding multispectral image. That is to say, after the optimal network model is obtained through training, in practical use, a mosaic image is obtained through shooting by using an RGB color camera, and then the mosaic image is used as the input of the neural network, and the output is the corresponding multispectral image.

For example, for better illustration, fig. 7 shows an actual imaging process, taking a Bayer-based filter array as an example, a mosaic image based on the Bayer filter array is first obtained by using a detector array, and then is used as an input of a neural network, and an output of the mosaic image is a corresponding multispectral image.

According to the multispectral imaging method based on the RGB camera and the depth neural network, provided by the embodiment of the invention, the mosaic image is obtained by shooting through the RGB color camera, the multispectral image is restored and reconstructed from the mosaic image through the depth neural network, and finally the spatial domain-spectral domain combined acquisition of a natural scene is realized, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging method is more suitable for practical application and is simple and easy to realize.

Next, a proposed multispectral imaging device based on an RGB camera and a deep neural network according to an embodiment of the present invention is described with reference to the drawings.

As shown in fig. 8, the RGB camera and depth neural network based multispectral imaging device 10 includes: a design module 100, a training module 200, and a capture module 300.

The design module 100 is configured to design a convolutional neural network with an input being a mosaic image and an output being a corresponding multispectral image. The training module 200 is configured to train the convolutional neural network using the existing multispectral data set to obtain an optimal network model. The shooting module 300 is configured to obtain a current mosaic image of an actual scene by using an RGB color camera to shoot, and obtain a target multispectral image by using the mosaic image as an input of an optimal network model. The device 10 of the embodiment of the invention can directly complete the calculation imaging from the mosaic image to the multispectral image by using the deep neural network, is more suitable for practical application, and is simple and easy to realize.

Further, in one embodiment of the present invention, the convolutional neural network includes a residual-based learning neural network, a multi-scale structure-based neural network, and a parallel-multi-scale structure-based neural network.

Further, in one embodiment of the invention, the training module 200 is further configured to train the data set { x }ⁱ,yⁱAs a multispectral dataset, where x is 1,2,3 … NⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output:

y′＝f(x,p)，

Further, in one embodiment of the present invention, the apparatus 10 of the embodiment of the present invention further comprises: and an acquisition module. The acquisition module is used for acquiring the mosaic image through the RGB color camera.

It should be noted that the foregoing explanation on the embodiment of the multispectral imaging method based on the RGB camera and the depth neural network is also applicable to the multispectral imaging device based on the RGB camera and the depth neural network of the embodiment, and details are not repeated here.

According to the multispectral imaging device based on the RGB camera and the depth neural network, which is provided by the embodiment of the invention, the mosaic image is obtained by shooting through the RGB color camera, the multispectral image is restored and reconstructed from the mosaic image through the depth neural network, and finally the spatial domain-spectral domain combined acquisition of a natural scene is realized, so that the computational imaging from the mosaic image to the multispectral image can be directly completed through the depth neural network, the multispectral imaging device is more suitable for practical application and is simple and easy to realize.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A multispectral imaging method based on an RGB camera and a depth neural network is characterized by comprising the following steps:

designing a convolutional neural network with the input of a mosaic image directly acquired by an RGB color camera and the output of the mosaic image corresponding to a multispectral image, wherein the convolutional neural network comprises a neural network based on residual error learning, and the neural network based on residual error learning comprises two parts: the first part is input as mosaic image directly collected by RGB color camera, and the mosaic image is first input via input module including c₁A convolutional layer and a₁An activation layer, then K feature extraction layers, each feature extraction layer containing c₂A convolutional layer and a₂An active layer, finally pass c₃A convolutional layer and a₃An active layer for adding the output of the first part and the input of the second part into a second part, the second part comprises two processes, the first process is a dimension raising process, 3 channels respectively represent different spectral channels of the image, each channel is respectively operated, and each channel comprises c₄A convolutional layer and a₄Activating the layers, adding the results to the next process, the second process, first passing through c₅A convolutional layer and a₅An active layer; then passes through M feature extraction layers, each feature extraction layer containing c₆A convolutional layer and a₆An active layer; finally pass through c₇A convolutional layer and a₇The activation layer adds the output of the activation layer and the data after the previous dimensionality lifting to obtain a final multispectral image;

training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training of the convolutional neural network by using the existing multispectral data set comprises the following steps: will train the data set { xⁱ，yⁱN as the multispectral dataset, wherein x is 1,2,3ⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing the loss function value for the purpose of optimizing the network, wherein the definition of the loss function includes a norm-based calculationA method, an algorithm based on a two-norm, an algorithm based on a structural similarity index, and an algorithm based on a multi-scale structural similarity index; and

and shooting by adopting an RGB color camera to obtain a current mosaic image of an actual scene, and taking the mosaic image as the input of the optimal network model to obtain a target multispectral image.

2. A multispectral imaging method based on an RGB camera and a depth neural network is characterized by comprising the following steps:

designing a convolutional neural network which inputs a mosaic image directly acquired by an RGB color camera and outputs a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a multi-scale structure, and the network designed based on the multi-scale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c₁A convolutional layer and a₁Activating the layers and adding the results into a second section comprising K down-sampling operations and K up-sampling operations, each time the down-sampling operation results in half the resolution of the image and twice the number of channels, one down-sampling operation comprising a Conv _ block and a pooling layer, one Conv _ block comprising c₂A convolutional layer a₂An active layer and b₂A normalization layer, one upsampling operation comprising an Up _ block, one Cross-layer join operation and one Conv _ block, one Up _ block comprising one interpolation operation and c₃A convolutional layer a₃An active layer and b₃A normalization layer, wherein the cross-layer connection is realized by connecting the images with the same resolution in the down-sampling process and the up-sampling process in parallel and finally passing through c₄Each convolution layer outputs to obtain a multispectral image;

training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training of the convolutional neural network by using the existing multispectral data set comprises the following steps: will train the data set { xⁱ，yⁱN as the multispectral dataset, wherein x is 1,2,3ⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and

3. A multispectral imaging method based on an RGB camera and a depth neural network is characterized by comprising the following steps:

designing a convolutional neural network with input being a mosaic image directly acquired by an RGB color camera and output being a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a parallel-multiscale structure, and the network designed based on the parallel-multiscale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c₁A convolutional layer and a₁Activating layers and adding the results into a second part, the second part passing first a Conv _ block and 4 Bottleneck, a Conv _ block containing c₂A convolutional layer a₂An active layer and b₂A normalization layer, a Bottleneck containing c₃A convolutional layer a₃An active layer and b₃The normalization layer then passes through parallel sub-networks, where each row represents a sub-network, for a total of K sub-networks, and for each next row, the resolution of the picture is reduced, typically, the number of channels is doubled,in each sub-network, feature extraction is carried out through 4 basic blocks, and each basic block contains c₄A convolutional layer a₄An active layer and b₄The normalization layer is used for information fusion among different sub-networks, the image resolution is halved through a convolution layer stranded 3 multiplied by 3, and the number of channels is doubled; the resolution of the image is doubled by one interpolation operation and one convolution layer, the number of channels is halved, and finally c is carried out₅Each convolution layer outputs to obtain a multispectral image;

4. An RGB camera and depth neural network based multispectral imaging device, comprising:

the design module is used for designing a convolutional neural network which is input into a mosaic image directly collected by an RGB color camera and output into a corresponding multispectral image, wherein the convolutional neural network comprises a neural network based on residual error learning, and the neural network based on the residual error learning comprises two parts: first, theA part of the input is a mosaic image directly collected by an RGB color camera, firstly passes through an input module comprising c₁A convolutional layer and a₁An activation layer, then K feature extraction layers, each feature extraction layer containing c₂A convolutional layer and a₂An active layer, finally pass c₃A convolutional layer and a₃An active layer for adding the output of the first part and the input of the second part into a second part, the second part comprises two processes, the first process is a dimension raising process, 3 channels respectively represent different spectral channels of the image, each channel is respectively operated, and each channel comprises c₄A convolutional layer and a₄Activating the layers, adding the results to the next process, the second process, first passing through c₅A convolutional layer and a₅An active layer; then passes through M feature extraction layers, each feature extraction layer containing c₆A convolutional layer and a₆An active layer; finally pass through c₇A convolutional layer and a₇The activation layer adds the output of the activation layer and the data after the previous dimensionality lifting to obtain a final multispectral image;

the training module is used for training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training module is further configured to apply a set of training data { xⁱ，yⁱN as the multispectral dataset, wherein x is 1,2,3ⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: f (x, p), wherein p is a parameter corresponding to the network model, and y' is a multispectral image predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and

and the shooting module is used for obtaining a current mosaic image of an actual scene by adopting an RGB (red, green and blue) color camera for shooting, and obtaining a target multispectral image by taking the mosaic image as the input of the optimal network model.

5. An RGB camera and depth neural network based multispectral imaging device, comprising:

the design module is used for designing a convolutional neural network which is input into a mosaic image directly collected by an RGB color camera and output into a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a multi-scale structure, and the network designed based on the multi-scale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c₁A convolutional layer and a₁Activating the layers and adding the results into a second section comprising K down-sampling operations and K up-sampling operations, each time the down-sampling operation results in half the resolution of the image and twice the number of channels, one down-sampling operation comprising a Conv _ block and a pooling layer, one Conv _ block comprising c₂A convolutional layer a₂An active layer and b₂A normalization layer, one upsampling operation comprising an Up _ block, one Cross-layer join operation and one Conv _ block, one Up _ block comprising one interpolation operation and c₃A convolutional layer a₃An active layer and b₃A normalization layer, wherein the cross-layer connection is realized by connecting the images with the same resolution in the down-sampling process and the up-sampling process in parallel and finally passing through c₄Each convolution layer outputs to obtain a multispectral image;

the training module is used for training the convolutional neural network by utilizing the existing multispectral data set to obtain an optimal network model; the training module is further configured to apply a set of training data { xⁱ，yⁱN as the multispectral dataset, wherein x is 1,2,3ⁱFor the input mosaic image, yⁱFor the corresponding multispectral image, inputting x as an input into the network model to obtain a corresponding output: y ═ f (x, p), where p is for the network modelThe parameters y' are multispectral images predicted by the model; defining a corresponding loss function, and then minimizing a loss function value to achieve the purpose of optimizing the network, wherein the definition of the loss function comprises an algorithm based on a norm, an algorithm based on a two-norm, an algorithm based on a structural similarity index and an algorithm based on a multi-scale structural similarity index; and

6. An RGB camera and depth neural network based multispectral imaging device, comprising:

the design module is used for designing a convolutional neural network which is input into a mosaic image directly collected by an RGB color camera and output into a corresponding multispectral image, wherein the convolutional neural network comprises a network designed based on a parallel-multiscale structure, and the network designed based on the parallel-multiscale structure comprises two parts: the first part is a dimensionality-increasing part, 3 channels respectively represent mosaic images of three channels of red, green and blue of an RGB image, convolution and activation operations are carried out on the three channels, and each channel comprises c₁A convolutional layer and a₁Activating layers and adding the results into a second part, the second part passing first a Conv _ block and 4 Bottleneck, a Conv _ block containing c₂A convolutional layer a₂An active layer and b₂A normalization layer, a Bottleneck containing c₃A convolutional layer a₃An active layer and b₃The method comprises the steps of enabling a normalization layer to pass through parallel sub-networks, enabling each line to represent one sub-network, enabling K sub-networks to be shared, enabling the resolution of pictures to be reduced generally and enabling the number of channels to be doubled when the pictures go to the next line, and enabling 4 basic blocks to be used for feature extraction in each sub-network, wherein each basic block comprises c₄A convolutional layer a₄An active layer and b₄A normalization layer, then performing information fusion between different sub-networks, passing through a convolutional layer stranded 3 × 3The image resolution is halved, and the number of channels is doubled; the resolution of the image is doubled by one interpolation operation and one convolution layer, the number of channels is halved, and finally c is carried out₅Each convolution layer outputs to obtain a multispectral image;