CN114723611A

CN114723611A - Image reconstruction model training method, reconstruction method, device, equipment and medium

Info

Publication number: CN114723611A
Application number: CN202210653453.9A
Authority: CN
Inventors: 张晟东; 邓涛; 张立华; 李志建; 蔡维嘉; 王济宇; 古家威
Original assignee: Ji Hua Laboratory
Current assignee: Ji Hua Laboratory
Priority date: 2022-06-10
Filing date: 2022-06-10
Publication date: 2022-07-08
Anticipated expiration: 2042-06-10
Also published as: CN114723611B

Abstract

The invention relates to the technical field of robot vision, and particularly discloses an image reconstruction model training method, a reconstruction method, a device, equipment and a medium, wherein the image reconstruction model training method comprises the following steps: acquiring two pieces of reference image information and a plurality of pieces of gray scale image information shot between the two pieces of reference image information; reconstructing according to the gray level image information by using an initial reconstruction model to generate a plurality of pieces of preliminary reconstruction image information; establishing a countermeasure discriminator according to the preliminary reconstruction image information and the reference image information; establishing a loss function according to the preliminary reconstruction image information, the reference image information and the countermeasure discriminator; training an initial reconstruction model using a loss function to generate a reconstruction model; the model training method establishes a loss function according to the discrimination result of the countermeasure discriminator with the countermeasure characteristic to carry out model training, so that the reconstruction effect of the initial reconstruction model is quickly close to a real picture, and the model training process is smoothed.

Description

Image reconstruction model training method, reconstruction method, device, equipment and medium

Technical Field

The application relates to the technical field of robot vision, in particular to an image reconstruction model training method, a reconstruction method, a device, equipment and a medium.

Background

Along with the continuous increase of the power demand in China and the daily change of scientific technology, the novel and efficient inspection mode of unmanned aerial vehicle power inspection is developed. Compare in traditional artifical mode of patrolling and examining, robot (portable robot or unmanned aerial vehicle) electric power is patrolled and examined and is had the security high, patrol and examine efficient and patrol and examine advantage such as effectual.

In order to meet the functions of power line fault detection, power line image recording, autonomous obstacle avoidance, positioning, path planning and the like of the robot in the unmanned aerial vehicle power inspection, the machine vision of the robot is very important in the intelligent power inspection.

At present, a binocular RGB camera, an RGB-D camera, an infrared camera combined with visible light and other camera schemes based on frame images are generally adopted for robot vision, and problems of insufficient resolution, insufficient definition and the like frequently caused by environmental change, robot movement and other reasons exist in a visual image acquired according to the camera schemes.

In view of the above problems, no effective technical solution exists at present.

Disclosure of Invention

An object of the present application is to provide an image reconstruction model training method, reconstruction method, apparatus, device, and medium to acquire a super-resolution image with low delay.

In a first aspect, the present application provides an image reconstruction model training method for training an initial reconstruction model to obtain a reconstruction model that can be reconstructed to generate a super-resolution image, the method comprising the following steps:

acquiring two pieces of reference image information and a plurality of pieces of gray scale image information shot between the two pieces of reference image information;

reconstructing according to the gray scale image information by using the initial reconstruction model to generate a plurality of pieces of preliminary reconstruction image information;

establishing a countermeasure discriminator according to the preliminary reconstruction image information and the reference image information;

establishing a loss function according to the preliminary reconstruction image information, the reference image information and the countermeasure discriminator;

training the initial reconstruction model with the loss function to generate the reconstruction model.

According to the image reconstruction model training method, the loss function is established according to the judgment result of the countermeasure characteristics of the countermeasure discriminator to perform model training, so that the reconstruction effect of the initial reconstruction model is quickly close to a real picture, and the reconstruction model generated by training can be quickly reconstructed according to image information acquired at high frequency to generate a super-resolution image.

The image reconstruction model training method comprises the step of setting a plurality of convolution layers and a plurality of deconvolution layers in sequence, wherein the number of the convolution layers is larger than that of the deconvolution layers.

According to the image reconstruction model training method, the convolution feature extraction is carried out on the gray image information through the plurality of convolution layers, unnecessary features in the gray image information can be gradually removed, a convolution feature image is generated, the convolution feature image can be gradually converted into an image with the resolution consistent with that of the reference image information through the up-sampling processing of the deconvolution layers, and the contrast comparison can be carried out on the image and the reference image information.

The image reconstruction model training method comprises the steps that the countermeasure arbiter comprises a first countermeasure arbiter and a second countermeasure arbiter, the first countermeasure arbiter is used for comparing the preliminary reconstruction image information with the previous reference image information, and the second countermeasure arbiter is used for comparing the preliminary reconstruction image information with the subsequent reference image information.

In the image reconstruction model training method, the first countermeasure discriminator and the second countermeasure discriminator are designed to discriminate the difference degree between the initial reconstruction image information generated by the reconstruction of the initial reconstruction model and the two pieces of reference image information, and it should be understood that in the example, the loss function includes the discrimination results of the first countermeasure discriminator and the second countermeasure discriminator, so that the super-resolution image reconstructed by the reconstruction model obtained by final training based on the input multiple pieces of gray scale image information is highly similar to the two pieces of reference image information, and the truth of the reconstructed super-resolution image is improved.

The image reconstruction model training method, wherein the step of establishing a loss function according to the preliminary reconstructed image information, the reference image information and the confrontation discriminator comprises:

establishing a first objective function according to the preliminary reconstruction image information, the reference image information and the first countermeasure discriminator;

establishing a second objective function according to the preliminary reconstruction image information, the reference image information and the second impedance discriminator;

and establishing the loss function according to the first objective function and the second objective function based on preset weight.

The image reconstruction model training method is characterized in that the gray image information is generated based on binocular DAVIS event camera acquisition.

The training method of the image reconstruction model is characterized in that each piece of preliminary reconstruction image information is generated by an initial reconstruction model based on at least one piece of gray scale image information.

In a second aspect, the present application further provides an image reconstruction method for reconstructing and acquiring a super-resolution image, the method comprising the following steps:

and reconstructing the gray image information to be processed by using the reconstruction model obtained by training the image reconstruction model training method provided by the first aspect to generate the super-resolution image.

According to the image reconstruction method, the reconstruction model obtained by training in the first aspect is used for image reconstruction, the high-frequency super-resolution image can be generated according to the high-frequency gray image information and can be used as the visual data of the robot, so that the robot can rapidly obtain the super-resolution image without dynamic blur on the premise of low delay and high dynamic range, the environment recognition accuracy of the robot is improved, and the method can be applied to various mobile robots.

In a third aspect, the present application further provides an image reconstruction model training apparatus for training an initial reconstruction model to obtain a reconstruction model that can be reconstructed to generate a super-resolution image, the apparatus comprising:

the acquisition module is used for acquiring two pieces of reference image information and a plurality of pieces of gray scale image information shot between the two pieces of reference image information;

the initial reconstruction module is used for reconstructing according to the gray image information by using the initial reconstruction model to generate a plurality of pieces of initial reconstruction image information;

the discriminator module is used for establishing a confrontation discriminator according to the preliminary reconstruction image information and the reference image information;

a loss module, configured to establish a loss function according to the preliminary reconstructed image information, the reference image information, and the confrontation discriminator;

a training module to train the initial reconstruction model with the loss function to generate the reconstruction model.

According to the image reconstruction model training device, the loss function is established according to the judgment result that the countermeasure discriminator has the countermeasure characteristic so as to carry out model training, the reconstruction effect of the initial reconstruction model is enabled to be close to a real picture quickly, the model training process is enabled to be smooth, and the reconstruction model generated by training can be rapidly reconstructed according to image information acquired at high frequency to generate a super-resolution image.

In a fourth aspect, the present application further provides an electronic device, comprising a processor and a memory, where the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the steps of the method as provided in the first or second aspect are executed.

In a fifth aspect, the present application also provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method as provided in the first or second aspect above.

From the above, the present application provides an image reconstruction model training method, a reconstruction method, an apparatus, a device, and a medium, wherein the image reconstruction model training method uses reference image information as a prior image and preliminary reconstruction image information reconstructed according to gray image information based on an initial reconstruction model to establish a confrontation discriminator, and establishes a loss function according to a discrimination result of the confrontation discriminator with a confrontation characteristic to perform model training, so that a reconstruction effect of the initial reconstruction model is rapidly close to a real picture, so that a model training process is smoothed, and a reconstruction model generated by training can be rapidly reconstructed according to image information acquired at high frequency to generate a super-resolution image, so as to realize reconstruction of the super-resolution image on the premise of low cost and low delay.

Drawings

Fig. 1 is a flowchart of an image reconstruction model training method provided in an embodiment of the present application.

Fig. 2 is a schematic diagram of a network layer structure of an initial reconstruction model.

Fig. 3 is a schematic diagram of a connection structure of the initial reconstruction model, the first reactance discriminator and the second reactance discriminator.

Fig. 4 is a schematic structural diagram of an image reconstruction model training device provided in an embodiment of the present application.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Reference numerals are as follows: 201. an acquisition module; 202. a preliminary reconstruction module; 203. a discriminator module; 204. a loss module; 205. a training module; 301. a processor; 302. a memory; 303. a communication bus.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

At present, the robot vision generally adopts camera schemes based on frame images, such as a binocular RGB camera, an RGB-D camera, an infrared camera combined with visible light and the like, and images can be continuously extracted to serve as environmental information; however, the following problems generally exist in the images extracted by the existing robot vision: 1. the image blurring is caused by the rapid movement process or vibration of the robot; 2. under the scene of overexposure or dark light, the visual sensor is easy to lose the information of the shot object, and has low robustness.

The embodiment of the application aims at obtaining the reconstruction model of the super-resolution image which can be used for power inspection, and particularly can be deployed on the robot inspection equipment which takes an unmanned aerial vehicle as a mobile main body.

In a first aspect, please refer to fig. 1-3, fig. 1 is an image reconstruction model training method for training an initial reconstruction model to obtain a reconstruction model that can be reconstructed to generate super-resolution images in some embodiments of the present application, the image reconstruction model training method including the following steps:

s1, acquiring two pieces of reference image information and a plurality of pieces of gray scale image information shot between the two pieces of reference image information;

specifically, the reference image information is a clearly visible image, which may be a grayscale image or a color image, and in this embodiment of the application, the reference image information is regarded as a prior image with sufficient resolution and definition for training the initial reconstruction model, so that an image obtained by reconstructing the finally obtained reconstruction model can be similar to the reference image as much as possible.

More specifically, the two pieces of reference image information may be two images that are continuously captured, or may be two images captured at the same position with a short interval; in the embodiment of the present application, it is preferable that two consecutive image frames are continuously captured by using the same camera, and the grayscale image information is a grayscale image captured by an image sensor with a higher capture frequency, which is preferably all grayscale images (which may include grayscale images captured simultaneously with the reference image information) between the two consecutive image frames, and these grayscale images can be used as reconstruction materials or training materials when the initial reconstructed model reconstructs an image.

More specifically, since the reference image information is a priori image used for training the initial reconstruction model, it should be understood that the reference image information is collected at the same position as the grayscale image information, so that the reference image information can be used for verifying that the initial reconstruction model reconstructs the acquired image.

More specifically, the reference image information is used as an image for evaluating the definition or resolution of the reconstructed image, so that it only needs to be a two-dimensional image with sufficiently high definition and resolution, and in the embodiment of the present application, it is preferably a real picture acquired by using an RGB camera.

More specifically, the grayscale image information needs to be acquired by a camera with an acquisition frequency higher than that of the RBG camera, and in this embodiment, it is preferable to acquire by an event camera with a grayscale image acquisition function, where the event camera performs image acquisition based on the principle of pixel brightness change and has an acquisition frequency much higher than that of the RGB camera, so that multiple pieces of grayscale image information can be acquired and generated between two consecutive pieces of reference image information acquired and generated by the RGB camera.

S2, reconstructing according to the gray scale image information by using the initial reconstruction model to generate a plurality of pieces of preliminary reconstruction image information;

specifically, in the embodiment of the present application, the initial reconstruction model may be a preliminary reconstruction image information reconstructed according to a gray image information, or may be a preliminary reconstruction image information reconstructed according to a plurality of gray image information, in the method of the embodiment of the present application, different amounts of gray image information are input to the initial reconstruction model for image reconstruction, so that preliminary reconstruction image information with different degrees of reconstruction effects can be obtained, the subsequent steps are facilitated to verify different preliminary reconstruction image information generated by the initial reconstruction model under different conditions, and the omnidirectional training of the initial reconstruction model is realized, so that the reconstruction model obtained by the final training can reconstruct an ultra-resolution image according to single or plural gray image information.

S3, establishing a confrontation discriminator according to the preliminary reconstruction image information and the reference image information;

specifically, the initial reconstruction model is a prototype of the reconstruction model, the initial reconstruction image information is an image with super-resolution characteristics generated by the reconstruction of the initial reconstruction model according to the gray image information, but the reconstruction effect of the initial reconstruction image information is not verified, so that the reconstruction effect of the initial reconstruction image information needs to be verified by using the reference image information as the prior image.

More specifically, the countermeasure discriminator is a discriminator based on the countermeasure principle, and is capable of discriminating the reference image information as true and the preliminary reconstructed image information as false as possible, and therefore it should be understood that the countermeasure discriminator is a previously trained discriminator or a discriminator trained based on the preliminary reconstructed image information and the reference image information acquired at present; the countermeasure discriminator can be used for evaluating the training effect of the initial reconstruction model, and if the countermeasure discriminator discriminates the preliminary reconstruction image information as true, the countermeasure discriminator can not distinguish the difference between the reference image information and the preliminary reconstruction image information, and the preliminary reconstruction image information is highly similar to the reference image information and has super-resolution and definition equivalent to the reference image information.

More specifically, since the plurality of preliminary reconstructed image information is generated in step S2, it is to be understood that the confrontation discriminator in step S3 can be used for different preliminary reconstructed image information.

S4, establishing a loss function according to the preliminary reconstruction image information, the reference image information and the countermeasure discriminator;

specifically, since the countermeasure discriminator can be used to evaluate the training effect of the initial reconstruction model, the similarity between the preliminary reconstruction image information and the reference image information can be evaluated based on the loss function established by the countermeasure discriminator.

More specifically, the confrontation discriminator has different discrimination results according to different input data, and it is understood that this step establishes a loss function for the different discrimination results of the preliminary reconstructed image information, the reference image information, based on the confrontation discriminator.

More specifically, the functional formula of the loss function may be a result of discrimination by the countermeasure discriminator for a specific number of pieces of preliminary reconstructed image information, or may also be a result of discrimination by the countermeasure discriminator for all pieces of preliminary reconstructed image information.

And S5, training the initial reconstruction model by using the loss function to generate the reconstruction model.

Specifically, the loss function includes the discrimination result of the countermeasure discriminator on the preliminary reconstructed image information, so that the loss function includes the reconstruction formula of the initial reconstructed model on the information of the plurality of gray scale images, the training process of step S5 is to train parameters of the initial reconstructed model, so that the initial reconstructed model can reconstruct the preliminary reconstructed image information highly similar to the reference image information, i.e. the reconstructed picture is almost not different from the real picture, so that the output result of the loss function is converged, and the parameters of the initial reconstructed model at this time are fixed and then the initial reconstructed model is regarded as the reconstructed model.

More specifically, the reconstruction model generated by training can be used for reconstructing and generating a super-resolution image according to one or more pieces of gray scale image information, so that the super-resolution image can be deployed on a robot with a gray scale image information acquisition device to assist the robot in reconstructing the super-resolution image.

More specifically, it should be understood that the image reconstruction model training method according to the embodiment of the present application may execute the steps S1-S5 multiple times to obtain a reconstruction model with higher accuracy, that is, obtain two different pieces of reference image information and multiple pieces of grayscale image information captured between the two pieces of reference image information multiple times through the step S1, and train and obtain the reconstruction model with higher accuracy as model training data.

According to the image reconstruction model training method, the countermeasure discriminator is established by utilizing the reference image information as the prior image and the preliminary reconstruction image information reconstructed according to the gray image information based on the initial reconstruction model, the loss function is established according to the discrimination result of the countermeasure discriminator with the countermeasure characteristic to carry out model training, the reconstruction effect of the initial reconstruction model is enabled to be close to a real picture quickly, the model training process is enabled to be smooth, the reconstruction model generated by training can be used for reconstructing and generating the super-resolution image quickly according to the image information acquired at high frequency, and reconstruction of the super-resolution image is achieved on the premise of low cost and low delay.

In some preferred embodiments, the grayscale image information is generated based on a binocular DAVIS event camera acquisition.

Specifically, the binocular DAVIS event camera is an event camera capable of outputting event information and a gray image, the image reconstruction model training method in the embodiment of the present application uses the gray image output by the binocular DAVIS event camera as gray image information, the gray image information does not have depth information, and the step S1 records the set of all the gray image information as the gray image informationK _T，TThe acquisition time of the grayscale image information (in the present embodiment,Tbut also the number, quantity, of collected gray image information) of the image data, there areK _T=（k _j1，k _j2 ……k _jT) Wherein, in the step (A),jfor the acquisition time index of the previous reference image information,k _jiis composed ofK _TThe element in (1), namely, one gray level image information satisfiesk _ji ∈R ^m×n，i=1,2……T，R ^m×nIs composed ofm×nThe real matrix of (2) is:

（1）

wherein the content of the first and second substances,gis a gray-scale value that is,m、nare pixel coordinates.

More specifically, it should be understood that the reconstruction model obtained by training the image reconstruction model training method according to the embodiment of the present application can be applied to any robot having a function of acquiring grayscale image information, and preferably to a robot having a binocular DAVIS event camera, so as to ensure that the size specification and the acquisition frequency of the grayscale image information actually acquired are consistent with those of the grayscale image information used in the training process, thereby further ensuring the reconstruction effect of the super-resolution image.

More specifically, the event camera can ensure that the acquired gray image information is clear enough, and can solve the problem of low robustness caused by the fact that a visual sensor is easy to lack of the information of a shot object in an overexposed or dark scene.

In some preferred embodiments, the initial reconstruction model includes a plurality of convolution layers and a plurality of deconvolution layers arranged in sequence, and the number of convolution layers is greater than that of deconvolution layers.

Specifically, convolution feature extraction of grayscale image information by a plurality of convolution layers can gradually remove unnecessary features in the grayscale image information and generate a convolution feature image, and up-sampling processing by the deconvolution layer can gradually convert the convolution feature image into an image having a resolution identical to that of reference image information so as to enable countercheck comparison with the reference image information.

In some preferred embodiments, as shown in fig. 2, the bucking wrap layer (deconv 1-deconv 4) is connected to its symmetrically positioned convolution layer (conv 4-conv 1), wherein the bucking wrap layer (deconv 1-deconv 4) and the convolution layer (conv 4-conv 1) are centered on conv5 as symmetrical as conv1 is symmetrical to deconv4, so that the up-sampling processing result of the back-convolution layer fuses the output result of the correspondingly connected convolution layer and the up-sampling processing result based on the output result of the last network layer to improve the resolution of the output result thereof while avoiding distortion of the output result thereof.

Specifically, the set number of the convolutional layers and the anti-convolutional layers can be adjusted according to the reconstruction effect requirement, in general, the higher the number of layers is, the higher the resolution of the super-resolution image obtained by reconstruction is, but the longer the model training time and the time consumed by image reconstruction are, and the reference image information with the corresponding resolution needs to be prepared for training; in the embodiment of the application, the convolution layer and the deconvolution layer are preferably 5 layers and 4 layers respectively, which can ensure that the super-resolution image obtained by reconstruction meets the requirement of the robot on environment identification.

In some preferred embodiments, as shown in fig. 2, the initial reconstruction model in the training method of the image reconstruction model in the embodiment of the present application has 5 convolutional layers (conv 1-conv 5) and 4 deconvolution layers (deconv 1-deconv 4), wherein 5 convolutional layers and 4 deconvolution layers are sequentially connected, and conv1 is connected with deconv4, conv2 is connected with deconv3, conv3 is connected with deconv2, and conv4 is connected with deconv 1; frame img is the image input end of the initial reconstruction model, and out img is the image output end of the initial reconstruction model.

Specifically, in the embodiment of the present application, the output results of conv1-conv5 are recorded asO ₁、O ₂、O ₃、O ₄AndO ₅the output results of the individual operations of deconv1-deconv4 are recorded asd ₁、d ₂、d ₃Andd ₄for deconv1, the input isO ₅AndO ₄the output is based onO ₅Output ofd ₁On the basis of the output result of the fusion conv4O ₄So that the final output result of deconv1 is

Wherein, in the step (A),λconvolution weights, typically set to 0.5; by analogy with the above principle, the output of out img results in the overall initial reconstruction model being

。

The two pieces of reference image information are two continuously shot image frames, so that the two pieces of reference image information have the characteristic of high similarity, a super-resolution image reconstructed by a reconstruction model obtained by the image reconstruction model training method in the embodiment of the application can be highly similar to any one piece of reference image information, and a countermeasure discriminator can be used for discriminating the similarity degree of the preliminary reconstruction image information and the previous reference image information and can also be used for discriminating the similarity degree of the preliminary reconstruction image information and the subsequent reference image information; still further, the reconstructed super-resolution image may also be highly similar to both reference image information, and therefore, in some preferred embodiments, the confrontation discriminator comprises a first confrontation discriminator for comparing the preliminary reconstructed image information with the preceding reference image information and a second confrontation discriminator for comparing the preliminary reconstructed image information with the succeeding reference image information.

Specifically, in the image reconstruction model training method according to the embodiment of the present application, the first countermeasure discriminator and the second countermeasure discriminator are designed to discriminate the difference degree between the preliminary reconstruction image information generated by the preliminary reconstruction model reconstruction and the two reference image information, and it should be understood that, in this embodiment, the step S4 establishes the loss function including the discrimination results of the first countermeasure discriminator and the second countermeasure discriminator, so that the super-resolution image reconstructed by the reconstruction model finally obtained through training based on the input multiple gray-scale image information is highly similar to the two reference image information, so as to improve the truth of the reconstructed super-resolution image.

In some preferred embodiments, each preliminary reconstructed image information is generated by an initial reconstruction model based on at least one grayscale image information reconstruction.

Specifically, step S2 may also generate the preliminary reconstructed image information by substituting the integrated image into the initial reconstructed model, that is, integrating more than one gray-scale image information into one integrated image, and inputting the integrated image from the frame img into the initial reconstructed model, so that the initial reconstructed model reconstructs corresponding preliminary reconstructed image information from the out img according to the integrated image.

More specifically, as shown in fig. 3, frames 1 to T are all integrated images, and generating the integrated images based on more than one gray-scale image information can ensure that input data of an initial reconstruction model is more complete, so that an output result is more effective, thereby improving a reconstruction effect of the reconstruction model.

More specifically, as shown in fig. 3, in the embodiment of the present application, the integrated images are sequentially overlapped and integrated according to the generation time of the grayscale image information, that is, frame 1 is directly generated according to the first grayscale image information, frame 2 is generated according to the integration of the first and second grayscale image information, and so on, and the last frame T is generated according to the integration of all the grayscale image information.

More specifically, for one piece of reference image information, the more distant the grayscale image information acquisition time is, the worse the similarity of the grayscale image information acquisition object, so the more distant the grayscale image information acquisition time is, the worse the similarity of the preliminary reconstructed image information generated based on the grayscale image information with the reference image information, the image reconstruction model training method of the embodiment of the present application sets up a first contrast discriminator and a second contrast discriminator to discriminate the similarity of the reconstructed image with two pieces of reference image information, so that the super-resolution image generated by the reconstruction of the reconstruction model obtained by final training is similar to the corresponding reference image information both forwards and backwards, the reconstruction model can correspond to the super-resolution image with the environmental characteristics in unit time in high precision aiming at the gray image information acquired in a certain time when in actual use.

More specifically, the image reconstruction model training method according to the embodiment of the present application performs initial reconstruction model training by using reference image information captured at the front end and the rear end of the grayscale image information based on two discriminators, so that smooth reconstruction is implemented on a reconstruction model obtained by final training, that is, a super-resolution image generated by the reconstruction can be regarded as a transition image at a time between two reference image information, that is, a real image at a time between time periods corresponding to a plurality of pieces of continuous grayscale image information is smoothly reconstructed by the reconstruction model.

In some preferred embodiments, the step of establishing a loss function from the preliminary reconstructed image information, the reference image information and the confrontation discriminator comprises:

s41, establishing a first objective function according to the preliminary reconstruction image information, the reference image information and the first contrast discriminator;

specifically, as can be seen from the foregoing, the first antialiasing discriminator can output a true discrimination result for the preceding reference image information and a false discrimination result as possible for the image of the preliminary reconstructed image information, so that the first antialiasing discriminator can discriminate the similarity between the preliminary reconstructed image information and the preceding reference image information, and the first objective function established based on the first antialiasing discriminator can be established according to the difference (subtraction) of the discrimination results of the first antialiasing discriminator or according to the similarity (addition) of the discrimination results of the first antialiasing discriminator, and in the embodiment of the present application, the first objective function is defined as:

（2）

wherein the content of the first and second substances,

in order to be a first objective function of the system,D ₁is a first one of the pair of anti-discriminators,img _jfor the purpose of the previous reference image information,jfor the acquisition time index of the reference image information,PGin this embodiment, the initial reconstruction model directly reconstructs from the grayscale image information to generate preliminary reconstruction image information, the first objective function can output a result of the first countermeasure discriminator for the corresponding preliminary reconstruction image information and for the previous reference image information, and the closer to 0 the output result indicates that the preliminary reconstruction image information is more similar to the reference image information.

More specifically, different gray-scale image information is substituted into formula (2), so that the similarity between different initial reconstructed image information generated by the initial reconstructed model and the previous reference image information can be obtained, and the initial reconstructed model is trained to be adjustedPGThe first objective function is converged.

More specifically, as can be seen from the foregoing, the processing manner of the integrated image as the input data of the initial reconstruction model can improve the reconstruction effect of the image, and therefore, in some embodiments, if the initial reconstruction model is to generate the preliminary reconstruction image information according to the integrated image reconstruction, the first objective function of equation (2) is defined as:

（3）

wherein, frame i is the ith integrated image.

Different integrated images are substituted into formula (3) to obtain the similarity between different initial reconstructed image information generated by the initial reconstructed model and the previous reference image information, and the initial reconstructed model is trained to be adjustedPGThe first objective function is converged.

More specifically, for equations (2) and (3), only one input data can be substituted into the initial reconstruction model each time, and for model training, if the reconstruction effect of the initial reconstruction model on all gray-scale image information is to be ensured, multiple data inputs and corresponding training processes need to be performed, which leads to a complicated model training process; thus, in some preferred real-time approaches, the first objective function is more preferably defined as:

（4）

specifically, all gray image information is input in the formula (4), all the primarily reconstructed image information generated by the initially reconstructed model according to the gray image information is discriminated by the first countermeasure discriminator, so that the result output by the first objective function represents the accumulated similarity between all the primarily reconstructed image information and the initially reconstructed model, model training is performed based on the objective function of the formula (4), the overall reconstruction effect of the reconstructed model is better, and reconstruction can be performed on different multi-time gray image information.

More specifically, to ensure that the first countering discriminator has better discrimination capability, before training the initial reconstruction model, the initial reconstruction model may be fixed, and then the first countering discriminator may be trained based on equation (4) (the first countering discriminator is trained to adjust parameters of the first countering discriminator so as to make the output result of equation (4) as small as possible), so that the first countering discriminator may more clearly distinguish between the previous reference image information and the preliminary reconstruction image information; then unlocking the initial reconstruction model, fixing parameters of the first countermeasure discriminator, and then training the initial reconstruction model so as to further optimize the training effect of the initial reconstruction model; in some other embodiments, the first antagonizing arbiter, the second antagonizing arbiter, and the initial reconstructed model may also be trained simultaneously when performing step S5.

More specifically, as can be seen from the foregoing, the processing manner of the integrated image as the input data of the initial reconstruction model can improve the reconstruction effect of the image, and therefore, in some embodiments, if the initial reconstruction model is to generate the preliminary reconstruction image information according to the integrated image reconstruction, the first objective function of equation (4) is defined as:

（5）

specifically, substituting the integrated image for the reconstruction of the preliminary reconstruction image information can improve the reconstruction effect of the preliminary reconstruction image information, but an additional image integration processing process is required, and the formula (4) or the formula (5) can be selected according to the actual reconstruction requirement.

Therefore, in the embodiment of the present application, the first objective function is preferably set to the formula (4) or the formula (5).

S42, establishing a second objective function according to the preliminary reconstruction image information, the reference image information and the second impedance discriminator;

specifically, based on the foregoing, the second countermeasure discriminator can output a true discrimination result for the following reference image information, and output a false discrimination result for the image of the preliminary reconstruction image information as much as possible, and similar to the first objective function definition process, the second objective function can be defined in the following four forms:

（6）

（7）

（8）

（9）

wherein the content of the first and second substances,

in order to be the second objective function,D ₂is a second counter-impedance discriminator and is,img _j+1in order to be the reference image information at the end,j+1 is the acquisition time index of the reference image information,PGis an initial reconstructed model.

In the embodiment of the present application, the second objective function is preferably set to the formula (8) or the formula (9).

And S43, establishing a loss function according to the first objective function and the second objective function based on the preset weight.

Specifically, the loss function established based on the first objective function and the second objective function can represent the comprehensive similarity between the preliminary reconstructed image information and the two reference image information, so the loss function can be generally directly set as:

（10）

wherein the content of the first and second substances,

is a loss function; in the reconstruction process of the actual super-resolution image, the robot may need to satisfy a certain time tendency for the super-resolution image reconstruction effect, for example, the reconstructed super-resolution image needs to be closer to the real picture at the previous time, in this case, in the process of training and acquiring the reconstruction model, a larger weight needs to be set for the first objective function, so that the reconstructed super-resolution image is closer to the previous reference image information, and therefore, the loss function is more preferably set as:

（11）

wherein the content of the first and second substances,

and

the preset weights of the first objective function and the second objective function can be set according to actual use requirements.

In some preferred embodiments, the weight value of the first objective function in the loss function is equal to the weight value corresponding to the second objective function, i.e. without special requirements,

and

are set to equal values, and are preferably 0.5 in the embodiment of the present application, so that the reconstructed super-resolution image is more likely to be a transition picture at an intermediate time between two pieces of reference image information.

Specifically, after determining the structure of the loss function, step S5 trains an initial reconstruction model based on the loss function and a common model training algorithm, in this embodiment of the present application, preferably, the initial reconstruction model is trained by using a gradient descent method, which includes the following specific procedures:

based on formulae (4), (8) and (11), there are:

（12）

updating (12) the initial reconstruction model to

Converging, and then regarding the initial reconstruction model as a reconstruction model; however, in order to make the model training effect better, in the embodiment of the present application, more reference image information and grayscale image information may be added to train the initial reconstruction model, that is, the initial reconstruction model is trained by using a plurality of consecutive reference image information and corresponding grayscale image information, and if the number of the reference image information is N, the loss function may be further improved as:

（13）

can be simplified as follows:

（14）

in this embodiment, the training process is to train the first reactance discriminator, the second reactance discriminator and the initial reconstruction model at the same time, and the parameters of the initial reconstruction model, the first reactance discriminator and the second reactance discriminator are respectively recorded as W₁、W₂And W₃Three parametersSynthetic markθ _tGradient descent update (14) is performed using the following formula:

（15）

wherein the content of the first and second substances,g _tis a block of a gradient, and is,tin order to update the time of day stamp,β ₁andβ ₂a first update weight and a second update weight,m _tandv _ta first iteration term and a second iteration term respectively,

is composed ofθ _tThe gradient operator of (a) is selected,αin order to obtain a learning rate,θ ₀the initialization satisfies the requirements of the corresponding model, i.e. each element is taken from the augmentation matrix of the plus-01 distribution and has

。

Update (14) to iterative Convergence using equation (15), fix W at this time₁The initial reconstruction model can be converted into a reconstruction model, and the reconstruction model can reconstruct and acquire a high-frequency super-resolution image based on high-frequency gray image information.

In a second aspect, an embodiment of the present application provides an image reconstruction method for reconstructing and acquiring a super-resolution image, where the image reconstruction method includes the following steps:

and (3) the reconstruction model obtained by training the training method of the image reconstruction model provided by the first aspect is used for reconstructing the gray image information to be processed to generate a super-resolution image.

According to the image reconstruction method, the reconstruction model obtained by training in the first aspect is used for image reconstruction, the high-frequency super-resolution image can be generated according to the high-frequency gray image information and can be used as the robot visual data, so that the robot can rapidly obtain the super-resolution image without dynamic blur on the premise of low delay and high dynamic range, the environment recognition accuracy of the robot is improved, and the method can be applied to various mobile robots, such as inspection robots and the like.

In some preferred real-time modes, the reconstruction model is preferably deployed in a robot equipped with an event camera, so that the reconstruction model can reconstruct images according to high-frequency grayscale image information output by the event camera, and super-resolution images obtained by reconstruction are ensured to be free from dynamic blur and to be obtained with low delay.

Specifically, after the super-resolution image is generated, the image reconstruction method according to the embodiment of the application can add the polarity and the timestamp of the event point acquired by the event camera to the corresponding pixel point of the super-resolution image, so that the super-resolution image has more environmental parameter characteristics, and the robot can conveniently perform more accurate environmental identification.

In some preferred real-time approaches, the event camera is preferably a binocular DAVIS event camera.

In a third aspect, please refer to fig. 4, fig. 4 is an image reconstruction model training apparatus provided in some embodiments of the present application, for training an initial reconstruction model to obtain a reconstruction model that can be reconstructed to generate a super-resolution image, the apparatus includes:

an obtaining module 201, configured to obtain two pieces of reference image information and a plurality of pieces of grayscale image information captured between the two pieces of reference image information;

a preliminary reconstruction module 202, configured to reconstruct, by using an initial reconstruction model, according to the multiple pieces of grayscale image information, to generate multiple pieces of preliminary reconstruction image information;

the discriminator module 203 is used for establishing a confrontation discriminator according to the preliminary reconstruction image information and the reference image information;

a loss module 204, configured to establish a loss function according to the preliminary reconstructed image information, the reference image information, and the confrontation discriminator;

a training module 205 for training the initial reconstruction model using the loss function to generate the reconstruction model.

The image reconstruction model training device provided by the embodiment of the application establishes the countermeasure discriminator by utilizing the reference image information as the prior image and the preliminary reconstruction image information reconstructed according to the gray image information based on the initial reconstruction model, establishes the loss function according to the discrimination result of the countermeasure discriminator with the countermeasure characteristic to perform model training, enables the reconstruction effect of the initial reconstruction model to be close to a real picture quickly, smoothes the training process of the model, and enables the reconstruction model generated by training to be capable of reconstructing and generating the super-resolution image quickly according to the image information acquired at high frequency, so that the reconstruction of the super-resolution image is realized on the premise of low cost and low delay.

In some preferred embodiments, the image reconstruction model training apparatus of the embodiments of the present application is configured to perform the image reconstruction model training method provided in the first aspect.

In a fourth aspect, please refer to fig. 5, where fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and the present application provides an electronic device including: the processor 301 and the memory 302, the processor 301 and the memory 302 being interconnected and communicating with each other via a communication bus 303 and/or other form of connection mechanism (not shown), the memory 302 storing a computer program executable by the processor 301, the processor 301 executing the computer program when the computing device is running to perform the method in any alternative implementation of the above-described embodiments.

In a fifth aspect, the present application provides a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program performs the method in any optional implementation manner of the foregoing embodiments. The storage medium may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic Memory, a flash Memory, a magnetic disk, or an optical disk.

In summary, embodiments of the present application provide an image reconstruction model training method, a reconstruction method, an apparatus, a device, and a medium, where the image reconstruction model training method uses reference image information as a prior image and preliminary reconstruction image information reconstructed according to gray image information based on an initial reconstruction model to establish a confrontation discriminator, and establishes a loss function according to a discrimination result of the confrontation discriminator with a confrontation characteristic to perform model training, so that a reconstruction effect of the initial reconstruction model is fast close to a real picture, so that a model training process is smoothed, and a reconstruction model generated by training can be rapidly reconstructed according to image information acquired at high frequency to generate a super-resolution image, so as to realize reconstruction of the super-resolution image on the premise of low cost and low delay.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. An image reconstruction model training method for training an initial reconstruction model to obtain a reconstruction model that can be reconstructed to generate super-resolution images, the method comprising the steps of:

2. The method according to claim 1, wherein the initial reconstruction model includes a plurality of convolutional layers and a plurality of anti-convolutional layers, which are sequentially arranged, and the number of the convolutional layers is greater than that of the anti-convolutional layers.

3. The image reconstruction model training method according to claim 1, wherein the countermeasure arbiter comprises a first countermeasure arbiter and a second countermeasure arbiter, the first countermeasure arbiter is used for comparing the preliminary reconstruction image information with the previous reference image information, and the second countermeasure arbiter is used for comparing the preliminary reconstruction image information with the subsequent reference image information.

4. The method according to claim 3, wherein the step of establishing a loss function according to the preliminary reconstructed image information, the reference image information and the confrontation discriminator comprises:

5. The image reconstruction model training method of claim 1, wherein the grayscale image information is generated based on binocular DAVIS event camera acquisition.

6. The method according to claim 1, wherein each of the preliminary reconstructed image information is generated by an initial reconstruction model based on at least one of the grayscale image information reconstructions.

7. An image reconstruction method for reconstructing and acquiring super-resolution images, the method comprising the steps of:

the reconstruction model obtained by training with the image reconstruction model training method according to any one of claims 1 to 6 is used for reconstructing gray scale image information to be processed, and the super-resolution image is generated.

8. An image reconstruction model training apparatus for training an initial reconstruction model to obtain a reconstruction model that can be reconstructed to generate super-resolution images, the apparatus comprising:

the initial reconstruction module is used for reconstructing and generating a plurality of pieces of initial reconstruction image information according to the gray image information by utilizing the initial reconstruction model;

9. An electronic device comprising a processor and a memory storing computer readable instructions that, when executed by the processor, perform the image reconstruction model training method of any one of claims 1-6 or perform the steps in the image reconstruction method of claim 7.

10. A storage medium having stored thereon a computer program, wherein the computer program, when being executed by a processor, runs the image reconstruction model training method according to any one of claims 1 to 6 or runs the steps in the image reconstruction method according to claim 7.