CN111340729B

CN111340729B - Training method for depth residual error network for removing Moire pattern of two-dimensional code

Info

Publication number: CN111340729B
Application number: CN202010120759.9A
Authority: CN
Inventors: 陈昌盛; 陆涵; 黄继武
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2019-12-31
Filing date: 2020-02-26
Publication date: 2023-04-07
Anticipated expiration: 2040-02-26
Also published as: CN111340729A; WO2021134874A1

Abstract

The disclosure describes a training method for a depth residual error network for removing moire of a two-dimensional code, which includes: preparing an original two-dimensional code image with Moire patterns; inputting an original two-dimensional code image into a preprocessing module, and then carrying out downsampling processing to form a reduced preprocessing image; then inputting the preprocessed image into a first residual error module for up-sampling processing to form a first output image with the image size enlarged to the size of the original two-dimensional code image; then inputting the first output image into a second residual error module to form a second output image for recovering the lost image information of the first output image; and then the second output image and the original two-dimensional code image are subjected to feature fusion to form a feature fusion image, and the feature fusion image is input into a third residual error module to be subjected to purification processing to form a Moire removing image with Moire removed. Therefore, the moire fringes in the original two-dimensional code image can be more effectively removed.

Description

Training method for depth residual error network for removing Moire pattern of two-dimensional code

Technical Field

The present disclosure generally relates to a training method for a depth residual error network for two-dimensional code degranulation.

Background

At present, three methods are commonly used for removing moire in an image, and a multi-phase component layer decomposition technology (LDPC) is adopted to remove moire in an image shot by a camera; removing Moire patterns from the scanned image by adopting a wavelet domain filtering method; and (4) removing Moire patterns by adopting an image restoration method.

The three methods can remove the moire in the image to different degrees, but simultaneously have the following three limitations:

(1) The advantage of the Moire removing technology in the universality is reduced. The removal of moire fringes on a camera shooting image can be realized by adopting a multi-phase component layer decomposition technology. In practice, this technique can only remove a small amount of moir e interference, and does not work significantly in images where large scale moir e interference is present, especially in moir e color bands, and eventually smoothes the image details. Moire patterns can be removed from the scanned image using wavelet domain filtering, but the Moire patterns must be reticulated. The Moire pattern removal is realized by adopting a traditional image restoration method or improving DnCNN (denoising convolutional neural network), and because the models are not specially designed for the Moire pattern removal problem, only suboptimal performance can be realized. Actually, moire is anisotropic and randomly varied, and the above-mentioned technique can solve the problem of moire removal only for a specific moire pattern, and has a certain limitation.

(2) The influence on the original image cannot be completely avoided. The method solves the Moire removal problem by adopting a multi-phase component layer decomposition technology, has an unobvious removal effect and can smooth the details of the original image, namely, the restored image has certain distortion. The moire removal is performed by using an image restoration method, for example, with reference to a denoised adaptive encoder, and the moire in a high frequency band range is removed, and at the same time, the high frequency components of the original image are also removed. There is also some blurring due to the loss of high frequency information from the image. The above techniques cannot completely avoid the moire removal and the influence on the content of the original image.

(3) Increasing the complexity of use. Since the moire is anisotropic and non-uniform. For a set of images with multiple moire patterns, we can make them respectively play roles in different moire patterns through a combination of various prior arts. The method increases the complexity of the Moire removing problem, needs a plurality of steps to realize Moire removing, and cannot realize a good removing effect.

Disclosure of Invention

In view of the above conventional circumstances, it is an object of the present invention to provide a training method for a depth residual network that can effectively remove moir e in a two-dimensional code.

Therefore, the present disclosure provides a training method for a depth residual error network for removing moire in a two-dimensional code, which includes: preparing an original two-dimensional code image with Moire patterns; preparing a network device for processing the original two-dimensional code, wherein the network device comprises a preprocessing module, a first residual error module, a second residual error module and a third residual error module; inputting the original two-dimensional code image into the preprocessing module, performing fuzzy processing on the original two-dimensional code image to increase pixels of a two-dimensional code area in a simulated two-dimensional code image, and then performing downsampling processing to form a reduced image preprocessed image; then inputting the preprocessed image into the first residual error module to perform upsampling processing so as to form a first output image with the image size enlarged to the size of the original two-dimensional code image; then inputting the first output image to the second residual module to form a second output image that restores the lost image information of the first output image; and then performing feature fusion on the second output image and the original two-dimensional code image to form a feature fusion image, and then inputting the feature fusion image into the third residual error module to perform purification processing to form a moir e removing image with moir patterns removed.

According to the method and the device, the preprocessed image is processed through the first residual module, the second residual module and the third residual module in the depth residual network, and Moire patterns in the original two-dimensional code image can be conveniently and effectively removed.

In addition, in the training method for the two-dimensional code degranulation deep residual error network according to the present disclosure, the first residual error module may optionally include a first convolution layer with a convolution kernel size of 3 × 3 and 64 feature maps, a first ReLU active layer, 16 residual error blocks connected in series, a second convolution layer with a convolution kernel size of 3 × 3 and 64 feature maps, a batch normalization layer, a third convolution layer with a convolution kernel size of 3 × 3 and 256 feature maps, a fourth convolution layer with a convolution kernel size of 1 × 1 and 3 feature maps, and a tanh active layer, which are connected in sequence. This enables more effective removal of moire in the preprocessed image.

In addition, in the training method of the two-dimensional code degranulation-based deep residual error network according to the present disclosure, optionally, the second residual error module includes 10 sequentially connected first combination layers in which a convolutional layer having a convolutional kernel size of 5 × 5 and having 64 feature maps and a second ReLU active layer are connected in series, and 10 second combination layers in which a convolutional layer having a convolutional kernel size of 3 × 3 and having 64 feature maps and a third ReLU active layer are connected in series. Thereby, the lost image information of the first output image can be recovered.

In addition, in the training method for the two-dimensional code degranulation deep residual error network according to the present disclosure, optionally, the third residual error module includes a fifth convolutional layer with a convolutional kernel size of 3 × 3 and having 128 feature maps, a fourth ReLU active layer, and a sixth convolutional layer with a convolutional kernel size of 3 × 3 and having 3 feature maps, which are connected in sequence. This enables the moire in the second output image to be effectively removed.

In addition, in the training method for the depth residual error network for removing the moire of the two-dimensional code according to the present disclosure, optionally, the original two-dimensional code image is a synthesized simulated two-dimensional code image. Thus, the original two-dimensional code image can be generated more conveniently.

In addition, in the training method for the depth residual error network for removing moire in the two-dimensional code according to the present disclosure, optionally, the forming of the simulated two-dimensional code image includes the following steps: resampling the input image to generate a mosaic of RGB pixels for display on a display; randomly making a projection transform to simulate different relative positions and orientations between the display and the camera; simulating distortion of a lens of the camera using a radiation distortion function; simulating anti-aliasing filtering by adopting a flat-top Gaussian filter; resampling the input image to simulate an input of the camera sensor; adding Gaussian noise to the input image to simulate sensor noise; demosaicing treatment; denoising by adopting a denoising filter; compressing the input image; and outputting the decompressed image to form the simulated two-dimensional code image. Therefore, the simulated two-dimensional code image can be generated more conveniently.

In addition, in the training method of the depth residual network for two-dimensional code moir e removal according to the present disclosure, optionally, the two-dimensional code image on the display corresponding to the input image is processed using the same projective transformation and the same lens distortion function. Therefore, the two-dimensional code image on the display corresponding to the input image one by one can be conveniently obtained.

In addition, in the training method of the two-dimensional code degranulation deep residual error network related to the present disclosure, optionally, a mean square error function is adopted

Training the depth residual network as a loss function, wherein M and N are height and width of the simulated two-dimensional code image, H (G (I)) is a two-dimensional code image output by the depth residual network, J is a two-dimensional code image on a display corresponding to the simulated two-dimensional code image, and saving a model and parameters of the depth residual network when the loss of training is minimal. Therefore, the deep residual error network can be effectively trained.

In addition, in the training method for the depth residual error network with moire-removed two-dimensional code according to the present disclosure, optionally, the method further includes a real two-dimensional code image with moire, based on a model and parameters of the depth residual error network when the training loss is minimum, and performing a transfer learning operation on the depth residual error network by using the real two-dimensional code image. This allows the deep residual network to be trained more efficiently.

In addition, in the training method for the depth residual error network for removing the moire of the two-dimensional code according to the present disclosure, optionally, the real two-dimensional code image is subjected to angle transformation to obtain the two-dimensional code image on the display in one-to-one correspondence with the real two-dimensional code image. Thus, a two-dimensional code image on the display, which corresponds one-to-one to the real two-dimensional code image, can be obtained.

Drawings

Embodiments of the present disclosure will now be explained in further detail, by way of example only, with reference to the accompanying drawings, in which:

fig. 1 is a block diagram schematically showing a deep residual error network according to the present embodiment.

Fig. 2 is a schematic diagram showing a specific configuration of the deep residual network according to the present embodiment.

Fig. 3 is a flowchart illustrating a method for training a two-dimensional code degranulation deep residual error network according to the present embodiment.

Fig. 4 (a) is a schematic diagram showing a synthesized pseudo two-dimensional code image according to the present embodiment, and fig. 4 (b) is a schematic diagram showing a democratic image of fig. 4 (a) according to the present embodiment.

Fig. 5 (a) is a schematic diagram showing a captured image according to the present embodiment, fig. 5 (b) is a schematic diagram showing a reconstructed image according to the present embodiment, and fig. 5 (c) is a schematic diagram showing a moir e removing image according to the present embodiment with respect to fig. 5 (a).

Description of the symbols:

the network device comprises a network device model 8230301, a preprocessing module 823010, a first residual error module 82303020, a second residual error module 82303030, a first combination layer model 82303031, a second combination layer model 823032 and a third residual error module 823040.

Detailed Description

Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In the following description, the same components are denoted by the same reference numerals, and redundant description thereof is omitted. The drawings are schematic and the ratio of the dimensions of the components and the shapes of the components may be different from the actual ones.

Fig. 1 is a block diagram schematically showing a network device 1 according to the present embodiment. Fig. 2 is a schematic diagram showing a specific configuration of the network device 1 according to the present embodiment. Fig. 3 is a flowchart illustrating a two-dimensional code degranulation training method by the network device 1 according to the present embodiment.

The digital camera or the smart phone is adopted to shoot the photoelectric display (or the display screen) so as to facilitate data recording and transmission, however, because the pixel lattices between the camera sensor and the display device cannot be completely matched, the shot image often has pattern interference of a Moire pattern. Most of the existing Moire removing technologies aim at processing natural images, and the effect is not particularly ideal. The construction of a display-shooting communication channel based on an image two-dimensional code is a hot spot problem, but moire distortion seriously hinders the effectiveness of the communication system. The investigation finds that no Moire removing technology is provided for the two-dimensional code at present.

In order to solve the above problem, the present disclosure designs a training method for a two-dimensional code moir e removing depth residual error network based on a network device 1. Referring to fig. 1 to 3, the network device 1 of the present disclosure may include a preprocessing module 10, a first residual module 20, a second residual module 30, and a third residual module 40, wherein the preprocessing module 10, the first residual module 20, the second residual module 30, and the third residual module 40 are connected in series.

The invention discloses a training method of a depth residual error network for removing Moire patterns of a two-dimensional code, which comprises the following steps of: preparing an original two-dimensional code image I with Moire patterns (step S100); preparing a network device for processing an original two-dimensional code, the network device including a preprocessing module, a first residual module, a second residual module, and a third residual module (step S200); inputting an original two-dimensional code image I into a preprocessing module 10, performing a blurring process on the original two-dimensional code image I to increase pixels of a two-dimensional code area in the simulated two-dimensional code image, and then performing a down-sampling process to form a reduced-image preprocessed image I' (step S300); then, the preprocessed image I' is input into the first residual module 20 for upsampling to form a first output image with the image size enlarged to the size of the original two-dimensional code image I (step S400); then inputting the first output image to the second residual module 30 to form a second output image restoring the lost image information of the first output image (step S500); and then the second output image and the original two-dimensional code image I are feature-fused to form a feature-fused image, and then the feature-fused image is input to the third residual module 40 to be subjected to a refining process to form a moir e removed moir e image (step S600).

In the present disclosure, the moir e in the original two-dimensional code image I can be conveniently and effectively removed by processing the preprocessed image I' through the first residual module 20, the second residual module 30 and the third residual module 40 in the network device 1.

In some examples, the missing image information of the first output image may be, for example, the absence of a pixel point, the loss of a pixel point coordinate, or the like.

In some examples, the first residual module 20 may include a first convolution layer having a convolution kernel size of 3 × 3 and having 64 feature maps, a first ReLU (modified linear unit) active layer, 16 residual blocks connected in series, a second convolution layer having a convolution kernel size of 3 × 3 and having 64 feature maps, a Batch normalization layer (Batch Norm), a third convolution layer having a convolution kernel size of 3 × 3 and having 256 feature maps, a fourth convolution layer having a convolution kernel size of 1 × 1 and having 3 feature maps, and a tanh (tanh) active layer, which are connected in sequence. In this case, the first residual module 20 may be used to remove moir e in the preprocessed image I 'that is significantly different from the preprocessed image I', such as some colored band-shaped moir e. Of course, the size of the convolution kernel and the number of feature maps in the first residual module 20 are not fixed, and may be adjusted according to different trained network devices 1, which is not limited herein.

In some examples, the second residual module 30 may include a first combined layer 31 in which sequentially connected 10 layers are concatenated by a convolutional layer having a convolutional kernel size of 5 × 5 and having 64 feature maps and a second ReLU activation layer, and a second combined layer 32 in which sequentially connected 10 layers are concatenated by a convolutional layer having a convolutional kernel size of 3 × 3 and having 64 feature maps and a third ReLU activation layer. That is, each of the 10 first assembly layers 31 may be formed by connecting a convolutional layer having a size of 5 × 5 and having 64 feature maps and a second ReLU activation layer in series, and then connecting each of the 10 first assembly layers 31 in series; similarly, each of the 10 second combined layers 32 may be formed by concatenating a convolutional layer having a size of 3 × 3 and 64 feature maps and a third ReLU activation layer, and then concatenating each of the 10 layers to form the second combined layer 32. In this case, the second residual module 30 may be utilized to recover the missing detail information of the first output image. Of course, the sizes of the convolution kernels and the number of feature maps in the first combination layer 31 and the second combination layer 32 are not fixed, and may be adjusted according to different trained network devices 1, which is not limited herein.

In some examples, the third residual module 40 may include a fifth convolution layer having a convolution kernel size of 3 × 3 and 128 feature maps, a fourth ReLU activation layer, and a sixth convolution layer having a convolution kernel size of 3 × 3 and 3 feature maps connected in series. This enables the moire in the second output image to be effectively removed. The second output image and the original two-dimensional code image I are input into the third residual module 40, that is, the second output image and the original two-dimensional code image I are subjected to feature fusion and input into the third residual module 40. In this case, the second output image and the original two-dimensional code image I may be refined by using two convolution layers to further remove some comparison rules and moire patterns that are not easily separated from the original two-dimensional code image I, so as to obtain a final moire-removed moire image.

Fig. 4 (a) is a schematic diagram showing a synthesized pseudo two-dimensional code image according to the present embodiment, and fig. 4 (b) is a democratic image with respect to the image (a) according to the present embodiment.

Referring to fig. 4 (a) and 4 (b), in some examples, the original two-dimensional code image I is a synthesized simulated two-dimensional code image. This enables the original two-dimensional code image I to be generated more easily.

It is understood that, in order to make the network device 1 of the present disclosure recognize moire patterns in a real environment more accurately, ideally, we can only train the network device 1 using an image pair formed by a real captured image and a corresponding image on a display. However, it is difficult to obtain a pair of images that match perfectly in spatial position, the network can easily misidentify the mismatched edges as moir e, and there are many common problems with the true captured images: lens distortion, camera shake, etc., which can severely affect the alignment of image pairs. In view of the difficulty of constructing a large image set of high quality using real images, we train the network using simulated two-dimensional code images with real moir e patterns. To more accurately simulate the moir e generation process, we strictly perform the entire flow of digital processing from the display of the image on the LCD display to the capture of the image with the camera to the inside of the camera.

In some examples, the forming of the simulated two-dimensional code image includes the steps of: resampling the input image to generate a mosaic of RGB pixels for display on a display; randomly making a projection transformation to simulate different relative positions and directions between the display and the camera; simulating distortion of a lens of a camera using a radiation distortion function; simulating anti-aliasing filtering by adopting a flat-top Gaussian filter; resampling the input image to simulate the input of the camera sensor; adding Gaussian noise to an input image to simulate sensor noise; demosaicing treatment; denoising by adopting a denoising filter; compressing an input image; and outputting the decompressed image to form an emulated two-dimensional code image. Therefore, the simulated two-dimensional code image can be generated more conveniently.

In some examples, a bayer CFA may be employed to resample an input image to simulate the input of a camera sensor.

Additionally, in some examples, the input image may be compressed in a format such as JPEG, TIFF, RAW, and so forth.

In some examples, a two-dimensional code image on a display corresponding to an input image may be processed using the same projective transformation and lens distortion functions. Since the network device 1 needs to be trained by forming image pairs by the one-to-one correspondence between the simulated two-dimensional code image and the two-dimensional code image on the display, the two-dimensional code image on the display can be processed by using the same projection transformation and lens distortion function.

In some examples, a mean square error function may be employed

The network device 1 is trained as a loss function, where M and N are the height and width of the emulated two-dimensional code image, H (G (I')) is the two-dimensional code image output by the network device 1, and J is the two-dimensional code image on the display corresponding to the emulated two-dimensional code image, and the model and parameters of the network device 1 are saved for which the loss (loss) of training is minimal. This enables the network device 1 to be trained effectively.

In some examples, the loss function may also be implemented by a loss function or a cross-entropy loss function, an exponential loss function, or a Hinge loss function (SVM), and may be selected according to actual situations in a specific application, which is not limited herein.

In the present embodiment, as described above, the simulated two-dimensional code image may be input to the network device 1 as the original image I and trained to obtain the corresponding deglitch image (as shown in fig. 4 (b)). Therefore, the difficulty brought by the real two-dimensional code image using a large amount of data can be reduced.

Fig. 5 (a) is a schematic diagram showing a captured image according to the present embodiment, fig. 5 (b) is a schematic diagram showing a reconstructed image according to the present embodiment, and fig. 5 (c) is a schematic diagram showing a moir e removed image according to the present embodiment with respect to fig. 5 (a).

In some examples, to achieve degranulation of two-dimensional codes, the network device 1 model proposed by the present disclosure may be performed in two stages: (a) The network is pre-trained by adopting a large number of simulated two-dimensional code images with synthesized moire patterns (refer to the training of the simulated two-dimensional code images), so that the network can play a role of removing moire patterns aiming at the two-dimensional codes. (b) Considering the characteristics of moire in the real environment, a relatively small number of real two-dimensional code images with moire which are actually shot can be adopted for transfer learning (Fine-tune) based on the network device 1 in training, so that the network has a more ideal removal effect on moire on the real images. The network device 1 trained in two stages can realize good moire removing performance on a two-dimensional code image which is really shot with moire.

Thus, in some examples, the present disclosure may further include a genuine two-dimensional code image with moire patterns, based on the model and parameters of the network device 1 when the loss of training is minimal, and performing a migration learning operation on the network device 1 using the genuine two-dimensional code image. This enables the network device 1 to be trained more efficiently.

The migration learning operation is detailed below.

After the simulated two-dimensional code image is used for pre-training the network, the network has the function of removing moire fringes, but the simulated two-dimensional code image cannot completely simulate the situation in a real environment, so that the network needs to be finely adjusted by using the real two-dimensional code image shot really, the network can exert the actual effect, and meanwhile, the robustness of the network is enhanced. And storing the model and the parameters with the minimum loss in the pre-training process, and performing transfer learning operation on the network by using a real two-dimensional code image with Moire patterns.

Since the network device 1 is in a supervised learning mode, the input and output images of the network have no difference except for existence of moire fringes, but a matched label is difficult to find out from a real two-dimensional code image shot really, so that modules fixed in a two-dimensional code can be utilized: the positioning pattern, the calibration pattern and the detection pattern carry out angle transformation on the shot two-dimensional code image, so that the two-dimensional code image can be in one-to-one correspondence with the two-dimensional code image on the display. The reconstructed image after angle transformation (as shown in fig. 5 (b)) is used as the network input in the transition learning stage, and the two-dimensional code image on the screen corresponding to the reconstructed image is used as the network output. In the transfer learning process, a Mean Square Error (MSE) function is still used as a loss function, and the network can have a more ideal removing effect on Moire patterns in a real shooting environment through transfer learning operation of real data (real two-dimensional code images).

In some examples, the specific training parameter setting may be batchsize of 4, and the number of iterations may be 12 × 10 ⁵ Next, the learning rate may be 10 ^-5 The optimizer may be Adam.

In the pre-training stage, MATLAB may be used to perform moire simulation on 600 different two-dimensional codes displayed on a display to generate 60000 images of 512 × 512 simulated two-dimensional codes with moire patterns, wherein 50000 images may be used for training and 10000 images may be used for testing.

In some examples, the network apparatus 1 may be implemented with a deep learning framework tensorflow. The training and testing environment of the present disclosure may be a server hosting the NVIDIA Tesla P100 GPU and the Intel Xeon E5-2695 v4 CPU.

In some examples, a Dell U2414H display may be used to display different two-dimensional code images, and three smartphones (apple iPhone 8plus, millet MI 8, charm 1 metal, etc.) may be employed to photograph the two-dimensional code images.

Referring to fig. 5 (a), in some examples, in the migration learning stage, 10000 real two-dimensional code images with moire fringes may be collected to perform the migration learning operation on the network device 1, so that the network may exert a better moire fringe removing effect. 2000 images can be collected from each of the versions 2 to 6 of the mobile phone, and the same two-dimensional code image on the screen is shot 20 times from different angles and different distances by using the mobile phone. 10000 real two-dimensional code images can be reconstructed, and according to the fixed modules in the two-dimensional codes: the positioning pattern, the calibration pattern, and the three detection patterns are subjected to angle conversion to obtain reconstructed images corresponding to the two-dimensional code images displayed on the screen one to one, as shown in fig. 5 (b).

Referring again to fig. 1, a real two-dimensional code image captured by a device such as a mobile phone may be input, and a reconstructed image may be obtained by reconstructing the image through operations such as perspective transformation and angle transformation, and may be input to the network device 1 for the transition learning. In this case, the trained network device 1 can have a more ideal removal effect on the two-dimensional code image shot in the real environment.

While the present disclosure has been described in detail in connection with the drawings and examples, it should be understood that the above description is not intended to limit the disclosure in any way. Variations and changes may be made as necessary by those skilled in the art without departing from the true spirit and scope of the disclosure, which fall within the scope of the disclosure.

Claims

1. A training method of a depth residual error network for removing Moire patterns of a two-dimensional code is characterized in that,

the method comprises the following steps:

preparing an original two-dimensional code image with Moire patterns;

preparing a network device for processing the original two-dimensional code, wherein the network device comprises a preprocessing module, a first residual error module, a second residual error module and a third residual error module;

inputting the original two-dimensional code image into the preprocessing module, performing fuzzy processing on the original two-dimensional code image to increase pixels of a two-dimensional code area in a simulated two-dimensional code image, and then performing downsampling processing to form a reduced image preprocessed image;

inputting the preprocessed image into the first residual error module to perform upsampling processing so as to form a first output image with the image size enlarged to the size of the original two-dimensional code image;

inputting the first output image to the second residual module to form a second output image that restores lost image information of the first output image; and is

Performing feature fusion on the second output image and the original two-dimensional code image to form a feature fusion image, inputting the feature fusion image into the third residual error module to perform purification processing to form a moire-removed image with moire removed,

the original two-dimensional code image is a synthesized simulated two-dimensional code image, and the formation of the simulated two-dimensional code image comprises the following steps:

resampling the input image to generate a mosaic of RGB pixels for display on a display;

randomly making a projection transform to simulate different relative positions and orientations between the display and camera;

simulating distortion of a lens of the camera using a radiation distortion function;

simulating anti-aliasing filtering by adopting a flat-top Gaussian filter;

resampling the input image to simulate an input of the camera sensor;

adding Gaussian noise to the input image to simulate sensor noise;

demosaicing treatment;

denoising by adopting a denoising filter;

compressing the input image; and

outputting the decompressed image to form the simulated two-dimensional code image.

2. The training method of claim 1,

the first residual error module comprises a first convolution layer with convolution kernel size of 3 x 3 and 64 feature mappings, a first ReLU active layer, 16 residual error blocks connected in series, a second convolution layer with convolution kernel size of 3 x 3 and 64 feature mappings, a batch normalization layer, a third convolution layer with convolution kernel size of 3 x 3 and 256 feature mappings, a fourth convolution layer with convolution kernel size of 1 x 1 and 3 feature mappings, and a tanh active layer which are connected in sequence.

3. The training method of claim 1,

the second residual module comprises a first combination layer formed by serially connecting 10 layers of convolution layer with convolution kernel size of 5 × 5 and 64 feature maps and a second ReLU activation layer, and a second combination layer formed by serially connecting 10 layers of convolution layer with convolution kernel size of 3 × 3 and 64 feature maps and a third ReLU activation layer.

4. The training method of claim 1,

the third residual module comprises a fifth convolution layer with convolution kernel size of 3 x 3 and 128 feature maps, a fourth ReLU activation layer and a sixth convolution layer with convolution kernel size of 3 x 3 and 3 feature maps which are connected in sequence.

5. The training method of claim 1,

and processing the two-dimensional code image on the display corresponding to the input image by using the same projective transformation and the same lens distortion function.

6. The training method of claim 1,

using a mean square error function

Training the depth residual network as a loss function, wherein M and N are height and width of the simulated two-dimensional code image, H (G (I)) is a two-dimensional code image output by the depth residual network, J is a two-dimensional code image on a display corresponding to the simulated two-dimensional code image, and saving a model and parameters of the depth residual network when the loss of training is minimal.

7. The training method of claim 6,

the depth residual error network transfer learning method is characterized by further comprising a real two-dimensional code image with Moire patterns, based on a model and parameters of the depth residual error network when training loss is minimum, and carrying out transfer learning operation on the depth residual error network by utilizing the real two-dimensional code image.

8. The training method of claim 6,

and carrying out angle transformation on the real two-dimensional code image to obtain a two-dimensional code image on a display which corresponds to the real two-dimensional code image one by one.