WO2019192338A1 - 图像的处理方法、装置、存储介质和电子装置 - Google Patents

图像的处理方法、装置、存储介质和电子装置 Download PDF

Info

Publication number
WO2019192338A1
WO2019192338A1 PCT/CN2019/079332 CN2019079332W WO2019192338A1 WO 2019192338 A1 WO2019192338 A1 WO 2019192338A1 CN 2019079332 W CN2019079332 W CN 2019079332W WO 2019192338 A1 WO2019192338 A1 WO 2019192338A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
model
scale
target
network
Prior art date
Application number
PCT/CN2019/079332
Other languages
English (en)
French (fr)
Inventor
陶鑫
高宏运
沈小勇
戴宇榮
賈佳亞
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP19782211.7A priority Critical patent/EP3754591A4/en
Publication of WO2019192338A1 publication Critical patent/WO2019192338A1/zh
Priority to US16/934,823 priority patent/US11354785B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/682Vibration or motion blur correction
    • H04N23/683Vibration or motion blur correction performed by a processor, e.g. controlling the readout of an image memory
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/951Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20201Motion blur correction

Definitions

  • the embodiments of the present application relate to the field of image processing, and in particular, to a method, an apparatus, a storage medium, and an electronic device for processing an image.
  • Image blurring is a very common problem in everyday photography, especially in dynamic scenes or in dimly lit environments.
  • Image deblurring is a must-have and important image processing operation that recovers the details lost due to blurring.
  • the traditional single-frame image deblurring method assumes a fixed convolution kernel model, and gradually optimizes the deblurring effect by continuously iterating both convolution kernel estimation and image deconvolution.
  • Image deblurring methods based on neural networks mostly use image convolution models to generate fuzzy images using fuzzy kernels to train neural networks.
  • Embodiments of the present application provide an image processing method, apparatus, storage medium, and electronic device to solve at least the technical problem that the blurred image cannot be deblurred.
  • a method for processing an image comprising: acquiring, by a terminal device, an image processing instruction, wherein the image processing instruction is used to indicate deblurring processing on a target blurred image; a target model obtained by training a sample model with different scales, wherein the sample image is a composite image, and the composite image is a blurred image obtained by synthesizing a plurality of clear images, and the target model is set to Deblurring the blurred image to obtain a clear image; the terminal device deblurs the target blurred image by using a target model in response to the image processing instruction to obtain a target clear image; and the terminal device outputs the target clear image.
  • an image processing apparatus including one or more processors, and one or more memories storing program units, wherein the program units are executed by a processor, the program units
  • the method includes: a first acquiring unit configured to acquire an image processing instruction, wherein the image processing instruction is used to instruct deblurring processing on the target blurred image; and the second acquiring unit is configured to acquire a sample image pair using different scales
  • the original model is a target model obtained by training, wherein the sample image is a composite image, and the composite image is a blurred image obtained by synthesizing a plurality of clear images, and the target model is used for deblurring the blurred image.
  • the response unit is configured to perform deblurring on the target blurred image with a target model in response to the image processing instruction to obtain a target clear image; and an output unit configured to output the target clear image.
  • a non-transitory computer readable storage medium having stored therein a computer program, wherein the computer program is configured to execute the method described above at runtime.
  • an electronic device comprising a memory and a processor, wherein the memory stores a computer program, the processor being arranged to perform the method described above by the computer program.
  • the features of the blurred photos in the real scene may be represented, and the target models obtained by training the original models using the sample images may be Deblurring the blurred image to obtain a clear image.
  • the calculation method such as convolution kernel
  • the gap between the a priori hypothesis and the real situation in the process of generating the blurred image is avoided, and the target model trained by the fuzzy image generated in the related technology is avoided.
  • the technical problem of deblurring achieves the technical effect of deblurring a blurred image to obtain a clear image.
  • FIG. 1 is a schematic diagram of a hardware environment in accordance with an embodiment of the present application.
  • FIG. 2 is a flowchart of a method of processing an image according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a first blurred image according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a clear image obtained by deblurring a first type of blurred image according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of an original model according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a residual unit according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a second blurred image according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a clear image obtained by deblurring a second blurred image according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a third blurred image in accordance with an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a clear image obtained by deblurring a third blurred image according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a fourth blurred image according to an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a clear image obtained by deblurring a fourth blurred image according to an embodiment of the present application.
  • FIG. 13 is a schematic diagram of an image processing apparatus according to an embodiment of the present application.
  • FIG. 14 is a schematic diagram of an electronic device in accordance with an embodiment of the present application.
  • a method of processing an image is provided.
  • the above image processing method can be applied to a hardware environment constituted by the terminal 101 and the server 102 as shown in FIG. 1.
  • the terminal 101 is connected to the server 102 through a network.
  • the network includes but is not limited to: a wide area network, a metropolitan area network, or a local area network.
  • the terminal 101 may be a mobile terminal, or may be a PC terminal, a notebook terminal, or a tablet terminal. .
  • FIG. 2 is a flowchart of a method for processing an image according to an embodiment of the present application.
  • the terminal device may be the terminal 101 shown in FIG. 1 or the server 102.
  • the processing method of the image includes:
  • the terminal device acquires an image processing instruction, where the image processing instruction is used to instruct deblurring the target blurred image.
  • Deblurring refers to processing a blurred image into a clear image.
  • the figure in the lower left corner of the image shown in Fig. 3 is an enlarged view of the spine letter of the book behind the Xiaohuang people in the figure. As can be seen from the lower left corner, the image is relatively blurred, and the contents of the spine letters cannot be clearly seen.
  • FIG. 4 The figure in the lower left corner of the image shown in Fig. 4 is an enlarged view of the spine letter of the book behind the little yellow man in the figure. 4 is a clear image obtained by performing deblurring processing on FIG. 3, especially comparing the lower left corners of FIG. 4 and FIG. 3, and the lower left corner of FIG. 4 is clearer than the lower left corner of FIG. 3, and the spine letters can be clearly displayed. Is "ROCESS".
  • the target blurred image may be the image shown in FIG. 3, and the blurred image shown in FIG. 3 is subjected to deblurring processing to obtain the image shown in FIG. 4.
  • the process of deblurring is the process of processing the image shown in FIG. 3 above to obtain the image shown in FIG.
  • the terminal device acquires a target model obtained by training the original model by using sample images of different scales, wherein the sample image is a composite image, and the composite image is a blurred image obtained by synthesizing multiple clear images, and the target model is used for The blurred image is subjected to deblurring to obtain a clear image.
  • the terminal device performs deblurring on the target blurred image by using the target model in response to the image processing instruction to obtain a target clear image.
  • the target blurred image is input into the target model, so that the target model processes the target blurred image and outputs a clear image of the target.
  • the target sharp image may be the image shown in FIG.
  • the target model can be a neural network model obtained by training the original model.
  • the sample image required to train the original model is a blurred image synthesized by multiple clear images, and the clear image corresponding to the generated blurred image is a clear image before the synthesized blurred image, that is, the composite image can be used as a sample image and multiple clear images. It is the training goal of the target model.
  • a clear image corresponding to the composite image may be output after inputting the composite image into the target model, and the clear image may be any one of a plurality of clear images.
  • the sample images of different scales may be images obtained by down-sampling the same sample image, and the granularity of the downsampling is different, and the scale of the obtained sample images is also different.
  • the method before acquiring the target model obtained by training the original model with the sample images of different scales, the method further includes: the terminal device acquiring a continuous multi-frame clear image from the set of frame images, wherein the frame picture set is a video A collection of all frame images; combining the multi-frame clear images to obtain a sample image, wherein the sample image is a blurred image.
  • the generation of blurred data in an image is usually due to the motion of the camera during shooting or the motion of objects in the scene. Both of these blurs are essentially due to the slow shutter speed. Due to the movement of the camera or the displacement of the scene in the short time from the opening to the closing of the shutter, the image sensor pixels inside the camera acquire not only the brightness of a certain fixed position but the integration of all the brightness of the relevant position at this time. This integral can be approximated as a summation of adjacent successive images in an image taken by a high speed camera. This makes it possible to simulate real blur pictures with high speed cameras. This embodiment uses a high speed camera to capture high speed video to synthesize sufficient blurred pictures.
  • the blurred image can be a high speed video taken by a high speed camera at a speed of 240 frames per second.
  • Selecting a continuous multi-frame clear image from the set of frame pictures, the continuous multi-frame clear image may be an image captured within a few hundred milliseconds, and the image captured within a few hundred milliseconds may also include tens to hundreds of clear images. Images, these clear images can be combined to obtain sample images, or some of these sharp images can be combined to obtain sample images.
  • the multi-frame clear image is combined to obtain a sample image, including: randomly selecting a partial image from the multi-frame clear image; and performing partial summation and then averaging for each channel for each channel, to obtain a A blurred image; a blurred image is taken as a sample image.
  • a part of the continuous multi-frame clear image is randomly selected for synthesis, and the specific method is to obtain a blurred image by summing and averaging several frames of images.
  • the data of each channel of the image can be separately summed, and then the data of each channel is separately averaged, and the data obtained by the averaging can represent a generated blurred image, that is, the sample image.
  • a randomly selected partial image can generate a plurality of blurred images as sample images. For example, there are 20 partial images, and when the sample images are synthesized, 7-13 images can be randomly selected for synthesis, and 7-13 images are selected each time. Can get the same blurred image.
  • the numbers of the 20 images are 1, 2, 3, ... 20 in order, and the images with the numbers 1-4 and 10-13 are selected for the first time, and the numbers are 3, 5 for the second time.
  • the images of 9, 15, 16, 17, 19, and 20 are synthesized, and the picture selected each time can be random.
  • the terminal device outputs a clear image of the target.
  • the features of the blurred photos in the real scene may be represented, and the target model obtained by training the original model using the sample images may be blurred.
  • the image is deblurred to obtain a clear image.
  • the gap between the a priori hypothesis and the real situation in the process of generating the blurred image is avoided, and the target model trained by the fuzzy image generated in the related technology is avoided.
  • the technical problem of deblurring achieves the technical effect of deblurring a blurred image to obtain a clear image.
  • the method before acquiring the target model obtained by training the original model with the sample images of different scales, the method includes: the terminal device repeatedly performing the following steps to train the original model until the scale of the intermediate image is the same as the scale of the composite image.
  • the current scale is initialized to the scale of the first image of the composite image
  • the current model is initialized to the original model
  • the intermediate image is initialized to the first image
  • the first image is obtained by downsampling the target image by the target multiple Blurred image:
  • the first model wherein the first model is a model obtained by training the original model according to the first image.
  • This embodiment uses an iterative depth neural network model to train the target model. Training is performed using images of different scales of the image.
  • the scale can be understood as the resolution of the image.
  • the order from the coarse scale to the fine scale iteration is adopted.
  • the picture is considered to be relatively clear.
  • This embodiment uses this as a starting point to optimize the clear picture of the current scale and upsample the clear picture as the next scale.
  • the input is used to estimate a clear picture of the next scale until the scale of the output image is the same as the original image.
  • the current scale blurred picture is downsampled to the current scale size for the original blurred picture.
  • the clear picture of the current scale is trained to be used as an iterative target to finally optimize the clear picture of the original scale. Therefore, deblurring is decomposed into a series of sub-problems at multiple scales: input the current scale of the blurred image and the preliminary deblurred image (the preliminary deblurred image is obtained by up-scaling the clear image estimated from the previous scale), and estimate the current scale. Clear picture.
  • the basic model is as follows:
  • I i ,h i Net SR (B i ,I i+1 ⁇ ,h i+1 ⁇ ; ⁇ SR )
  • this formula is that for the scale i, given the blurred picture B i of the current scale and the clear picture and hidden state of the upper scale of the upsampling as the input of the neural network, the clear picture and hidden state of the current scale are output. In this way, a clear image is continuously estimated from the coarse scale to the fine scale until a clear image of the same scale as the sample image is obtained.
  • a neural network including a cyclic neural network (RNN), a long- and short-term memory network (LSTM), and a gate control loop unit (GRU).
  • RNN cyclic neural network
  • LSTM long- and short-term memory network
  • GRU gate control loop unit
  • the scale of the sample image is 256*256, that is, there are 256 pixels in the horizontal direction and the vertical direction, the current scale is the scale of the first image, the scale of the first image is 64*64, and the first image is the image from the sample.
  • Sampling can be interval sampling, where the sample image is reduced by sampling points (eg, sampling at intervals).
  • the sample image is a blurred image, and the first image obtained after downsampling is also a blurred image.
  • the first image as the sample image and the first image as the intermediate image are input into the original model for training, and the second image obtained by the preliminary deblurring process is output, and the scale of the second image is 64*64.
  • the model is updated to the first model after training; the first image is a coarse-scale image, the first image and the intermediate image are used as input images of the original model, and the second image is also outputted as a coarse scale, and the second image is output as the original model. image.
  • the network structure of the first model and the second model is the same, and the parameters of the first model and the second model are different.
  • the amplification process can sample the interpolation upsampling
  • the fourth image as the sample image and the third image as the intermediate image are input into the first model for training, and the fifth image obtained by the deblurring process is output, and the fifth image is clearer than the fourth image, fifth
  • the scale of the image is 128*128, and the first model is trained to be updated to the second model; the third image and the fourth image are mesoscale images, and the third image and the fourth image are used as input images of the first model.
  • the output is also a fifth image of the mesoscale, and the fifth image is the output image of the first model.
  • the network structure of the second model and the first model are the same, and the parameters of the second model and the first model are different.
  • the fifth image is enlarged to obtain a sixth image with a scale of 256*256.
  • the amplification process can sample the interpolation upsampling
  • the sample image and the sixth image are input into the second model for training, and the seventh image obtained by the deblurring process is output, and the second model is updated to the third model after training.
  • the scale of the seventh image is the same as the scale of the sample image.
  • the sample image is updated to a new image and continues to be trained with the updated sample image until all images in the training set have been trained. All the images in the training set are trained to obtain the model as the target model.
  • the sixth image and the seventh image are fine-scale images, and the sixth image and the seventh image are used as input images of the second model, and the output is also a fine-scale image, and the scale of the output image is the same as the size of the sample image.
  • the multiple of the scale is 2, and it should be noted that different multiples can be used in the actual training process.
  • the size of the sample image of this embodiment may be larger, for example, 1024*1024, and a part of the image is taken out from the sample image to train the original model, which can save the memory space required for training the model.
  • the current model includes a codec network, where the codec network includes an encoding network and a decoding network, and the current image is used to deblur the first image and the intermediate image of the current scale, and the second image is obtained by using the encoding network.
  • Encoding the first image and the intermediate image to obtain a first result wherein the two-layer convolution of the encoding network further includes a residual unit, and the residual unit is used to add the data before the two-layer convolution calculation to the two-layer volume In the data after the product calculation; the first result of the output of the encoded network is decoded using a decoding network to obtain a second image, wherein the two-layer convolution of the decoding network includes a residual unit.
  • Figure 5 shows the three codec networks in the current model.
  • Figure 5 is a codec network from input B3 to output I3.
  • From input B2 to output I2 is a codec network.
  • Input B1 to output I1 is a codec network.
  • Each codec network may perform deblurring on one image.
  • Each two-layer convolution in the coding network and the decoding network includes a residual unit, and a schematic diagram of the residual unit is shown in FIG. 6.
  • the nonlinear convolution after the reduced-dimensional convolution or the ascending-dimensional convolution layer in the codec network is replaced with a residual unit, and the number of residual units in each spatial dimension in the coding network and the decoding network is consistent.
  • the residual unit can calculate the difference between the output and the input of a block in the codec network, so that the amount of calculation becomes smaller, it is easier to learn, and the learning ability of the network is optimized.
  • the learning ability of the network can be further optimized by hopping the characteristics of the coding network and the decoding network.
  • the codec network of FIG. 5 is a symmetric network, including an encoding network and a decoding network.
  • the encoding network can encode the blurred image, and output the encoded first result to the decoding network, and the decoding network processes the output to output.
  • the process of clear picture, encoding and decoding implements deblurring.
  • the structure of the codec network of this embodiment can be decomposed into three modules, namely, an encoding network Net E , a hidden layer unit ConvLSTM, and a decoding network Net D , which are sequentially represented by the following formula:
  • I i Net D (g i ; ⁇ D )
  • f i denotes the coding feature of the i-th scale
  • B i is the blurred picture at the i-th scale
  • I i+1 is an enlarged view of the clear image of the previous scale output of the i-th scale
  • h i denotes The hidden information of i scales
  • h i+1 represents the hidden information of the previous scale of the i-th scale
  • g i represents the result of optimization of f
  • ⁇ E , ⁇ LSTM , ⁇ D respectively represent all the coding network Net E
  • the weight of the convolutional layer, the weight of all convolutional layers in the hidden layer unit ConvLSTM, the weight of all convolutional layers in the decoding network Net D , " ⁇ " represents the operation of magnifying the picture twice, wherein the encoding network and the decoding network Both contain residual units to increase network learning capabilities. In the same spatial dimension, three residual units can be added to balance the deblurring effect and the computational cost.
  • the scale of the sample image is 1000*2000, that is, 1000 pixels in the horizontal direction and 2000 pixels in the vertical direction.
  • the current scale is the scale of the first image, and the scale of the first image is 250*500, and the first image is Subsampled from the sample image.
  • the method of downsampling can be interval sampling, where the sample image is reduced by sampling points (eg, sampling at intervals).
  • the sample image is a blurred image, and the first image obtained after downsampling is also a blurred image.
  • the first image as the sample image and the first image as the intermediate image are input as input B 3 to the original model for training, and the second image I 3 obtained by the preliminary deblurring process is output, and the size of the second image is 250. *500, at this time the original model is updated to the first model after training;
  • the amplification process can sample the interpolation upsampling
  • the fourth image as the sample image and the third image as the intermediate image are input as input B 2 to the first model for training, and the fifth image I 2 obtained by the deblurring process is output, and the fifth image is fourth.
  • the image is clearer, the scale of the fifth image is 500*1000, and the first model is updated to the second model after training;
  • the fifth image is enlarged to obtain a sixth image with a scale of 1000*2000.
  • the amplification process can sample the interpolation upsampling;
  • the sample image and the sixth image as input to the second input B 1 model train outputs the obtained deblurring seventh image I 1, this time to update the second model trained to a third model.
  • the scale of the seventh image is the same as the scale of the sample image, and the training is ended.
  • the image of FIG. 7 can be used as a sample image of the input original model, and the image shown in FIG. 8 can be used as the seventh image.
  • the image of FIG. 9 can be used as a sample image of the input original model, and the image shown in FIG. 10 can be used as the seventh image.
  • the image of FIG. 11 can be used as a sample image of the input original model, and the image shown in FIG. 12 can be used as the seventh image.
  • the original model is trained by the deep iterative neural network model to obtain the target model.
  • the clear image obtained by the previous scale is enlarged and used as the input of the current scale, and the blurred image of the current scale is used for training.
  • the blurred image is deblurred by the target model to obtain a clear image.
  • the first image and the intermediate image whose scale is the current scale are deblurred by using the current model, and obtaining the second image includes: acquiring inherent information of the image of different scales, wherein the inherent information passes through the recurrent neural in the current model.
  • the network is transmitted between codec networks of different scales; the first image and the intermediate image whose scale is the current scale are deblurred using a codec network, combined with inherent information, to obtain a second image.
  • the LSTM module long-short-term memory network
  • the hidden information may be common information between pictures of different scales, such as the structure of images of different scales.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the embodiments of the present application may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium (such as ROM/RAM, disk).
  • the optical disc includes a plurality of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the various embodiments of the present application.
  • FIG. 13 is a schematic diagram of an image processing apparatus according to an embodiment of the present application.
  • the apparatus includes one or more processors, and one or more memories storing program units, wherein the program units are executed by a processor, and the program units may include:
  • the first obtaining unit 1302 is configured to acquire an image processing instruction, wherein the image processing instruction is used to indicate that the target blurred image is subjected to deblurring processing.
  • Deblurring refers to processing a blurred image into a clear image.
  • the figure in the lower left corner of the image shown in Fig. 3 is an enlarged view of the spine letter of the book behind the Xiaohuang people in the figure. As can be seen from the lower left corner, the image is relatively blurred, and the contents of the spine letters cannot be clearly seen.
  • FIG. 4 The figure in the lower left corner of the image shown in Fig. 4 is an enlarged view of the spine letter of the book behind the little yellow man in the figure. 4 is a clear image obtained by performing deblurring processing on FIG. 3, especially comparing the lower left corners of FIG. 4 and FIG. 3, and the lower left corner of FIG. 4 is clearer than the lower left corner of FIG. 3, and the spine letters can be clearly displayed. Is "ROCESS".
  • the target blurred image may be the image shown in FIG. 3, and the blurred image shown in FIG. 3 is subjected to deblurring processing to obtain the image shown in FIG. 4.
  • the process of deblurring is the process of processing the image shown in FIG. 3 above to obtain the image shown in FIG.
  • the second acquiring unit 1304 is configured to acquire a target model obtained by training the original model by using sample images of different scales, wherein the sample image is a composite image, and the composite image is synthesized by processing multiple clear images.
  • a blurred image, the target model is used to deblur the blurred image to obtain a clear image.
  • the response unit 1306 is configured to deblur the target blurred image with a target model in response to the image processing instruction to obtain a target sharp image.
  • the target blurred image is input into the target model, so that the target model processes the target blurred image and outputs a clear image of the target.
  • the target sharp image may be the image (d) shown in FIG.
  • the target model can be a neural network model obtained by training the original model.
  • the sample image required to train the original model is a blurred image synthesized by multiple clear images, and the clear image corresponding to the generated blurred image is a clear image before the synthesized blurred image, that is, the composite image can be used as a sample image and multiple clear images. It is the training goal of the target model.
  • a clear image corresponding to the composite image may be output after inputting the composite image into the target model, and the clear image may be any one of a plurality of clear images.
  • the apparatus further includes: a third acquiring unit, configured to obtain a continuous multi-frame clear image from the frame picture set before acquiring the target model obtained by training the original model with the sample images of different scales,
  • the frame picture set is a set of all frame pictures of a video segment;
  • the merging unit is configured to perform merging processing on the multi-frame clear image to obtain the sample image, wherein the sample image is a blurred image.
  • the generation of blurred data in an image is usually due to the motion of the camera during shooting or the motion of objects in the scene. Both of these blurs are essentially due to the slow shutter speed. Due to the movement of the camera or the displacement of the scene in the short time from the opening to the closing of the shutter, the image sensor pixels inside the camera acquire not only the brightness of a certain fixed position but the integration of all the brightness of the relevant position at this time. This integral can be approximated as a summation of adjacent successive images in an image taken by a high speed camera. This makes it possible to simulate real blur pictures with high speed cameras. This embodiment uses a high speed camera to capture high speed video to synthesize sufficient blurred pictures.
  • the blurred image can be a high speed video taken by a high speed camera at a speed of 240 frames per second.
  • Selecting a continuous multi-frame clear image from the set of frame pictures, the continuous multi-frame clear image may be an image captured within a few hundred milliseconds, and the image captured within a few hundred milliseconds may also include tens to hundreds of clear images. Images, these clear images can be combined to obtain sample images, or some of these sharp images can be combined to obtain sample images.
  • the merging unit comprises: a selecting module, configured to randomly select a partial image from the multi-frame clear image; and a second processing module configured to separately request the partial image for each channel And a process of averaging to obtain a blurred image; a determining module configured to use the one blurred image as the sample image.
  • a part of the continuous multi-frame clear image is randomly selected for synthesis, and the specific method is to obtain a blurred image by summing and averaging several frames of images.
  • the data of each channel of the image can be separately summed, and then the data of each channel is separately averaged, and the data obtained by the averaging can represent a generated blurred image, that is, the sample image.
  • a randomly selected partial image can generate a plurality of blurred images as sample images. For example, there are 20 partial images, and when the sample images are synthesized, 7-13 images can be randomly selected for synthesis, and 7-13 images are selected each time. Can get the same blurred image.
  • the numbers of the 20 images are 1, 2, 3, ... 20 in order, and the images with the numbers 1-4 and 10-13 are selected for the first time, and the numbers are 3, 5 for the second time.
  • the images of 9, 15, 16, 17, 19, and 20 are synthesized, and the picture selected each time can be random.
  • the output unit 1308 is arranged to output the target sharp image.
  • the sample images used for the training target model are synthesized according to the real captured images, the features of the blurred photos in the real scene may be represented, and the target model obtained by training the original model using the sample images may be blurred.
  • the image is deblurred to obtain a clear image.
  • the gap between the a priori hypothesis and the real situation in the process of generating fuzzy images is avoided, and the target model trained by the fuzzy images generated in the related technology is avoided.
  • the technical problem of deblurring achieves the technical effect of deblurring a blurred image to obtain a clear image.
  • the apparatus comprises: a training unit configured to repeatedly call the following module to train the original model until an intermediate image is obtained before acquiring a target model obtained by training the original model with sample images of different scales
  • the scale of the composite image is the same as the scale of the composite image, wherein the current scale is initialized to the scale of the first image of the composite image, the current model is initialized to the original model, and the intermediate image is initialized to the first image,
  • the first image is a blurred image obtained by downsampling a target multiple of the composite image:
  • a first acquisition module configured to acquire a first image of the current scale from the composite image
  • a first processing module configured to use the current model to measure the scale as the current scale
  • An image and the intermediate image are subjected to deblurring processing to obtain a second image, wherein the second image is a sharp image associated with the first image
  • an amplifying module configured to magnify the second image Processing, obtaining a third image, wherein the intermediate image is updated to the third image
  • a first update module configured to update the current scale to N times the current scale, wherein N is greater than or equal to 2
  • a second update module configured to update the current model to a first model, wherein the first model is a model obtained by training the original model according to the first image.
  • This embodiment uses an iterative depth neural network model to train the target model. Training is performed using images of different scales of the image.
  • the scale can be understood as the resolution of the image.
  • the order from the coarse scale to the fine scale iteration is adopted.
  • the picture is considered to be relatively clear.
  • This embodiment uses this as a starting point to optimize the clear picture of the current scale and upsample the clear picture as the next scale.
  • the input is used to estimate a clear picture of the next scale until the scale of the output image is the same as the original image.
  • the current scale blurred picture is downsampled to the current scale size for the original blurred picture.
  • the clear picture of the current scale is trained to be used as an iterative target to finally optimize the clear picture of the original scale. Therefore, deblurring is decomposed into a series of sub-problems at multiple scales: input the current scale of the blurred image and the preliminary deblurred image (the preliminary deblurred image is obtained by up-scaling the clear image estimated from the previous scale), and estimate the current scale. Clear picture.
  • the basic model is as follows:
  • I i ,h i Net SR (B i ,I i+1 ⁇ ,h i+1 ⁇ ; ⁇ SR )
  • the meaning of this formula is that for the scale i, given the blurred picture B i of the current scale and the clear picture and hidden state of the upper scale of the upsampling as the input of the neural network, the clear picture and hidden state of the current scale are output. In this way, a clear image is continuously estimated from the coarse scale to the fine scale until a clear image of the same scale as the sample image is obtained.
  • the hidden state in the neural network including the cyclic neural network RNN, the long- and short-term memory network LSTM, and the gate control loop unit GRU.
  • This embodiment can use LSTM as a way of indicating hidden layer information.
  • the scale of the sample image is 256*256, that is, there are 256 pixels in the horizontal direction and the vertical direction, the current scale is the scale of the first image, the scale of the first image is 64*64, and the first image is the image from the sample.
  • Sampling can be interval sampling, where the sample image is reduced by sampling points (eg, sampling at intervals).
  • the sample image is a blurred image, and the first image obtained after downsampling is also a blurred image.
  • the first image as the sample image and the first image as the intermediate image are input into the original model for training, and the second image obtained by the preliminary deblurring process is output, and the scale of the second image is 64*64.
  • the model is updated to the first model after training; the first image is a coarse-scale image, the first image and the intermediate image are used as input images of the original model, and the second image is also outputted as a coarse scale, and the second image is output as the original model. image.
  • the network structure of the first model and the second model is the same, and the parameters of the first model and the second model are different.
  • the amplification process can sample the interpolation upsampling
  • the fourth image as the sample image and the third image as the intermediate image are input into the first model for training, and the fifth image obtained by the deblurring process is output, and the fifth image is clearer than the fourth image, fifth
  • the scale of the image is 128*128, and the first model is trained to be updated to the second model; the third image and the fourth image are mesoscale images, and the third image and the fourth image are used as input images of the first model.
  • the output is also a fifth image of the mesoscale, and the fifth image is the output image of the first model.
  • the network structure of the second model and the first model are the same, and the parameters of the second model and the first model are different.
  • the fifth image is enlarged to obtain a sixth image with a scale of 256*256.
  • the amplification process can sample the interpolation upsampling
  • the sample image and the sixth image are input into the second model for training, and the seventh image obtained by the deblurring process is output, and the second model is updated to the third model after training.
  • the scale of the seventh image is the same as the scale of the sample image.
  • the sample image is updated to a new image and continues to be trained with the updated sample image until all images in the training set have been trained. All the images in the training set are trained to obtain the model as the target model.
  • the sixth image and the seventh image are fine-scale images, and the sixth image and the seventh image are used as input images of the second model, and the output is also a fine-scale image, and the scale of the output image is the same as the size of the sample image.
  • the multiple of the scale is 2, and it should be noted that different multiples can be used in the actual training process.
  • the size of the sample image of this embodiment may be larger, for example, 1024*1024, and a part of the image is taken out from the sample image to train the original model, which can save the memory space required for training the model.
  • the current model includes a codec network
  • the codec network includes an encoding network and a decoding network
  • the first processing module includes: an encoding submodule configured to use the encoding network to the first image and The intermediate image is subjected to an encoding process to obtain a first result
  • the two-layer convolution of the encoding network further includes a residual unit, and the residual unit is configured to add data before the two-layer convolution calculation to The data after the two-layer convolution calculation
  • the decoding sub-module is configured to perform decoding processing on the first result output by the coding network using the decoding network to obtain the second image, where The two-layer convolution of the decoding network includes the residual unit.
  • Figure 5 shows the three codec networks in the current model.
  • Figure 5 is a codec network from input B3 to output I3.
  • From input B2 to output I2 is a codec network.
  • Input B1 to output I1 is a codec network.
  • Each codec network may perform deblurring on one image.
  • Each two-layer convolution in the coding network and the decoding network includes a residual unit, and a schematic diagram of the residual unit is shown in FIG. 6.
  • the nonlinear convolution after the reduced-dimensional convolution or the ascending-dimensional convolution layer in the codec network is replaced with a residual unit, and the number of residual units in each spatial dimension in the coding network and the decoding network is consistent.
  • the residual unit can calculate the difference between the output and the input of one block in the codec network, so that the calculation amount becomes smaller, it is easier to learn, and the learning ability of the network is optimized.
  • the learning ability of the network can be further optimized by hopping the characteristics of the coding network and the decoding network.
  • the codec network of FIG. 5 is a symmetric network, including an encoding network and a decoding network.
  • the encoding network can encode the blurred image, and output the encoded first result to the decoding network, and the decoding network processes the output to output.
  • the process of clear picture, encoding and decoding implements deblurring.
  • the structure of the codec network of this embodiment can be decomposed into three modules, namely, an encoding network Net E (including the input block in FIG. 5, E block #1 and E block #2), and hidden.
  • the layer unit ConvLSTM (LSTM shown in Fig. 5) decodes the network Net D (including the output block in Fig. 5, D block #1 and D block #2), which are sequentially expressed by the following formula:
  • I i Net D (g i ; ⁇ D )
  • f i denotes the coding feature of the i-th scale
  • B i is the blurred picture at the i-th scale
  • I i+1 is an enlarged view of the clear image of the previous scale output of the i-th scale
  • h i denotes The hidden information of i scales
  • h i+1 represents the hidden information of the previous scale of the i-th scale
  • g i represents the result of optimization of f
  • ⁇ E , ⁇ LSTM , ⁇ D respectively represent all the coding network Net E
  • the weight of the convolutional layer, the weight of all convolutional layers in the hidden layer unit ConvLSTM, the weight of all convolutional layers in the decoding network Net D , " ⁇ " represents the operation of magnifying the picture twice, wherein the encoding network and the decoding network Both contain residual units to increase network learning capabilities. In the same spatial dimension, three residual units can be added to balance the deblurring effect and the computational cost.
  • the scale of the sample image is 1000*2000, that is, 1000 pixels in the horizontal direction and 2000 pixels in the vertical direction.
  • the current scale is the scale of the first image, and the scale of the first image is 250*500, and the first image is Subsampled from the sample image.
  • the method of downsampling can be interval sampling, where the sample image is reduced by sampling points (eg, sampling at intervals).
  • the sample image is a blurred image, and the first image obtained after downsampling is also a blurred image.
  • the first image as the sample image and the first image as the intermediate image are input as input B 3 to the original model for training, and the second image I 3 obtained by the preliminary deblurring process is output, and the size of the second image is 250. *500, at this time the original model is updated to the first model after training;
  • the amplification process can sample the interpolation upsampling
  • the fourth image as the sample image and the third image as the intermediate image are input as input B 2 to the first model for training, and the fifth image I 2 obtained by the deblurring process is output, and the fifth image is fourth.
  • the image is clearer, the scale of the fifth image is 500*1000, and the first model is updated to the second model after training;
  • the fifth image is enlarged to obtain a sixth image with a scale of 1000*2000.
  • the amplification process can sample the interpolation upsampling;
  • the sample image and the sixth image as input to the second input B 1 model train outputs the obtained deblurring seventh image I 1, this time to update the second model trained to a third model.
  • the scale of the seventh image is the same as the scale of the sample image, and the training is ended.
  • the image of FIG. 7 can be used as a sample image of the input original model, and the image shown in FIG. 8 can be used as the seventh image.
  • the image of Fig. 9 can be used as a sample image of the input original model, and the image shown in Fig. 10 can be used as the seventh image.
  • the image of FIG. 11 can be used as a sample image of the input original model, and the image shown in FIG. 12 can be used as the seventh image.
  • the original model is trained by the deep iterative neural network model to obtain the target model.
  • the clear image obtained by the previous scale is enlarged and used as the input of the current scale, and the blurred image of the current scale is used for training.
  • the blurred image is deblurred by the target model to obtain a clear image.
  • the first processing module includes: an obtaining submodule configured to acquire inherent information of images of different scales, wherein the inherent information is processed by a recursive neural network in the current model. Transmitting a transmission between the networks; the processing submodule is configured to use the codec network to deblur the first image and the intermediate image of the current scale in combination with the inherent information to obtain a first Two images.
  • the LSTM module long-short-term memory network
  • the hidden information may be common information between pictures of different scales, such as the structure of images of different scales.
  • an electronic device for implementing the processing method of the above image, which may be the terminal 101 shown in FIG. 1 or a server 102, as shown in FIG.
  • the electronic device includes a memory and a processor having a computer program stored therein, the processor being configured to perform the steps of any of the above method embodiments by a computer program.
  • FIG. 14 is a structural block diagram of an electronic device according to an embodiment of the present application.
  • the electronic device may include one or more (only one shown) processor 1401, at least one communication bus 1402, a user interface 1403, at least one transmission device 1404, and a memory 1405.
  • the communication bus 1402 is used to implement connection communication between these components.
  • the user interface 1403 can include a display 1406 and a keyboard 1407.
  • Transmission device 1404 can optionally include a standard wired interface and a wireless interface.
  • the foregoing electronic device may be located in at least one network device of the plurality of network devices of the computer network.
  • the foregoing processor may be configured to perform the following steps by using a computer program:
  • the structure shown in FIG. 14 is merely illustrative, and the electronic device may also be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, a palm computer, and a mobile Internet device (Mobile). Terminal devices such as Internet Devices, MID) and PAD.
  • Fig. 14 does not limit the structure of the above electronic device.
  • the electronic device may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 14, or have a different configuration than that shown in FIG.
  • the memory 1405 can be configured to store software programs and modules, such as the image processing method and the program program/module corresponding to the device in the embodiment of the present application.
  • the processor 1401 runs the software program and the module stored in the memory 1405. Thereby, various functional applications and data processing are performed, that is, the above-described image processing method is implemented.
  • Memory 1405 can include high speed random access memory, and can also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 1405 can further include memory remotely located relative to processor 1401, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the above described transmission device 1404 is arranged to receive or transmit data via a network.
  • Specific examples of the above network may include a wired network and a wireless network.
  • the transmission device 1404 includes a Network Interface Controller (NIC) that can be connected to other network devices and routers via a network cable to communicate with the Internet or a local area network.
  • NIC Network Interface Controller
  • transmission device 1404 is a radio frequency (RF) module that is configured to communicate with the Internet wirelessly.
  • RF radio frequency
  • the memory 1405 is set to store a sample image.
  • An embodiment of the present application further provides a storage medium, which may be a non-transitory computer readable storage medium, in which a computer program is stored, wherein the computer program is configured to execute any of the above The steps in the method embodiments.
  • the above storage medium may be configured to store a computer program for performing the following steps:
  • the storage medium is further arranged to store a computer program for performing the following steps:
  • the following steps are repeatedly performed to train the original model until the scale of the intermediate image is the same as the scale of the composite image, wherein the current scale is initialized to the scale of the first image of the composite image, and the current model is initialized to The original model, the intermediate image is initialized to the first image, and the first image is a blurred image obtained by downsampling a target multiple of the composite image: obtaining a scale from the composite image as described a first image of a current scale; deblurring the first image and the intermediate image of the current scale by using the current model to obtain a second image, wherein the second image is a Defining a clear image associated with the first image; performing amplification processing on the second image to obtain a third image, wherein the intermediate image is updated to the third image; updating the current scale to the current scale N times, where N is greater than or equal to 2; updating the current model to a first model, wherein the first model is according to the first Like the original model, model training get.
  • the storage medium is further configured to store a computer program for performing the steps of: obtaining a continuous multi-frame clear image from the set of frame pictures, wherein the set of frame pictures is a set of all frame pictures of a video segment;
  • the multi-frame clear image is subjected to a combining process to obtain the sample image, wherein the sample image is a blurred image.
  • the storage medium is further configured to store a computer program for performing the steps of: randomly selecting a partial image from the multi-frame clear image; performing a first summation and then averaging for each of the partial images Processing, obtaining a blurred image; using the blurred image as the sample image.
  • the storage medium is further configured to store a computer program for performing the following steps: encoding the first image and the intermediate image using the encoding network to obtain a first result, wherein the encoding
  • the two-layer convolution of the network further includes a residual unit for adding data before the two-layer convolution calculation to the data after the two-layer convolution calculation; using the decoding network pair
  • the first result of the encoding network output is subjected to a decoding process to obtain the second image, wherein the two-layer convolution of the decoding network includes the residual unit.
  • the storage medium is further configured to store a computer program for performing the steps of: acquiring intrinsic information of images of different scales, wherein the intrinsic information is processed at different scales by a recurrent neural network in the current model Transmitting between codec networks; using the codec network, combining the inherent information to deblurize the first image and the intermediate image of the current scale to obtain a second image.
  • the storage medium is further configured to store a computer program for performing the steps included in the method in the above embodiments, which will not be described in detail in this embodiment.
  • the storage medium may include a flash disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like.
  • the integrated unit in the above embodiment if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in the above-described computer readable storage medium.
  • the technical solution of the embodiments of the present application may be embodied in the form of a software product in the form of a software product in essence or in a part contributing to the related art, and the computer software product is stored in a storage medium.
  • a number of instructions are included to cause one or more computer devices (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the disclosed client may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the features of the blurred photos in the real scene may be represented, and the target models obtained by training the original models using the sample images may be Deblurring the blurred image to obtain a clear image.
  • the calculation method such as convolution kernel
  • the gap between the a priori hypothesis and the real situation in the process of generating the blurred image is avoided, and the target model trained by the fuzzy image generated in the related technology is avoided.
  • the technical problem of deblurring achieves the technical effect of deblurring a blurred image to obtain a clear image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种图像的处理方法、装置、存储介质和电子装置。其中,该方法包括:终端设备获取图像处理指令(S202),其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理;终端设备获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型(S204),其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型用于对模糊图像进行去模糊处理以得到清晰图像;终端设备响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像(S206);终端设备输出所述目标清晰图像(S208)。所述方法解决了无法对模糊图像进行去模糊的问题。

Description

图像的处理方法、装置、存储介质和电子装置
本申请要求于2018年04月04日提交中国专利局、申请号为201810301685.1、发明名称“图像的处理方法、装置、存储介质和电子装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及图像处理领域,具体而言,涉及一种图像的处理方法、装置、存储介质和电子装置。
背景技术
图像模糊在日常拍照中是一种很常见的问题,尤其是在动态场景或者光线较暗的环境中。图像的去模糊是一种必备并且重要的图像处理操作,恢复由于模糊而损失的细节信息。传统单帧图像去模糊方法是假设固定卷积核模型,通过不断迭代卷积核估计和图像反卷积两种操作来逐步优化去模糊效果。基于神经网络的图像去模糊方法大多是采用图像卷积模型用模糊核生成模糊图像来训练神经网络。
无论是传统迭代方法还是神经网络方法,都对模糊图像有着严格的卷积模型假设。其基本的解法是通过不断迭代卷积核估计和图像反卷积两种操作来逐步优化去模糊效果。不同的方法基于对自然图像不同的先验假设提出了特定优化方程,真实的模糊图像场景十分复杂,包括相机的运动和场景中物体的运动,理论上的先验假设很少满足,导致大部分去模糊方法在真实情况下的无法达到去模糊的效果,可靠性比较差。
针对上述的问题,目前尚未提出有效的解决方案。
发明内容
本申请实施例提供了一种图像的处理方法、装置、存储介质和电子装 置,以至少解决无法对模糊图像进行去模糊的技术问题。
根据本申请实施例的一个方面,提供了一种图像的处理方法,包括:终端设备获取图像处理指令,其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理;终端设备获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型被设置为对模糊图像进行去模糊处理以得到清晰图像;终端设备响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像;终端设备输出所述目标清晰图像。
根据本申请实施例的另一方面,还提供了一种图像的处理装置,包括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,程序单元由处理器执行,程序单元包括:第一获取单元,被设置为获取图像处理指令,其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理;第二获取单元,被设置为获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型用于对模糊图像进行去模糊处理以得到清晰图像;响应单元,被设置为响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像;输出单元,被设置为输出所述目标清晰图像。
根据本申请实施例的另一方面,还提供了一种非暂态计算机可读存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述的方法。
根据本申请实施例的另一方面,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为通过所述计算机程序执行上述的方法。
在本申请实施例中,由于用于训练目标模型的样本图像是根据真实拍摄的图像合成的,可以表示真实场景下模糊照片的特征,利用这些样本图 像对原始模型进行训练得到的目标模型,可以对模糊图像进行去模糊处理,得到清晰的图像。相比利用卷积核等计算方式来生成模糊图像的方式,避免了生成模糊图像过程中先验假设与真实情况的差距,也就避免了相关技术中生成的模糊图像训练出的目标模型无法实现去模糊的技术问题,达到了对模糊图像进行去模糊得到清晰图像的技术效果。
附图说明
此处所说明的附图用来提供对本申请实施例的进一步理解,构成本申请实施例的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是根据本申请实施例的一种硬件环境的示意图;
图2是根据本申请实施例的图像的处理方法的流程图;
图3是根据本申请实施例的第一种模糊图像的示意图;
图4是根据本申请实施例的对第一种模糊图像去模糊后得到的清晰图像的示意图;
图5是根据本申请实施例的原始模型的示意图;
图6是根据本申请实施例的残差单元的示意图;
图7是根据本申请实施例的第二种模糊图像的示意图;
图8是根据本申请实施例的对第二种模糊图像去模糊后得到的清晰图像的示意图;
图9是根据本申请实施例的第三种模糊图像的示意图;
图10是根据本申请实施例的对第三种模糊图像去模糊后得到的清晰图像的示意图;
图11是根据本申请实施例的第四种模糊图像的示意图;
图12是根据本申请实施例的对第四种模糊图像去模糊后得到的清晰 图像的示意图;
图13是根据本申请实施例的图像的处理装置的示意图;
图14是根据本申请实施例的电子装置的示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请实施例方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请实施例保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
根据本申请实施例的一个方面,提供了一种图像的处理方法。在本实施例中,上述图像的处理方法可以应用于如图1所示的终端101和服务器102所构成的硬件环境中。如图1所示,终端101通过网络与服务器102进行连接,上述网络包括但不限于:广域网、城域网或局域网,终端101可以是手机终端,也可以是PC终端、笔记本终端或平板电脑终端。
图2是根据本申请实施例的图像的处理方法的流程图,下面以终端设备执行上述标识的显示方法为例进行说明,该终端设备可以是图1所示的终端101,也可以是服务器102。如图2所示,该图像的处理方法包括:
S202,终端设备获取图像处理指令,其中,图像处理指令用于指示对目标模糊图像进行去模糊处理。
去模糊处理是指将模糊的图像处理为清晰的图像。如图3所示的图像左下角的图为图中小黄人后面的书籍的书脊字母的放大图,从左下角的图可以看出来,图像比较模糊,看不清楚书脊字母的内容。
如图4所示的图像左下角的图为图中小黄人后面的书籍的书脊字母的放大图。图4是对图3进行去模糊处理后得到的清晰图像,尤其对比图4和图3左下角,图4左下角的图比图3左下角的图更清晰,已经能够清晰的显示出书脊字母为“ROCESS”。
目标模糊图像可以是图3所示的图像,对图3所示的模糊图像进行去模糊处理得到图4所示的图像。去模糊处理的过程就是将上述图3所示的图像处理得到图4所示图像的过程。
S204,终端设备获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,样本图像为合成图像,合成图像为对多张清晰图像进行合成处理得到的模糊图像,目标模型用于对模糊图像进行去模糊处理以得到清晰图像。
S206,终端设备响应图像处理指令采用目标模型对目标模糊图像进行去模糊处理,以得到目标清晰图像。
将目标模糊图像输入到目标模型中,以便目标模型对目标模糊图像进行处理,输出目标清晰图像。目标清晰图像可以是图4所示的图像。目标模型可以是神经网络模型,该目标模型通过训练原始模型得到。训练原始模型需要的样本图像是通过多张清晰图像合成的模糊图像,生成的模糊图像对应的清晰图像就是合成模糊图像之前的清晰图像,也就是说,合成图像可以作为样本图像,多张清晰图像是目标模型的训练目标。在得到训练好的目标模型后,向目标模型中输入合成图像后可以输出与合成图像对应的清晰图像,该清晰图像可以是多张清晰图像中的任意一张。不同尺度的 样本图像可以是通过对同一张样本图像进行降采样得到的图像,降采样的粒度不同,得到的样本图像的尺度也不相同。
可选地,在获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,方法还包括:终端设备从帧画面集合中获取连续的多帧清晰图像,其中,帧画面集合为一段视频所有帧画面的集合;对多帧清晰图像进行合并处理,得到样本图像,其中,样本图像为模糊图像。
图像中模糊数据的产生通常是由于拍摄时相机的运动或者场景中物体的运动。这两种模糊本质上都是由于快门速度过慢导致。在快门打开到关闭的短时间内由于相机的运动或者场景的位移导致了相机内部的图像传感器像素采集的不只是某个固定位置的亮度而是在这个时刻内相关位置所有亮度的积分。该积分在高速相机拍摄的图像中可以近似为相邻连续图像的求和。这使得利用高速相机模拟真实模糊图片具备了可行性。本实施例采用高速相机来采集高速视频以合成足够的模糊图片。因为训练层数较深的卷积网络需要大量的数据,本实施例获取大量模糊图像进行训练。该模糊图像可以是高速相机在240帧每秒的速度下拍摄的高速视频。本实施例的帧画面集合是高速视频所有帧画面的集合,例如,一个5秒的高速视频,帧画面集合包括240*5=1200帧画面,每个帧画面就是一个清晰的图像。从帧画面集合中选择连续的多帧清晰图像,该连续的多帧清晰图像可以是在几百毫秒内拍摄得到的图像,几百毫秒内拍摄得到的图像也可以包括几十到几百张清晰图像,可以对这些清晰图像进行合成得到样本图像,也可以对这些清晰图像中的部分图像进行合成得到样本图像。
可选地,对多帧清晰图像进行合并处理,得到样本图像包括:从多帧清晰图像中随机选择部分图像;对部分图像分别针对每个通道进行先求和再取平均的处理,得到一张模糊的图像;将一张模糊的图像作为样本图像。
从连续的多帧清晰图像中随机选择部分进行合成,具体方式是对几帧图像进行求和取平均的方法得到模糊图片。在求和时可以对图像的每个通道的数据分别进行求和,然后分别对每个通道的数据进行求平均的处理, 求平均后得到的数据可以表示一个生成的模糊图像,即样本图像。
随机选择的部分图像可以生成多个模糊图像作为样本图像,例如,部分图像有20张,在合成样本图像时可以多次随机选择7-13张图像进行合成,每次选择7-13张图像就能得到一样模糊图像。比如,20张图像的编号依次为1、2、3、……20,第一次选择编号为1-4以及编号为10-13的图像进行合成,第二次可以选择编号为3、5、9、15、16、17、19和20的图像进行合成,每次选择的图片可以是随机的。
S208,终端设备输出目标清晰图像。
本实施例中,由于用于训练目标模型的样本图像是根据真实拍摄的图像合成的,可以表示真实场景下模糊照片的特征,利用这些样本图像对原始模型进行训练得到的目标模型,可以对模糊图像进行去模糊处理,得到清晰的图像。相比利用卷积核等计算方式来生成模糊图像的方式,避免了生成模糊图像过程中先验假设与真实情况的差距,也就避免了相关技术中生成的模糊图像训练出的目标模型无法实现去模糊的技术问题,达到了对模糊图像进行去模糊得到清晰图像的技术效果。
可选地,在获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,方法包括:终端设备重复执行以下步骤以对原始模型进行训练,直到中间图像的尺度与合成图像的尺度相同,其中,当前尺度被初始化为合成图像的第一图像的尺度,当前模型被初始化为原始模型,中间图像被初始化为第一图像,第一图像是通过对合成图像进行目标倍数的降采样得到的模糊图像:
从合成图像中获取尺度为当前尺度的第一图像;使用当前模型对尺度为当前尺度的第一图像和中间图像进行去模糊处理,得到第二图像,其中,第二图像为与第一图像关联的清晰图像;对第二图像进行放大处理,得到第三图像,其中,中间图像被更新为第三图像;将当前尺度更新为当前尺度的N倍,其中,N大于等于2;将当前模型更新为第一模型,其中,第一模型为根据第一图像对原始模型进行训练得到的模型。
本实施例采用迭代深度神经网络模型来训练目标模型。利用图像不同尺度的图像进行训练。尺度可以理解为图像的分辨率。在训练进行图像去模糊的目标模型的过程中,采用从粗尺度到细尺度迭代的顺序。在最粗的尺度(图片下采样到最小,分辨率较小),图片认为是比较清晰的,本实施例以此作为出发点,优化出当前尺度的清晰图片并上采样此清晰图片作为下一尺度的输入来估计出下一尺度的清晰图片,直到输出图像的尺度与原始图像尺度相同。其中当前尺度模糊图片为原始模糊图片降采样到当前尺度大小。通过向待训练的模型输入当前尺度模糊图片和放大的上一尺度清晰图片优化得到当前尺度的清晰图片进行训练,以此作为迭代目标最终优化出原来尺度的清晰图片。因此,去模糊分解为一系列多尺度下的子问题:输入当前尺度的模糊图片和初步去模糊图片(初步去模糊图片由上一个尺度估计出的清晰图片上采样得到),估计出当前尺度下的清晰图片。其基本模型如下:
I i,h i=Net SR(B i,I i+1↑,h i+1↑;θ SR)
其中i代表当前尺度(1代表最细的尺度);B i代表尺度i下的模糊图片;I i代表尺度i下输出的清晰图片;h i代表尺度i下的隐藏状态特征其中隐含估计了每个位置的卷积核信息;Net SR是迭代神经网络;θ SR是迭代神经网络中所有卷积层的权重;“↑”代表对图片进行放大2倍的操作,例如可以将128*128分辨率的图片放大为256*256分辨率的图片。这个公式的意义是对于尺度i,给定当前尺度的模糊图片B i和上采样上一尺度的清晰图片和隐藏状态作为神经网络的输入,输出当前尺度的清晰图片和隐藏状态。并以此从粗尺度到细尺度不断估计出清晰图像直到得到与样本图像相同尺度的清晰图像。
神经网络中的隐藏状态有几种不同选择,包括循环神经网络(RNN),长短期记忆网络(LSTM),门控制循环单元(GRU)。本实施例可以采用(LSTM)作为表示隐藏层信息的方式。对于如何从尺度i+1估计出的清晰图片放大(即上采样上一尺度的清晰图片)作为下一尺度的输入,同样 也有几种不同的选项,包括反卷积,缩放。基于效率和速度的考量,本实施例可以选择双线性插值缩放的方法。
例如:
样本图像的尺度为256*256,即水平方向和竖直方向各有256个像素,当前尺度为第一图像的尺度,该第一图像的尺度为64*64,第一图像为从样本图像降采样得到。降采样的方式可以是间隔采样,在样本图像减少采样点(例如间隔几个点进行采样)。样本图像是模糊图像,降采样后得到的第一图像也是模糊图像,步骤如下:
1、将作为样本图像的第一图像和作为中间图像的第一图像输入到原始模型中进行训练,输出初步去模糊处理得到的第二图像,第二图像的尺度为64*64,此时原始模型经过训练后更新为第一模型;第一图像为粗尺度的图像,第一图像和中间图像作为原始模型的输入图像,输出同样为粗尺度的第二图像,第二图像作为原始模型的输出图像。其中,第一模型和第二模型的网络结构相同,第一模型和第二模型的参数不同。
2、对第二图像进行放大处理,得到尺度为128*128的第三图像。放大处理可以采样插值上采样;
3、对样本图像进行降采样,得到尺度为128*128的第四图像。
4、将作为样本图像的第四图像和作为中间图像的第三图像输入到第一模型中进行训练,输出经过去模糊处理得到的第五图像,第五图像比第四图像更清晰,第五图像的尺度为128*128,此时第一模型经过训练后更新为第二模型;第三图像和第四图像为中尺度的图像,第三图像和第四图像作为第一模型的输入图像,输出同样为中尺度的第五图像,第五图像为第一模型的输出图像。其中,第二模型和第一模型的网络结构相同,第二模型和第一模型的参数不同。
5、对第五图像进行放大处理,得到尺度为256*256的第六图像。放大处理可以采样插值上采样;
6、对样本图像和第六图像输入到第二模型中进行训练,输出经过去模糊处理得到的第七图像,此时第二模型经过训练后更新为第三模型。第七图像的尺度与样本图像的尺度相同。将样本图像更新为新的图像,继续利用更新后的样本图像进行训练,直到训练集中的所有图像都完成训练。在训练集中的所有图像都完成训练后得到的模型作为目标模型。其中,第六图像和第七图像为细尺度的图像,第六图像和第七图像作为第二模型的输入图像,输出同样为细尺度的图像,输出的图像的尺度与样本图像的尺度相同。
此处尺度的倍数关系为2,需要说明的是,实际训练的过程中,可以采用不同的倍数关系。本实施例的样本图像的尺度可以更大,例如1024*1024,从样本图像上取出一部分图像对原始模型进行训练,可以节约训练模型所需的内存空间。
可选地,当前模型包括编解码网络,编解码网络包括编码网络和解码网络,使用当前模型对尺度为当前尺度的第一图像和中间图像进行去模糊处理,得到第二图像包括:使用编码网络对第一图像和中间图像进行编码处理,得到第一结果,其中,编码网络的两层卷积还包括残差单元,残差单元用于将两层卷积计算之前的数据添加到两层卷积计算之后的数据中;使用解码网络对编码网络输出的第一结果进行解码处理,得到第二图像,其中,解码网络的两层卷积包括残差单元。
当前模型如图5所示,图5示出了当前模型中的3个编解码网络,图5从输入B3到输出I3是一个编解码网络,从输入B2到输出I2是一个编解码网络,从输入B1到输出I1是一个编解码网络。每个编解码网络可以对一个图像进行去模糊处理,编码网络和解码网络中的每两层卷积包括残差单元,残差单元的示意图如图6所示。本实施例将编解码网络中降维卷积或者升维卷积层之后的非线性卷积替换为残差单元,保证编码网络和解码网络中每个空间维度下的残差单元数目一致。残差单元可以计算编解码网络中一个块的输出和输入的差值,使得计算量变小,更容易学习,优化 网络的学习能力。再通过跳跃连接编码网络和解码网络对应的特征可以进一步优化网络的学习能力。
图5的编解码网络是对称的网络,包括编码网络和解码网络,编码网络可以对模糊图像进行编码处理,并将编码处理后的第一结果输出给解码网络,由解码网络进行处理,以输出清晰的图片,编码和解码的过程实现了去模糊处理。
本实施例的编解码网络的结构,如图5所示,可以分解为三个模块,分别为编码网络Net E,隐藏层单元ConvLSTM,解码网络Net D,依次采用以下公式表示:
f i=Net E(B i,I i+1↑;θ E)
h i,g i=ConvLSTM(h i+1↑,f i;θ LSTM)
I i=Net D(g i;θ D)
其中,f i表示第i个尺度的编码特征,B i为第i个尺度下的模糊图片,I i+1为第i个尺度的上一个尺度输出的清晰图像的放大图,h i表示第i个尺度的隐藏信息,h i+1表示第i个尺度的上一个尺度的隐藏信息,g i表示对f优化后的结果,θ E、θ LSTM、θ D分别代表编码网络Net E中所有卷积层的权重、隐藏层单元ConvLSTM中所有卷积层的权重、解码网络Net D中所有卷积层的权重,“↑”代表对图片进行放大2倍的操作,其中,编码网络和解码网络都包含了残差单元来增加网络学***衡去模糊效果和计算代价。
以下结合图5对本实施例进行说明。
样本图像的尺度为1000*2000,即水平方向上有1000个像素,竖直方向有2000个像素,当前尺度为第一图像的尺度,该第一图像的尺度为250*500,第一图像为从样本图像降采样得到。降采样的方式可以是间隔采样,在样本图像减少采样点(例如间隔几个点进行采样)。样本图像是模糊图像,降采样后得到的第一图像也是模糊图像,步骤如下:
1、将作为样本图像的第一图像和作为中间图像的第一图像作为输入B 3输入到原始模型中进行训练,输出初步去模糊处理得到的第二图像I 3,第二图像的尺度为250*500,此时原始模型经过训练后更新为第一模型;
2、对第二图像进行放大处理,得到尺度为500*1000的第三图像。放大处理可以采样插值上采样;
3、对样本图像进行降采样,得到尺度为500*1000的第四图像。
4、将作为样本图像的第四图像和作为中间图像的第三图像作为输入B 2输入到第一模型中进行训练,输出经过去模糊处理得到的第五图像I 2,第五图像比第四图像更清晰,第五图像的尺度为500*1000,此时第一模型经过训练后更新为第二模型;
5、对第五图像进行放大处理,得到尺度为1000*2000的第六图像。放大处理可以采样插值上采样;
6、对样本图像和第六图像作为输入B 1输入到第二模型中进行训练,输出经过去模糊处理得到的第七图像I 1,此时第二模型经过训练后更新为第三模型。第七图像的尺度与样本图像的尺度相同,结束训练。
结合图7和图8,图7的图像可以作为输入原始模型的样本图像,图8所示的图像可以作为第七图像。
结合图9和图10,图9的图像可以作为输入原始模型的样本图像,图10所示的图像可以作为第七图像。
结合图11和图12,图11的图像可以作为输入原始模型的样本图像,图12所示的图像可以作为第七图像。
本实施例中,通过深度迭代神经网络模型对原始模型进行训练得到目标模型,在训练过程中,将上一个尺度得到的清晰图像放大后作为当前尺度的输入,结合当前尺度的模糊图片进行训练,以得到目标模型,利用目标模型将模糊图像去模糊处理得到清晰图像。
可选地,使用当前模型对尺度为当前尺度的第一图像和中间图像进行去模糊处理,得到第二图像包括:获取不同尺度的图像的固有信息,其中,固有信息通过当前模型中的递归神经网络在处理不同尺度的编解码网络之间传输;使用编解码网络、结合固有信息对尺度为当前尺度的第一图像和中间图像进行去模糊处理,得到第二图像。
本实施例需要在不同尺度之间传递关于模糊的隐藏信息,编解码网络内部需要添加相应处理迭代信息的模块。如图5所示,在解码网络中间的位置添加了LSTM模块(长短期记忆网络),使得此模块可以在不同尺度中间传递的隐藏信息。隐藏信息可以是不同尺度的图片之间的共同信息,比如不同尺度的图像的结构等信息。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请实施例所必须的。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
根据本申请实施例的另一个方面,还提供了一种用于实施上述图像的处理方法的图像的处理装置。图13是根据本申请实施例的图像的处理装置的示意图。如图13所示,该装置包括一个或多个处理器,以及一个或 多个存储程序单元的存储器,其中,程序单元由处理器执行,程序单元可以包括:
第一获取单元1302,被设置为获取图像处理指令,其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理。
去模糊处理是指将模糊的图像处理为清晰的图像。如图3所示的图像左下角的图为图中小黄人后面的书籍的书脊字母的放大图,从左下角的图可以看出来,图像比较模糊,看不清楚书脊字母的内容。
如图4所示的图像左下角的图为图中小黄人后面的书籍的书脊字母的放大图。图4是对图3进行去模糊处理后得到的清晰图像,尤其对比图4和图3左下角,图4左下角的图比图3左下角的图更清晰,已经能够清晰的显示出书脊字母为“ROCESS”。
目标模糊图像可以是图3所示的图像,对图3所示的模糊图像进行去模糊处理得到图4所示的图像。去模糊处理的过程就是将上述图3所示的图像处理得到图4所示图像的过程。
第二获取单元1304,被设置为获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型用于对模糊图像进行去模糊处理以得到清晰图像。
响应单元1306被设置为响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像。
将目标模糊图像输入到目标模型中,以便目标模型对目标模糊图像进行处理,输出目标清晰图像。目标清晰图像可以是图4所示的图像(d)。目标模型可以是神经网络模型,该目标模型通过训练原始模型得到。训练原始模型需要的样本图像是通过多张清晰图像合成的模糊图像,生成的模糊图像对应的清晰图像就是合成模糊图像之前的清晰图像,也就是说,合成图像可以作为样本图像,多张清晰图像是目标模型的训练目标。在得到 训练好的目标模型后,向目标模型中输入合成图像后可以输出与合成图像对应的清晰图像,该清晰图像可以是多张清晰图像中的任意一张。
可选地,所述装置还包括:第三获取单元,被设置为在获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,从帧画面集合中获取连续的多帧清晰图像,其中,所述帧画面集合为一段视频所有帧画面的集合;合并单元,被设置为对所述多帧清晰图像进行合并处理,得到所述样本图像,其中,所述样本图像为模糊图像。
图像中模糊数据的产生通常是由于拍摄时相机的运动或者场景中物体的运动。这两种模糊本质上都是由于快门速度过慢导致。在快门打开到关闭的短时间内由于相机的运动或者场景的位移导致了相机内部的图像传感器像素采集的不只是某个固定位置的亮度而是在这个时刻内相关位置所有亮度的积分。该积分在高速相机拍摄的图像中可以近似为相邻连续图像的求和。这使得利用高速相机模拟真实模糊图片具备了可行性。本实施例采用高速相机来采集高速视频以合成足够的模糊图片。因为训练层数较深的卷积网络需要大量的数据,本实施例获取大量模糊图像进行训练。该模糊图像可以是高速相机在240帧每秒的速度下拍摄的高速视频。本实施例的帧画面集合是高速视频所有帧画面的集合,例如,一个5秒的高速视频,帧画面集合包括240*5=1200帧画面,每个帧画面就是一个清晰的图像。从帧画面集合中选择连续的多帧清晰图像,该连续的多帧清晰图像可以是在几百毫秒内拍摄得到的图像,几百毫秒内拍摄得到的图像也可以包括几十到几百张清晰图像,可以对这些清晰图像进行合成得到样本图像,也可以对这些清晰图像中的部分图像进行合成得到样本图像。
可选地,所述合并单元包括:选择模块,被设置为从所述多帧清晰图像中随机选择部分图像;第二处理模块,被设置为对所述部分图像分别针对每个通道进行先求和再取平均的处理,得到一张模糊的图像;确定模块,被设置为将所述一张模糊的图像作为所述样本图像。
从连续的多帧清晰图像中随机选择部分进行合成,具体方式是对几帧 图像进行求和取平均的方法得到模糊图片。在求和时可以对图像的每个通道的数据分别进行求和,然后分别对每个通道的数据进行求平均的处理,求平均后得到的数据可以表示一个生成的模糊图像,即样本图像。
随机选择的部分图像可以生成多个模糊图像作为样本图像,例如,部分图像有20张,在合成样本图像时可以多次随机选择7-13张图像进行合成,每次选择7-13张图像就能得到一样模糊图像。比如,20张图像的编号依次为1、2、3、……20,第一次选择编号为1-4以及编号为10-13的图像进行合成,第二次可以选择编号为3、5、9、15、16、17、19和20的图像进行合成,每次选择的图片可以是随机的。
输出单元1308被设置为输出所述目标清晰图像。
本实施例中,由于用于训练目标模型的样本图像是根据真实拍摄的图像合成的,可以表示真实场景下模糊照片的特征,利用这些样本图像对原始模型进行训练得到的目标模型,可以对模糊图像进行去模糊处理,得到清晰的图像。相比利用卷积和等计算方式来生成模糊图像的方式,避免了生成模糊图像过程中先验假设与真实情况的差距,也就避免了相关技术中生成的模糊图像训练出的目标模型无法实现去模糊的技术问题,达到了对模糊图像进行去模糊得到清晰图像的技术效果。
可选地,所述装置包括:训练单元,被设置为在获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,重复调用以下模块以对所述原始模型进行训练,直到中间图像的尺度与所述合成图像的尺度相同,其中,当前尺度被初始化为所述合成图像的第一图像的尺度,当前模型被初始化为所述原始模型,中间图像被初始化为所述第一图像,所述第一图像是通过对所述合成图像进行目标倍数的降采样得到的模糊图像:
第一获取模块,被设置为从所述合成图像中获取尺度为所述当前尺度的第一图像;第一处理模块,被设置为使用所述当前模型对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像,其中,所述第二图像为与所述第一图像关联的清晰图像;放大模块,被设 置为对所述第二图像进行放大处理,得到第三图像,其中,所述中间图像被更新为所述第三图像;第一更新模块,被设置为将所述当前尺度更新为所述当前尺度的N倍,其中,N大于等于2;第二更新模块,被设置为将所述当前模型更新为第一模型,其中,所述第一模型为根据所述第一图像对所述原始模型进行训练得到的模型。
本实施例采用迭代深度神经网络模型来训练目标模型。利用图像不同尺度的图像进行训练。尺度可以理解为图像的分辨率。在训练进行图像去模糊的目标模型的过程中,采用从粗尺度到细尺度迭代的顺序。在最粗的尺度(图片下采样到最小,分辨率较小),图片认为是比较清晰的,本实施例以此作为出发点,优化出当前尺度的清晰图片并上采样此清晰图片作为下一尺度的输入来估计出下一尺度的清晰图片,直到输出图像的尺度与原始图像尺度相同。其中当前尺度模糊图片为原始模糊图片降采样到当前尺度大小。通过向待训练的模型输入当前尺度模糊图片和放大的上一尺度清晰图片优化得到当前尺度的清晰图片进行训练,以此作为迭代目标最终优化出原来尺度的清晰图片。因此,去模糊分解为一系列多尺度下的子问题:输入当前尺度的模糊图片和初步去模糊图片(初步去模糊图片由上一个尺度估计出的清晰图片上采样得到),估计出当前尺度下的清晰图片。其基本模型如下:
I i,h i=Net SR(B i,I i+1↑,h i+1↑;θ SR)
其中i代表当前尺度(1代表最细的尺度);B i代表尺度i下的模糊图片;I i代表尺度i下输出的清晰图片;h i代表尺度i下的隐藏状态特征其中隐含估计了每个位置的卷积核信息;Net SR是迭代神经网络;θ SR是迭代神经网络中所有卷积层的权重;“↑”代表对图片进行放大2倍的操作。这个公式的意义是对于尺度i,给定当前尺度的模糊图片B i和上采样上一尺度的清晰图片和隐藏状态作为神经网络的输入,输出当前尺度的清晰图片和隐藏状态。并以此从粗尺度到细尺度不断估计出清晰图像直到得到与样本图像 相同尺度的清晰图像。
神经网络中的隐藏状态有几种不同选择,包括循环神经网络RNN,长短期记忆网络LSTM,门控制循环单元GRU。本实施例可以采用LSTM作为表示隐藏层信息的方式。对于如何从尺度i+1估计出的清晰图片放大(即上采样上一尺度的清晰图片)作为下一尺度的输入,同样也有几种不同的选项,包括反卷积,缩放。基于效率和速度的考量,本实施例可以选择双线性插值缩放的方法。
例如:
样本图像的尺度为256*256,即水平方向和竖直方向各有256个像素,当前尺度为第一图像的尺度,该第一图像的尺度为64*64,第一图像为从样本图像降采样得到。降采样的方式可以是间隔采样,在样本图像减少采样点(例如间隔几个点进行采样)。样本图像是模糊图像,降采样后得到的第一图像也是模糊图像,步骤如下:
1、将作为样本图像的第一图像和作为中间图像的第一图像输入到原始模型中进行训练,输出初步去模糊处理得到的第二图像,第二图像的尺度为64*64,此时原始模型经过训练后更新为第一模型;第一图像为粗尺度的图像,第一图像和中间图像作为原始模型的输入图像,输出同样为粗尺度的第二图像,第二图像作为原始模型的输出图像。其中,第一模型和第二模型的网络结构相同,第一模型和第二模型的参数不同。
2、对第二图像进行放大处理,得到尺度为128*128的第三图像。放大处理可以采样插值上采样;
3、对样本图像进行降采样,得到尺度为128*128的第四图像。
4、将作为样本图像的第四图像和作为中间图像的第三图像输入到第一模型中进行训练,输出经过去模糊处理得到的第五图像,第五图像比第四图像更清晰,第五图像的尺度为128*128,此时第一模型经过训练后更新为第二模型;第三图像和第四图像为中尺度的图像,第三图像和第四图 像作为第一模型的输入图像,输出同样为中尺度的第五图像,第五图像为第一模型的输出图像。其中,第二模型和第一模型的网络结构相同,第二模型和第一模型的参数不同。
5、对第五图像进行放大处理,得到尺度为256*256的第六图像。放大处理可以采样插值上采样;
6、对样本图像和第六图像输入到第二模型中进行训练,输出经过去模糊处理得到的第七图像,此时第二模型经过训练后更新为第三模型。第七图像的尺度与样本图像的尺度相同。将样本图像更新为新的图像,继续利用更新后的样本图像进行训练,直到训练集中的所有图像都完成训练。在训练集中的所有图像都完成训练后得到的模型作为目标模型。其中,第六图像和第七图像为细尺度的图像,第六图像和第七图像作为第二模型的输入图像,输出同样为细尺度的图像,输出的图像的尺度与样本图像的尺度相同。
此处尺度的倍数关系为2,需要说明的是,实际训练的过程中,可以采用不同的倍数关系。本实施例的样本图像的尺度可以更大,例如1024*1024,从样本图像上取出一部分图像对原始模型进行训练,可以节约训练模型所需的内存空间。
可选地,所述当前模型包括编解码网络,所述编解码网络包括编码网络和解码网络,第一处理模块包括:编码子模块,被设置为使用所述编码网络对所述第一图像和所述中间图像进行编码处理,得到第一结果,其中,所述编码网络的两层卷积还包括残差单元,所述残差单元用于将所述两层卷积计算之前的数据添加到所述两层卷积计算之后的数据中;解码子模块,被设置为使用所述解码网络对所述编码网络输出的所述第一结果进行解码处理,得到所述第二图像,其中,所述解码网络的两层卷积包括所述残差单元。
当前模型如图5所示,图5示出了当前模型中的3个编解码网络,图5从输入B3到输出I3是一个编解码网络,从输入B2到输出I2是一个编 解码网络,从输入B1到输出I1是一个编解码网络。每个编解码网络可以对一个图像进行去模糊处理,编码网络和解码网络中的每两层卷积包括残差单元,残差单元的示意图如图6所示。本实施例将编解码网络中降维卷积或者升维卷积层之后的非线性卷积替换为残差单元,保证编码网络和解码网络中每个空间维度下的残差单元数目一致。残差单元可以计算编解码网络中一个块的输出和输入的差值,使得计算量变小,更容易学习,优化网络的学习能力。再通过跳跃连接编码网络和解码网络对应的特征可以进一步优化网络的学习能力。
图5的编解码网络是对称的网络,包括编码网络和解码网络,编码网络可以对模糊图像进行编码处理,并将编码处理后的第一结果输出给解码网络,由解码网络进行处理,以输出清晰的图片,编码和解码的过程实现了去模糊处理。
本实施例的编解码网络的结构,如图5所示,可以分解为三个模块,分别为编码网络Net E(包括图5中的输入块、E块#1和E块#2),隐藏层单元ConvLSTM(图5所示的LSTM),解码网络Net D(包括图5中的输出块、D块#1和D块#2),依次采用以下公式表示:
f i=Net E(B i,I i+1↑;θ E)
h i,g i=ConvLSTM(h i+1↑,f i;θ LSTM)
I i=Net D(g i;θ D)
其中,f i表示第i个尺度的编码特征,B i为第i个尺度下的模糊图片,I i+1为第i个尺度的上一个尺度输出的清晰图像的放大图,h i表示第i个尺度的隐藏信息,h i+1表示第i个尺度的上一个尺度的隐藏信息,g i表示对f优化后的结果,θ E、θ LSTM、θ D分别代表编码网络Net E中所有卷积层的权重、隐藏层单元ConvLSTM中所有卷积层的权重、解码网络Net D中所有卷积层的权重,“↑”代表对图片进行放大2倍的操作,其中,编码网络和解码网络都包含了残差单元来增加网络学***衡去模糊效果和计算代价。
以下结合图5对本实施例进行说明。
样本图像的尺度为1000*2000,即水平方向上有1000个像素,竖直方向有2000个像素,当前尺度为第一图像的尺度,该第一图像的尺度为250*500,第一图像为从样本图像降采样得到。降采样的方式可以是间隔采样,在样本图像减少采样点(例如间隔几个点进行采样)。样本图像是模糊图像,降采样后得到的第一图像也是模糊图像,步骤如下:
1、将作为样本图像的第一图像和作为中间图像的第一图像作为输入B 3输入到原始模型中进行训练,输出初步去模糊处理得到的第二图像I 3,第二图像的尺度为250*500,此时原始模型经过训练后更新为第一模型;
2、对第二图像进行放大处理,得到尺度为500*1000的第三图像。放大处理可以采样插值上采样;
3、对样本图像进行降采样,得到尺度为500*1000的第四图像。
4、将作为样本图像的第四图像和作为中间图像的第三图像作为输入B 2输入到第一模型中进行训练,输出经过去模糊处理得到的第五图像I 2,第五图像比第四图像更清晰,第五图像的尺度为500*1000,此时第一模型经过训练后更新为第二模型;
5、对第五图像进行放大处理,得到尺度为1000*2000的第六图像。放大处理可以采样插值上采样;
6、对样本图像和第六图像作为输入B 1输入到第二模型中进行训练,输出经过去模糊处理得到的第七图像I 1,此时第二模型经过训练后更新为第三模型。第七图像的尺度与样本图像的尺度相同,结束训练。
结合图7和图8,图7的图像可以作为输入原始模型的样本图像,图8所示的图像可以作为第七图像。
结合图9和图10,图9的图像可以作为输入原始模型的样本图像,图 10所示的图像可以作为第七图像。
结合图11和图12,图11的图像可以作为输入原始模型的样本图像,图12所示的图像可以作为第七图像。
本实施例中,通过深度迭代神经网络模型对原始模型进行训练得到目标模型,在训练过程中,将上一个尺度得到的清晰图像放大后作为当前尺度的输入,结合当前尺度的模糊图片进行训练,以得到目标模型,利用目标模型将模糊图像去模糊处理得到清晰图像。
可选地,所述第一处理模块包括:获取子模块,被设置为获取不同尺度的图像的固有信息,其中,所述固有信息通过所述当前模型中的递归神经网络在处理不同尺度的编解码网络之间传输;处理子模块,被设置为使用所述编解码网络、结合所述固有信息对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像。
本实施例需要在不同尺度之间传递关于模糊的隐藏信息,编解码网络内部需要添加相应处理迭代信息的模块。如图5所示,在解码网络中间的位置添加了LSTM模块(长短期记忆网络),使得此模块可以在不同尺度中间传递的隐藏信息。隐藏信息可以是不同尺度的图片之间的共同信息,比如不同尺度的图像的结构等信息。
根据本申请实施例的又一个方面,还提供了一种用于实施上述图像的处理方法的电子装置,该电子装置可以是图1所示的终端101,也可以是服务器102,如图14所示,该电子装置包括,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为通过计算机程序执行上述任一项方法实施例中的步骤。
可选地,图14是根据本申请实施例的一种电子装置的结构框图。如图14所示,该电子装置可以包括:一个或多个(图中仅示出一个)处理器1401、至少一个通信总线1402、用户接口1403、至少一个传输装置1404和存储器1405。其中,通信总线1402用于实现这些组件之间的连接通信。其中,用户接口1403可以包括显示器1406和键盘1407。传输装置1404 可选的可以包括标准的有线接口和无线接口。
可选地,在本实施例中,上述电子装置可以位于计算机网络的多个网络设备中的至少一个网络设备。
可选地,在本实施例中,上述处理器可以被设置为通过计算机程序执行以下步骤:
S1,获取图像处理指令,其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理;
S2,获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型用于对模糊图像进行去模糊处理以得到清晰图像;
S3,响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像;
S4,输出所述目标清晰图像。
可选地,本领域普通技术人员可以理解,图14所示的结构仅为示意,电子装置也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌上电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图14其并不对上述电子装置的结构造成限定。例如,电子装置还可包括比图14中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图14所示不同的配置。
其中,存储器1405可被设置为存储软件程序以及模块,如本申请实施例中的图像的处理方法和装置对应的程序指令/模块,处理器1401通过运行存储在存储器1405内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的图像的处理方法。存储器1405可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器1405可进 一步包括相对于处理器1401远程设置的存储器,这些远程存储器可以通过网络连接至终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
上述的传输装置1404被设置为经由一个网络接收或者发送数据。上述的网络具体实例可包括有线网络及无线网络。在一个实例中,传输装置1404包括一个网络适配器(Network Interface Controller,NIC),其可通过网线与其他网络设备与路由器相连从而可与互联网或局域网进行通讯。在一个实例中,传输装置1404为射频(Radio Frequency,RF)模块,其被设置为通过无线方式与互联网进行通讯。
其中,具体地,存储器1405被设置为存储样本图像。
本申请的实施例还提供了一种存储介质,该存储介质可以是非暂态计算机可读存储介质,该存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的计算机程序:
S1,获取图像处理指令,其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理;
S2,获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型用于对模糊图像进行去模糊处理以得到清晰图像;
S3,响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像;
S4,输出所述目标清晰图像。
可选地,存储介质还被设置为存储用于执行以下步骤的计算机程序:
重复执行以下步骤以对所述原始模型进行训练,直到中间图像的尺度与所述合成图像的尺度相同,其中,当前尺度被初始化为所述合成图像的第一图像的尺度,当前模型被初始化为所述原始模型,中间图像被初始化为所述第一图像,所述第一图像是通过对所述合成图像进行目标倍数的降采样得到的模糊图像:从所述合成图像中获取尺度为所述当前尺度的第一图像;使用所述当前模型对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像,其中,所述第二图像为与所述第一图像关联的清晰图像;对所述第二图像进行放大处理,得到第三图像,其中,所述中间图像被更新为所述第三图像;将所述当前尺度更新为所述当前尺度的N倍,其中,N大于等于2;将所述当前模型更新为第一模型,其中,所述第一模型为根据所述第一图像对所述原始模型进行训练得到的模型。
可选地,存储介质还被设置为存储用于执行以下步骤的计算机程序:从帧画面集合中获取连续的多帧清晰图像,其中,所述帧画面集合为一段视频所有帧画面的集合;对所述多帧清晰图像进行合并处理,得到所述样本图像,其中,所述样本图像为模糊图像。
可选地,存储介质还被设置为存储用于执行以下步骤的计算机程序:从所述多帧清晰图像中随机选择部分图像;对所述部分图像分别针对每个通道进行先求和再取平均的处理,得到一张模糊的图像;将所述一张模糊的图像作为所述样本图像。
可选地,存储介质还被设置为存储用于执行以下步骤的计算机程序:使用所述编码网络对所述第一图像和所述中间图像进行编码处理,得到第一结果,其中,所述编码网络的两层卷积还包括残差单元,所述残差单元用于将所述两层卷积计算之前的数据添加到所述两层卷积计算之后的数据中;使用所述解码网络对所述编码网络输出的所述第一结果进行解码处理,得到所述第二图像,其中,所述解码网络的两层卷积包括所述残差单元。
可选地,存储介质还被设置为存储用于执行以下步骤的计算机程序:获取不同尺度的图像的固有信息,其中,所述固有信息通过所述当前模型中的递归神经网络在处理不同尺度的编解码网络之间传输;使用所述编解码网络、结合所述固有信息对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像。
可选地,存储介质还被设置为存储用于执行上述实施例中的方法中所包括的步骤的计算机程序,本实施例中对此不再赘述。
可选地,在本实施例中,本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或 通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请实施例原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请实施例的保护范围。
工业实用性
在本申请实施例中,由于用于训练目标模型的样本图像是根据真实拍摄的图像合成的,可以表示真实场景下模糊照片的特征,利用这些样本图像对原始模型进行训练得到的目标模型,可以对模糊图像进行去模糊处理,得到清晰的图像。相比利用卷积核等计算方式来生成模糊图像的方式,避免了生成模糊图像过程中先验假设与真实情况的差距,也就避免了相关技术中生成的模糊图像训练出的目标模型无法实现去模糊的技术问题,达到了对模糊图像进行去模糊得到清晰图像的技术效果。

Claims (14)

  1. 一种图像的处理方法,包括:
    终端设备获取图像处理指令,其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理;
    所述终端设备获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型用于对模糊图像进行去模糊处理以得到清晰图像;
    所述终端设备响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像
    所述终端设备输出所述目标清晰图像。
  2. 根据权利要求1所述的方法,其中,在所述终端设备获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,所述方法包括:
    所述终端设备重复执行以下步骤以对所述原始模型进行训练,直到中间图像的尺度与所述合成图像的尺度相同,其中,当前尺度被初始化为所述合成图像的第一图像的尺度,当前模型被初始化为所述原始模型,中间图像被初始化为所述第一图像,所述第一图像是通过对所述合成图像进行目标倍数的降采样得到的模糊图像:
    所述终端设备从所述合成图像中获取尺度为所述当前尺度的第一图像;
    所述终端设备使用所述当前模型对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像,其中,所述第二图像为与所述第一图像关联的清晰图像;
    所述终端设备对所述第二图像进行放大处理,得到第三图像,其中,所述中间图像被更新为所述第三图像;
    所述终端设备将所述当前尺度更新为所述当前尺度的N倍,其中,N大于等于2;
    所述终端设备将所述当前模型更新为第一模型,其中,所述第一模型为根据所述第一图像对所述原始模型进行训练得到的模型。
  3. 根据权利要求1所述的方法,其中,在所述终端设备获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,所述方法还包括:
    所述终端设备从帧画面集合中获取连续的多帧清晰图像,其中,所述帧画面集合为一段视频所有帧画面的集合;
    所述终端设备对所述多帧清晰图像进行合并处理,得到所述样本图像,其中,所述样本图像为模糊图像。
  4. 根据权利要求3所述的方法,其中,对所述多帧清晰图像进行合并处理,得到所述样本图像包括:
    从所述多帧清晰图像中随机选择部分图像;
    对所述部分图像分别针对每个通道进行先求和再取平均的处理,得到一张模糊的图像;
    将所述一张模糊的图像作为所述样本图像。
  5. 根据权利要求2所述的方法,其中,所述当前模型包括编解码网络,所述编解码网络包括编码网络和解码网络,使用所述当前模型对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像包括:
    使用所述编码网络对所述第一图像和所述中间图像进行编码处理,得到第一结果,其中,所述编码网络的两层卷积还包括残差单元,所述残差单元用于将所述两层卷积计算之前的数据添加到所述两层卷积计算之后的数据中;
    使用所述解码网络对所述编码网络输出的所述第一结果进行解码处理,得到所述第二图像,其中,所述解码网络的两层卷积包括所 述残差单元。
  6. 根据权利要求2所述的方法,其中,使用所述当前模型对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像包括:
    获取不同尺度的图像的固有信息,其中,所述固有信息通过所述当前模型中的递归神经网络在处理不同尺度的编解码网络之间传输;
    使用所述编解码网络、结合所述固有信息对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像。
  7. 一种图像的处理装置,包括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,所述程序单元由所述处理器执行,所述程序单元包括:
    第一获取单元,被设置为获取图像处理指令,其中,所述图像处理指令用于指示对目标模糊图像进行去模糊处理;
    第二获取单元,被设置为获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型,其中,所述样本图像为合成图像,所述合成图像为对多张清晰图像进行合成处理得到的模糊图像,所述目标模型用于对模糊图像进行去模糊处理以得到清晰图像;
    响应单元,被设置为响应所述图像处理指令采用目标模型对所述目标模糊图像进行去模糊处理,以得到目标清晰图像;
    输出单元,被设置为输出所述目标清晰图像。
  8. 根据权利要求7所述的装置,其中,所述装置包括:
    训练单元,被设置为在获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,重复调用以下模块以对所述原始模型进行训练,直到中间图像的尺度与所述合成图像的尺度相同,其中,当前尺度被初始化为所述合成图像的第一图像的尺度,当前模型被初始化为所述原始模型,中间图像被初始化为所述第一图像,所述第一图像是通过对所述合成图像进行目标倍数的降采样得到的模糊图像:
    第一获取模块,被设置为从所述合成图像中获取尺度为所述当前尺度的第一图像;
    第一处理模块,被设置为使用所述当前模型对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像,其中,所述第二图像为与所述第一图像关联的清晰图像;
    放大模块,被设置为对所述第二图像进行放大处理,得到第三图像,其中,所述中间图像被更新为所述第三图像;
    第一更新模块,被设置为将所述当前尺度更新为所述当前尺度的N倍,其中,N大于等于2;
    第二更新模块,被设置为将所述当前模型更新为第一模型,其中,所述第一模型为根据所述第一图像对所述原始模型进行训练得到的模型。
  9. 根据权利要求7所述的装置,其中,所述装置还包括:
    第三获取单元,被设置为在获取利用不同尺度的样本图像对原始模型进行训练得到的目标模型之前,从帧画面集合中获取连续的多帧清晰图像,其中,所述帧画面集合为一段视频所有帧画面的集合;
    合并单元,被设置为对所述多帧清晰图像进行合并处理,得到所述样本图像,其中,所述样本图像为模糊图像。
  10. 根据权利要求9所述的装置,其中,所述合并单元包括:
    选择模块,被设置为从所述多帧清晰图像中随机选择部分图像;
    第二处理模块,被设置为对所述部分图像分别针对每个通道进行先求和再取平均的处理,得到一张模糊的图像;
    确定模块,被设置为将所述一张模糊的图像作为所述样本图像。
  11. 根据权利要求8所述的装置,其中,所述当前模型包括编解码网络,所述编解码网络包括编码网络和解码网络,第一处理模块包括:
    编码子模块,被设置为使用所述编码网络对所述第一图像和所述中间图像进行编码处理,得到第一结果,其中,所述编码网络的两层 卷积还包括残差单元,所述残差单元被设置为将所述两层卷积计算之前的数据添加到所述两层卷积计算之后的数据中;
    解码子模块,被设置为使用所述解码网络对所述编码网络输出的所述第一结果进行解码处理,得到所述第二图像,其中,所述解码网络的两层卷积包括所述残差单元。
  12. 根据权利要求8所述的装置,其中,所述第一处理模块包括:
    获取子模块,被设置为获取不同尺度的图像的固有信息,其中,所述固有信息通过所述当前模型中的递归神经网络在处理不同尺度的编解码网络之间传输;
    处理子模块,被设置为使用所述编解码网络、结合所述固有信息对尺度为所述当前尺度的所述第一图像和所述中间图像进行去模糊处理,得到第二图像。
  13. 一种非暂态计算机可读存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行所述权利要求1至6任一项中所述的方法。
  14. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为通过所述计算机程序执行所述权利要求1至6任一项中所述的方法。
PCT/CN2019/079332 2018-04-04 2019-03-22 图像的处理方法、装置、存储介质和电子装置 WO2019192338A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19782211.7A EP3754591A4 (en) 2018-04-04 2019-03-22 IMAGE PROCESSING METHOD AND DEVICE, STORAGE MEDIUM AND ELECTRONIC DEVICE
US16/934,823 US11354785B2 (en) 2018-04-04 2020-07-21 Image processing method and device, storage medium and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810301685.1 2018-04-04
CN201810301685.1A CN108629743B (zh) 2018-04-04 2018-04-04 图像的处理方法、装置、存储介质和电子装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/934,823 Continuation US11354785B2 (en) 2018-04-04 2020-07-21 Image processing method and device, storage medium and electronic device

Publications (1)

Publication Number Publication Date
WO2019192338A1 true WO2019192338A1 (zh) 2019-10-10

Family

ID=63704674

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/079332 WO2019192338A1 (zh) 2018-04-04 2019-03-22 图像的处理方法、装置、存储介质和电子装置

Country Status (4)

Country Link
US (1) US11354785B2 (zh)
EP (1) EP3754591A4 (zh)
CN (1) CN108629743B (zh)
WO (1) WO2019192338A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340722A (zh) * 2020-02-20 2020-06-26 Oppo广东移动通信有限公司 图像处理方法、处理装置、终端设备及可读存储介质
CN112488943A (zh) * 2020-12-02 2021-03-12 北京字跳网络技术有限公司 模型训练和图像去雾方法、装置、设备
US11568518B2 (en) 2019-12-11 2023-01-31 Samsung Electronics Co., Ltd. Method and electronic device for deblurring blurred image
CN116542884A (zh) * 2023-07-07 2023-08-04 合肥市正茂科技有限公司 模糊图像清晰化模型的训练方法、装置、设备及介质

Families Citing this family (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108629743B (zh) 2018-04-04 2022-03-25 腾讯科技(深圳)有限公司 图像的处理方法、装置、存储介质和电子装置
CN109525859B (zh) * 2018-10-10 2021-01-15 腾讯科技(深圳)有限公司 模型训练、图像发送、图像处理方法及相关装置设备
CN109360171B (zh) * 2018-10-26 2021-08-06 北京理工大学 一种基于神经网络的视频图像实时去模糊方法
CN109410146A (zh) * 2018-11-02 2019-03-01 北京大学深圳研究生院 一种基于Bi-Skip-Net的图像去模糊算法
CN112889069B (zh) * 2018-11-08 2024-04-05 Oppo广东移动通信有限公司 用于提高低照度图像质量的方法、***和计算机可读介质
CN110147864B (zh) 2018-11-14 2022-02-22 腾讯科技(深圳)有限公司 编码图案的处理方法和装置、存储介质、电子装置
CN109598298B (zh) * 2018-11-29 2021-06-04 上海皓桦科技股份有限公司 图像物体识别方法和***
CN111507931B (zh) * 2019-01-14 2023-04-18 阿里巴巴集团控股有限公司 一种数据处理方法和装置
CN109816659B (zh) * 2019-01-28 2021-03-23 北京旷视科技有限公司 图像分割方法、装置及***
CN109993712B (zh) * 2019-04-01 2023-04-25 腾讯科技(深圳)有限公司 图像处理模型的训练方法、图像处理方法及相关设备
CN113992848A (zh) * 2019-04-22 2022-01-28 深圳市商汤科技有限公司 视频图像处理方法及装置
US11126895B2 (en) * 2019-05-22 2021-09-21 Lawrence Livermore National Security, Llc Mimicking of corruption in images
US10984507B2 (en) * 2019-07-17 2021-04-20 Harris Geospatial Solutions, Inc. Image processing system including training model based upon iterative blurring of geospatial images and related methods
CN110443310B (zh) * 2019-08-07 2022-08-09 浙江大华技术股份有限公司 比对分析***的更新方法、服务器及计算机存储介质
CN112468830A (zh) * 2019-09-09 2021-03-09 阿里巴巴集团控股有限公司 视频图像处理方法、装置及电子设备
JP7455542B2 (ja) * 2019-09-27 2024-03-26 キヤノン株式会社 画像処理方法、プログラム、画像処理装置、学習済みモデルの製造方法、および、画像処理システム
JP7377048B2 (ja) * 2019-09-30 2023-11-09 キヤノン株式会社 画像処理装置及び方法、及び撮像装置
US11893482B2 (en) * 2019-11-14 2024-02-06 Microsoft Technology Licensing, Llc Image restoration for through-display imaging
CN110895801A (zh) * 2019-11-15 2020-03-20 北京金山云网络技术有限公司 图像处理方法、装置、设备及存储介质
US11348291B2 (en) * 2019-11-29 2022-05-31 Shanghai United Imaging Intelligence Co., Ltd. System and method for reconstructing magnetic resonance images
CN110971895B (zh) * 2019-12-18 2022-07-08 北京百度网讯科技有限公司 视频抖动检测方法和装置
CN111340694B (zh) * 2020-02-07 2023-10-27 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机可读存储介质和计算机设备
CN113452898B (zh) * 2020-03-26 2023-07-18 华为技术有限公司 一种拍照方法及装置
CN111885297B (zh) * 2020-06-16 2022-09-06 北京迈格威科技有限公司 图像清晰度的确定方法、图像对焦方法及装置
CN111985565B (zh) * 2020-08-20 2023-01-10 上海风秩科技有限公司 图片分析方法和装置、存储介质及电子设备
KR102336103B1 (ko) * 2020-09-09 2021-12-08 울산과학기술원 딥러닝 기반 영상 디블러링 방법 및 이를 수행하는 장치
CN112102205B (zh) * 2020-10-15 2024-02-09 平安科技(深圳)有限公司 图像去模糊方法、装置、电子设备及存储介质
CN112053308B (zh) * 2020-10-22 2023-05-26 华润数字科技有限公司 一种图像去模糊方法、装置、计算机设备及存储介质
CN112419201A (zh) * 2020-12-04 2021-02-26 珠海亿智电子科技有限公司 一种基于残差网络的图像去模糊方法
CN112561826A (zh) * 2020-12-22 2021-03-26 杭州趣链科技有限公司 基于人工智能的图像去模糊方法、装置、设备及存储介质
CN112907450B (zh) * 2021-03-24 2023-01-06 东莞中国科学院云计算产业技术创新与育成中心 三维时序图像处理方法、装置、计算机设备和存储介质
CN113111886B (zh) * 2021-04-19 2023-03-24 太原科技大学 一种基于双层残差网络的交通场景图像语义分割方法
CN113139942B (zh) * 2021-04-21 2023-10-31 Oppo广东移动通信有限公司 图像处理模型的训练方法、装置、电子设备及存储介质
CN113256785B (zh) * 2021-05-31 2023-04-04 北京字跳网络技术有限公司 图像处理方法、装置、设备及介质
CN113724159A (zh) * 2021-08-18 2021-11-30 北京工业大学 基于模糊等级的花粉图像去模糊方法、装置、设备及介质
CN113627210A (zh) * 2021-08-19 2021-11-09 南京华捷艾米软件科技有限公司 条形码图像的生成方法、装置、电子设备及存储介质
CN114943639B (zh) * 2022-05-24 2023-03-28 北京瑞莱智慧科技有限公司 图像获取方法、相关装置及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103854268A (zh) * 2014-03-26 2014-06-11 西安电子科技大学 基于多核高斯过程回归的图像超分辨重建方法
CN106296597A (zh) * 2016-07-25 2017-01-04 天津大学 一种基于最优化颜色修正和回归模型的水下图像复原方法
CN107220612A (zh) * 2017-05-19 2017-09-29 天津工业大学 以关键点局部邻域的高频分析为核心的模糊人脸判别方法
US20170365046A1 (en) * 2014-08-15 2017-12-21 Nikon Corporation Algorithm and device for image processing
CN107871310A (zh) * 2017-10-26 2018-04-03 武汉大学 一种基于模糊核精细化的单幅图像盲去运动模糊方法
CN108629743A (zh) * 2018-04-04 2018-10-09 腾讯科技(深圳)有限公司 图像的处理方法、装置、存储介质和电子装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7440634B2 (en) * 2003-06-17 2008-10-21 The Trustees Of Columbia University In The City Of New York Method for de-blurring images of moving objects
JP6870076B2 (ja) * 2016-09-26 2021-05-12 グーグル エルエルシーGoogle LLC ニューラル機械翻訳システム
CN107730459B (zh) * 2017-09-20 2022-12-06 大连理工大学 一种基于非线性动态***的图像去模糊方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103854268A (zh) * 2014-03-26 2014-06-11 西安电子科技大学 基于多核高斯过程回归的图像超分辨重建方法
US20170365046A1 (en) * 2014-08-15 2017-12-21 Nikon Corporation Algorithm and device for image processing
CN106296597A (zh) * 2016-07-25 2017-01-04 天津大学 一种基于最优化颜色修正和回归模型的水下图像复原方法
CN107220612A (zh) * 2017-05-19 2017-09-29 天津工业大学 以关键点局部邻域的高频分析为核心的模糊人脸判别方法
CN107871310A (zh) * 2017-10-26 2018-04-03 武汉大学 一种基于模糊核精细化的单幅图像盲去运动模糊方法
CN108629743A (zh) * 2018-04-04 2018-10-09 腾讯科技(深圳)有限公司 图像的处理方法、装置、存储介质和电子装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11568518B2 (en) 2019-12-11 2023-01-31 Samsung Electronics Co., Ltd. Method and electronic device for deblurring blurred image
CN111340722A (zh) * 2020-02-20 2020-06-26 Oppo广东移动通信有限公司 图像处理方法、处理装置、终端设备及可读存储介质
CN111340722B (zh) * 2020-02-20 2023-05-26 Oppo广东移动通信有限公司 图像处理方法、处理装置、终端设备及可读存储介质
CN112488943A (zh) * 2020-12-02 2021-03-12 北京字跳网络技术有限公司 模型训练和图像去雾方法、装置、设备
CN112488943B (zh) * 2020-12-02 2024-02-02 北京字跳网络技术有限公司 模型训练和图像去雾方法、装置、设备
CN116542884A (zh) * 2023-07-07 2023-08-04 合肥市正茂科技有限公司 模糊图像清晰化模型的训练方法、装置、设备及介质
CN116542884B (zh) * 2023-07-07 2023-10-13 合肥市正茂科技有限公司 模糊图像清晰化模型的训练方法、装置、设备及介质

Also Published As

Publication number Publication date
US20200349680A1 (en) 2020-11-05
EP3754591A1 (en) 2020-12-23
US11354785B2 (en) 2022-06-07
CN108629743B (zh) 2022-03-25
CN108629743A (zh) 2018-10-09
EP3754591A4 (en) 2021-11-10

Similar Documents

Publication Publication Date Title
WO2019192338A1 (zh) 图像的处理方法、装置、存储介质和电子装置
Sun et al. Learned image downscaling for upscaling using content adaptive resampler
TWI766175B (zh) 單目圖像深度估計方法、設備及儲存介質
CN110324664B (zh) 一种基于神经网络的视频补帧方法及其模型的训练方法
CN108694705B (zh) 一种多帧图像配准与融合去噪的方法
US10846836B2 (en) View synthesis using deep convolutional neural networks
CN111784578A (zh) 图像处理、模型训练方法及装置、设备、存储介质
US8594464B2 (en) Adaptive super resolution for video enhancement
US20220222776A1 (en) Multi-Stage Multi-Reference Bootstrapping for Video Super-Resolution
CN113066017B (zh) 一种图像增强方法、模型训练方法及设备
CN110956219B (zh) 视频数据的处理方法、装置和电子***
Islam et al. Super-resolution enhancement technique for low resolution video
KR20180128888A (ko) 지각 다운스케일링 방법을 사용하여 이미지를 다운스케일링하기 위한 이미지 처리 시스템
CN111259841B (zh) 一种图像处理方法及相关设备
CN113688907B (zh) 模型训练、视频处理方法,装置,设备以及存储介质
CN108876716B (zh) 超分辨率重建方法及装置
Simpkins et al. An introduction to super-resolution imaging
CN112801876A (zh) 信息处理方法、装置及电子设备和存储介质
US20230060988A1 (en) Image processing device and method
Miao et al. Snapshot compressive imaging using domain-factorized deep video prior
CN116486009A (zh) 单目三维人体重建方法、装置以及电子设备
US11871145B2 (en) Optimization of adaptive convolutions for video frame interpolation
US20230316463A1 (en) Filter for temporal noise reduction
US20230169326A1 (en) Method and apparatus for generating paired low resolution and high resolution images using a generative adversarial network
CN114140363B (zh) 视频去模糊方法及装置、视频去模糊模型训练方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19782211

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019782211

Country of ref document: EP

Effective date: 20200918

NENP Non-entry into the national phase

Ref country code: DE