WO2021115180A1 - Sample image processing method and apparatus, electronic device, and medium - Google Patents

Sample image processing method and apparatus, electronic device, and medium Download PDF

Info

Publication number
WO2021115180A1
WO2021115180A1 PCT/CN2020/133408 CN2020133408W WO2021115180A1 WO 2021115180 A1 WO2021115180 A1 WO 2021115180A1 CN 2020133408 W CN2020133408 W CN 2020133408W WO 2021115180 A1 WO2021115180 A1 WO 2021115180A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
video
sequence
processing
images
Prior art date
Application number
PCT/CN2020/133408
Other languages
French (fr)
Chinese (zh)
Inventor
成超
蔡媛
樊鸿飞
汪贤
鲁方波
Original Assignee
北京金山云网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京金山云网络技术有限公司 filed Critical 北京金山云网络技术有限公司
Publication of WO2021115180A1 publication Critical patent/WO2021115180A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/36Scalability techniques involving formatting the layers as a function of picture distortion after decoding, e.g. signal-to-noise [SNR] scalability
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Definitions

  • This application relates to the field of computer technology, in particular to a sample image processing method, device, electronic equipment, and medium.
  • the noise added by the above-mentioned traditional noise adding method is very different from the noise caused by video encoding and decoding, which causes the samples used in the training process to be inconsistent with the images actually recognized by the deep learning model, which leads to the deep learning model after training. Poor ability to repair video.
  • the purpose of the embodiments of the present application is to provide a sample image processing method, device, electronic device, and medium, so as to improve the deep learning model after training to have a better ability to repair videos.
  • the specific technical solutions are as follows:
  • an embodiment of the present application provides a sample image processing method, which is applied to an electronic device, and the method includes: copying the original image into a first number of original images, and arranging the first number of original images into an image Sequence, and perform image deformation processing on the images in the image sequence except the specified image; perform video encoding processing on the image sequence that has undergone image deformation processing to obtain encoded video.
  • Each video frame in the encoded video contains video encoding noise; The video decodes the video to obtain a video frame sequence.
  • the video frame sequence contains the first number of video frames, and each video frame contains video coding noise; to obtain the specified video frame corresponding to the specified image in the video frame sequence, the specified video frame
  • the original image and the original image are used as the training sample of the deep learning model, and the deep learning model obtained through the training of the training sample is used to repair the video containing video coding noise.
  • an embodiment of the present application provides a sample image processing device, which is applied to an electronic device, and the device includes: an image deformation module configured to copy the original image into a first number of original images, and the first number The original images are arranged into an image sequence, and the images in the image sequence except the specified image are subjected to image deformation processing; the encoding module is used to perform video encoding processing on the image sequence subjected to the image deformation processing to obtain the encoded video, and the encoded video Each video frame in the video contains video encoding noise; the decoding module is used to perform video decoding processing on the encoded video to obtain a video frame sequence, the video frame sequence contains the first number of video frames, and each video frame contains video encoding noise; training module , Used to obtain the specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as the training sample of the deep learning model to repair the video containing the video coding noise with the deep learning model obtained through training of the training sample.
  • an embodiment of the present application provides an electronic device, including a processor, a communication interface, a memory, and a communication bus.
  • the processor, the communication interface, and the memory communicate with each other through the communication bus;
  • the memory is set to store A computer program;
  • a processor which is configured to implement the method steps of the sample image processing method provided in the embodiment of the present application when executing the program stored in the memory.
  • an embodiment of the present application provides a computer-readable storage medium, and a computer program is stored in the computer-readable storage medium.
  • the computer program is executed by a processor, the sample image processing method provided by the embodiment of the present application is implemented. Method steps.
  • the embodiments of the present application provide a computer program product containing instructions.
  • the computer program product containing instructions runs on a computer, the computer executes the steps of the sample image processing method provided in the embodiments of the present application.
  • an embodiment of the present application provides a computer program, which when the computer program runs on a computer, causes the computer to execute the steps of the sample image processing method provided in the embodiment of the present application.
  • the electronic device can copy the original image into the first number of original images. Then the first number of original images are arranged into an image sequence, and the images in the image sequence except the specified image are subjected to image deformation processing. Then the video encoding process is performed on the image sequence that has undergone image deformation processing to obtain an encoded video. And the encoded video is processed by video decoding to obtain a sequence of video frames. Then, the specified video frame corresponding to the specified image in the video frame sequence is acquired, the specified video frame and the original image are used as the training samples of the deep learning model, and the deep learning model obtained through training of the training samples is used to repair the video containing video coding noise.
  • the original image is converted into an image sequence, and the image is deformed, so that the image sequence can be regarded as a video.
  • the electronic device compresses the image sequence into a video through video encoding, and decompresses the video into a video frame sequence through video decoding processing to obtain a designated video frame with noise caused by the video encoding. Because the source of video noise in the actual situation is video encoding and video decoding, the specified video frame can be used as a training sample consistent with the actual situation, so that the trained deep learning model has a better ability to repair the video.
  • FIG. 1 is a flowchart of a sample image processing method provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of the effect of a sample image processing method provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of the effect of another sample image processing method provided by an embodiment of the application.
  • FIG. 4 is an exemplary schematic diagram of video encoding and video decoding provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a sample image processing device provided by an embodiment of the application.
  • FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • the embodiment of the present application provides a method for processing sample images, and the method is applied to an electronic device.
  • the electronic device includes a mobile terminal or a PC (English: personal computer, Chinese: personal computer) terminal that can apply a deep learning model to denoise the image.
  • Step 101 Copy the original image as a first number of original images, arrange the first number of original images into an image sequence, and perform image deformation processing on the images in the image sequence except the specified image.
  • the original image is a lossless image, that is, an image before noise is added.
  • the electronic device copies the original image into the first number of original images, and arranges the first number of original images into an image sequence, the purpose of which is to add noise to the original image through video coding.
  • the image sequence includes a second number of designated images, and the second number is less than the first number.
  • the designated image is the target of the electronic device for adding noise.
  • the electronic device can identify the marked image in the image sequence by marking the image at the designated position in the image sequence to determine the designated image.
  • the embodiment of the present application may also use other methods to determine the designated image The embodiments of this application do not limit this.
  • the specified image in the image sequence after image deformation processing has different pixels between its adjacent images, that is, the purpose of the deformation is to simulate the movement or deformation of the object in the original image.
  • a video usually shows a moving picture rather than a static picture.
  • the electronic device can perform image deformation processing on the images in the image sequence except the specified image to simulate the object in the original image Movement or deformation.
  • image deformation processing on the images in the image sequence except the specified image to simulate the object in the original image Movement or deformation.
  • the purpose of image deformation processing on images other than the specified image is to make the specified image in a video environment, that is, to make the specified image differ from the images before and after the specified image.
  • Step 102 Perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video.
  • each video frame in the coded video contains video coding noise.
  • Video coding is a video compression technology. Since the video is compressed during the video coding process, video coding noise will be generated during the video coding process.
  • the electronic device may encode an image sequence into an encoded video through video encoding processing.
  • Step 103 Perform video decoding processing on the encoded video to obtain a video frame sequence.
  • the video frame sequence includes the first number of video frames, and each video frame includes video coding noise.
  • the video encoding process mentioned in step 102 and the video decoding process mentioned in step 103 can use the H.264 (a video encoding format) video codec.
  • the electronic device may also use other versions of video codecs, which is not limited in the embodiment of the present application.
  • Step 104 Obtain a specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as a training sample of the deep learning model to repair the video containing video coding noise with the deep learning model obtained through training of the training sample.
  • each video frame in the video frame sequence obtained after the above-mentioned video encoding processing and video decoding processing contains video coding noise
  • the specified video frame corresponding to the specified image in the video frame sequence also contains video coding noise.
  • the electronic device can use the specified video frame in the video frame sequence and the original image corresponding to each specified video frame as training samples, and train the deep learning model through the training samples.
  • the trained deep learning model can be used to repair the video containing the video. Encode video with noise.
  • the electronic device can use the specified video frame and the original image corresponding to the specified image as training samples, and train the deep learning model according to the loss between the labeled frame and the original image and the loss function.
  • the above-mentioned marked frame refers to the marked video frame in the video frame image, that is, the designated video frame corresponding to the designated image in the video frame sequence.
  • the training set used when the electronic device trains the deep learning model includes multiple training samples, and each training sample includes an original image and a designated video frame corresponding to the original image.
  • the training sample in the training set is not limited to one original image.
  • the method flow shown in Figure 1 can be performed on multiple original images to obtain designated video frames corresponding to multiple original images.
  • the image and the specified video frame corresponding to the original image are used as training samples to train the deep learning model.
  • the electronic device can copy the original image into a first number of original images, and then arrange the first number of original images into an image sequence. Then, the images in the image sequence other than the specified image are subjected to image deformation processing, and then the image sequence subjected to the image deformation processing is subjected to video encoding processing to obtain an encoded video. And the encoded video is processed by video decoding to obtain a sequence of video frames. Then, the specified video frame corresponding to the specified image in the video frame sequence is acquired, the specified video frame and the original image are used as the training samples of the deep learning model, and the deep learning model obtained through training of the training samples is used to repair the video containing video coding noise.
  • the original image is converted into an image sequence, and the image is deformed, so that the image sequence can be regarded as a video.
  • the electronic device compresses the image sequence into a video through video encoding, and decompresses the video into a video frame sequence through video decoding processing to obtain a designated video frame with noise caused by the video encoding. Because the source of video noise in the actual situation is video encoding and video decoding, the specified video frame can be used as a training sample consistent with the actual situation, so that the trained deep learning model has a better ability to repair the video.
  • the process of the electronic device performing image deformation processing on images other than the specified image in the image sequence may be executed as follows:
  • the image deformation processing includes random affine transformation and elastic transformation
  • the random affine transformation includes: one or more image deformation processing of movement, rotation, and stretching.
  • random affine transformation can be realized by affine transformation algorithm
  • elastic transformation can be realized by elastic transformation algorithm
  • FIG. 2 is a schematic diagram of an image sequence after elastic transformation.
  • the image sequence includes 20 images and the serial numbers corresponding to the 20 images.
  • serial number below each image represents the order of each image in the image sequence, that is, the position of each image in the image sequence.
  • the designated images are images 1, 5, 10, 15 and 20, that is, when the electronic device elastically transforms the image sequence, the electronic device only responds to images other than images 1, 5, 10, 15 and 20.
  • the image undergoes elastic transformation, and the specific effect after elastic transformation is shown in Figure 2.
  • the elastic transformation is to perform distortion image deformation processing on the image.
  • the solid line grid is only used to conveniently indicate the result of the image distortion, and the solid line grid does not exist in practical applications.
  • the electronic device can perform both elastic transformation on the image sequence and random affine transformation on the image sequence, which is not limited in the embodiment of the present application.
  • the electronic device uses random affine transformation and elastic transformation to make the image sequence have differences between adjacent images, that is, the image sequence can be regarded as a piece of video. Furthermore, the electronic device may add video coding noise to the image sequence, so that the noise included in the image in the image sequence is closer to the noise generated in the actual application scenario.
  • the electronic device performs image deformation processing on images in the image sequence other than the specified image, which may be specifically executed as follows:
  • the images except the specified images in the image sequence are subjected to image deformation processing in a preset manner and order.
  • the electronic device may preset the image deformation rules to obtain a more realistic motion effect of the image sequence, and thereby make the image noise increase effect better.
  • the video encoding process is performed on the image sequence subjected to the image deformation process to obtain the encoded video.
  • the electronic device may specifically execute as follows:
  • the target video encoding format is adopted to perform video encoding processing on the image sequence after image deformation processing to obtain an encoded video.
  • the target video encoding format includes but is not limited to: H.264 video encoding format, H.265 video encoding format, and H.266 video encoding format.
  • the embodiments of this application do not limit the video encoding format used.
  • the encoded video is subjected to video decoding processing to obtain a video frame sequence
  • the electronic device may specifically execute as follows:
  • the encoded video is subjected to video decoding processing to obtain a video frame sequence.
  • the video frame sequence obtained after the video decoding process of the encoded video includes the video frame corresponding to each image in the image sequence.
  • the video frame corresponding to the specified image in the image sequence is the specified video frame.
  • Fig. 3 is a schematic diagram of a designated video frame corresponding to a designated image.
  • the video frame image sequence shown in Figure 3 includes designated video frames with serial numbers 1, 5, 10, 15, and 20.
  • the serial numbers of the designated video frames correspond to the serial numbers of the images in Fig. 2, for example, in Fig. 2
  • the image 1 corresponds to the specified video frame 1.
  • the specified video frame includes the noise generated in the video encoding.
  • the final specified video frame includes the noise caused by the video encoding process, which better simulates the actual application The noise in the sample image.
  • FIG. 4 is an exemplary schematic diagram of video encoding and video decoding provided by an embodiment of the application, where the schematic flowchart shows the content of the above step 102 to step 104.
  • the image sequence shown in FIG. 4 includes 20 images, and the designated images are the first, 5, 10, 15 and 20 images in the image sequence.
  • the electronic device may first perform H.264 encoding on the image sequence to obtain the encoded video corresponding to the image sequence.
  • the electronic device performs H.264 decoding on the encoded video to obtain a video frame sequence corresponding to the encoded video.
  • the electronic device can extract the specified video frame from the video frame sequence (ie the 1, 5, 10, 15 and 20 video frames in the video frame sequence), and use the specified video frame and original image as the training of the deep learning model sample.
  • the electronic device can effectively add video coding noise in actual application scenarios to the image.
  • step 102 the video encoding process is performed on the image sequence that has undergone image deformation processing, and before the encoded video is obtained, the electronic device may also perform noise processing on the video frame in the target video.
  • the specific process may be for:
  • Noise addition is performed on the images in the image sequence that have undergone image deformation processing.
  • the noise addition processing includes one or more of adding Gaussian noise, blurring, compression noise, and first down-sampling and then up-sampling.
  • the compression noise can be JPEG compression noise.
  • the above-mentioned noise adding processing is a noise adding processing method for an image. Since the electronic device will eventually perform video encoding and video decoding processing on the target video, after the electronic device performs noise processing on the video frame of the target video through the above noise processing method, it can better simulate the noise generated in the actual application scene.
  • an embodiment of the present application also provides a sample image processing device, as shown in FIG. 5, the device includes: an image deformation module 501, an encoding module 502, a decoding module 503, and a training module 504;
  • the image deformation module 501 is configured to copy the original image into the first number of original images, arrange the first number of original images into an image sequence, and perform image deformation processing on the images in the image sequence except the specified image;
  • the encoding module 502 is configured to perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video, and each video frame in the encoded video contains video encoding noise;
  • the decoding module 503 is configured to perform video decoding processing on the encoded video to obtain a video frame sequence, the video frame sequence includes a first number of video frames, and each video frame includes video encoding noise;
  • the training module 504 is configured to obtain the specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as the training samples of the deep learning model to repair the deep learning model obtained by training the training samples containing video coding noise Video.
  • the image deformation module 501 is specifically configured as follows:
  • Random affine transformation and/or elastic transformation are performed on images other than the specified image in the image sequence. Random affine transformation includes one or more image deformation processing of moving, rotating and stretching.
  • the image deformation module 501 is specifically configured as follows:
  • the images except the specified images in the image sequence are subjected to image deformation processing in a preset manner and order.
  • the device further includes: a noise adding module
  • the noise addition module is set to perform noise addition processing on the images in the image sequence after the image deformation processing.
  • the noise addition processing includes one or more of adding Gaussian noise, blurring, compressed noise, and first down-sampling and then up-sampling.
  • the encoding module 502 is specifically configured as follows:
  • the decoding module 503 is specifically set as:
  • the encoded video is subjected to video decoding processing to obtain a video frame sequence.
  • the electronic device can copy the original image into the first number of original images, and then arrange the first number of original images into an image sequence, and exclude the specified images in the image sequence Perform image deformation processing on the image of the image, and then perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video, and perform video decoding processing on the encoded video to obtain a video frame sequence, and then obtain the video frame sequence corresponding to the specified image Specify the video frame, use the specified video frame and the original image as the training sample of the deep learning model, and the deep learning model trained by the training sample is used to repair the video containing the video coding noise.
  • the embodiment of the application converts the original image into an image sequence and deforms the image so that the image sequence can be regarded as a video. Then, the electronic device compresses the image sequence into a video through video encoding, and processes the video through video decoding. Decompress it into a sequence of video frames to obtain a specified video frame with noise caused by video encoding. Because the source of video noise in the actual situation is video encoding and video decoding, the specified video frame can be used as a training sample consistent with the actual situation, so that the trained deep learning model has a better ability to repair the video.
  • An embodiment of the present application also provides an electronic device, as shown in FIG. 6, including a processor 601, a communication interface 602, a memory 603, and a communication bus 604.
  • the processor 601, the communication interface 602, and the memory 603 pass through the communication bus 604. Complete the communication between each other,
  • the memory 603 is set to store computer programs
  • processor 601 when the processor 601 is set to execute the program stored in the memory 603, it is also set to implement other steps described in the above method embodiment. You can refer to the relevant description in the above method embodiment, which will not be omitted here. Go into details.
  • the communication bus mentioned by the aforementioned network device may be a peripheral component interconnection standard (English: Peripheral Component Interconnect, abbreviated as: PCI) bus or an extended industry standard architecture (English: Extended Industry Standard Architecture, abbreviated as: EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the aforementioned network device and other devices.
  • the memory may include a random access memory (English: Random Access Memory, abbreviated as: RAM), and may also include a non-volatile memory (English: Non-Volatile Memory, abbreviated as: NVM), for example, at least one disk storage.
  • RAM Random Access Memory
  • NVM Non-Volatile Memory
  • the memory may also be at least one storage device located far away from the foregoing processor.
  • the above-mentioned processor may be a general-purpose processor, including a central processing unit (English: Central Processing Unit, abbreviated as: CPU), a network processor (English: Network Processor, abbreviated as: NP), etc.; it may also be a digital signal processor (English: : Digital Signal Processing, abbreviation: DSP), application specific integrated circuit (English: Application Specific Integrated Circuit, abbreviation: ASIC), Field-Programmable Gate Array (English: Field-Programmable Gate Array, abbreviation: FPGA) or other programmable logic devices , Discrete gates or transistor logic devices, discrete hardware components.
  • CPU Central Processing Unit
  • NP Network Processor
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • FPGA Field-Programmable Gate Array
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the above-mentioned sample image processing method are implemented. .
  • the embodiments of the present application also provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the steps of the above-mentioned sample image processing method.
  • the embodiments of the present application also provide a computer program, which when running on a computer, causes the computer to execute any sample image processing method in the above embodiments
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the training sample can contain noise generated by video coding.
  • the deep learning model trained based on the training samples can perform better repairs on videos containing video coding noise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Embodiments of the present application relate to the technical field of computers, and provide a sample image processing method and apparatus, an electronic device, and a medium. The method comprises: replicating an original image into a first number of original images, arranging the first number of original images into an image sequence, and performing image distortion processing on images other than the specified image in the image sequence; performing video encoding processing on the image sequence subjected to image distortion processing to obtain an encoded video; performing video decoding processing on the encoded video to obtain a video frame sequence; and obtaining a specified video frame corresponding to the specified image in the video frame sequence, and using the specified video frame and the original image as a training sample of a deep learning model. By using the present application, a trained deep learning model can have a good video repair capability.

Description

一种样本图像处理方法、装置、电子设备以及介质Sample image processing method, device, electronic equipment and medium
本申请要求于2019年12月13日提交中国专利局、申请号为201911282864.6,发明名称为“一种样本图像处理方法、装置、电子设备以及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on December 13, 2019, the application number is 201911282864.6, and the invention title is "a sample image processing method, device, electronic equipment, and medium". The entire content of the application is approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及计算机技术领域,特别是涉及一种样本图像处理方法、装置、电子设备以及介质。This application relates to the field of computer technology, in particular to a sample image processing method, device, electronic equipment, and medium.
背景技术Background technique
目前,视频在互联网传输中,由于经过了视频编码以及解码的过程,会导致视频出现压缩噪声,例如出现模糊、边缘毛刺感和振铃效应等压缩噪声。At present, in video transmission over the Internet, due to the process of video encoding and decoding, compression noises such as blur, edge glitches, and ringing effects will appear in the video.
为解决上述问题,相关技术采用深度学习模型对经过了视频编码以及解码后的视频进行修复。在对深度学习模型进行训练的过程中,将无噪声的原始图像以及对原始图像进行加噪处理后的加噪图像作为训练样本,对深度学习模型进行训练。To solve the above problems, related technologies use deep learning models to repair videos that have undergone video encoding and decoding. In the process of training the deep learning model, the noise-free original image and the noise-added image after the original image is processed with noise are used as training samples to train the deep learning model.
在对原始图像进行加噪处理时,一般采用人为添加高斯噪声、图像模糊处理、联合图像专家组(Joint Photographic Experts Group,JPEG)压缩噪声、先下采样后上采样等传统加噪方式。When adding noise to the original image, traditional noise adding methods such as artificial addition of Gaussian noise, image blur processing, Joint Photographic Experts Group (JPEG) noise compression, and down-sampling and then up-sampling are generally used.
然而,上述传统加噪方式添加的噪声与视频编码以及解码后带来的噪声有很大差异,导致训练过程采用的样本与深度学习模型实际要识别的图像不符,进而导致训练后的深度学习模型修复视频能力不佳。However, the noise added by the above-mentioned traditional noise adding method is very different from the noise caused by video encoding and decoding, which causes the samples used in the training process to be inconsistent with the images actually recognized by the deep learning model, which leads to the deep learning model after training. Poor ability to repair video.
发明内容Summary of the invention
本申请实施例的目的在于提供一种样本图像处理方法、装置、电子设备以及介质,以提高训练后的深度学习模型有较好的修复视频能力。具体技术方案如下:The purpose of the embodiments of the present application is to provide a sample image processing method, device, electronic device, and medium, so as to improve the deep learning model after training to have a better ability to repair videos. The specific technical solutions are as follows:
第一方面,本申请实施例提供了一种样本图像处理方法,该方法应用于电子设备,该方法包括:将原始图像复制为第一数量个原始图像,将第一数量个原始图像排列为图像序列,并将图像序列中的除指定图像之外的图像进行图像变形处理;对经过图像变形处理的图像序列进行视频编码处理,得到 编码视频,编码视频中各视频帧包含视频编码噪声;将编码视频进行视频解码处理,得到视频帧序列,视频帧序列中包含第一数量个视频帧,且各视频帧包含视频编码噪声;获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,以通过训练样本训练得到的深度学习模型修复包含视频编码噪声的视频。In the first aspect, an embodiment of the present application provides a sample image processing method, which is applied to an electronic device, and the method includes: copying the original image into a first number of original images, and arranging the first number of original images into an image Sequence, and perform image deformation processing on the images in the image sequence except the specified image; perform video encoding processing on the image sequence that has undergone image deformation processing to obtain encoded video. Each video frame in the encoded video contains video encoding noise; The video decodes the video to obtain a video frame sequence. The video frame sequence contains the first number of video frames, and each video frame contains video coding noise; to obtain the specified video frame corresponding to the specified image in the video frame sequence, the specified video frame The original image and the original image are used as the training sample of the deep learning model, and the deep learning model obtained through the training of the training sample is used to repair the video containing video coding noise.
第二方面,本申请实施例提供了一种样本图像处理装置,该装置应用于电子设备,该装置包括:图像变形模块,设置为将原始图像复制为第一数量个原始图像,将第一数量个原始图像排列为图像序列,并将图像序列中的除指定图像之外的图像进行图像变形处理;编码模块,用于对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,编码视频中各视频帧包含视频编码噪声;解码模块,用于将编码视频进行视频解码处理,得到视频帧序列,视频帧序列中包含第一数量个视频帧,且各视频帧包含视频编码噪声;训练模块,用于获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,以通过训练样本训练得到的深度学习模型修复包含视频编码噪声的视频。In the second aspect, an embodiment of the present application provides a sample image processing device, which is applied to an electronic device, and the device includes: an image deformation module configured to copy the original image into a first number of original images, and the first number The original images are arranged into an image sequence, and the images in the image sequence except the specified image are subjected to image deformation processing; the encoding module is used to perform video encoding processing on the image sequence subjected to the image deformation processing to obtain the encoded video, and the encoded video Each video frame in the video contains video encoding noise; the decoding module is used to perform video decoding processing on the encoded video to obtain a video frame sequence, the video frame sequence contains the first number of video frames, and each video frame contains video encoding noise; training module , Used to obtain the specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as the training sample of the deep learning model to repair the video containing the video coding noise with the deep learning model obtained through training of the training sample.
第三方面,本申请实施例提供了一种电子设备,包括处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;存储器,设置为存放计算机程序;处理器,设置为执行存储器上所存放的程序时,实现本申请实施例所提供的样本图像处理方法的方法步骤。In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a communication interface, a memory, and a communication bus. The processor, the communication interface, and the memory communicate with each other through the communication bus; the memory is set to store A computer program; a processor, which is configured to implement the method steps of the sample image processing method provided in the embodiment of the present application when executing the program stored in the memory.
第四方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质内存储有计算机程序,计算机程序被处理器执行时,实现本申请实施例所提供的样本图像处理方法的方法步骤。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, and a computer program is stored in the computer-readable storage medium. When the computer program is executed by a processor, the sample image processing method provided by the embodiment of the present application is implemented. Method steps.
第五方面,本申请实施例提供了一种包含指令的计算机程序产品,包含指令的计算机程序产品在计算机上运行时,使得计算机执行本申请实施例所提供的样本图像处理方法的步骤。In a fifth aspect, the embodiments of the present application provide a computer program product containing instructions. When the computer program product containing instructions runs on a computer, the computer executes the steps of the sample image processing method provided in the embodiments of the present application.
第六方面,本申请实施例提供了一种计算机程序,计算机程序在计算机上运行时,使得计算机执行本申请实施例所提供的样本图像处理方法的步骤。In a sixth aspect, an embodiment of the present application provides a computer program, which when the computer program runs on a computer, causes the computer to execute the steps of the sample image processing method provided in the embodiment of the present application.
采用上述技术方案,电子设备可以将原始图像复制为第一数量个原始图像。然后将第一数量个原始图像排列为图像序列,并将图像序列中的除指定图像之外的图像进行图像变形处理。然后对经过图像变形处理的图像序列进行视频编码处理,得到编码视频。并将编码视频进行视频解码处理,得到视频帧序列。然后获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,通过训练样本训练得到的深度学习模型用于修复包含视频编码噪声的视频。本申请实施例将原始图像转化为一段图像序列,并将图像进行变形,使得图像序列可以被视为一段视频。然后电子设备通过视频编码将图像序列压缩为视频,并通过视频解码处理将该视频解压为视频帧序列,得到具有视频编码带来的噪声的指定视频帧。因为实际情况中视频的噪声来源为视频编码以及视频解码,所以该指定视频帧可以作为与实际情况相符的训练样本,进而使得训练后的深度学习模型有较好的修复视频能力。With the above technical solution, the electronic device can copy the original image into the first number of original images. Then the first number of original images are arranged into an image sequence, and the images in the image sequence except the specified image are subjected to image deformation processing. Then the video encoding process is performed on the image sequence that has undergone image deformation processing to obtain an encoded video. And the encoded video is processed by video decoding to obtain a sequence of video frames. Then, the specified video frame corresponding to the specified image in the video frame sequence is acquired, the specified video frame and the original image are used as the training samples of the deep learning model, and the deep learning model obtained through training of the training samples is used to repair the video containing video coding noise. In the embodiment of the present application, the original image is converted into an image sequence, and the image is deformed, so that the image sequence can be regarded as a video. Then, the electronic device compresses the image sequence into a video through video encoding, and decompresses the video into a video frame sequence through video decoding processing to obtain a designated video frame with noise caused by the video encoding. Because the source of video noise in the actual situation is video encoding and video decoding, the specified video frame can be used as a training sample consistent with the actual situation, so that the trained deep learning model has a better ability to repair the video.
当然,实施本申请的任一产品或方法并不一定需要同时达到以上所述的所有优点。Of course, implementing any product or method of the present application does not necessarily need to achieve all the advantages described above at the same time.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or related technologies, the following will briefly introduce the drawings that need to be used in the description of the embodiments or related technologies. Obviously, the drawings in the following description are merely present For some of the embodiments of the application, for those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为本申请实施例提供的一种样本图像处理方法的流程图;FIG. 1 is a flowchart of a sample image processing method provided by an embodiment of the application;
图2为本申请实施例提供的一种样本图像处理方法的效果示意图;2 is a schematic diagram of the effect of a sample image processing method provided by an embodiment of the application;
图3为本申请实施例提供的另一种样本图像处理方法的效果示意图;3 is a schematic diagram of the effect of another sample image processing method provided by an embodiment of the application;
图4为本申请实施例提供的一种视频编码和视频解码的示例性示意图;FIG. 4 is an exemplary schematic diagram of video encoding and video decoding provided by an embodiment of the application;
图5为本申请实施例提供的一种样本图像处理装置的结构示意图;5 is a schematic structural diagram of a sample image processing device provided by an embodiment of the application;
图6为本申请实施例提供的一种电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
具体实施方式Detailed ways
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请 一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions, and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请实施例提供了一种样本图像处理的方法,该方法应用于电子设备。其中,电子设备包括可以应用深度学习模型对图像进行去噪的移动终端或者PC(英文:personal computer,中文:个人计算机)终端。The embodiment of the present application provides a method for processing sample images, and the method is applied to an electronic device. Among them, the electronic device includes a mobile terminal or a PC (English: personal computer, Chinese: personal computer) terminal that can apply a deep learning model to denoise the image.
下面将结合具体实施方式,对本申请实施例提供的一种样本图像处理方法进行详细的说明,如图1所示,具体步骤如下:The following will describe in detail a sample image processing method provided in an embodiment of the present application in conjunction with specific implementations, as shown in FIG. 1, and the specific steps are as follows:
步骤101、将原始图像复制为第一数量个原始图像,将第一数量个原始图像排列为图像序列,并将图像序列中的除指定图像之外的图像进行图像变形处理。Step 101: Copy the original image as a first number of original images, arrange the first number of original images into an image sequence, and perform image deformation processing on the images in the image sequence except the specified image.
本申请实施例中,原始图像为无损图像,即加噪之前的图像。电子设备将原始图像复制为第一数量个原始图像,并将第一数量个原始图像排列为图像序列,其目的在于通过视频编码对原始图像进行加噪。In the embodiment of the present application, the original image is a lossless image, that is, an image before noise is added. The electronic device copies the original image into the first number of original images, and arranges the first number of original images into an image sequence, the purpose of which is to add noise to the original image through video coding.
其中,图像序列中包括第二数量个指定图像,第二数量小于第一数量。Wherein, the image sequence includes a second number of designated images, and the second number is less than the first number.
指定图像为电子设备进行加噪的目标,电子设备可以通过标记图像序列中指定位置的图像的方式,将图像序列中被标记的图像确定指定图像,本申请实施例也可以采用其他方式确定指定图像,本申请实施例对此不做限定。The designated image is the target of the electronic device for adding noise. The electronic device can identify the marked image in the image sequence by marking the image at the designated position in the image sequence to determine the designated image. The embodiment of the present application may also use other methods to determine the designated image The embodiments of this application do not limit this.
图像变形处理后的图像序列中的指定图像,与其相邻的图像之间存在差异像素点,即变形的目的在于模拟原始图像中的物体的移动或变形等动作。The specified image in the image sequence after image deformation processing has different pixels between its adjacent images, that is, the purpose of the deformation is to simulate the movement or deformation of the object in the original image.
可以理解的,一段视频通常展示的是一段有动作的画面,而非一段一直静止的画面。Understandably, a video usually shows a moving picture rather than a static picture.
由于原始图像是一张静止的图像,因此电子设备为了使图像序列中产生有动作的画面,电子设备可以对图像序列中除指定图像之外的图像进行图像变形处理,以模拟原始图像中的物体的移动或变形等动作。当图像序列作为视频帧序列被播放时,可以产生有动作的画面。Since the original image is a still image, in order for the electronic device to produce an action screen in the image sequence, the electronic device can perform image deformation processing on the images in the image sequence except the specified image to simulate the object in the original image Movement or deformation. When a sequence of images is played as a sequence of video frames, it can produce motion pictures.
对除指定图像之外的图像进行图像变形处理的目的在于:使指定图像处 于一个视频环境中,即,使指定图像与指定图像的前后的图像存在差异。The purpose of image deformation processing on images other than the specified image is to make the specified image in a video environment, that is, to make the specified image differ from the images before and after the specified image.
步骤102、对经过图像变形处理的图像序列进行视频编码处理,得到编码视频。Step 102: Perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video.
其中,编码视频中各视频帧包含视频编码噪声。Among them, each video frame in the coded video contains video coding noise.
视频编码是一种视频压缩技术,由于视频编码处理的过程中会对视频进行压缩,因此视频编码处理的过程中会产生视频编码噪声。Video coding is a video compression technology. Since the video is compressed during the video coding process, video coding noise will be generated during the video coding process.
本申请实施例中,电子设备可以通过视频编码处理,将图像序列编码成编码视频。In the embodiment of the present application, the electronic device may encode an image sequence into an encoded video through video encoding processing.
步骤103、将编码视频进行视频解码处理,得到视频帧序列。Step 103: Perform video decoding processing on the encoded video to obtain a video frame sequence.
其中,视频帧序列中包含第一数量个视频帧,且各视频帧包含视频编码噪声。Wherein, the video frame sequence includes the first number of video frames, and each video frame includes video coding noise.
在实际应用中,步骤102中提及的视频编码处理,以及步骤103中提及的视频解码处理,可以使用H.264(一种视频编码格式)视频编解码器。In practical applications, the video encoding process mentioned in step 102 and the video decoding process mentioned in step 103 can use the H.264 (a video encoding format) video codec.
当然,电子设备也可以使用其他版本的视频编解码器,本申请实施例不做限定。Of course, the electronic device may also use other versions of video codecs, which is not limited in the embodiment of the present application.
步骤104、获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,以通过训练样本训练得到的深度学习模型修复包含视频编码噪声的视频。Step 104: Obtain a specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as a training sample of the deep learning model to repair the video containing video coding noise with the deep learning model obtained through training of the training sample.
由于在经过上述视频编码处理以及视频解码处理后,得到的视频帧序列中的各视频帧均包含视频编码噪声,因此视频帧序列中指定图像对应的指定视频帧同样包含视频编码噪声。进而,电子设备可将视频帧序列中的指定视频帧和每个指定视频帧对应的原始图像作为训练样本,通过训练样本对深度学习模型进行训练,训练后的深度学习模型可以用于修复包含视频编码噪声的视频。Since each video frame in the video frame sequence obtained after the above-mentioned video encoding processing and video decoding processing contains video coding noise, the specified video frame corresponding to the specified image in the video frame sequence also contains video coding noise. Furthermore, the electronic device can use the specified video frame in the video frame sequence and the original image corresponding to each specified video frame as training samples, and train the deep learning model through the training samples. The trained deep learning model can be used to repair the video containing the video. Encode video with noise.
当电子设备训练深度学习模型时,电子设备可以将指定图像对应的指定视频帧和原始图像作为训练样本,并根据标记帧和原始图像之间的损失以及损失函数训练深度学习模型。其中,上述标记帧是指视频帧图像中被标记的 视频帧,即视频帧序列中指定图像对应的指定视频帧。When the electronic device trains the deep learning model, the electronic device can use the specified video frame and the original image corresponding to the specified image as training samples, and train the deep learning model according to the loss between the labeled frame and the original image and the loss function. Wherein, the above-mentioned marked frame refers to the marked video frame in the video frame image, that is, the designated video frame corresponding to the designated image in the video frame sequence.
在实际应用中,电子设备训练深度学习模型时所用的训练集中,包括多个训练样本,每个训练样本包括一个原始图像以及该原始图像对应的一个指定视频帧。In practical applications, the training set used when the electronic device trains the deep learning model includes multiple training samples, and each training sample includes an original image and a designated video frame corresponding to the original image.
训练集中的训练样本不限于一个原始图像,例如,在实际应用中,可以分别对多个原始图像进行图1所示的方法流程,从而得到多个原始图像对应的指定视频帧,将多个原始图像以及原始图像对应的指定视频帧作为训练样本对深度学习模型进行训练。The training sample in the training set is not limited to one original image. For example, in practical applications, the method flow shown in Figure 1 can be performed on multiple original images to obtain designated video frames corresponding to multiple original images. The image and the specified video frame corresponding to the original image are used as training samples to train the deep learning model.
本申请实施例提供的样本图像处理方法,电子设备可以将原始图像复制为第一数量个原始图像,然后将第一数量个原始图像排列为图像序列。并将图像序列中的除指定图像之外的图像进行图像变形处理,然后对经过图像变形处理的图像序列进行视频编码处理,得到编码视频。并将编码视频进行视频解码处理,得到视频帧序列。然后获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,通过训练样本训练得到的深度学习模型用于修复包含视频编码噪声的视频。本申请实施例将原始图像转化为一段图像序列,并将图像进行变形,使得图像序列可以被视为一段视频。然后电子设备通过视频编码将图像序列压缩为视频,并通过视频解码处理将该视频解压为视频帧序列,得到具有视频编码带来的噪声的指定视频帧。因为实际情况中视频的噪声来源为视频编码以及视频解码,所以该指定视频帧可以作为与实际情况相符的训练样本,进而使得训练后的深度学习模型有较好的修复视频能力。In the sample image processing method provided by the embodiment of the present application, the electronic device can copy the original image into a first number of original images, and then arrange the first number of original images into an image sequence. Then, the images in the image sequence other than the specified image are subjected to image deformation processing, and then the image sequence subjected to the image deformation processing is subjected to video encoding processing to obtain an encoded video. And the encoded video is processed by video decoding to obtain a sequence of video frames. Then, the specified video frame corresponding to the specified image in the video frame sequence is acquired, the specified video frame and the original image are used as the training samples of the deep learning model, and the deep learning model obtained through training of the training samples is used to repair the video containing video coding noise. In the embodiment of the present application, the original image is converted into an image sequence, and the image is deformed, so that the image sequence can be regarded as a video. Then, the electronic device compresses the image sequence into a video through video encoding, and decompresses the video into a video frame sequence through video decoding processing to obtain a designated video frame with noise caused by the video encoding. Because the source of video noise in the actual situation is video encoding and video decoding, the specified video frame can be used as a training sample consistent with the actual situation, so that the trained deep learning model has a better ability to repair the video.
在本申请实施例的一种实现方式中,上述步骤101中,电子设备对图像序列中除指定图像之外的图像进行图像变形处理的过程,可以执行为:In an implementation manner of the embodiment of the present application, in the foregoing step 101, the process of the electronic device performing image deformation processing on images other than the specified image in the image sequence may be executed as follows:
将图像序列中除指定图像之外的图像随机进行随机仿射变换和/或弹性变换。Randomly perform random affine transformation and/or elastic transformation on images other than the specified image in the image sequence.
其中,图像变形处理包括随机仿射变换和弹性变换,随机仿射变换包括:移动、旋转和拉伸中的一项或多项图像变形处理。Among them, the image deformation processing includes random affine transformation and elastic transformation, and the random affine transformation includes: one or more image deformation processing of movement, rotation, and stretching.
在实际应用中,随机仿射变换可以通过仿射变换算法实现,弹性变换可 以通过弹性变换算法实现。In practical applications, random affine transformation can be realized by affine transformation algorithm, and elastic transformation can be realized by elastic transformation algorithm.
例如,如图2所示,图2为图像序列经过弹性变换后的示意图,该图像序列中包括20张图像以及该20张图像对应的序号。For example, as shown in FIG. 2, FIG. 2 is a schematic diagram of an image sequence after elastic transformation. The image sequence includes 20 images and the serial numbers corresponding to the 20 images.
其中,每个图像下方的序号代表各图像在图像序列中的排序,即各图像在图像序列中的位置。Among them, the serial number below each image represents the order of each image in the image sequence, that is, the position of each image in the image sequence.
在图2中,指定图像为1、5、10、15和20号图像,即电子设备在对图像序列进行弹性变换时,电子设备只对1、5、10、15和20号图像之外的图像进行弹性变换,具体的弹性变换后的效果如图2所示。In Figure 2, the designated images are images 1, 5, 10, 15 and 20, that is, when the electronic device elastically transforms the image sequence, the electronic device only responds to images other than images 1, 5, 10, 15 and 20. The image undergoes elastic transformation, and the specific effect after elastic transformation is shown in Figure 2.
其中,弹性变换是对图像进行扭曲图像变形处理,在图2中,实线网格只用于方便表明图像扭曲的结果,在实际应用中并不存在该实线网格。Among them, the elastic transformation is to perform distortion image deformation processing on the image. In Figure 2, the solid line grid is only used to conveniently indicate the result of the image distortion, and the solid line grid does not exist in practical applications.
在实际应用中,电子设备可以既对图像序列进行弹性变换,又对图像序列进行随机仿射变换,本申请实施例不做限定。In practical applications, the electronic device can perform both elastic transformation on the image sequence and random affine transformation on the image sequence, which is not limited in the embodiment of the present application.
在本申请实施例中,电子设备通过随机仿射变换和弹性变换,使得图像序列中,相邻的图像之间存在差异,即使得图像序列可以被视为一段视频。进而,电子设备可以对图像序列添加视频编码噪声,使得图像序列中的图像包括的噪声更趋近于实际应用场景中产生的噪声。In the embodiment of the present application, the electronic device uses random affine transformation and elastic transformation to make the image sequence have differences between adjacent images, that is, the image sequence can be regarded as a piece of video. Furthermore, the electronic device may add video coding noise to the image sequence, so that the noise included in the image in the image sequence is closer to the noise generated in the actual application scenario.
在本申请实施例的另一种实现方式中,上述步骤101中,电子设备将图像序列中的除指定图像之外的图像进行图像变形处理,具体可以执行为:In another implementation of the embodiment of the present application, in the above step 101, the electronic device performs image deformation processing on images in the image sequence other than the specified image, which may be specifically executed as follows:
根据预设图像变形规则,将图像序列中除指定图像之外的图像按预设方式和顺序进行图像变形处理。According to the preset image deformation rules, the images except the specified images in the image sequence are subjected to image deformation processing in a preset manner and order.
本申请实施例中,电子设备可以通过预设图像变形规则,使图像序列获得更逼真的运动效果,进而使图像加噪的效果更佳。In the embodiment of the present application, the electronic device may preset the image deformation rules to obtain a more realistic motion effect of the image sequence, and thereby make the image noise increase effect better.
在本申请另一实施例中,针对上述步骤102、对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,电子设备具体可以执行为:In another embodiment of the present application, for the foregoing step 102, the video encoding process is performed on the image sequence subjected to the image deformation process to obtain the encoded video. The electronic device may specifically execute as follows:
采用目标视频编码格式,对经过图像变形处理的图像序列进行视频编码处理,得到编码视频。The target video encoding format is adopted to perform video encoding processing on the image sequence after image deformation processing to obtain an encoded video.
其中,目标视频编码格式包括但不限于:H.264视频编码格式、H.265视频编码格式、H.266视频编码格式。本申请实施例对采用的视频编码格式不做限定。Among them, the target video encoding format includes but is not limited to: H.264 video encoding format, H.265 video encoding format, and H.266 video encoding format. The embodiments of this application do not limit the video encoding format used.
针对上述步骤103、将编码视频进行视频解码处理,得到视频帧序列,电子设备具体可以执行为:For the foregoing step 103, the encoded video is subjected to video decoding processing to obtain a video frame sequence, and the electronic device may specifically execute as follows:
采用目标视频解码格式,将编码视频进行视频解码处理,得到视频帧序列。Using the target video decoding format, the encoded video is subjected to video decoding processing to obtain a video frame sequence.
其中,编码视频经过视频解码处理后得到的视频帧序列中,包括图像序列中每个图像对应的视频帧。Among them, the video frame sequence obtained after the video decoding process of the encoded video includes the video frame corresponding to each image in the image sequence.
进而,图像序列中的指定图像对应的视频帧为指定视频帧。Furthermore, the video frame corresponding to the specified image in the image sequence is the specified video frame.
如图3所示,图3为指定图像对应的指定视频帧的示意图。其中,图3所表示的视频帧图像序列中包括序号为1、5、10、15和20的指定视频帧,指定视频帧的序号与图2中图像的序号一一对应,例如,图2中的图像1与指定视频帧1对应。As shown in Fig. 3, Fig. 3 is a schematic diagram of a designated video frame corresponding to a designated image. Among them, the video frame image sequence shown in Figure 3 includes designated video frames with serial numbers 1, 5, 10, 15, and 20. The serial numbers of the designated video frames correspond to the serial numbers of the images in Fig. 2, for example, in Fig. 2 The image 1 corresponds to the specified video frame 1.
由图3明显可见,指定视频帧包括了视频编码中产生的噪声。It is obvious from Figure 3 that the specified video frame includes the noise generated in the video encoding.
在本申请实施例中,由于引入了视频编码和视频解码对视频帧序列进行加噪,所以在最终得到的指定视频帧中,包括了视频编码过程带来的噪声,更好的模拟了实际应用中样本图像中的噪声。In the embodiments of this application, since video encoding and video decoding are introduced to add noise to the video frame sequence, the final specified video frame includes the noise caused by the video encoding process, which better simulates the actual application The noise in the sample image.
如图4所示,图4为本申请实施例提供的一种视频编码和视频解码的示例性示意图,其中,该流程示意图展示的是上述步骤102至步骤104的内容。As shown in FIG. 4, FIG. 4 is an exemplary schematic diagram of video encoding and video decoding provided by an embodiment of the application, where the schematic flowchart shows the content of the above step 102 to step 104.
在图4所示的图像序列中包括20张图像,且指定图像为该图像序列中第1、5、10、15和20号图像。The image sequence shown in FIG. 4 includes 20 images, and the designated images are the first, 5, 10, 15 and 20 images in the image sequence.
电子设备可以先对该图像序列进行H.264编码,得到该图像序列对应的编码视频。The electronic device may first perform H.264 encoding on the image sequence to obtain the encoded video corresponding to the image sequence.
然后,电子设备对该编码视频进行H.264解码,得到编码视频对应的视频帧序列。Then, the electronic device performs H.264 decoding on the encoded video to obtain a video frame sequence corresponding to the encoded video.
最后,电子设备可以从视频帧序列中提取出指定视频帧(即视频帧序列中第1、5、10、15和20号视频帧),并将指定视频帧和原始图像作为深度学习模型的训练样本。Finally, the electronic device can extract the specified video frame from the video frame sequence (ie the 1, 5, 10, 15 and 20 video frames in the video frame sequence), and use the specified video frame and original image as the training of the deep learning model sample.
通过本申请实施例,电子设备可以有效的为图像添加实际应用场景中的视频编码噪声。Through the embodiments of the present application, the electronic device can effectively add video coding noise in actual application scenarios to the image.
在本申请另一实施例中,在步骤102、对经过图像变形处理的图像序列进行视频编码处理,得到编码视频之前,电子设备还可以对目标视频中的视频帧进行加噪处理,具体过程可以为:In another embodiment of the present application, in step 102, the video encoding process is performed on the image sequence that has undergone image deformation processing, and before the encoded video is obtained, the electronic device may also perform noise processing on the video frame in the target video. The specific process may be for:
对经过图像变形处理的图像序列中的图像进行加噪处理,加噪处理包括添加高斯噪声、模糊、压缩噪声、先下采样后上采样中的一项或多项。Noise addition is performed on the images in the image sequence that have undergone image deformation processing. The noise addition processing includes one or more of adding Gaussian noise, blurring, compression noise, and first down-sampling and then up-sampling.
在实际应用中,压缩噪声可以是JPEG压缩噪声。In practical applications, the compression noise can be JPEG compression noise.
其中,上述加噪处理是针对图像的加噪处理方式。由于电子设备最终会对目标视频进行视频编码以及视频解码处理,因此电子设备通过上述加噪处理方式对目标视频的视频帧进行加噪处理后,可以更好的模拟实际应用场景中产生的噪声。Among them, the above-mentioned noise adding processing is a noise adding processing method for an image. Since the electronic device will eventually perform video encoding and video decoding processing on the target video, after the electronic device performs noise processing on the video frame of the target video through the above noise processing method, it can better simulate the noise generated in the actual application scene.
基于相同的技术构思,本申请实施例还提供了一种样本图像处理装置,如图5所示,该装置包括:图像变形模块501、编码模块502、解码模块503、训练模块504;Based on the same technical concept, an embodiment of the present application also provides a sample image processing device, as shown in FIG. 5, the device includes: an image deformation module 501, an encoding module 502, a decoding module 503, and a training module 504;
图像变形模块501,设置为将原始图像复制为第一数量个原始图像,将第一数量个原始图像排列为图像序列,并将图像序列中的除指定图像之外的图像进行图像变形处理;The image deformation module 501 is configured to copy the original image into the first number of original images, arrange the first number of original images into an image sequence, and perform image deformation processing on the images in the image sequence except the specified image;
编码模块502,设置为对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,编码视频中各视频帧包含视频编码噪声;The encoding module 502 is configured to perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video, and each video frame in the encoded video contains video encoding noise;
解码模块503,设置为将编码视频进行视频解码处理,得到视频帧序列,视频帧序列中包含第一数量个视频帧,且各视频帧包含视频编码噪声;The decoding module 503 is configured to perform video decoding processing on the encoded video to obtain a video frame sequence, the video frame sequence includes a first number of video frames, and each video frame includes video encoding noise;
训练模块504,设置为获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,以通过训练样本训 练得到的深度学习模型修复包含视频编码噪声的视频。The training module 504 is configured to obtain the specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as the training samples of the deep learning model to repair the deep learning model obtained by training the training samples containing video coding noise Video.
在本申请另一实施例中,图像变形模块501,具体设置为:In another embodiment of the present application, the image deformation module 501 is specifically configured as follows:
将图像序列中除指定图像之外的图像进行随机仿射变换和/或弹性变换,随机仿射变换包括:移动、旋转和拉伸中的一项或多项图像变形处理。Random affine transformation and/or elastic transformation are performed on images other than the specified image in the image sequence. Random affine transformation includes one or more image deformation processing of moving, rotating and stretching.
在本申请另一实施例中,图像变形模块501,具体设置为:In another embodiment of the present application, the image deformation module 501 is specifically configured as follows:
根据预设图像变形规则,将图像序列中除指定图像之外的图像按预设方式和顺序进行图像变形处理。According to the preset image deformation rules, the images except the specified images in the image sequence are subjected to image deformation processing in a preset manner and order.
在本申请另一实施例中,该装置还包括:加噪模块;In another embodiment of the present application, the device further includes: a noise adding module;
加噪模块,设置为对经过图像变形处理的图像序列中的图像进行加噪处理,加噪处理包括添加高斯噪声、模糊、压缩噪声、先下采样后上采样中的一项或多项。The noise addition module is set to perform noise addition processing on the images in the image sequence after the image deformation processing. The noise addition processing includes one or more of adding Gaussian noise, blurring, compressed noise, and first down-sampling and then up-sampling.
在本申请另一实施例中,编码模块502,具体设置为:In another embodiment of the present application, the encoding module 502 is specifically configured as follows:
采用目标视频编码格式,对经过图像变形处理的图像序列进行视频编码处理,得到编码视频;Use the target video encoding format to perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video;
解码模块503,具体设置为:The decoding module 503 is specifically set as:
采用目标视频解码格式,将编码视频进行视频解码处理,得到视频帧序列。Using the target video decoding format, the encoded video is subjected to video decoding processing to obtain a video frame sequence.
本申请实施例提供的样本图像处理装置,电子设备可以将原始图像复制为第一数量个原始图像,然后将第一数量个原始图像排列为图像序列,并将图像序列中的除指定图像之外的图像进行图像变形处理,然后对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,并将编码视频进行视频解码处理,得到视频帧序列,然后获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,通过训练样本训练得到的深度学习模型用于修复包含视频编码噪声的视频。本申请实施例将原始图像转化为一段图像序列,并将图像进行变形,使得图像序列可以被视为一段视频,然后电子设备通过视频编码将图像序列压缩为视频,并通过视频解码处理将该视频解压为视频帧序列,得到具有视频编码带来的 噪声的指定视频帧。因为实际情况中视频的噪声来源为视频编码以及视频解码,所以该指定视频帧可以作为与实际情况相符的训练样本,进而使得训练后的深度学习模型有较好的修复视频能力。In the sample image processing device provided by the embodiment of the application, the electronic device can copy the original image into the first number of original images, and then arrange the first number of original images into an image sequence, and exclude the specified images in the image sequence Perform image deformation processing on the image of the image, and then perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video, and perform video decoding processing on the encoded video to obtain a video frame sequence, and then obtain the video frame sequence corresponding to the specified image Specify the video frame, use the specified video frame and the original image as the training sample of the deep learning model, and the deep learning model trained by the training sample is used to repair the video containing the video coding noise. The embodiment of the application converts the original image into an image sequence and deforms the image so that the image sequence can be regarded as a video. Then, the electronic device compresses the image sequence into a video through video encoding, and processes the video through video decoding. Decompress it into a sequence of video frames to obtain a specified video frame with noise caused by video encoding. Because the source of video noise in the actual situation is video encoding and video decoding, the specified video frame can be used as a training sample consistent with the actual situation, so that the trained deep learning model has a better ability to repair the video.
本申请实施例还提供了一种电子设备,如图6所示,包括处理器601、通信接口602、存储器603和通信总线604,其中,处理器601,通信接口602,存储器603通过通信总线604完成相互间的通信,An embodiment of the present application also provides an electronic device, as shown in FIG. 6, including a processor 601, a communication interface 602, a memory 603, and a communication bus 604. The processor 601, the communication interface 602, and the memory 603 pass through the communication bus 604. Complete the communication between each other,
存储器603,设置为存放计算机程序;The memory 603 is set to store computer programs;
处理器601,设置为执行存储器603上所存放的程序时,实现如下步骤:When the processor 601 is set to execute the program stored in the memory 603, the following steps are implemented:
将原始图像复制为第一数量个原始图像,将第一数量个原始图像排列为图像序列,并将图像序列中的除指定图像之外的图像进行图像变形处理;Copy the original image into the first number of original images, arrange the first number of original images into an image sequence, and perform image deformation processing on the images in the image sequence except the specified image;
对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,编码视频中各视频帧包含视频编码噪声;Perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video, and each video frame in the encoded video contains video encoding noise;
将编码视频进行视频解码处理,得到视频帧序列,视频帧序列中包含第一数量个视频帧,且各视频帧包含视频编码噪声;Perform video decoding processing on the encoded video to obtain a sequence of video frames, the sequence of video frames includes the first number of video frames, and each video frame includes video encoding noise;
获取视频帧序列中的指定图像对应的指定视频帧,将指定视频帧和原始图像作为深度学习模型的训练样本,以通过训练样本训练得到的深度学习模型修复包含视频编码噪声的视频。Obtain the specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as the training sample of the deep learning model to repair the video containing the video coding noise with the deep learning model obtained through the training of the training sample.
需要说明的是,处理器601,设置为执行存储器603上所存放的程序时,还设置为实现上述方法实施例中描述的其他步骤,可参考上述方法实施例中的相关描述,此处不再赘述。It should be noted that when the processor 601 is set to execute the program stored in the memory 603, it is also set to implement other steps described in the above method embodiment. You can refer to the relevant description in the above method embodiment, which will not be omitted here. Go into details.
上述网络设备提到的通信总线可以是外设部件互连标准(英文:Peripheral Component Interconnect,简称:PCI)总线或扩展工业标准结构(英文:Extended Industry Standard Architecture,简称:EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The communication bus mentioned by the aforementioned network device may be a peripheral component interconnection standard (English: Peripheral Component Interconnect, abbreviated as: PCI) bus or an extended industry standard architecture (English: Extended Industry Standard Architecture, abbreviated as: EISA) bus, etc. The communication bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
通信接口用于上述网络设备与其他设备之间的通信。The communication interface is used for communication between the aforementioned network device and other devices.
存储器可以包括随机存取存储器(英文:Random Access Memory,简称: RAM),也可以包括非易失性存储器(英文:Non-Volatile Memory,简称:NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include a random access memory (English: Random Access Memory, abbreviated as: RAM), and may also include a non-volatile memory (English: Non-Volatile Memory, abbreviated as: NVM), for example, at least one disk storage. Optionally, the memory may also be at least one storage device located far away from the foregoing processor.
上述的处理器可以是通用处理器,包括中央处理器(英文:Central Processing Unit,简称:CPU)、网络处理器(英文:Network Processor,简称:NP)等;还可以是数字信号处理器(英文:Digital Signal Processing,简称:DSP)、专用集成电路(英文:Application Specific Integrated Circuit,简称:ASIC)、现场可编程门阵列(英文:Field-Programmable Gate Array,简称:FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The above-mentioned processor may be a general-purpose processor, including a central processing unit (English: Central Processing Unit, abbreviated as: CPU), a network processor (English: Network Processor, abbreviated as: NP), etc.; it may also be a digital signal processor (English: : Digital Signal Processing, abbreviation: DSP), application specific integrated circuit (English: Application Specific Integrated Circuit, abbreviation: ASIC), Field-Programmable Gate Array (English: Field-Programmable Gate Array, abbreviation: FPGA) or other programmable logic devices , Discrete gates or transistor logic devices, discrete hardware components.
基于相同的技术构思,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述样本图像处理方法步骤。Based on the same technical concept, the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the above-mentioned sample image processing method are implemented. .
基于相同的技术构思,本申请实施例还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述样本图像处理方法步骤。Based on the same technical concept, the embodiments of the present application also provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the steps of the above-mentioned sample image processing method.
基于相同的技术构思,本申请实施例还提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述实施例中任一样本图像处理方法Based on the same technical concept, the embodiments of the present application also provide a computer program, which when running on a computer, causes the computer to execute any sample image processing method in the above embodiments
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁 性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website site, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply one of these entities or operations. There is any such actual relationship or order between. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in this specification are described in a related manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。The above descriptions are only preferred embodiments of this application, and are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection of this application. Within the range.
工业实用性Industrial applicability
基于本申请实施例提供的样本图像处理的方法、装置、服务器及介质,能够使得训练样本中包含视频编码产生的噪声。基于该训练样本训练得到深度学习模型能够对包含视频编码噪声的视频进行较好的修复。Based on the sample image processing method, device, server, and medium provided by the embodiments of the present application, the training sample can contain noise generated by video coding. The deep learning model trained based on the training samples can perform better repairs on videos containing video coding noise.

Claims (12)

  1. 一种样本图像处理方法,所述方法包括:A sample image processing method, the method includes:
    将原始图像复制为第一数量个原始图像,将所述第一数量个原始图像排列为图像序列,并将所述图像序列中的除指定图像之外的图像进行图像变形处理;Copying the original image into a first number of original images, arranging the first number of original images into an image sequence, and performing image deformation processing on images in the image sequence other than the specified image;
    对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,所述编码视频中各视频帧包含视频编码噪声;Performing video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video, where each video frame in the encoded video contains video encoding noise;
    将所述编码视频进行视频解码处理,得到视频帧序列,所述视频帧序列中包含第一数量个视频帧,且各视频帧包含视频编码噪声;Performing video decoding processing on the encoded video to obtain a sequence of video frames, the sequence of video frames includes a first number of video frames, and each video frame includes video encoding noise;
    获取所述视频帧序列中的所述指定图像对应的指定视频帧,将所述指定视频帧和所述原始图像作为深度学习模型的训练样本,以通过所述训练样本训练得到的深度学习模型修复包含视频编码噪声的视频。Acquire a designated video frame corresponding to the designated image in the video frame sequence, use the designated video frame and the original image as a training sample of a deep learning model, and repair the deep learning model obtained through the training of the training sample Video that contains video coding noise.
  2. 根据权利要求1所述的方法,其中,所述将所述图像序列中的除指定图像之外的图像进行图像变形处理,包括:The method according to claim 1, wherein said performing image deformation processing on images other than the specified image in the image sequence comprises:
    将所述图像序列中除所述指定图像之外的图像进行随机仿射变换和/或弹性变换,所述随机仿射变换包括:移动、旋转和拉伸中的一项或多项图像变形处理。Perform random affine transformation and/or elastic transformation on images other than the specified image in the image sequence, and the random affine transformation includes: one or more image deformation processing of moving, rotating, and stretching .
  3. 根据权利要求1所述的方法,其中,所述将所述图像序列中的除指定图像之外的图像进行图像变形处理,包括:The method according to claim 1, wherein said performing image deformation processing on images other than the specified image in the image sequence comprises:
    根据预设图像变形规则,将所述图像序列中除所述指定图像之外的图像按预设方式和顺序进行图像变形处理。According to a preset image deformation rule, the images in the image sequence other than the specified image are subjected to image deformation processing in a preset manner and order.
  4. 根据权利要求1-3任一项所述的方法,其中,在所述对经过图像变形处理的图像序列进行视频编码处理之前,所述方法还包括:The method according to any one of claims 1 to 3, wherein, before the video encoding process is performed on the image sequence subjected to the image deformation process, the method further comprises:
    对经过图像变形处理的图像序列中的图像进行加噪处理,所述加噪处理包括添加高斯噪声、模糊、压缩噪声、先下采样后上采样中的一项或多项。Noise addition processing is performed on the images in the image sequence after the image deformation processing. The noise addition processing includes one or more of adding Gaussian noise, blurring, compressed noise, and first down-sampling and then up-sampling.
  5. 根据权利要求1所述的方法,其中,所述对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,包括:The method according to claim 1, wherein said performing video encoding processing on an image sequence subjected to image deformation processing to obtain an encoded video comprises:
    采用目标视频编码格式,对经过图像变形处理的图像序列进行视频编码处理,得到编码视频;Use the target video encoding format to perform video encoding processing on the image sequence that has undergone image deformation processing to obtain an encoded video;
    所述将所述编码视频进行视频解码处理,得到视频帧序列,包括:The performing video decoding processing on the encoded video to obtain a video frame sequence includes:
    采用目标视频解码格式,将所述编码视频进行视频解码处理,得到视频帧序列。Using the target video decoding format, the encoded video is subjected to video decoding processing to obtain a video frame sequence.
  6. 一种样本图像处理装置,所述装置包括:A sample image processing device, the device comprising:
    图像变形模块,设置为将原始图像复制为第一数量个原始图像,将所述第一数量个原始图像排列为图像序列,并将所述图像序列中的除指定图像之外的图像进行图像变形处理;The image deformation module is configured to copy the original image into a first number of original images, arrange the first number of original images into an image sequence, and perform image deformation on images in the image sequence except the specified image deal with;
    编码模块,设置为对经过图像变形处理的图像序列进行视频编码处理,得到编码视频,所述编码视频中各视频帧包含视频编码噪声;An encoding module, configured to perform video encoding processing on an image sequence that has undergone image deformation processing to obtain an encoded video, where each video frame in the encoded video contains video encoding noise;
    解码模块,设置为将所述编码视频进行视频解码处理,得到视频帧序列,所述视频帧序列中包含第一数量个视频帧,且各视频帧包含视频编码噪声;A decoding module, configured to perform video decoding processing on the encoded video to obtain a sequence of video frames, the sequence of video frames includes a first number of video frames, and each video frame includes video encoding noise;
    训练模块,设置为获取所述视频帧序列中的所述指定图像对应的指定视频帧,将所述指定视频帧和所述原始图像作为深度学习模型的训练样本,以通过所述训练样本训练得到的深度学习模型修复包含视频编码噪声的视频。The training module is configured to obtain a specified video frame corresponding to the specified image in the video frame sequence, and use the specified video frame and the original image as training samples of the deep learning model to obtain training samples through the training samples. The deep learning model repairs videos that contain video coding noise.
  7. 根据权利要求6所述的装置,其中,所述图像变形模块,具体设置为:The device according to claim 6, wherein the image deformation module is specifically configured as follows:
    将所述图像序列中除所述指定图像之外的图像进行随机仿射变换和/或弹性变换,所述随机仿射变换包括:移动、旋转和拉伸中的一项或多项图像变形处理。Perform random affine transformation and/or elastic transformation on images other than the specified image in the image sequence, and the random affine transformation includes: one or more image deformation processing of moving, rotating, and stretching .
  8. 根据权利要求6所述的装置,其中,所述图像变形模块,具体设置为:The device according to claim 6, wherein the image deformation module is specifically configured as follows:
    根据预设图像变形规则,将所述图像序列中除所述指定图像之外的图像按预设方式和顺序进行图像变形处理。According to a preset image deformation rule, the images in the image sequence other than the specified image are subjected to image deformation processing in a preset manner and order.
  9. 一种电子设备,包括处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;An electronic device, including a processor, a communication interface, a memory, and a communication bus. The processor, the communication interface, and the memory communicate with each other through the communication bus;
    存储器,设置为存放计算机程序;Memory, set to store computer programs;
    处理器,设置为执行存储器上所存放的程序时,实现权利要求1-5任一所述的方法步骤。The processor is configured to implement the method steps described in any one of claims 1-5 when it is configured to execute the program stored in the memory.
  10. 一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-5任一所述的方法步骤。A computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the method steps according to any one of claims 1 to 5 are realized.
  11. 一种包含指令的计算机程序产品,所述包含指令的计算机程序产品在计算机上运行时,使得计算机执行权利要求1-5任一所述的方法步骤。A computer program product containing instructions, which when running on a computer, causes the computer to execute the method steps of any one of claims 1-5.
  12. 一种计算机程序,所述计算机程序在计算机上运行时,使得计算机执行权利要求1-5任一所述的方法步骤。A computer program that, when run on a computer, causes the computer to execute the method steps described in any one of claims 1-5.
PCT/CN2020/133408 2019-12-13 2020-12-02 Sample image processing method and apparatus, electronic device, and medium WO2021115180A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911282864.6A CN112995673B (en) 2019-12-13 2019-12-13 Sample image processing method and device, electronic equipment and medium
CN201911282864.6 2019-12-13

Publications (1)

Publication Number Publication Date
WO2021115180A1 true WO2021115180A1 (en) 2021-06-17

Family

ID=76329437

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/133408 WO2021115180A1 (en) 2019-12-13 2020-12-02 Sample image processing method and apparatus, electronic device, and medium

Country Status (2)

Country Link
CN (1) CN112995673B (en)
WO (1) WO2021115180A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591655A (en) * 2021-07-23 2021-11-02 上海明略人工智能(集团)有限公司 Video contrast loss calculation method, system, storage medium and electronic device
CN114494967A (en) * 2022-01-27 2022-05-13 广东欧域科技有限公司 Fruit tree identification method, system, equipment and medium based on edge calculation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709486A (en) * 2016-11-11 2017-05-24 南京理工大学 Automatic license plate identification method based on deep convolutional neural network
CN106960416A (en) * 2017-03-20 2017-07-18 武汉大学 A kind of video satellite compression image super-resolution method of content complexity self adaptation
CN108305214A (en) * 2017-12-28 2018-07-20 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and computer equipment
US20190122115A1 (en) * 2017-10-24 2019-04-25 Vmaxx, Inc. Image Quality Assessment Using Similar Scenes as Reference
CN110996171A (en) * 2019-12-12 2020-04-10 北京金山云网络技术有限公司 Training data generation method and device for video tasks and server

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106952239A (en) * 2017-03-28 2017-07-14 厦门幻世网络科技有限公司 image generating method and device
KR102606200B1 (en) * 2018-03-06 2023-11-24 삼성전자주식회사 Electronic apparatus and control method thereof
CN109671026B (en) * 2018-11-28 2020-09-29 浙江大学 Gray level image noise reduction method based on void convolution and automatic coding and decoding neural network
CN109740505B (en) * 2018-12-29 2021-06-18 成都视观天下科技有限公司 Training data generation method and device and computer equipment
CN109949234B (en) * 2019-02-25 2020-10-02 华中科技大学 Video restoration model training method and video restoration method based on deep network
US10373300B1 (en) * 2019-04-29 2019-08-06 Deep Render Ltd. System and method for lossy image and video compression and transmission utilizing neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709486A (en) * 2016-11-11 2017-05-24 南京理工大学 Automatic license plate identification method based on deep convolutional neural network
CN106960416A (en) * 2017-03-20 2017-07-18 武汉大学 A kind of video satellite compression image super-resolution method of content complexity self adaptation
US20190122115A1 (en) * 2017-10-24 2019-04-25 Vmaxx, Inc. Image Quality Assessment Using Similar Scenes as Reference
CN108305214A (en) * 2017-12-28 2018-07-20 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and computer equipment
CN110996171A (en) * 2019-12-12 2020-04-10 北京金山云网络技术有限公司 Training data generation method and device for video tasks and server

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591655A (en) * 2021-07-23 2021-11-02 上海明略人工智能(集团)有限公司 Video contrast loss calculation method, system, storage medium and electronic device
CN114494967A (en) * 2022-01-27 2022-05-13 广东欧域科技有限公司 Fruit tree identification method, system, equipment and medium based on edge calculation

Also Published As

Publication number Publication date
CN112995673B (en) 2023-04-07
CN112995673A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
WO2021115180A1 (en) Sample image processing method and apparatus, electronic device, and medium
CN111491170B (en) Method for embedding watermark and watermark embedding device
US10032257B2 (en) Super resolution processing method, device, and program for single interaction multiple data-type super parallel computation processing device, and storage medium
WO2022127374A1 (en) Color image steganography method based on convolutional neural network
CN109325928A (en) A kind of image rebuilding method, device and equipment
US11514948B1 (en) Model-based dubbing to translate spoken audio in a video
WO2020172907A1 (en) Method and device for selecting context model of quantization coefficient end flag bit
US20200104711A1 (en) Method and apparatus for training a neural network used for denoising
CN110889824A (en) Sample generation method and device, electronic equipment and computer readable storage medium
US20100080283A1 (en) Processing real-time video
JP7499402B2 (en) End-to-End Watermarking System
CN112399249A (en) Multimedia file generation method and device, electronic equipment and storage medium
CN112927144A (en) Image enhancement method, image enhancement device, medium, and electronic apparatus
CN111429566A (en) Reconstruction method and device of virtual home decoration scene and electronic equipment
CN113033677A (en) Video classification method and device, electronic equipment and storage medium
CN114071190B (en) Cloud application video stream processing method, related device and computer program product
US10789769B2 (en) Systems and methods for image style transfer utilizing image mask pre-processing
CN111048065A (en) Text error correction data generation method and related device
WO2022178975A1 (en) Noise field-based image noise reduction method and apparatus, device, and storage medium
CN115393756A (en) Visual image-based watermark identification method, device, equipment and medium
CN113423016A (en) Video playing method, device, terminal and server
WO2023082773A1 (en) Video encoding method and apparatus, video decoding method and apparatus, and device, storage medium and computer program
CN114070950B (en) Image processing method, related device and equipment
WO2024007135A1 (en) Image processing method and apparatus, terminal device, electronic device, and storage medium
US12041138B1 (en) Computer-implemented methods for synthetic accuracy measurement of a content recognition system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900515

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20900515

Country of ref document: EP

Kind code of ref document: A1