WO2024041482A1 - 一种图像处理方法、装置及*** - Google Patents

一种图像处理方法、装置及*** Download PDF

Info

Publication number
WO2024041482A1
WO2024041482A1 PCT/CN2023/114021 CN2023114021W WO2024041482A1 WO 2024041482 A1 WO2024041482 A1 WO 2024041482A1 CN 2023114021 W CN2023114021 W CN 2023114021W WO 2024041482 A1 WO2024041482 A1 WO 2024041482A1
Authority
WO
WIPO (PCT)
Prior art keywords
image frames
consecutive
image
downsampling
training
Prior art date
Application number
PCT/CN2023/114021
Other languages
English (en)
French (fr)
Inventor
王鑫
黄婧
郑晓旭
秘谧
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024041482A1 publication Critical patent/WO2024041482A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder

Definitions

  • the present application relates to the field of image processing technology, and in particular, to an image processing method, device and system.
  • image processing usually requires a series of processes such as compression, transmission, and decompression.
  • Compressing images can reduce the bandwidth of image transmission.
  • the sending end usually downsamples the image first, and then compresses and transmits the sub-image obtained by the downsampling. After decompression, the receiving end uses super-resolution technology to compress the decompressed image. to rebuild.
  • the receiving end in order to restore the image information lost during downsampling and compression processing at the sending end, the receiving end usually uses the correlation between pixels between frames and moving objects between frames when reconstructing images through super-resolution technology. sub-pixel misalignment to recover.
  • this method can only restore the information of the moving area lost during downsampling and compression to a certain extent, and cannot recover the information of the stationary area lost during downsampling and compression, so the recovery effect is not good.
  • the present application provides an image processing method, device and system, which solves the problem in the prior art of poor restoration effect when restoring downsampled images based on super-resolution technology.
  • an image processing method includes: acquiring image data.
  • the image data may be high-resolution video data.
  • the image data includes a plurality of consecutive first image frames, also known as consecutive multiple first image frames. frame image; perform downsampling processing on the plurality of consecutive first image frames respectively to obtain a plurality of consecutive second image frames, and the plurality of consecutive second image frames are one by one with the plurality of consecutive first image frames.
  • the plurality of consecutive second image frames at least two adjacent second image frames have different sampling points in the same pixel module, and the sampling points may be pixels or sub-pixels.
  • sampling points at different phases can be down-sampled, so that the down-sampling is Among the multiple consecutive second image frames, at least two adjacent second image frames have different sampling points in the same pixel module, that is, the different sampling points in the multiple consecutive first image frames can be obtained through down-sampling processing.
  • pixel information in phase so that when subsequent super-resolution processing is performed on multiple consecutive second image frames, the correlation of pixel information between multiple consecutive second image frames can be effectively utilized to compensate for each second image
  • the pixel information lost in the frame downsampling process makes the error between the multiple consecutive image frames obtained after restoration and the original multiple consecutive first image frames smaller, improving the authenticity of the restoration, and thus the restoration effect is better. good.
  • performing downsampling processing on the plurality of consecutive first image frames to obtain a plurality of consecutive second image frames includes: respectively performing downsampling processing based on at least two preset phases.
  • the plurality of consecutive first image frames are subjected to downsampling processing.
  • Each first image frame in the plurality of consecutive first image frames corresponds to one of the at least two preset phases, and adjacent The preset phases corresponding to the two first image frames are different.
  • Each first image frame is subjected to downsampling processing to obtain a second image frame, so that multiple consecutive first image frames correspond to multiple consecutive second images. frame.
  • performing downsampling processing on the plurality of consecutive first image frames to obtain a plurality of consecutive second image frames includes: based on at least two preset phases, Each consecutive first image frame among the plurality of consecutive first image frames is subjected to downsampling processing to obtain at least two candidate image frames; from the plurality of consecutive first image frames corresponding to each first image frame Select one candidate image frame from at least two candidate image frames, and there are two adjacent first image frames corresponding to the selected two candidate image frames.
  • the preset phases used when downsampling are different, and the plurality of consecutive first image frames are obtained. Two image frames.
  • At least two adjacent second image frames have different sampling points in the same pixel module, so as to realize the sampling point of the multiple consecutive second image frames. Pixel information at different phases in consecutive first image frames is sampled.
  • downsampling is performed on the multiple consecutive first image frames to obtain multiple
  • the continuous second image frames include: using a down-sampling network to down-sample the multiple continuous first image frames to obtain the multiple continuous second image frames, and the down-sampling network is trained.
  • the accuracy and processing efficiency of the down-sampling process can be improved, thereby ensuring that the subsequent plurality of consecutive second image frames can be down-sampled.
  • image frames undergo super-resolution processing the authenticity of multiple image frames obtained after restoration is improved, thereby ensuring the restoration effect.
  • the method further includes: performing downsampling training on multiple training image frames to obtain multiple sampled image frames, in which there are at least two adjacent The sampled image frames have different sampling points in the same pixel module; super-resolution training is performed on the multiple sampled image frames to obtain multiple training recovery image frames; based on the multiple training recovery image frames and the multiple training image frames, Determine this downsampling network.
  • the encoding and decoding operations are not considered first, but downsampling training and super-resolution training are directly performed based on multiple training image frames to obtain the downsampling network, ensuring that the training results are obtained The downsampling network has better performance.
  • the method further includes: encoding the plurality of consecutive second image frames to obtain image encoding data.
  • the transmission efficiency of the multiple continuous second image frames can be improved and the storage space occupied during storage can be reduced.
  • an image processing method includes: acquiring multiple continuous second image frames, which are obtained by downsampling multiple continuous first image frames respectively. , the plurality of consecutive second image frames correspond to the plurality of consecutive first image frames, and among the plurality of consecutive second image frames, there are at least two adjacent second image frames in the same pixel module The sampling points in are different, and the sampling points are pixels or sub-pixels; super-resolution processing is performed on the multiple continuous second image frames to obtain multiple continuous third image frames, and the multiple continuous second image frames are the same as The plurality of consecutive third image frames correspond one to one.
  • multiple continuous second image frames are acquired, and among the multiple continuous second image frames, there are at least two adjacent second image frames with different sampling points in the same pixel module, that is, the sampling points in the same pixel module are different.
  • Multiple consecutive second image frames contain pixel information of the same object in different phases, so that when performing super-resolution processing on the multiple consecutive second image frames, the pixels between the multiple consecutive second image frames can be effectively utilized.
  • the correlation of information compensates for the pixel information lost in the downsampling process of each second image frame, making the error between the multiple image frames obtained after restoration and the original image frame smaller, improving the authenticity of the restoration, and thus The recovery effect is better.
  • obtaining multiple consecutive second image frames includes: obtaining image coded data, and decoding the image coded data to obtain multiple continuous second image frames.
  • performing super-resolution processing on the multiple continuous second image frames to obtain multiple continuous third image frames includes: using a super-resolution network to process the multiple continuous third image frames.
  • the second image frame is subjected to super-resolution processing to obtain the plurality of consecutive third image frames.
  • the method further includes: performing super-resolution training on multiple sampled degraded image frames to obtain the super-resolution network; wherein the multiple sampled degraded image frames train multiple sampled images
  • the multiple sampled image frames are obtained by encoding and decoding frames, and the multiple sampled image frames are obtained by downsampling multiple training image frames using a downsampling network.
  • when training the super-resolution network first perform down-sampling training and super-resolution training based on multiple training image frames to obtain the down-sampling network, and then fix the down-sampling network, and then fix the down-sampling network.
  • the output sampled image frames are encoded and decoded, and the degraded sampled degraded image frames are used for super-resolution training to obtain the super-resolution network, thereby achieving complete end-to-end training from downsampling to super-resolution, ensuring training
  • the obtained downsampling network and super-resolution network have better performance.
  • an image processing device in a third aspect, includes: an acquisition unit for acquiring image data, the image data including a plurality of consecutive first image frames; and a downsampling unit for acquiring the plurality of consecutive first image frames.
  • An image frame is subjected to downsampling processing to obtain a plurality of consecutive second image frames.
  • the plurality of consecutive second image frames correspond to the plurality of consecutive first image frames.
  • the plurality of consecutive second image frames There are at least two adjacent second image frames with different sampling points in the same pixel module, and the sampling points are pixels or sub-pixels.
  • the down-sampling unit is further configured to perform down-sampling processing on the plurality of consecutive first image frames based on at least two preset phases, and the plurality of consecutive first image frames are Each first image frame in an image frame corresponds to one of the at least two preset phases, and the preset phases corresponding to two adjacent first image frames are different, and the plurality of consecutive first image frames The image frames correspond to multiple consecutive second image frames.
  • the down-sampling unit is also used to: based on at least two preset phases, the plurality of consecutive Perform downsampling processing on each of the first image frames to obtain at least two candidate image frames; from at least two candidate images corresponding to each first image frame in the plurality of consecutive first image frames A candidate image frame is selected from the frame, and there are two adjacent first image frames corresponding to the selected two candidate image frames.
  • the preset phases used when downsampling are different, and the plurality of consecutive second image frames are obtained.
  • the down-sampling unit is further configured to: use a down-sampling network to perform down-sampling processing on the plurality of consecutive first image frames to obtain the plurality of consecutive second image frames. , the downsampling network is trained.
  • the device further includes: a training unit, configured to: perform downsampling training on multiple training image frames to obtain multiple sampled image frames, where there are At least two adjacent sampled image frames have different sampling points in the same pixel module; perform super-resolution training on the multiple sampled image frames to obtain multiple training restored image frames; restore the image frames and the multiple training image frames based on the multiple training Multiple training image frames determine the downsampling network.
  • a training unit configured to: perform downsampling training on multiple training image frames to obtain multiple sampled image frames, where there are At least two adjacent sampled image frames have different sampling points in the same pixel module; perform super-resolution training on the multiple sampled image frames to obtain multiple training restored image frames; restore the image frames and the multiple training image frames based on the multiple training Multiple training image frames determine the downsampling network.
  • the device further includes: a coding unit, configured to code the plurality of consecutive second image frames to obtain coded image data.
  • an image processing device configured to acquire a plurality of consecutive second image frames.
  • the plurality of consecutive second image frames are performed on a plurality of consecutive first image frames. Obtained by downsampling, there are at least two adjacent second image frames in the plurality of second image frames that have different sampling points in the same pixel module, and the sampling points are pixels or sub-pixels;
  • a super-resolution unit used for Super-resolution processing is performed on the plurality of consecutive second image frames to obtain a plurality of consecutive third image frames.
  • the plurality of consecutive second image frames correspond to the plurality of consecutive third image frames in a one-to-one manner.
  • the device further includes a decoding unit; the obtaining unit is also used to obtain image coded data; the decoding unit is used to decode the image coded data to obtain multiple consecutive the second image frame.
  • the super-resolution unit is also used to perform super-resolution processing on the plurality of consecutive second image frames using a super-resolution network to obtain the plurality of consecutive third image frames.
  • the device further includes: a training unit for performing super-resolution training on multiple sampled degraded image frames to obtain the super-resolution network; wherein, the multiple sampled degraded image frames
  • the multiple sampled image frames are obtained by encoding and decoding multiple sampled image frames.
  • the multiple sampled image frames are obtained by downsampling multiple training image frames using a downsampling network.
  • Another aspect of the present application provides an image processing system, which includes: any image processing device as provided in the third aspect or any possible implementation of the third aspect, and as in the fourth aspect Any image processing device provided by any possible implementation of the aspect or the fourth aspect.
  • the image processing system includes a processor and a memory. Instructions are stored in the memory. When the instructions are executed, the image processing system executes the first aspect or the first aspect.
  • the image processing method provided by any possible implementation of the aspect.
  • a computer-readable storage medium stores a computer program or instructions. When the computer program or instructions are run, the first aspect or any of the first aspects is implemented.
  • One possible implementation of the provided image processing method is provided.
  • Another aspect of the present application provides a computer program product.
  • the computer program product When the computer program product is run on a computer, it causes the computer to execute the image processing method provided by the first aspect or any possible implementation of the first aspect.
  • any device, system, computer storage medium or computer program product of any of the image processing methods provided above is used to execute the corresponding method provided above. Therefore, the beneficial effects it can achieve can be referred to the above. The beneficial effects of the corresponding methods provided in this article will not be repeated here.
  • Figure 1 is a schematic structural diagram of an image processing system provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of another image processing system provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of down-sampling provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of another downsampling provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of an image processing system processing multiple image frames provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of another image processing system processing multiple image frames provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of a training downsampling network provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of training a super-resolution network provided by an embodiment of the present application.
  • Figure 11 is a schematic structural diagram of an image processing device provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of another image processing device provided by an embodiment of the present application.
  • At least one refers to one or more, and “plurality” refers to two or more.
  • And/or describes the association of associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural.
  • At least one of the following or similar expressions thereof refers to any combination of these items, including any combination of a single item (items) or a plurality of items (items).
  • At least one of a, b or c can represent: a, b, c, a-b, a-c, b-c or a-b-c, where a, b and c can be single or multiple.
  • the character "/" generally indicates that the related objects are in an "or” relationship.
  • words such as “first” and “second” do not limit the number and execution order.
  • the technical solutions provided in this application can be applied to a variety of different image processing systems, which may be image encoding and decoding systems, image storage systems, or video shooting systems (such as security systems).
  • the image processing system may be one electronic device or may include multiple electronic devices.
  • the electronic devices include but are not limited to: mobile phones, tablets, computers, laptops, camcorders, cameras, wearable devices, vehicle-mounted devices or terminal devices, etc.
  • the image processing system can be used to perform down-sampling processing on high-resolution image frames, and can also be used to perform at least one of the following processes: encoding, noise reduction, or denoising the down-sampled image frames. Deblurring processing, etc., storing the image data after the above processing, decoding the image data, performing super-resolution processing on low-resolution image frames, etc.
  • the specific structure of the image processing system is illustrated below with an example.
  • FIG 1 is a schematic structural diagram of an image processing system provided by an embodiment of the present application.
  • the image processing system is explained by taking a mobile phone as an example.
  • the mobile phone or a chip system built into the mobile phone includes: a memory 101, a processor 102, and a sensor component 103. , multimedia component 104 and input/output interface 105.
  • the following is a detailed introduction to each component of a mobile phone or a chip system built into a mobile phone with reference to Figure 1.
  • the memory 101 can be used to store data, software programs and modules; it mainly includes a stored program area and a stored data area.
  • the stored program area can store software programs, including instructions formed in codes, including but not limited to operating systems, at least one function. Required applications, such as sound playback function, image playback function, etc.; the storage data area can store data created based on the use of the mobile phone, such as audio data, image data, phone book, etc.
  • the memory 101 can be used to store face images, illumination information databases, images to be evaluated, etc.
  • the memory may include floppy disks, hard disks such as built-in hard disks and removable hard disks, magnetic disks, optical disks, magneto-optical disks such as CD_ROM and DCD_ROM, and non-volatile storage Devices such as RAM, ROM, PROM, EPROM, EEPROM, flash memory, or any other form of storage media known in the technical field.
  • the memory may include floppy disks, hard disks such as built-in hard disks and removable hard disks, magnetic disks, optical disks, magneto-optical disks such as CD_ROM and DCD_ROM, and non-volatile storage Devices such as RAM, ROM, PROM, EPROM, EEPROM, flash memory, or any other form of storage media known in the technical field.
  • the processor 102 is the control center of the mobile phone, using various interfaces and lines to connect various parts of the entire device, by running or executing software programs and/or software modules stored in the memory 101, and calling data stored in the memory 101, Performs various functions of the phone and processes data for overall monitoring of the phone.
  • the processor 102 may be used to perform one or more steps in the method embodiment of the present application.
  • the processor 102 may be used to perform one or more of S202 to S204 in the following method embodiment. step.
  • the processor 102 may be a single-processor structure, a multi-processor structure, a single-threaded processor, a multi-threaded processor, etc.; in some feasible embodiments, the processor 102 may include a central processing unit At least one of a unit, a general-purpose processor, a digital signal processor, a neural network processor, an image processing unit, an image signal processor, a microcontroller or a microprocessor, and the like. In addition, the processor 102 may further include other hardware circuits or accelerators, such as application specific integrated circuits, field programmable gate arrays or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
  • the processor 102 may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with this disclosure.
  • the processor 102 may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and so on.
  • the sensor component 103 includes one or more sensors for providing various aspects of status assessment for the mobile phone.
  • the sensor component 103 may include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications, that is, becoming an integral part of a camera or camera.
  • the sensor component 103 can be used to support the camera in the multimedia component 104 to obtain facial images. wait.
  • the sensor component 103 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor, through which the acceleration/deceleration, orientation, open/close status of the mobile phone, the relative positioning of components, or Temperature changes of mobile phones, etc.
  • the multimedia component 104 provides a screen of an output interface between the mobile phone and the user.
  • the screen may be a touch panel, and when the screen is a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide action.
  • the multimedia component 104 also includes at least one camera.
  • the multimedia component 104 includes a front camera and/or a rear camera. When the mobile phone is in an operating mode, such as shooting mode or video mode, the front camera and/or the rear camera can sense external multimedia signals, which are used to form image frames.
  • Each front-facing camera and rear-facing camera can be a fixed optical lens system or have a focal length and optical zoom capabilities.
  • the input/output interface 105 provides an interface between the processor 102 and peripheral interface modules.
  • the peripheral interface modules may include a keyboard, a mouse, or a USB (Universal Serial Bus) device.
  • the input/output interface 105 may have only one input/output interface, or may have multiple input/output interfaces.
  • the mobile phone may also include audio components and communication components.
  • the audio component includes a microphone
  • the communication component includes a wireless fidelity (WiFi) module, a Bluetooth module, etc.
  • WiFi wireless fidelity
  • Bluetooth Bluetooth
  • FIG. 2 is a schematic structural diagram of another image processing system provided by an embodiment of the present application.
  • the image processing system is explained by taking a video shooting system as an example.
  • the video shooting system includes multiple security devices (also called edge devices) 201 and a server 202.
  • the multiple security devices 201 and the server 202 can be connected in a wired or wireless manner.
  • the multiple security devices 201 may be multiple video camera devices, used for shooting and processing video data, and transmitting the video data to the server 202 .
  • the multiple security devices 201 can perform down-sampling processing on the captured video data, and can also be used to perform one of the following processings on the video data: encoding, noise reduction, deblurring, feature extraction, feature comparison, etc. Multiple.
  • the plurality of security devices 201 may include various cameras such as pinhole cameras, dome cameras, infrared cameras, mobile phones, tablets, or other devices with video shooting functions.
  • the server 202 may be used to receive and store video data transmitted by the plurality of security devices 201, and to process the video data, among other functions.
  • the server 202 can be used to perform downsampling processing on the video data, and can also be used to perform super-resolution processing on the video data. It can also be used to perform encoding and decoding processing, noise reduction processing, deblurring processing, feature extraction, and feature extraction on the video data. One or more of the processes such as comparison and image retrieval.
  • the server 202 can also be used to uniformly manage and configure the multiple security devices 201 .
  • the server 202 can be used to perform identity authentication on the multiple security devices 201, transmit partial processing results of video data to the multiple security devices 201, etc.
  • the server 202 may be a cloud server in a cloud data center, and the cloud data center may include one or more cloud servers. Cloud data centers can be used to provide users with services such as video sharing, video analysis, and big data applications.
  • the video shooting system may also include a storage device, which may be connected to the server 202 through a bus, and the storage device may be used to store images or video-related data.
  • the server 202 may store the downsampled image data in the storage device.
  • the server 202 can also obtain the image data from the storage device through a bus, and perform super-resolution processing on the image data.
  • FIGS. 1 and 2 does not constitute a limitation on the image processing system, and may include more or less components than shown in the figures, or a combination of certain components. components, or different component arrangements.
  • FIG 3 is a schematic flowchart of an image processing method provided by an embodiment of the present application. The method can be executed by the image processing system shown in Figure 1 or Figure 2. Referring to Figure 3, the method can include the following steps.
  • the image data may be high-resolution video data, and the multiple consecutive first image frames in the image data may be multiple consecutive image frames in the video data, or may be referred to as continuous multi-frame images.
  • the resolution of each first image frame in the plurality of first image frames may be the same.
  • the resolution of each first image frame in the image data may be 1280 ⁇ 720 or 1920 ⁇ 1080.
  • the image processing system can be a terminal device such as a mobile phone, a camera, or a vehicle-mounted device.
  • the terminal device can photograph objects in the surrounding environment through a camera or a device with an image capturing function such as a camera, and obtain the image data.
  • the image processing system may include a server and a security device.
  • the security device may capture objects in the surrounding environment through a camera to obtain the image data, and the security device may perform the following step S202 ; Or, the security After obtaining the image data, the device sends the image data to the server in a wired or wireless manner.
  • the server receives the image data and executes the following step S202.
  • S202 Perform downsampling processing on the plurality of consecutive first image frames respectively to obtain a plurality of consecutive second image frames.
  • the plurality of second image frames correspond to the plurality of first image frames in a one-to-one correspondence.
  • the consecutive second image frames at least two adjacent second image frames have different sampling points in the same pixel module.
  • performing downsampling processing on each first image frame can obtain a second image frame, thereby performing downsampling processing on the plurality of consecutive first image frames respectively to obtain a plurality of consecutive second image frames.
  • the number of consecutive first image frames is equal to the number of the plurality of consecutive second image frames.
  • the plurality of consecutive second image frames are obtained by down-sampling the plurality of consecutive first image frames, so that the resolution of the plurality of consecutive second image frames is smaller than that of the plurality of consecutive first image frames. resolution.
  • the resolution of the plurality of first image frames may be 1920 ⁇ 1080
  • the resolution of the plurality of second image frames may be 640 ⁇ 480.
  • sampling points of the second image frame may be pixels or sub-pixels in the corresponding first image frame, that is, sampling the pixels or sub-pixels in the first image frame to obtain the corresponding third Two image frames.
  • the phase of the sampling point may refer to the phase of the sampled pixel or sub-pixel, and the phase may also be understood as the position of the sampled pixel or sub-pixel in the first image frame.
  • At least two adjacent second image frames have different sampling points in the same pixel module
  • the at least two adjacent second image frames may be the plurality of second image frames.
  • Part of the two image frames may also be all image frames of the plurality of second image frames.
  • the same pixel module may refer to pixel modules located at the same position in different image frames.
  • the image processing system may perform one down-sampling on the first image frame, or may perform multiple down-sampling on the first image frame.
  • the image frame obtained by one downsampling may be selected as the second image frame.
  • the image processing system may divide the first image frame into multiple pixel modules (which may also be referred to as image blocks), and sample the pixels in each pixel module of the multiple pixel modules. Pixels or sub-pixels located in the same area.
  • the image processing system performs downsampling once for each first image frame among the plurality of consecutive first image frames
  • the specific process of obtaining the plurality of consecutive second image frames can be The method includes: performing a downsampling process on each of the plurality of consecutive first image frames based on at least two preset phases, and each first image of the plurality of consecutive first image frames is The frame corresponds to one preset phase among the at least two preset phases, so that the plurality of consecutive first image frames are subjected to downsampling processing to obtain a plurality of consecutive second image frames.
  • each divided pixel module includes 2 ⁇ 2 pixels, And the 2 ⁇ 2 pixels respectively correspond to four preset phases P1 to P4, then the down-sampling process may include: down-sampling the pixels corresponding to the preset phase P1 in the first image frame F11, and correspondingly obtain the second image frame F21.
  • the pixels corresponding to the preset phase P4 in the first image frame F14 are sampled to obtain the second image frame F24.
  • the above four preset phases P1 to P4 are based on a pixel module composed of 2 ⁇ 2 pixels. The pixels at different positions in the pixel module correspond to one phase as an example for explanation.
  • each first image frame in the plurality of consecutive first image frames corresponds to one of the at least two preset phases
  • the plurality of consecutive first image frames corresponds to the at least one preset phase.
  • the corresponding relationship between the two preset phases may be random or set in advance, and the embodiments of the present application do not specifically limit this.
  • the image processing system can perform multiple downsampling on each of the multiple consecutive first image frames to obtain the multiple consecutive second image frames.
  • the specific process may include: performing downsampling processing on each of the plurality of consecutive first image frames based on at least two preset phases to obtain at least two candidate image frames; Presets used when downsampling one of the at least two candidate image frames corresponding to each first image frame in the frame, and there are two adjacent first image frames corresponding to the selected two candidate image frames. The phases are different, and the plurality of consecutive second image frames are obtained.
  • each divided pixel module includes 2 ⁇ 2 pixels, And the 2 ⁇ 2 pixels respectively correspond to the four preset phases P1 to P4, then the downsampling process may include: downsampling the pixels corresponding to the preset phases P1 to P4 in the first image frame F11, respectively, to obtain the pixels corresponding to the preset phases P1 to P4 in the first image frame F11.
  • Four candidate image frames 4 ⁇ F21' respectively downsample the pixels corresponding to the preset phases P1 to P4 in the second image frame F12, to obtain four candidate image frames 4 ⁇ F22' corresponding to F12; respectively downsample the first image frame
  • the pixels corresponding to the preset phases P1 to P4 in F13 Obtain four candidate image frames 4 ⁇ F23' corresponding to F13; respectively downsample the pixels corresponding to the preset phases P1 to P4 in the first image frame F14, and obtain four candidate image frames 4 ⁇ F24' corresponding to F14; from F11 to Select one of the four candidate image frames corresponding to each first image frame in F14 to obtain four consecutive second image frames F21 to F24.
  • the selection can be made randomly or according to preset rules.
  • the embodiments of this application do not specifically limit this.
  • the image processing system can use a down-sampling network to perform down-sampling processing on the plurality of consecutive first image frames to obtain the plurality of consecutive second image frames.
  • the down-sampling network can perform down-sampling according to any of the above possible implementation methods, and the embodiments of the present application do not specifically limit this.
  • the downsampling network can be obtained by training multiple training images using deep learning.
  • the downsampling network may include a spatial to depth (S2D) layer, a fusion layer, a convolution layer and a dimensionality reduction layer.
  • S2D conversion layer can be used to downsample the pixels or sub-pixels of each first image frame based on at least two preset phases to obtain at least two candidate image frames corresponding to each first image frame;
  • the fusion layer It can be used to overlap and fuse all candidate image frames corresponding to the multiple consecutive first image frames to obtain fused image data;
  • the convolution layer can be used to perform a convolution operation on the fused image data;
  • the dimensionality reduction layer can be used to Perform dimensionality reduction processing on the fused image data after the convolution operation to output multiple second image frames.
  • the S2D layer can be implemented by the pixel shuffler in the deep learning algorithm
  • the fusion layer can be implemented by the connection operator (concat) in the deep learning algorithm
  • the convolution layer can be implemented by the depth
  • the convolution operator (convolution) in the learning algorithm is implemented.
  • This dimensionality reduction layer can be implemented by the dimensionality reduction operator in the deep learning algorithm.
  • sampling points in different phases can be sampled, so that the samples in the plurality of consecutive second image frames obtained by downsampling
  • the phases of the points are different, so that for the stationary objects and dynamic objects in the multiple continuous first image frames, the multiple continuous second image frames obtained by sampling contain pixel information of different positions in the stationary objects and dynamic objects.
  • the pixel information of the same object at different positions in multiple consecutive second image frames can be effectively used to restore the object, thereby ensuring that the effect of the restored image frame is better.
  • S203 Encode the plurality of consecutive second image frames to obtain image encoding data.
  • Encoding the plurality of consecutive second image frames may also be referred to as compressing or compressing the plurality of consecutive second image frames.
  • the image processing system can perform compression encoding on the multiple continuous second image frames according to a certain coding standard to obtain image coded data.
  • the image processing system can compress and encode the multiple consecutive second image frames according to the coding standard H265 or H364.
  • the image processing system may not encode the multiple consecutive second image frames, but may perform other processing such as denoising or deblurring on the multiple consecutive second image frames, or Only the plurality of consecutive second image frames are encoded, and denoising or deblurring is performed, which is not specifically limited in the embodiment of the present application.
  • other processing such as denoising or deblurring on the multiple consecutive second image frames, or Only the plurality of consecutive second image frames are encoded, and denoising or deblurring is performed, which is not specifically limited in the embodiment of the present application.
  • the image processing system when the image processing system is a terminal device, the image processing system may include a memory; when the image processing system is a video shooting system, the image processing system may include a storage device. After the image processing system codes the plurality of consecutive second image frames to obtain the image coded data, the image processing system can store the image coded data in a memory or storage device.
  • the method may also include: S204-S205.
  • S204 Decode the image coded data to obtain the plurality of consecutive second image frames.
  • the image processing system can obtain the image encoding data from the memory or storage device, and decode the image encoding data according to the encoding method. Decoding is performed to obtain the plurality of consecutive second image frames.
  • S205 Perform super-resolution processing on the multiple continuous second image frames to obtain multiple continuous third image frames.
  • the image processing system when it obtains the multiple continuous second image frames, it can perform super-resolution processing on the multiple continuous second image frames to utilize the pixels between the multiple continuous second image frames. Or sub-pixel correlation, to compensate for the pixel information lost in the downsampling process of each second image frame, to obtain multiple consecutive third image frames with high resolution.
  • the image processing system can use a super-resolution network to perform super-resolution processing on the plurality of consecutive second image frames to obtain the plurality of consecutive third image frames.
  • the super-resolution network can be obtained by using a deep learning algorithm to perform super-resolution training on multiple sampled images obtained by downsampling.
  • the multiple sampled images can be obtained by downsampling multiple training images by the above-mentioned downsampling network. Obtained by the same process.
  • the image processing method may not include steps S203 and S204, that is, no steps may be performed during the image processing.
  • the plurality of consecutive second image frames obtained by downsampling are encoded.
  • the operation of decoding the image encoded data does not need to be performed before the super-resolution processing.
  • the following uses the image processing system shown in Figures 7 and 8 as an example to illustrate the technical solution provided by the embodiment of the present application.
  • the image processing system may include a downsampling network, a coding and decoding module, and a super-resolution network.
  • the corresponding image processing method may include: the image processing system acquires multiple consecutive first images. When frames F11 to F1i (i is an integer greater than 1); use a downsampling network to downsample the multiple continuous first image frames F11 to F1i to obtain multiple continuous second image frames F21 to F2i, which Among the multiple consecutive second image frames F21 to F2i, two adjacent second image frames have different sampling points in the same pixel module; use the encoding and decoding module to first perform the processing on the multiple consecutive second image frames F21 to F2i.
  • Encoding process and then decoding the encoded image coded data to obtain the multiple continuous second image frames F21 to F2i; using a super-resolution network to perform super-resolution processing on the multiple continuous second image frames F21 to F2i , obtaining multiple consecutive third image frames F31 to F3i.
  • the image processing system may include a downsampling network, a storage device, and a super-resolution network.
  • the corresponding image processing method may include: the image processing system acquires multiple consecutive first image frames F11 to When F1i (i is an integer greater than 1); use the downsampling network to downsample the multiple continuous first image frames F11 to F1i to obtain multiple continuous second image frames F21 to F2i. The multiple continuous second image frames F21 to F2i are obtained.
  • Two adjacent second image frames among the second image frames F21 to F2i have different sampling points in the same pixel module; store the multiple consecutive second image frames F21 to F2i in the storage device; from the The multiple continuous second image frames F21 to F2i are obtained from the storage device, and a super-resolution network is used to perform super-resolution processing on the multiple continuous second image frames F21 to F2i to obtain multiple continuous third image frames F31 to F3i.
  • sampling points at different phases can be down-sampled, so that Among the plurality of consecutive second image frames obtained by sampling, at least two adjacent second image frames have different sampling points in the same pixel module, that is, the plurality of consecutive first image frames can be obtained through down-sampling processing.
  • pixel information in different phases so that when performing super-resolution processing on the multiple consecutive second image frames, the correlation of the pixel information between the multiple consecutive second image frames can be effectively utilized to compensate for each second
  • the pixel information of static objects (such as leaves, houses, warning signs, etc.) and/or dynamic objects (such as license plates of moving vehicles) lost during the downsampling process of image frames makes multiple consecutive third-order pixels obtained after restoration.
  • the error between the image frame and the original multiple consecutive first image frames is small, which improves the authenticity of the restoration and results in a better restoration effect.
  • the process of using deep learning to train to obtain the down-sampling network and the super-resolution network in the embodiment of the present application will be introduced and explained below.
  • the process of training the downsampling network and the super-resolution network may include two steps: in the first step, the operation of encoding and decoding (or denoising, deblurring and other processing) may not be considered, but a deep learning algorithm may be used.
  • the downsampling network obtained in the first step can be fixed, and encoding and decoding (or denoising, (Deblurring and other processing) and other operations), use the degraded image frames for super-resolution training to obtain the super-resolution network.
  • the process of training the downsampling network may include: S11. Train the initial downsampling network based on multiple training image frames Y11 to Y1i (i is an integer greater than 1) to obtain multiple sampled image frames Y21 to Y2i, there are at least two sampled image frames with different phases of sampling points in the multiple sampled image frames Y21 to Y2i; S12.
  • the initial super-resolution network is trained according to the multiple sampled image frames Y21 to Y2i, and multiple Train the restored image frames Y31 to Y3i; adjust the initial downsampling network and the initial super-resolution network based on the errors between multiple training restored image frames Y31 to Y3i and multiple training image frames Y11 to Y1i, and re-execute S11 and S12 ; When the above error is within the acceptable error range, the currently obtained downsampling network is determined as the final trained downsampling network.
  • the process of training the super-resolution network may include: S21. Perform encoding and decoding degradation processing on the multiple sampled image frames Y21 to Y2i output by the down-sampling network obtained by the above training, that is, the multiple sampled image frames Encode Y21 to Y2i, and then decode the encoded image coded data to obtain multiple sampled degraded image frames Y21' to Y2i'; S22. Train the initial super-resolution network based on the sampled degraded image frames Y21' to Y2i' , obtain multiple training recovery image frames Y31 to Y3i; S23.
  • the downsampling network including the S2D layer, the fusion layer, the convolution layer and the dimensionality reduction layer is used as an example for explanation.
  • the specific description of the S2D layer, the fusion layer, the convolution layer and the dimensionality reduction layer can be Referring to the above description, the embodiments of the present application will not be described again here.
  • the encoding and decoding operations are not considered first, but the down-sampling training and super-resolution training are directly performed based on multiple training image frames to obtain the down-sampling network. Then the down-sampling network is fixed, the sampled image frames output by the down-sampling network are encoded and decoded, and the degraded sampled degraded image frames are used for super-resolution training to obtain the super-resolution network, thus realizing the transformation from down-sampling to The complete end-to-end training from encoding and decoding to super-resolution ensures that the trained downsampling network and super-resolution network have better performance.
  • the image processing system includes corresponding hardware structures and/or software modules for executing each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving the hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.
  • Embodiments of the present application can divide functional modules according to the image processing device corresponding to the above method example.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic and is only a logical function division. In actual implementation, there may be other division methods.
  • FIG. 11 shows a possible structural diagram of the image processing device involved in the above embodiment.
  • the image processing device includes: an acquisition unit 301 and a down-sampling unit 302; wherein, the acquisition unit 301 is used to support the device to perform S201 in the method embodiment; the down-sampling unit 302 is used to support the device to perform S202 in the method embodiment.
  • the image processing device may also include: a coding unit 303, and/or a training unit 304; wherein the coding unit 303 is used to support the device in executing S203 in the method embodiment, and the training unit 304 is used to support the device in executing the method. Steps for training the downsampling network in the embodiment.
  • FIG. 12 shows another possible structural schematic diagram of the image processing device involved in the above embodiment.
  • the image processing device includes: an acquisition unit 401 and a super-resolution unit 402; wherein the acquisition unit 401 is used to support the device in performing the steps of acquiring multiple consecutive second image frames in the method embodiment; the super-resolution unit 402 is used to support the The device executes S205 in the method embodiment.
  • the image processing device may also include: a decoding unit 403, and/or a training unit 404; the decoding unit 403 is used to support the device to perform S204 in the method embodiment; the training unit 404 is used to support the device to perform the method embodiment. Steps in training a super-resolution network.
  • the image processing device in the embodiment of the present application is described above from the perspective of modular functional entities, and the image processing device in the embodiment of the present application is described below from the perspective of hardware processing.
  • An embodiment of the present application also provides an image processing device.
  • the structure of the image processing device can be as shown in Figure 1 .
  • the processor 102 is configured to process one or more steps of S202, S203, S204 and S205 of the above image processing method.
  • the processor 102 is configured to: One image frame is subjected to downsampling processing to obtain multiple continuous second image frames; the multiple continuous second image frames are encoded to obtain image coded data; the image coded data is decoded to obtain the multiple continuous second image frames. a second image frame; performing super-resolution processing on the plurality of consecutive second image frames to obtain a plurality of consecutive third image frames.
  • the above information output by the input ⁇ output interface 105 can be sent to the memory 101 for storage, or can be sent to another processing flow to continue processing, or the current frame image and the next frame image can be output. Send it to the display device for display, send it to the player terminal for playback, etc.
  • This memory can store the above-mentioned multiple continuous first image frames, multiple continuous second image frames, image encoding data, multiple continuous third image frames, and related instructions for configuring the processor.
  • the multimedia component 104 may include a camera, and the processor 102 may control the camera to capture the surrounding environment to obtain the plurality of first image frames, so that after obtaining the plurality of consecutive first image frames, the processor 102 may The plurality of consecutive first image frames are subjected to downsampling processing to obtain the plurality of consecutive second image frames, and the plurality of consecutive second image frames are sequentially subjected to encoding, decoding, super-resolution processing, etc.
  • the multimedia component 104 may also include a display panel, and the processor 102 may also send the plurality of third image frames to the display panel to display the plurality of third image frames through the display panel.
  • an image processing system may include: the image processing device shown in FIG. 11 and the image processing device provided in FIG. 12 .
  • the image processing device shown in Figure 11 can be used to support the system to perform one or more steps of S201-S203 in the above method embodiment; the image processing device provided in Figure 12 can be used to support the system to perform the above One or more steps in S204-S205 in the method embodiment.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium stores instructions. When the instructions are run on a device (for example, the device can be a microcontroller, a chip, a computer or a processor, etc.) When, the device is caused to perform one or more steps in S201-S205 of the above image processing method. If each component module of the above-mentioned image processing device is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in the computer-readable storage medium.
  • embodiments of the present application also provide a computer program product containing instructions.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be implemented as a software product.
  • the computer software product is stored in a storage medium and includes a number of instructions to cause a computer device (which can be a personal computer, server, or network device, etc.) or a processor thereof to execute various embodiments of the present application. All or part of the steps of the method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

本申请涉及一种图像处理方法、装置及***,涉及图像处理技术领域,用于提高基于超分辨技术对下采样的图像进行复原时的恢复效果。该方法包括:获取图像数据,该图像数据包括多个连续的第一图像帧;对于该多个连续的第一图像帧中的不同第一图像帧,采样不同相位上的采样点,使得下采样得到的多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,即该多个连续的第二图像帧中包含了同一物体的不同位置的像素信息,从而后续在进行超分辨处理时能够有效利用多个连续的第二图像帧中同一物体不同位置的像素信息对该物体进行复原,进而提高复原的效果。

Description

一种图像处理方法、装置及***
本申请要求于2022年08月22日提交国家知识产权局、申请号为202211008622.X、申请名称为“一种图像处理方法、装置及***”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种图像处理方法、装置及***。
背景技术
在通信类应用中,图像的处理通常需要经过压缩、传输、解压缩等一系列过程。通过对图像进行压缩可以降低图像传输时的带宽。在一些带宽受限的应用场景中,发送端通常是先对图像进行下采样,再对下采样得到的子图像进行压缩传输,接收端在解压缩后通过超分辨率技术对解压缩后的图像进行重建。
现有技术中,为了恢复发送端下采样和压缩处理时损失的图像信息,接收端在通过超分辨率技术进行图像重建时,通常是利用帧间像素之间的相关性、以及帧间运动物体的亚像素错位来恢复。但是,该方式仅能够对下采样和压缩时损失的运动区域的信息进行一定程度的恢复,对于下采样和压缩时损失的静止区域的信息无法恢复,因此恢复效果不佳。
发明内容
本申请提供一种图像处理方法、装置及***,解决了现有技术中基于超分辨技术对下采样的图像进行复原时的恢复效果不佳的问题。
为达到上述目的,本申请采用如下技术方案:
第一方面,提供一种图像处理方法,该方法包括:获取图像数据,该图像数据可以是高分辨率的视频数据,该图像数据包括多个连续的第一图像帧,或者称为连续的多帧图像;对该多个连续的第一图像帧进行分别下采样处理,得到多个连续的第二图像帧,该多个连续的第二图像帧与该多个连续的第一图像帧一一对应,该多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,该采样点可以为像素或者亚像素。
上述技术方案中,当获取到多个连续的第一图像帧时,对于该多个连续的第一图像帧中的不同第一图像帧,可以下采样不同相位上的采样点,使得下采样得到的多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,即通过下采样处理能够得到该多个连续的第一图像帧中不同相位上的像素信息,从而后续再对该多个连续的第二图像帧进行超分辨处理时,能够有效利用多个连续的第二图像帧间的像素信息的相关性,补偿每个第二图像帧在下采样过程中损失的像素信息,使得复原后得到的多个连续的图像帧与原始的多个连续的第一图像帧之间的误差较小,提高了复原的真实性,进而复原效果较佳。
在第一方面的一种可能的实现方式中,对该多个连续的第一图像帧分别进行下采样处理,得到多个连续的第二图像帧,包括;基于至少两个预设相位分别对该多个连续的第一图像帧进行下采样处理,该多个连续的第一图像帧中的每个第一图像帧对应该至少两个预设相位中的一个预设相位,且相邻的两个第一图像帧对应的预设相位不同,每个第一图像帧进行下采样处理可得到一个第二图像帧,从而该多个连续的第一图像帧对应得到多个连续的第二图像帧。上述可能的实现方式中,能够使得下采样得到的多个连续的第二图像帧中存在至少相邻的两个第二图像帧的采样点的相位不同,以实现对该多个连续的第一图像帧中不同相位上的像素信息进行采样。
在第一方面的一种可能的实现方式中,对该多个连续的第一图像帧分别进行下采样处理,得到多个连续的第二图像帧,包括;基于至少两个预设相位对该多个连续的第一图像帧中的每个连续的第一图像帧进行下采样处理,得到至少两个候选图像帧;从该多个连续的第一图像帧中每个第一图像帧对应的至少两个候选图像帧中选择一个候选图像帧,且存在相邻的两个第一图像帧对应选择的两个候选图像帧下采样时所使用的预设相位不同,得到该多个连续的第二图像帧。上述可能的实现方式中,能够使得下采样得到的多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,以实现对该多个连续的第一图像帧中不同相位上的像素信息进行采样。
在第一方面的一种可能的实现方式中,对该多个连续的第一图像帧分别进行下采样处理,得到多个 连续的第二图像帧,包括:利用下采样网络对该多个连续的第一图像帧进行下采样处理,得到该多个连续的第二图像帧,该下采样网络是训练得到的。上述可能的实现方式中,通过利用下采样网络对该多个连续的第一图像帧进行下采样处理,可以提高下采样处理的准确性和处理效率,从而保证后续对该多个连续的第二图像帧进行超分辨处理时,提高复原后得到的多个图像帧的真实性,进而保证复原效果。
在第一方面的一种可能的实现方式中,该方法还包括:对多个训练图像帧进行下采样训练,得到多个采样图像帧,该多个采样图像帧中存在至少两个相邻的采样图像帧在相同的像素模块中的采样点不同;对该多个采样图像帧进行超分辨训练,得到多个训练恢复图像帧;根据该多个训练恢复图像帧和该多个训练图像帧,确定该下采样网络。上述可能的实现方式中,在训练下采样网络时,先不考虑编解码的操作,而是直接根据多个训练图像帧进行下采样训练和超分辨训练,得到该下采样网络,保证训练得到的下采样网络具有较好的性能。
在第一方面的一种可能的实现方式中,该方法还包括:对该多个连续的第二图像帧进行编码,得到图像编码数据。上述可能的实现方式中,通过对该多个连续的第二图像帧进行编码,能够提高该多个连续的第二图像帧的传输效率、减小存储时占用的存储空间。
第二方面,提供一种图像处理方法,该方法包括:获取多个连续的第二图像帧,该多个连续的第二图像帧是对多个连续的第一图像帧分别进行下采样得到的,该多个连续的第二图像帧与该多个连续的第一图像帧一一对应,该多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,该采样点为像素或者亚像素;对该多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧,该多个连续的第二图像帧与该多个连续的第三图像帧一一对应。
上述技术方案中,获取多个连续的第二图像帧,且该多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,即该多个连续的第二图像帧包含了同一物体不同相位上的像素信息,从而对该多个连续的第二图像帧进行超分辨处理时,能够有效利用多个连续的第二图像帧间的像素信息的相关性,补偿每个第二图像帧在下采样过程中损失的像素信息,使得复原后得到的多个图像帧与原始的图像帧之间的误差较小,提高了复原的真实性,进而复原效果较佳。
在第二方面的一种可能的实现方式中,获取多个连续的第二图像帧,包括:获取图像编码数据,并对该图像编码数据进行解码,得到多个连续的第二图像帧。上述可能的实现方式中,通过对该多个连续的第二图像帧进行编码,进而能够提高该多个连续的第二图像帧的传输效率、减小存储时占用的存储空间。
在第二方面的一种可能的实现方式中,对该多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧,包括:利用超分辨网络对该多个连续的第二图像帧进行超分辨处理,得到该多个连续的第三图像帧。上述可能的实现方式中,通过利用超分辨网络对该多个连续的第二图像帧进行超分辨处理,可以提高超分辨处理的准确性和处理效率,保证复原后得到的多个图像帧的真实性,进而保证复原效果。
在第二方面的一种可能的实现方式中,该方法还包括:对多个采样退化图像帧进行超分辨训练,得到该超分辨网络;其中,该多个采样退化图像帧对多个采样图像帧进行编解码处理得到的,该多个采样图像帧是利用下采样网络对多个训练图像帧进行下采样处理得到的。上述可能的实现方式中,在训练该超分辨网络时,先根据多个训练图像帧进行下采样训练和超分辨训练,得到该下采样网络,然后再固定该下采样网络,对该下采样网络输出的采样图像帧进行编解码的退化,利用退化后的采样退化图像帧进行超分辨训练,得到该超分辨网络,从而实现了从下采样至超分辨的完整的端到端的训练,保证了训练得到的下采样网络和超分辨网络具有较好的性能。
第三方面,提供一种图像处理装置,该装置包括:获取单元,用于获取图像数据,该图像数据包括多个连续的第一图像帧;下采样单元,用于对该多个连续的第一图像帧进行下采样处理,得到多个连续的第二图像帧,该多个连续的第二图像帧与该多个连续的第一图像帧一一对应,该多个连续的第二图像帧中存在至少相邻的两个第二图像帧在相同的像素模块中的采样点不同,该采样点为像素或者亚像素。
在第三方面的一种可能的实现方式中,该下采样单元还用于;基于至少两个预设相位分别对该多个连续的第一图像帧进行下采样处理,该多个连续的第一图像帧中的每个第一图像帧对应该至少两个预设相位中的一个预设相位,且相邻的两个第一图像帧对应的预设相位不同,该多个连续的第一图像帧对应得到多个连续的第二图像帧。
在第三方面的一种可能的实现方式中,该下采样单元还用于;基于至少两个预设相位对该多个连续 的第一图像帧中的每个第一图像帧进行下采样处理,得到至少两个候选图像帧;从该多个连续的第一图像帧中每个第一图像帧对应的至少两个候选图像帧中选择一个候选图像帧,且存在相邻的两个第一图像帧对应选择的两个候选图像帧下采样时所使用的预设相位不同,得到该多个连续的第二图像帧。
在第三方面的一种可能的实现方式中,该下采样单元还用于:利用下采样网络对该多个连续的第一图像帧进行下采样处理,得到该多个连续的第二图像帧,该下采样网络是训练得到的。
在第三方面的一种可能的实现方式中,该装置还包括:训练单元,用于:对多个训练图像帧进行下采样训练,得到多个采样图像帧,该多个采样图像帧中存在至少两个相邻的采样图像帧在相同的像素模块中的采样点不同;对该多个采样图像帧进行超分辨训练,得到多个训练恢复图像帧;根据该多个训练恢复图像帧和该多个训练图像帧,确定该下采样网络。
在第三方面的一种可能的实现方式中,该装置还包括:编码单元,用于对该多个连续的第二图像帧进行编码,得到图像编码数据。
第四方面,提供一种图像处理装置,该装置包括:获取单元,用于获取多个连续的第二图像帧,该多个连续的第二图像帧是对多个连续的第一图像帧进行下采样得到的,该多个第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,该采样点为像素或者亚像素;超分辨单元,用于对该多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧,该多个连续的第二图像帧与该多个连续的第三图像帧一一对应。
在第四方面的一种可能的实现方式中,该装置还包括解码单元;该获取单元,还用于获取图像编码数据;该解码单元,用于对该图像编码数据进行解码,得到多个连续的第二图像帧。
在第四方面的一种可能的实现方式中,该超分辨单元还用于:利用超分辨网络对该多个连续的第二图像帧进行超分辨处理,得到该多个连续的第三图像帧。
在第四方面的一种可能的实现方式中,该装置还包括:训练单元,用于对多个采样退化图像帧进行超分辨训练,得到该超分辨网络;其中,该多个采样退化图像帧对多个采样图像帧进行编解码处理得到的,该多个采样图像帧是利用下采样网络对多个训练图像帧进行下采样处理得到的。
本申请的又一方面,提供一种图像处理***,该图像处理***包括:如第三方面或第三方面的任一种可能的实现方式所提供的任一种图像处理装置,以及如第四方面或第四方面的任一种可能的实现方式所提供的任一种图像处理装置。
本申请的又一方面,提供一种图像处理***,该图像处理***包括处理器和存储器,该存储器中存储有指令,当该指令被执行时,使得该图像处理***执行第一方面或者第一方面的任一种可能的实现方式所提供的图像处理方法。
本申请的又一方面,提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序或指令,当该计算机程序或指令被运行时,实现第一方面或者第一方面的任一种可能的实现方式所提供的图像处理方法。
本申请的又一方面,提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行第一方面或者第一方面的任一种可能的实现方式所提供的图像处理方法。
可以理解地,上述提供的任一种图像处理方法的装置、***、计算机存储介质或者计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
附图说明
图1为本申请实施例提供的一种图像处理***的结构示意图;
图2为本申请实施例提供的另一种图像处理***的结构示意图;
图3为本申请实施例提供的一种图像处理方法的流程示意图;
图4为本申请实施例提供的一种下采样的示意图;
图5为本申请实施例提供的另一种下采样的示意图;
图6为本申请实施例提供的另一种图像处理方法的流程示意图;
图7为本申请实施例提供的一种图像处理***处理多个图像帧的示意图;
图8为本申请实施例提供的另一种图像处理***处理多个图像帧的示意图;
图9为本申请实施例提供的一种训练下采样网络的示意图;
图10为本申请实施例提供的一种训练超分辨网络的示意图;
图11为本申请实施例提供的一种图像处理装置的结构示意图;
图12为本申请实施例提供的另一种图像处理装置的结构示意图。
具体实施方式
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c或a-b-c,其中a、b和c可以是单个,也可以是多个。字符“/”一般表示前后关联对象是一种“或”的关系。另外,在本申请的实施例中,“第一”、“第二”等字样并不对数量和执行次序进行限定。
需要说明的是,本申请中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本申请本提供的技术方案可以应用于多种不同的图像处理***中,该图像处理***可以是图像编解码***、图像存储***或者视频拍摄***(比如,安防***)等。在实际应用中,该图像处理***可以是一个电子设备,也可以包括多个电子设备。该电子设备包括但不限于:手机、平板电脑、计算机、笔记本电脑、摄像机、照相机、可穿戴设备、车载设备或者终端设备等。在本申请实施例中,该图像处理***可用于对高分辨率的图像帧进行下采样处理,还可以用于执行以下处理中的至少一个:对下采样得到的图像帧进行编码、降噪或者去模糊处理等,存储上述处理后的图像数据,对该图像数据进行解码处理,对低分辨率的图像帧进行超分辨处理等。下面对该图像处理***的具体结构进行举例说明。
图1为本申请实施例提供的一种图像处理***的结构示意图,该图像处理***以手机为例进行说明,该手机或者内置于手机的芯片***包括:存储器101、处理器102、传感器组件103、多媒体组件104以及输入/输出接口105。下面结合图1对手机或者内置于手机的芯片***的各个构成部件进行具体的介绍。
存储器101可用于存储数据、软件程序以及模块;主要包括存储程序区和存储数据区,其中,存储程序区可存储软件程序,包括以代码形成的指令,包括但不限于操作***、至少一个功能所需的应用程序,比如声音播放功能、图像播放功能等;存储数据区可存储根据手机的使用所创建的数据,比如音频数据、图像数据、电话本等。在本申请实施例中,存储器101可用于存储人脸图像、光照信息数据库和待评估图像等。在一些可行的实施例中,可以有一个存储器,也可以有多个存储器;该存储器可以包括软盘,硬盘如内置硬盘和移动硬盘,磁盘,光盘,磁光盘如CD_ROM、DCD_ROM,非易失性存储设备如RAM、ROM、PROM、EPROM、EEPROM、闪存、或者技术领域内所公知的任意其他形式的存储介质。
处理器102是手机的控制中心,利用各种接口和线路连接整个设备的各个部分,通过运行或执行存储在存储器101内的软件程序和/或软件模块,以及调用存储在存储器101内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。在本申请实施例中,处理器102可用于执行本申请方法实施例中的一个或者多个步骤,比如,处理器102可用于执行下述方法实施例中的S202至S204中的一个或者多个步骤。在一些可行的实施例中,处理器102可以是单处理器结构、多处理器结构、单线程处理器以及多线程处理器等;在一些可行的实施例中,处理器102可以包括中央处理器单元、通用处理器、数字信号处理器、神经网络处理器、图像处理单元、图像信号处理器、微控制器或微处理器等的至少一个。除此以外,处理器102还可进一步包括其他硬件电路或加速器,如专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器102也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。
传感器组件103包括一个或多个传感器,用于为手机提供各个方面的状态评估。其中,传感器组件103可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用,即成为相机或摄像头的组成部分。在本申请实施例中,传感器组件103可用于支持多媒体组件104中的摄像头获取人脸图像 等。此外,传感器组件103还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器,通过传感器组件103可以检测到手机的加速/减速、方位、打开/关闭状态,组件的相对定位,或手机的温度变化等。
多媒体组件104在手机和用户之间提供一个输出接口的屏幕,该屏幕可以为触摸面板,且当该屏幕为触摸面板时,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。此外,多媒体组件104还包括至少一个摄像头,比如,多媒体组件104包括一个前置摄像头和/或后置摄像头。当手机处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以感应外部的多媒体信号,该信号被用于形成图像帧。每个前置摄像头和后置摄像头可以是一个固定的光学透镜***或具有焦距和光学变焦能力。
输入/输出接口105为处理器102和***接口模块之间提供接口,比如,***接口模块可以包括键盘、鼠标、或USB(通用串行总线)设备等。在一种可能的实现方式中,输入/输出接口105可以只有一个输入/输出接口,也可以有多个输入/输出接口。
尽管未示出,手机还可以包括音频组件和通信组件等,比如,音频组件包括麦克风,通信组件包括无线保真(wireless fidelity,WiFi)模块、蓝牙模块等,本申请实施例在此不再赘述。
图2为本申请实施例提供的另一种图像处理***的结构示意图,该图像处理***以视频拍摄***为例进行说明。如图2所示,该视频拍摄***包括多个安防设备(也可以称为边缘设备)201和服务器202,该多个安防设备201与服务器202之间可以通过有线或无线的方式连接。
其中,该多个安防设备201可以为多个视频摄像设备,用于拍摄和处理视频数据,并该视频数据传输至服务器202。比如,该多个安防设备201可以对拍摄得到的视频数据进行下采样处理,还可用于对视频数据进行编码处理、降噪处理、去模糊处理、特征提取和特征比对等处理中的一个或者多个。在实际应用中,该多个安防设备201可以包括诸如针孔摄像头、半球型摄像头、红外摄像头等各种各样的摄像头、手机、平板电脑、或者其他具有视频拍摄功能的设备等。
服务器202可用于接收并存储该多个安防设备201传输的视频数据,以及对该视频数据进行处理等功能。比如,服务器202可用于对该视频数据进行下采样处理,还可用于对视频数据进行超分辨处理,还可用于对该视频数据进行编解码处理、降噪处理、去模糊处理、特征提取、特征比对和图像检索等处理中的一个或者多个。可选的,服务器202还可以用于对该多个安防设备201进行统一管理和配置。比如,服务器202可用于对该多个安防设备201进行身份认证,向该多个安防设备201传输视频数据的部分处理结果等。在一种可能的实施例中,服务器202可以是云数据中心内的云服务器,且该云数据中心内可以包括一个或者多个云服务器。云数据中心可用于为用户提供视频共享、视频解析和大数据应用等服务。
进一步的,该视频拍摄***还可以包括存储设备,该存储设备可以通过总线与服务器202连接,该存储设备可用于存储图像、或视频相关的数据。在一种可能的实施例中,在服务器202对接收到的视频数据进行下采样后,该服务器202可以将下采样后的图像数据存在该存储设备中。在另一种可能的实施例中,服务器202还可以通过总线从该存储装置中获取该图像数据,并对该图像数据进行超分辨处理。
本领域技术人员可以理解的是,上述图1和图2中示出的图像处理***的结构并不构成对图像处理***的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
图3为本申请实施例提供一种图像处理方法的流程示意图,该方法可以由图1或图2所示的图像处理***来执行,参见图3,该方法可以包括以下步骤。
S201:获取图像数据,该图像数据包括多个连续的第一图像帧。
其中,该图像数据可以是高分辨率的视频数据,该图像数据中的多个连续的第一图像帧可以是该视频数据中多个连续的图像帧,或者称为连续的多帧图像。该多个第一图像帧中每个第一图像帧的分辨率可以是相同的,比如,该图像数据中每个第一图像帧的分辨率可以为1280×720或者1920×1080。
在一种可能的实施例中,该图像处理***可以为手机、摄像机或者车载设备等终端设备,该终端设备可以通过摄像头或相机等具有图像拍摄功能的设备对周围环境中的物体进行拍摄,得到该图像数据。
在另一种可能的实施例中,该图像处理***可以包括服务器和安防设备,该安防设备可以通过摄像头对周围环境中的物体进行拍摄得到该图像数据,并由该安防设备执行下述步骤S202;或者,该安防 设备在得到该图像数据后,通过有线或者无线的方式向该服务器发送该图像数据,该服务器接收该图像数据,并执行下述步骤S202。
S202:对该多个连续的第一图像帧分别进行下采样处理,得到多个连续的第二图像帧,该多个第二图像帧与该多个第一图像帧一一对应,该多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同像素模块中的采样点不同。
其中,对每个第一图像帧进行下采样处理可得到一个第二图像帧,从而对该多个连续的第一图像帧分别进行下采样处理可得到多个连续的第二图像帧,该多个连续的第一图像帧的数量和该多个连续的第二图像帧的数量相等。该多个连续的第二图像帧是对该多个连续的第一图像帧进行下采样处理得到的,从而该多个连续的第二图像帧的分辨率小于该多个连续的第一图像帧的分辨率。比如,该多个第一图像帧的分辨率可以为1920×1080,该多个第二图像帧的分辨率可以为640×480。
另外,第二图像帧的采样点可以是对应的第一图像帧中的像素(pixel)或者亚像素(sub pixel),即对第一图像帧中的像素或者亚像素进行采样,得到对应的第二图像帧。该采样点的相位可以是指采样的像素或者亚像素的相位,该相位也可以理解为采样的像素或者亚像素在第一图像帧中的位置。
再者,该多个第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,该至少两个相邻的第二图像帧可以是该多个第二图像帧中的部分图像帧,也可以是该多个第二图像帧的全部图像帧。相同的像素模块可以是指位于不同图像帧中的相同位置上的像素模块。
对于该多个连续的第一图像帧中的每个第一图像帧,该图像处理***可以对该第一图像帧进行一次下采样,也可以进行多次下采样。当该图像处理***对该第一图像帧进行多次下采样时,可以选择其中一次下采样得到的图像帧作为上述第二图像帧。可选的,在每次下采样时,该图像处理***可以将该第一图像帧划分为多个像素模块(也可以称为图像块),并采样该多个像素模块的每个像素模块中位于同一区域内的像素或亚像素。示例性的,每个像素模块可以包括M×N个像素,M和N的取值为大于1的整数(比如,M=N=2),每次下采样时从M×N个像素中采样一个像素。
在一种可能的实现方式中,该图像处理***对该多个连续的第一图像帧中的每个第一图像帧进行一次下采样,得到该多个连续的第二图像帧的具体过程可以包括:基于至少两个预设相位分别对该多个连续的第一图像帧中的每个第一图像帧进行一次下采样处理,该多个连续的第一图像帧中的每个第一图像帧对应该至少两个预设相位中的一个预设相位,从而该多个连续的第一图像帧经过下采样处理得到多个连续的第二图像帧。
示例性的,如图4所示,假设该多个连续的第一图像帧包括四个连续的第一图像帧且分别表示为F11至F14,划分的每个像素模块包括2×2个像素,且该2×2个像素分别对应四个预设相位P1至P4,则下采样处理的过程可以包括:下采样第一图像帧F11中预设相位P1对应的像素,对应得到第二图像帧F21;下采样第一图像帧F12中预设相位P2对应的像素,对应得到第二图像帧F22;下采样第一图像帧F13中预设相位P3对应的像素,对应得到第二图像帧F23;下采样第一图像帧F14中预设相位P4对应的像素,对应得到第二图像帧F24。上述四个预设相位P1至P4是以2×2个像素构成的一个像素模块,该像素模块中不同位置上的像素对应一个相位为例进行说明。
在上述示例中,该多个连续的第一图像帧中的每个第一图像帧对应该至少两个预设相位中的一个预设相位时,该多个连续的第一图像帧与该至少两个预设相位之间的对应关系可以是随机的,也可以是事先设置的,本申请实施例对此不作具体限制。
在另一种可能的实现方式中,该图像处理***可以对该多个连续的第一图像帧中的每个第一图像帧进行多次下采样,得到该多个连续的第二图像帧的具体过程可以包括:基于至少两个预设相位对该多个连续的第一图像帧中的每个第一图像帧进行下采样处理,得到至少两个候选图像帧;从该多个第一图像帧中每个第一图像帧对应的至少两个候选图像帧中选择一个候选图像帧,且存在相邻的两个第一图像帧对应选择的两个候选图像帧下采样时所使用的预设相位不同,得到该多个连续的第二图像帧。
示例性的,如图5所示,假设该多个连续的第一图像帧包括四个连续的第一图像帧且分别表示为F11至F14,划分的每个像素模块包括2×2个像素,且该2×2个像素分别对应四个预设相位P1至P4,则下采样处理的过程可以包括:分别下采样第一图像帧F11中预设相位P1至P4对应的像素,得到F11对应的四个候选图像帧4×F21’;分别下采样第二图像帧F12中预设相位P1至P4对应的像素,得到F12对应的四个候选图像帧4×F22’;分别下采样第一图像帧F13中预设相位P1至P4对应的像素, 得到F13对应的四个候选图像帧4×F23’;分别下采样第一图像帧F14中预设相位P1至P4对应的像素,得到F14对应的四个候选图像帧4×F24’;从F11至F14中每个第一图像帧对应的四个候选图像帧选择一个候选图像帧,得到四个连续的第二图像帧F21至F24。
在上述示例中,当从每个第一图像帧对应的至少两个候选图像帧中选择一个候选图像帧作为对应的第二图像帧时,可以随机进行选择,也可以按照事先设置的规则进行选择,本申请实施例对此不作具体限制。
进一步的,该图像处理***可以利用下采样网络对该多个连续的第一图像帧进行下采样处理,得到该多个连续的第二图像帧。该下采样网络可以按照上述任意一种可能的实现方式进行下采样,本申请实施例对此不作具体限制。其中,该下采样网络可以是利用深度学习对多个训练图像进行训练得到的。
在一种可能的实施例中,该下采样网络可以包括空间到深度(spatial to depth,S2D)层、融合层、卷积层和降维层。其中,该S2D转换层可用于基于至少两个预设相位对每个第一图像帧的像素或亚像素进行下采样,得到每个第一图像帧对应的至少两个候选图像帧;该融合层可用于将该多个连续的第一图像帧对应的所有候选图像帧重叠融合在一起,得到融合图像数据;该卷积层可用于对该融合图像数据进行卷积操作;该降维层可用于对卷积操作后的融合图像数据进行降维处理,以输出多个第二图像帧。
在实际应用中,该S2D层可以由深度学习算法中的像素重组算子(pixel shuffler)实现,该融合层可以由深度学习算法中的连接算子(concat)实现,该卷积层可以由深度学习算法中的卷积算子(convolution)实现,该降维层可以由深度学习算法中的降维算子实现。
在本申请实施例中,对于该多个连续的第一图像帧中的不同第一图像帧,可以采样不同相位上的采样点,使得下采样得到的多个连续的第二图像帧中的采样点的相位不同,这样对于该多个连续的第一图像帧中的静止物体和动态物体,采样得到的多个连续的第二图像帧中包含了该静止物体和动态物体中不同位置的像素信息,从而后续在进行超分辨处理时能够有效利用多个连续的第二图像帧中同一物体不同位置的像素信息对该物体进行复原,进而保证复原得到图像帧的效果较佳。
S203:对该多个连续的第二图像帧进行编码,得到图像编码数据。
其中,对该多个连续的第二图像帧进行编码,也可以称为对该多个连续的第二图像帧进行压缩或者压缩编码。具体的,当得到该多个连续的第二图像帧时,该图像处理***可以按照一定的编码标准对该多个连续的第二图像帧进行压缩编码,以得到图像编码数据。比如,该图像处理***可以按照编码标准H265或H364对该多个连续的第二图像帧进行压缩编码。
在一种可能的实施例中,该图像处理***可以不对该多个连续的第二图像帧进行编码,而是对该多个连续的第二图像帧进行去噪或去模糊等其它处理,或者仅对该多个连续的第二图像帧进行编码,又进行去噪或去模糊等,本申请实施例对此不作具体限定。上述关于对该多个连续的第二图像帧进行编码、去噪和去模糊等处理的详细描述可以参见相关技术中的描述,本申请实施例在此不再赘述。
可选的,在该图像处理***为终端设备时,该图像处理***中可以包括存储器;当该图像处理***为视频拍摄***时,该图像处理***中可以包括存储设备。当该图像处理***对该多个连续的第二图像帧进行编码得到该图像编码数据后,该图像处理***可以将该图像编码数据存储在存储器或者存储设备中。
进一步的,如图6所示,在S203之后,该方法还可以包括:S204-S205。
S204:对该图像编码数据进行解码,得到该多个连续的第二图像帧。
当该图像处理***需要播放该图像编码数据对应的高分辨率的图像数据时,该图像处理***可以从存储器或者存储设备中获取该图像编码数据,并按照该图像编码数据的编码方式对应解码方式进行解码,以得到该多个连续的第二图像帧。
S205:对该多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧。
其中,该图像处理***在得到该多个连续的第二图像帧时,可以对该多个连续的第二图像帧进行超分辨处理,以利用该多个连续的第二图像帧之间的像素或者亚像素的相关性,补偿每个第二图像帧在下采样过程中损失的像素信息,得到高分辨率的多个连续的第三图像帧。
在一种可能的实施例中,该图像处理***可以利用超分辨网络对该多个连续的第二图像帧进行超分辨处理,得到该多个连续的第三图像帧。其中,该超分辨网络可以是利用深度学习算法对下采样得到的多个采样图像进行超分辨训练得到的,该多个采样图像可以是上述下采样网络对多个训练图像进行下采 样处理得到的。
需要说明的是,上述图3和图6所提供的方法实施例仅是示例性的,在实际应用中,该图像处理方法也可以不包括步骤S203和S204,即在图像处理的过程中可以不对下采样得到的多个连续的第二图像帧进行编码处理,相应的在超分辨处理之前也可以不执行对图像编码数据进行解码的操作。为便于理解,下面通过图7和图8所示的图像处理***为例,对该本申请实施例提供的技术方案进行举例说明。
示例性的,如图7所示,该图像处理***可以包括下采样网络、编解码模块和超分辨网络,相应的图像处理方法可以包括:该图像处理***获取连续的多个连续的第一图像帧F11至F1i(i为大于1的整数)时;利用下采样网络对该多个连续的第一图像帧F11至F1i进行下采样处理,得到多个连续的第二图像帧F21至F2i,该多个连续的第二图像帧F21至F2i中相邻的两个第二图像帧在相同像素模块中的采样点不同;利用编解码模块对该多个连续的第二图像帧F21至F2i先进行编码处理,再对编码得到的图像编码数据进行解码处理,得到该多个连续的第二图像帧F21至F2i;利用超分辨网络对该多个连续的第二图像帧F21至F2i进行超分辨处理,得到多个连续的第三图像帧F31至F3i。
示例性的,如图8所示,该图像处理***可以包括下采样网络、存储设备和超分辨网络,相应的图像处理方法可以包括:该图像处理***获取多个连续的第一图像帧F11至F1i(i为大于1的整数)时;利用下采样网络对该多个连续的第一图像帧F11至F1i进行下采样处理,得到多个连续的第二图像帧F21至F2i,该多个连续的第二图像帧F21至F2i中相邻的两个第二图像帧在相同像素模块中的采样点不同;将该多个连续的第二图像帧F21至F2i存储在该存储装置中;从该存储装置中获取该多个连续的第二图像帧F21至F2i,并利用超分辨网络对该多个连续的第二图像帧F21至F2i进行超分辨处理,得到多个连续的第三图像帧F31至F3i。
在本申请实施例中,当获取到多个连续的第一图像帧时,对于该多个连续的第一图像帧中的不同第一图像帧,可以下采样不同相位上的采样点,使得下采样得到的多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同,即通过下采样处理能够得到该多个连续的第一图像帧中不同相位上的像素信息,从而再对该多个连续的第二图像帧进行超分辨处理时,能够有效利用多个连续的第二图像帧间的像素信息的相关性,补偿每个第二图像帧在下采样过程中损失的静态物体(比如,树叶、房屋、警示牌等)和/或和动态物体(比如,运动车辆的车牌)的像素信息,使得复原后得到的多个连续的第三图像帧与原始的多个连续的第一图像帧之间的误差较小,提高了复原的真实性,进而复原效果较佳。
进一步的,下面以图7所示的图像处理***为例,对本申请实施例中利用深度学习进行训练得到该下采样网络和该超分辨网络的过程进行介绍说明。其中,训练该下采样网络和该超分辨网络的过程可以包括两个步骤:第一个步骤中可以不考虑编解码(或去噪、去模糊等其它处理)的操作,而是利用深度学习算法根据多个训练图像帧进行下采样训练和超分辨训练,以得到该下采样网络;第二个步骤中可以将第一个步骤中得到的下采样网络固定,并引入编解码(或去噪、去模糊等其它处理)等操作的退化,利用退化后的图像帧进行超分辨训练,得到该超分辨网络。下面通过图9和图10对这两个步骤进行详细说明。
如图9所示,训练该下采样网络的过程可以包括:S11.根据多个训练图像帧Y11至Y1i(i为大于1的整数)对初始下采样网络进行训练,得到多个采样图像帧Y21至Y2i,该多个采样图像帧Y21至Y2i中存在至少两个采样图像帧中采样点的相位不同;S12.根据该多个采样图像帧Y21至Y2i对初始超分辨网络进行训练,得到多个训练恢复图像帧Y31至Y3i;根据多个训练恢复图像帧Y31至Y3i、以及多个训练图像帧Y11至Y1i之间的误差,调整初始下采样网络和初始超分辨网络,并重新执行S11和S12;当上述误差位于可接受的误差范围之内时,将当前得到的下采样网络确定为最终训练得到的下采样网络。
如图10所示,训练该超分辨网络的过程可以包括:S21.对上述训练得到的下采样网络输出的多个采样图像帧Y21至Y2i进行编解码退化处理,即对该多个采样图像帧Y21至Y2i进行编码,再对编码后的图像编码数据进行解码,得到多个采样退化图像帧Y21’至Y2i’;S22.根据该采样退化图像帧Y21’至Y2i’对初始超分辨网络进行训练,得到多个训练恢复图像帧Y31至Y3i;S23.根据该多个训练恢复图像帧Y31至Y3i、以及多个训练图像帧Y11至Y1i之间的误差,调整上述初始超分辨网络,并重新执行S22(或S21和S22);当上述误差位于可接受的误差范围之内时,将当前得到的超分辨网络确定 为最终训练得到的超分辨网络。
上述图7-图9中,以该下采样网络包括S2D层、融合层、卷积层和降维层为例进行说明,关于S2D层、融合层、卷积层和降维层的具体描述可以参见上文中的描述,本申请实施例在此不再赘述。
在本申请实施例中,在训练下采样网络和超分辨网络时,先不考虑编解码的操作,而是直接根据多个训练图像帧进行下采样训练和超分辨训练,得到该下采样网络,然后再固定该下采样网络,对该下采样网络输出的采样图像帧进行编解码的退化,利用退化后的采样退化图像帧进行超分辨训练,得到该超分辨网络,从而实现了从下采样、编解码至超分辨的完整的端到端的训练,保证了训练得到的下采样网络和超分辨网络具有较好的性能。
上述主要从图像处理***的角度对本申请实施例提供的图像处理方法进行了介绍。可以理解的是,该图像处理***为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的结构及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对应的图像处理装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图11示出了上述实施例中所涉及的图像处理装置的一种可能的结构示意图。该图像处理装置包括:获取单元301和下采样单元302;其中,获取单元301用于支持该装置执行方法实施例中的S201;下采样单元302用于支持该装置执行方法实施例中的S202。进一步的,该图像处理装置还可以包括:编码单元303,和/或训练单元304;其中,编码单元303用于支持该装置执行方法实施例中的S203,训练单元304用于支持该装置执行方法实施例中训练下采样网络的步骤。
在采用对应各个功能划分各个功能模块的情况下,图12示出了上述实施例中所涉及的图像处理装置的另一种可能的结构示意图。该图像处理装置包括:获取单元401和超分辨单元402;其中,获取单元401用于支持该装置执行方法实施例中获取多个连续的第二图像帧的步骤;超分辨单元402用于支持该装置执行方法实施例中的S205。进一步的,该图像处理装置还可以包括:解码单元403,和/或训练单元404;解码单元403用于支持该装置执行方法实施例中的S204;训练单元404用于支持该装置执行方法实施例中训练超分辨网络的步骤。
上面从模块化功能实体的角度对本申请实施例中的一种图像处理装置进行描述,下面从硬件处理的角度对本申请实施例中的一种图像处理装置进行描述。
本申请实施例还提供的一种图像处理装置,该图像处理装置的结构可以如图1所示。在本申请实施例中,处理器102被配置为可处理上述图像处理方法的S202、S203、S204和S205中的一个或者多个步骤,比如,处理器102用于:对该多个连续的第一图像帧进行下采样处理,得到多个连续的第二图像帧;对该多个连续的第二图像帧进行编码,得到图像编码数据;对该图像编码数据进行解码,得到该多个连续的第二图像帧;对该多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧。
在一些可行的实施例中,该输入\输出接口105输出的以上信息可以送到存储器101中存储,也可以送到另外的处理流程中继续进行处理,或者输出的当前帧图像和下一帧图像送到显示设备进行显示、送到播放器终端进行播放等。
存储器101:该存储器中可存储上述多个连续的第一图像帧、多个连续的第二图像帧、图像编码数据、多个连续的第三图像帧以及配置处理器的相关指令等。
多媒体组件104中可以包括摄像头,处理器102可以控制该摄像头对周围环境进行拍摄,以获取该多个第一图像帧,从而处理器102在获取该多个连续的第一图像帧后,可以对该多个连续的第一图像帧进行下采样处理得到该多个连续的第二图像帧,并对该多个连续的第二图像帧依次进行编解码和超分辨处理等。可选的,该多媒体组件104中还可以包括显示面板,处理器102还可以将该多个第三图像帧发送至该显示面板,以通过该显示面板显示该多个第三图像帧。
本申请实施例提供的上述图像处理装置的各组成部分分别用于实现相对应的前述图像处理方法的各步骤的功能,由于在前述的图像处理方法实施例中,已经对各步骤进行了详细说明,在此不再赘述。
在本申请的另一方面,还提供一种图像处理***,该图像处理***可以包括:图11所示的图像处理装置、以及图12所提供的图像处理装置。其中,该图11所示的图像处理装置可用于支持该***执行上述方法实施例中的S201-S203中的一个或者多个步骤;该图12所提供的图像处理装置可用于支持该***执行上述方法实施例中的S204-S205中的一个或者多个步骤。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在一个设备(比如,该设备可以是单片机,芯片、计算机或处理器等)上运行时,使得该设备执行上述图像处理方法的S201-S205中的一个或多个步骤。上述图像处理装置的各组成模块如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在所述计算机可读取存储介质中。
基于这样的理解,本申请实施例还提供一种包含指令的计算机程序产品,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或其中的处理器执行本申请各个实施例所述方法的全部或部分步骤。
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (27)

  1. 一种图像处理方法,其特征在于,所述方法包括:
    获取图像数据,所述图像数据包括多个连续的第一图像帧;
    对所述多个连续的第一图像帧分别进行下采样处理,得到多个连续的第二图像帧,所述多个连续的第二图像帧与所述多个连续的第一图像帧一一对应,所述多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述多个连续的第一图像帧分别进行下采样处理,得到多个连续的第二图像帧,包括;
    基于至少两个预设相位分别对所述多个连续的第一图像帧进行下采样处理,所述多个连续的第一图像帧中的每个第一图像帧对应所述至少两个预设相位中的一个预设相位,且相邻的两个第一图像帧对应的预设相位不同,所述多个连续的第一图像帧对应得到多个连续的第二图像帧。
  3. 根据权利要求1所述的方法,其特征在于,所述对所述多个连续的第一图像帧分别进行下采样处理,得到多个连续的第二图像帧,包括;
    基于至少两个预设相位对所述多个连续的第一图像帧中的每个第一图像帧进行下采样处理,得到至少两个候选图像帧;
    从所述多个连续的第一图像帧中每个第一图像帧对应的至少两个候选图像帧中选择一个候选图像帧,且存在相邻的两个第一图像帧对应选择的两个候选图像帧下采样时所使用的预设相位不同,得到所述多个连续的第二图像帧。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述采样点为像素或者亚像素。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述对所述多个连续的第一图像帧分别进行下采样处理,得到多个连续的第二图像帧,包括:
    利用下采样网络对所述多个连续的第一图像帧进行下采样处理,得到所述多个连续的第二图像帧,所述下采样网络是训练得到的。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    对多个训练图像帧进行下采样训练,得到多个采样图像帧,所述多个采样图像帧中存在至少两个相邻的采样图像帧在相同的像素模块中的采样点不同;
    对所述多个采样图像帧进行超分辨训练,得到多个训练恢复图像帧;
    根据所述多个训练恢复图像帧和所述多个训练图像帧,确定所述下采样网络。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述方法还包括:
    对所述多个连续的第二图像帧进行编码,得到图像编码数据。
  8. 一种图像处理方法,其特征在于,所述方法包括:
    获取多个连续的第二图像帧,所述多个连续的第二图像帧是对多个连续的第一图像帧分别进行下采样得到的,所述多个连续的第二图像帧与所述多个连续的第一图像帧一一对应,所述多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同;
    对所述多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧,所述多个连续的第二图像帧与所述多个连续的第三图像帧一一对应。
  9. 根据权利要求8所述的方法,其特征在于,所述获取多个连续的第二图像帧,包括:
    获取图像编码数据,并对所述图像编码数据进行解码,得到多个连续的第二图像帧。
  10. 根据权利要求9所述的方法,其特征在于,所述对所述多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧,包括:
    利用超分辨网络对所述多个连续的第二图像帧进行超分辨处理,得到所述多个连续的第三图像帧。
  11. 根据权利要求10所述的方法,其特征在于,所述方法还包括:
    对多个采样退化图像帧进行超分辨训练,得到所述超分辨网络;
    其中,所述多个采样退化图像帧对多个采样图像帧进行编解码处理得到的,所述多个采样图像帧是利用下采样网络对多个训练图像帧进行下采样处理得到的。
  12. 根据权利要求8-11任一项所述的方法,其特征在于,所述采样点为像素或者亚像素。
  13. 一种图像处理装置,其特征在于,所述装置包括:
    获取单元,用于获取图像数据,所述图像数据包括多个连续的第一图像帧;
    下采样单元,用于对所述多个连续的第一图像帧分别进行分别下采样处理,得到多个连续的第二图像帧,所述多个连续的第二图像帧与所述多个连续的第一图像帧一一对应,所述多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同。
  14. 根据权利要求13所述的装置,其特征在于,所述下采样单元还用于;
    基于至少两个预设相位分别对所述多个连续的第一图像帧进行下采样处理,所述多个连续的第一图像帧中的每个第一图像帧对应所述至少两个预设相位中的一个预设相位,且相邻的两个第一图像帧对应的预设相位不同,所述多个连续的第一图像帧对应得到多个连续的第二图像帧。
  15. 根据权利要求13所述的装置,其特征在于,所述下采样单元还用于;
    基于至少两个预设相位对所述多个连续的第一图像帧中的每个第一图像帧进行下采样处理,得到至少两个候选图像帧;
    从所述多个连续的第一图像帧中每个第一图像帧对应的至少两个候选图像帧中选择一个候选图像帧,且存在相邻的两个第一图像帧对应选择的两个候选图像帧下采样时所使用的预设相位不同,得到所述多个连续的第二图像帧。
  16. 根据权利要求13-15任一项所述的装置,其特征在于,所述采样点为像素或者亚像素。
  17. 根据权利要求13-16任一项所述的装置,其特征在于,所述下采样单元还用于:
    利用下采样网络对所述多个连续的第一图像帧进行下采样处理,得到所述多个连续的第二图像帧,所述下采样网络是训练得到的。
  18. 根据权利要求17所述的装置,其特征在于,所述装置还包括:
    训练单元,用于:对多个训练图像帧进行下采样训练,得到多个采样图像帧,所述多个采样图像帧中存在至少两个相邻的采样图像帧在相同的像素模块中的采样点不同;对所述多个采样图像帧进行超分辨训练,得到多个训练恢复图像帧;根据所述多个训练恢复图像帧和所述多个训练图像帧,确定所述下采样网络。
  19. 根据权利要求13-18任一项所述的装置,其特征在于,所述装置还包括:
    编码单元,用于对所述多个连续的第二图像帧进行编码,得到图像编码数据。
  20. 一种图像处理装置,其特征在于,所述装置包括:
    获取单元,用于获取多个连续的第二图像帧,所述多个连续的第二图像帧是对多个连续的第一图像帧分别进行下采样得到的,所述多个连续的第二图像帧与所述多个连续的第一图像帧一一对应,所述多个连续的第二图像帧中存在至少两个相邻的第二图像帧在相同的像素模块中的采样点不同;
    超分辨单元,用于对所述多个连续的第二图像帧进行超分辨处理,得到多个连续的第三图像帧,所述多个连续的第二图像帧与所述多个连续的第三图像帧一一对应。
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括解码单元;
    所述获取单元,还用于获取图像编码数据;
    所述解码单元,用于对所述图像编码数据进行解码,得到多个连续的第二图像帧。
  22. 根据权利要求21所述的装置,其特征在于,所述超分辨单元还用于:
    利用超分辨网络对所述多个连续的第二图像帧进行超分辨处理,得到所述多个连续的第三图像帧。
  23. 根据权利要求22所述的装置,其特征在于,所述装置还包括:
    训练单元,用于对多个采样退化图像帧进行超分辨训练,得到所述超分辨网络;
    其中,所述多个采样退化图像帧对多个采样图像帧进行编解码处理得到的,所述多个采样图像帧是利用下采样网络对多个训练图像帧进行下采样处理得到的。
  24. 根据权利要求20-23任一项所述的装置,其特征在于,所述采样点为像素或者亚像素。
  25. 一种图像处理***,其特征在于,所述图像处理***包括:如权利要求13-19任一项所述的图像处理装置,以及如权利要求20-24任一项所述的图像处理装置。
  26. 一种图像处理***,其特征在于,所述图像处理***包括处理器和存储器,所述存储器中存储有指令,当所述指令被执行时,使得所述图像处理***执行如权利要求1-12任一项所述的图像处理方法。
  27. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序或指令,当所述计算机程序或指令被运行时,实现如权利要求1-12任一项所述的图像处理方法。
PCT/CN2023/114021 2022-08-22 2023-08-21 一种图像处理方法、装置及*** WO2024041482A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211008622.XA CN117676154A (zh) 2022-08-22 2022-08-22 一种图像处理方法、装置及***
CN202211008622.X 2022-08-22

Publications (1)

Publication Number Publication Date
WO2024041482A1 true WO2024041482A1 (zh) 2024-02-29

Family

ID=90012498

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/114021 WO2024041482A1 (zh) 2022-08-22 2023-08-21 一种图像处理方法、装置及***

Country Status (2)

Country Link
CN (1) CN117676154A (zh)
WO (1) WO2024041482A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110228848A1 (en) * 2010-03-21 2011-09-22 Human Monitoring Ltd. Intra video image compression and decompression
JP2014131136A (ja) * 2012-12-28 2014-07-10 Nikon Corp 動画像圧縮装置、動画像復号装置およびプログラム
WO2018176494A1 (en) * 2017-04-01 2018-10-04 SZ DJI Technology Co., Ltd. Method and system for video transmission
CN113596442A (zh) * 2021-07-07 2021-11-02 北京百度网讯科技有限公司 视频处理方法、装置、电子设备及存储介质
CN114786007A (zh) * 2022-03-21 2022-07-22 鹏城实验室 一种结合编码与图像超分辨率的智能视频传输方法及***

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110228848A1 (en) * 2010-03-21 2011-09-22 Human Monitoring Ltd. Intra video image compression and decompression
JP2014131136A (ja) * 2012-12-28 2014-07-10 Nikon Corp 動画像圧縮装置、動画像復号装置およびプログラム
WO2018176494A1 (en) * 2017-04-01 2018-10-04 SZ DJI Technology Co., Ltd. Method and system for video transmission
CN113596442A (zh) * 2021-07-07 2021-11-02 北京百度网讯科技有限公司 视频处理方法、装置、电子设备及存储介质
CN114786007A (zh) * 2022-03-21 2022-07-22 鹏城实验室 一种结合编码与图像超分辨率的智能视频传输方法及***

Also Published As

Publication number Publication date
CN117676154A (zh) 2024-03-08

Similar Documents

Publication Publication Date Title
CN112543347B (zh) 基于机器视觉编解码的视频超分辨率方法、装置、***和介质
TWI759668B (zh) 視頻圖像處理方法、電子設備和電腦可讀儲存介質
CN104333760B (zh) 三维图像编码方法和三维图像解码方法及相关装置
US20220180477A1 (en) Video super-resolution processing method and apparatus
CN112950471A (zh) 视频超分处理方法、装置、超分辨率重建模型、介质
US10784892B1 (en) High throughput hardware unit providing efficient lossless data compression in convolution neural networks
US9996894B2 (en) Image processing device, video subsystem and video pipeline
WO2022111631A1 (zh) 视频传输方法、服务器、终端和视频传输***
US9697584B1 (en) Multi-stage image super-resolution with reference merging using personalized dictionaries
CN114339260A (zh) 图像处理方法及装置
WO2023005740A1 (zh) 图像编码、解码、重建、分析方法、***及电子设备
US9967465B2 (en) Image frame processing method
CN113298712A (zh) 图像处理方法、电子设备及其可读介质
WO2024041482A1 (zh) 一种图像处理方法、装置及***
CN116847087A (zh) 视频处理方法、装置、存储介质及电子设备
CN110677676B (zh) 视频编码方法和装置、视频解码方法和装置及存储介质
WO2023124461A1 (zh) 面向机器视觉任务的视频编解码方法、装置、设备及介质
CN115550669B (zh) 一种视频转码方法及装置、电子设备和存储介质
WO2023133888A1 (zh) 图像处理方法、装置、遥控设备、***及存储介质
CN115834889A (zh) 视频编解码方法、装置、电子设备及介质
CN113781336A (zh) 图像处理的方法、装置、电子设备与存储介质
CN104104958A (zh) 图像解码方法及其图像解码装置
WO2023133889A1 (zh) 图像处理方法、装置、遥控设备、***及存储介质
CN116052047B (zh) 运动物体检测方法及其相关设备
WO2023185305A1 (zh) 编码方法、装置、存储介质及计算机程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23856572

Country of ref document: EP

Kind code of ref document: A1