WO2022000266A1 - Method for creating depth map for stereo moving image and electronic device - Google Patents

Method for creating depth map for stereo moving image and electronic device Download PDF

Info

Publication number
WO2022000266A1
WO2022000266A1 PCT/CN2020/099275 CN2020099275W WO2022000266A1 WO 2022000266 A1 WO2022000266 A1 WO 2022000266A1 CN 2020099275 W CN2020099275 W CN 2020099275W WO 2022000266 A1 WO2022000266 A1 WO 2022000266A1
Authority
WO
WIPO (PCT)
Prior art keywords
depth
depth map
sparse
image
moving image
Prior art date
Application number
PCT/CN2020/099275
Other languages
French (fr)
Inventor
Ahmed BOUDISSA
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp., Ltd. filed Critical Guangdong Oppo Mobile Telecommunications Corp., Ltd.
Priority to PCT/CN2020/099275 priority Critical patent/WO2022000266A1/en
Publication of WO2022000266A1 publication Critical patent/WO2022000266A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/128Adjusting depth or disparity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/25Image signal generators using stereoscopic image cameras using two or more image sensors with different characteristics other than in their location or field of view, e.g. having different resolutions or colour pickup characteristics; using image signals from one sensor to control the characteristics of another sensor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/122Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/254Image signal generators using stereoscopic image cameras in combination with electromagnetic radiation sources for illuminating objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals

Definitions

  • the present disclosure relates to a method for creating a depth map for a stereo moving image, and an electronic device performing the method.
  • an amount of parallax for each corresponding pixel from a stereo image i.e., a left image and a right image
  • calculating the parallax amount for all pixels in the stereo image requires a large amount of calculation. Therefore, it is difficult to create a depth map for each frame in real time on electronic devices which do not have high computing power.
  • the present disclosure aims to solve at least one of the technical problems mentioned above. Accordingly, the present disclosure needs to provide a method for creating a depth map for a stereo moving image, and an electronic device therefor.
  • a method for creating a depth map for a stereo moving image may include:
  • the plurality of sparse depth maps are created based on a plurality of depth extraction patterns which are different from each other, each of the depth extraction patterns indicating locations of target pixels for which to extract depth values.
  • the locations of target pixels may be shifted along the X-axis direction and/or the Y-axis direction among the plurality of depth extraction patterns.
  • a number of the target pixels in the depth extraction pattern may be determined based on a frame rate and/or a resolution of the stereo moving image.
  • the number of the target pixels in the depth extraction pattern may decrease as the frame rate increases, to load off amount of calculation.
  • the number of the target pixels in the depth extraction pattern may increase as the resolution increases.
  • a length of the predetermined period may be set based on a characteristic of the stereo moving image.
  • the length of the period may decrease as a motion of a subject increases.
  • the method may further include interpolating depth holes in the pseudo dense depth map based on a depth value of an adjacent pixel thereof.
  • the method may further include performing noise reduction processing on the sparse depth map before the superimposing the plurality of sparse depth maps.
  • the creating the sparse depth map may be performed by using a template matching method.
  • the creating the sparse depth map may be performed by calculating depth values for only a part of a stereo image.
  • an electronic device for processing a stereo moving image may include a processor and a memory for storing instructions; wherein the instructions, when executed by the processor, cause the processor to perform the method according to the present disclosure.
  • a computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a computer to implement the method according to the present disclosure.
  • FIG. 1 is a circuit diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure
  • FIG. 2 is a functional block diagram of an image signal processor of the electronic device according to an embodiment of the present disclosure
  • FIG. 3 is a diagram for explaining how to calculate a depth value based on a stereo image
  • FIG. 4 is a diagram illustrating an example of depth extraction patterns (upper figure) and the superimposed patterns for each frame (lower figure) ;
  • FIG. 5 is a flowchart illustrating a method for creating a depth map for a stereo moving image according to an embodiment of the present disclosure
  • FIG. 6 is a diagram for explaining the method of the flow chart in FIG. 5.
  • FIG. 1 is a circuit diagram illustrating a schematic configuration of an electronic device 100 according to an embodiment of the present disclosure.
  • the electronic device 100 is a mobile apparatus such as a smartphone or a tablet terminal. But the electronic device 100 may be other types of electronic devices configured to take stereo moving images.
  • the electronic device 100 includes a stereo camera module 10, a range sensor module 20, and an image signal processor 30 that controls the stereo camera module 10 and the range sensor module 20.
  • the image signal processor 30 processes image data acquired from the stereo camera module 10.
  • the stereo camera module 10 includes a master camera module 11 and a slave camera module 12 for binocular stereo viewing use.
  • the master camera module 11 includes a first lens 11a that is capable of focusing on a subject (e.g., a person, an object) , a first image sensor 11b that detects an image inputted via the first lens 11a, and a first image sensor driver 11c that drives the first image sensor 11b, as shown in FIG. 1.
  • a subject e.g., a person, an object
  • a first image sensor 11b that detects an image inputted via the first lens 11a
  • a first image sensor driver 11c that drives the first image sensor 11b, as shown in FIG. 1.
  • the slave camera module 12 includes a second lens 12a that is capable of focusing on the subject, a second image sensor 12b that detects an image inputted via the second lens 12a, and a second image sensor driver 12c that drives the second image sensor 12b, as shown in FIG. 1.
  • the master camera module 11 acquires a master camera image.
  • the slave camera module 12 acquires a slave camera image.
  • the master camera image is a left image
  • the slave camera image is a right image.
  • the range sensor module 20 includes a lens 20a, a range sensor 20b, a range sensor driver 20c and a projector 20d, as shown in FIG. 1.
  • the projector 20d emits pulsed light toward a subject and the range sensor 20b detects reflection light from the subject through the lens 20a.
  • the range sensor module 20 acquires a time of flight depth value (i.e., ToF depth value) based on the time from emitting the pulsed light to receiving the reflection light.
  • the ToF depth value indicates an actual distance between the electronic device 100 and the subject.
  • the resolution of a ToF depth map created based on depth values detected by the range sensor module 20 is lower than the resolution of a depth map of a stereo image that is acquired based on the master camera image and the slave camera image.
  • the image signal processor 30 controls the master camera module 11 and the slave camera module 12 to acquire stereo moving images.
  • the stereo moving image includes stereo images (frames) taken at a predetermined time interval. Each of the stereo images has a left image taken by the master camera module 11 and a right image taken by the slave camera module 12.
  • the electronic device 100 further includes a global navigation satellite system (GNSS) module 40, a wireless communication module 41, a CODEC 42, a speaker 43, a microphone 44, a display module 45, an input module 46, an inertial measurement unit (IMU) 47, a main processor 48, and a memory 49, as shown in FIG. 1.
  • GNSS global navigation satellite system
  • IMU inertial measurement unit
  • the GNSS module 40 measures the current position of the electronic device 100.
  • the wireless communication module 41 performs wireless communications with the Internet.
  • the CODEC 42 bi-directionally performs encoding and decoding, using a predetermined encoding/decoding method.
  • the speaker 43 outputs a sound in accordance with sound data decoded by the CODEC 42.
  • the microphone 44 outputs sound data to the CODEC 42 based on inputted sound.
  • the display module 45 displays various information such as real time stereo moving images.
  • the input module 46 inputs information via a user’s operation.
  • the IMU 47 detects the angular velocity and the acceleration of the electronic device 100.
  • the main processor 48 controls the global navigation satellite system (GNSS) module 40, the wireless communication module 41, the CODEC 42, the speaker 43, the microphone 44, the display module 45, the input module 46 and the IMU 47.
  • GNSS global navigation satellite system
  • the memory 49 stores a program and data required for the image signal processor 30, acquired image data, and a program and data required for the main processor 48.
  • the image signal processor 30 includes a sparse depth map creating unit 31, a noise reduction unit 32, a superimposing unit 33 and an interpolating unit 34. Please note that at least one of the units 31, 32, 33 and 34 may be implemented as a software (program) performed by a processor such as the main processor 48. Each of the units 31, 32, 33 and 34 will be described in detail below.
  • the sparse depth map creating unit 31 is configured to create a sparse depth map based on a left image and a right image of the stereo moving image acquired by the stereo camera module 10.
  • the sparse depth map is created for each frame.
  • the sparse depth map is a depth map with depth values only for target pixels defined by a depth extraction pattern described later.
  • Depth values are calculated based on a left image and a right image of a stereo image. Specifically, the similarity between the left image and the right image is calculated.
  • the calculation is performed in block units (see FIG. 3) .
  • a block may have an arbitrary size centered on a specific pixel.
  • the calculation is performed by a method called template matching or block matching, in which a template image (e.g., left image) is scanned with an input image (e.g., right image) in a search range, and a block with the highest degree of similarity is detected.
  • a template image e.g., left image
  • an input image e.g., right image
  • a block with the highest degree of similarity is detected.
  • the calculation of similarity gives the corresponding pixels between the left image and the right image.
  • the difference in the horizontal pixel coordinates between the corresponding pixels is a parallax amount.
  • the parallax amount indicates a depth value (i.e., a distance) of the specific pixel of the reference image (e.g., the left image) .
  • the sparse depth map creating unit 31 intermittently thins out pixels in each frame image to create a sparse depth map. Specifically, the sparse depth map creating unit 31 creates the sparse depth map by calculating depth values only for target pixels.
  • the target pixels are defined by a depth extraction pattern which indicates locations of target pixels for which to extract depth values. The locations are also referred to as “sparse locations” .
  • FIG. 4 shows an example of depth extraction patterns P1 to P4 (upper figure) and the superimposed patterns for each time (lower figure) .
  • each of the depth extraction patterns P1 to P4 four target pixels (gray pixels) are defined in a unit of 8 ⁇ 8 pixels.
  • the amount of calculation is only 1/16 compared to the case where depth values are calculated for all pixels.
  • Pixel thinning rate is not limited to 1/16, but it is arbitrary. Pixel thinning rate may be changed for each frame.
  • the target pixels or sparse locations are shifted for each frame. As shown in FIG. 4, the locations of target pixels are shifted by a predetermined number of pixels (2 pixels in this example) along the X-axis direction and/or the Y-axis direction among the depth extraction patterns P1 to P4.
  • the depth extraction pattern P2 is obtained by shifting the depth extraction pattern P1 by 2 pixels in the Y-axis direction (i.e., vertical direction) .
  • the depth extraction pattern P3 is obtained by shifting the depth extraction pattern P1 by 2 pixels in the X-axis direction (i.e., horizontal direction) .
  • the depth extraction pattern P4 is obtained by shifting the depth extraction pattern P1 by 2 pixels in each of the X-axis direction and the Y-axis direction.
  • Sparse locations may not be regular among the depth extraction patterns. That is to say, each of the depth extraction patterns may include sparse locations determined randomly.
  • the depth extraction pattern P1 is used at time t
  • the depth extraction pattern P2 is used at time t+ ⁇ t
  • the depth extraction pattern P3 is used at time t+2 ⁇ t
  • the depth extraction pattern P4 is used at time t+3 ⁇ t.
  • ⁇ t is a time interval which is an inverse of a frame rate of the stereo moving image.
  • the depth extraction pattern P1 is used again.
  • the depth extraction pattern P2 is used again.
  • the depth extraction pattern P1, P2, P3 and P4 are used periodically for creating a sparse depth map for each frame of the stereo moving image.
  • the sparse depth map creating unit 31 may create a sparse depth map by calculating depth values for only a part of a stereo image.
  • a sparse depth map may be created by calculating depth values only for a person and/or an object in a stereo image.
  • the noise reduction unit 32 is configured to perform noise reduction processing on the sparse depth map.
  • the noise reduction processing is performed by using, for example, a filter such as an averaging filter or a moving average filter.
  • the superimposing unit 33 is configured to superimpose a plurality of sparse depth maps created in a predetermined period.
  • a pseudo dense depth map is obtained by the superposition of the plurality of sparse depth maps.
  • a length of the period is 4 ⁇ t.
  • ⁇ t is an inverse of a frame rate of the stereo moving image.
  • a first sparse depth map created by the depth extraction pattern P1 a first sparse depth map created by the depth extraction pattern P1
  • a second sparse depth map created by the depth extraction pattern P2 a third sparse depth map created by the depth extraction pattern P3, and a fourth sparse depth map created by the depth extraction pattern P4 are superimposed by the superimposing unit 33, to obtain a pseudo dense depth map (see a depth map at the rightmost in the lower figure of FIG. 4) .
  • the superimposing unit 33 superimposes the latest four sparse depth maps at times t, t+ ⁇ t, t+2 ⁇ t and t+3 ⁇ t to create a pseudo dense depth map at t+3 ⁇ t. For example, at time t+4 ⁇ t, the superimposing unit 33 superimposes the sparse depth maps at time t+ ⁇ t, the sparse depth maps at time t+2 ⁇ t, the sparse depth maps at time t+3 ⁇ t and the sparse depth maps at time t+4 ⁇ t to create a pseudo dense depth map at t+4 ⁇ t.
  • the superimposed depth extraction pattern at time t+3 ⁇ t is not sparse, but rather dense. According to the embodiment, it is possible to obtain a depth map of 1/4 resolution while reducing the amount of calculation to 1/16. Thus, it can be said that the superimposing unit 33 performs temporal interpolation on a sparse depth map.
  • a length of the predetermined period is not limited to 4 ⁇ t, but it may be set arbitrarily.
  • the length may be set based on a characteristic of the stereo moving image (e.g., frame rate, resolution, motion of stereo moving image etc, ) .
  • a length of the period may decrease as a motion of a subject increases. It is possible to keep the quality of the pseudo dense depth map.
  • the interpolating unit 34 is configured to interpolate depth holes in the pseudo dense depth map created by the superimposing unit 33.
  • the depth hole corresponds to a pixel which is not a target pixel and thus does not have a depth value. In other words, the depth hole is not a target pixel of the depth extraction pattern.
  • the interpolating unit 34 interpolates the depth holes in the pseudo dense depth map based on a depth value of an adjacent pixel thereof. For example, the interpolation is performed by referring to surrounding pixels of the depth hole (i.e., high correlation pixels) . In this sense, the interpolating unit 34 performs spatial interpolation. It can be said that a dense depth map is created by a temporal interpolation by the superimposing unit 33 and a spatial interpolation by the interpolating unit 34. Please note that the interpolation by the interpolating unit 34 may be omitted, for example, when a number of depth holes is low enough or a density of the pseudo dense depth map is already satisfactory.
  • the sparse depth map creating unit 31 intermittently thins out pixels in the stereo image.
  • Pixel thinning rate corresponds to a number of the target pixels in the depth extraction pattern. The number may be determined based on a resolution of the stereo moving image. For example, the number of the target pixels in the depth extraction pattern increases as the resolution increases. As a result, it is possible to maintain the quality of the pseudo dense depth map.
  • the number of the target pixels in the depth extraction pattern may be determined based on a frame rate of the stereo moving image. For example, the number of the target pixels in the depth extraction pattern decreases as the frame rate increases. This effectively suppresses the amount of calculation for creating a pseudo dense depth map.
  • the number of the target pixels in the depth extraction pattern may be determined based on a performance of an electronic device. For example, the number of the target pixels in the depth extraction pattern increases as the performance of the electronic device 100 (e.g., a processing power of the Image Signal Processor 30) .
  • the number of the target pixels in the depth extraction pattern may decrease as the load of the Image Signal Processor 30 increases. Still further, the number may change according to a requirement of an application installed in the electronic device 100.
  • the method includes the following steps.
  • the sparse depth map creating unit 31 creates a sparse depth map based on a left image and a right image of a stereo moving image frame.
  • the sparse depth map is created by using a depth extraction pattern.
  • the depth extraction pattern used to create a sparse depth map is associated with a time position of a period. The period may be an inverse of a frame rate of the stereo moving image.
  • the depth extraction pattern P1 is used at the first time position t
  • the depth extraction pattern P2 is used at the second time position t+ ⁇ t
  • the depth extraction pattern P3 is used at the third time position t+2 ⁇ t
  • the depth extraction pattern P4 is used at the fourth time position t+3 ⁇ t.
  • the noise reduction unit 32 performs noise reduction processing on the sparse depth map created in the S1. As a result, noise in the sparse depth map decreases.
  • the superimposing unit 33 superimposes a plurality of sparse depth maps created in a predetermined period to obtain a pseudo dense depth map.
  • a pseudo dense depth map M5 is obtained.
  • the sparse depth map M1 is created by calculating depth values of target pixels defined by the depth extraction pattern P1.
  • the sparse depth maps M2, M3 and M4 are created by calculating depth values of target pixels defined by the depth extraction patterns P2, P3 and P4 respectively.
  • the interpolating unit 34 interpolates depth holes in the pseudo dense depth map created in the S3.
  • a dense depth map M6 is obtained as shown in FIG. 6.
  • the S4 may be omitted, for example, when a number of depth holes is sufficiently low or when a density of the dense depth map meets a requirement of an application.
  • a sparse depth map is created by using a depth extraction pattern associated with a time position of a predetermined period.
  • the sparse dense map is overlaid with the latest sparse dense maps to obtain a pseudo dense depth map.
  • superimposing the sparse depth maps on the time axis, i.e., temporal interpolation is performed. After that, depth holes in the pseudo dense depth map are filled by interpolation process.
  • the amount of calculation to create a dense depth map can be greatly reduced. Therefore, even mobile terminals such as smartphones which calculation performance is limited, can perform high resolution and real-time processing of stereo moving images.
  • the present disclosure can also be applied to applications that require real-time processing such as stereo movies.
  • the stereo moving image of the present disclosure is not only a stereo moving image taken by a user, but it may also be a stereo movie.
  • first and second are used herein for purposes of description and are not intended to indicate or imply relative importance or significance or to imply the number of indicated technical features.
  • a feature defined as “first” and “second” may comprise one or more of this feature.
  • “aplurality of” means “two or more than two” , unless otherwise specified.
  • the terms “mounted” , “connected” , “coupled” and the like are used broadly, and may be, for example, fixed connections, detachable connections, or integral connections; may also be mechanical or electrical connections; may also be direct connections or indirect connections via intervening structures; may also be inner communications of two elements which can be understood by those skilled in the art according to specific situations.
  • a structure in which a first feature is "on" or “below” a second feature may include an embodiment in which the first feature is in direct contact with the second feature, and may also include an embodiment in which the first feature and the second feature are not in direct contact with each other, but are in contact via an additional feature formed therebetween.
  • a first feature "on” , “above” or “on top of” a second feature may include an embodiment in which the first feature is orthogonally or obliquely “on” , “above” or “on top of” the second feature, or just means that the first feature is at a height higher than that of the second feature; while a first feature “below” , “under” or “on bottom of” a second feature may include an embodiment in which the first feature is orthogonally or obliquely “below” , "under” or “on bottom of” the second feature, or just means that the first feature is at a height lower than that of the second feature.
  • Any process or method described in a flow chart or described herein in other ways may be understood to include one or more modules, segments or portions of codes of executable instructions for achieving specific logical functions or steps in the process, and the scope of a preferred embodiment of the present disclosure includes other implementations, in which it should be understood by those skilled in the art that functions may be implemented in a sequence other than the sequences shown or discussed, including in a substantially identical sequence or in an opposite sequence.
  • the logic and/or step described in other manners herein or shown in the flow chart may be specifically achieved in any computer readable medium to be used by the instructions execution system, device or equipment (such as a system based on computers, a system comprising processors or other systems capable of obtaining instructions from the instructions execution system, device and equipment executing the instructions) , or to be used in combination with the instructions execution system, device and equipment.
  • the computer readable medium may be any device adaptive for including, storing, communicating, propagating or transferring programs to be used by or in combination with the instruction execution system, device or equipment.
  • the computer readable medium comprise but are not limited to: an electronic connection (an electronic device) with one or more wires, a portable computer enclosure (amagnetic device) , a random access memory (RAM) , a read only memory (ROM) , an erasable programmable read-only memory (EPROM or a flash memory) , an optical fiber device and a portable compact disk read-only memory (CDROM) .
  • the computer readable medium may even be a paper or other appropriate medium capable of printing programs thereon, this is because, for example, the paper or other appropriate medium may be optically scanned and then edited, decrypted or processed with other appropriate methods when necessary to obtain the programs in an electric manner, and then the programs may be stored in the computer memories.
  • each part of the present disclosure may be realized by the hardware, software, firmware or their combination.
  • a plurality of steps or methods may be realized by the software or firmware stored in the memory and executed by the appropriate instructions execution system.
  • the steps or methods may be realized by one or a combination of the following techniques known in the art: a discrete logic circuit having a logic gate circuit for realizing a logic function of a data signal, an application-specific integrated circuit having an appropriate combination logic gate circuit, a programmable gate array (PGA) , a field programmable gate array (FPGA) , etc.
  • each function cell of the embodiments of the present disclosure may be integrated in a processing module, or these cells may be separate physical existence, or two or more cells are integrated in a processing module.
  • the integrated module may be realized in a form of hardware or in a form of software function modules. When the integrated module is realized in a form of software function module and is sold or used as a standalone product, the integrated module may be stored in a computer readable storage medium.
  • the storage medium mentioned above may be read-only memories, magnetic disks, CD, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

Disclosed is a method for creating depth map for stereo moving image. The method includes creating a sparse depth map based on a left image and a right image of the stereo moving image for each frame and superimposing a plurality of sparse depth maps created in a predetermined period to obtain a pseudo dense depth map. The plurality of sparse depth maps are created based on a plurality of depth extraction patterns which are different from each other. Each of the depth extraction patterns indicates locations of target pixels for which to extract depth values.

Description

METHOD FOR CREATING DEPTH MAP FOR STEREO MOVING IMAGE AND ELECTRONIC DEVICE TECHNICAL FIELD
The present disclosure relates to a method for creating a depth map for a stereo moving image, and an electronic device performing the method.
BACKGROUND
In recent years, electronic devices such as smartphones having a master camera and a slave camera for stereo moving images (stereo video) are very popular. It is required to apply a depth map which indicates the distance from a camera to a subject to a real-time moving image processing application installed in mobile terminals, such as smartphones, which have limited computing power.
To create a depth map, an amount of parallax for each corresponding pixel from a stereo image (i.e., a left image and a right image) is calculated. However, calculating the parallax amount for all pixels in the stereo image requires a large amount of calculation. Therefore, it is difficult to create a depth map for each frame in real time on electronic devices which do not have high computing power.
SUMMARY
The present disclosure aims to solve at least one of the technical problems mentioned above. Accordingly, the present disclosure needs to provide a method for creating a depth map for a stereo moving image, and an electronic device therefor.
In accordance with the present disclosure, a method for creating a depth map for a stereo moving image may include:
creating a sparse depth map based on a left image and a right image of the stereo moving image for each frame; and
superimposing a plurality of sparse depth maps created in a predetermined period to obtain a pseudo dense depth map,
wherein the plurality of sparse depth maps are created based on a plurality of depth extraction patterns which are different from each other, each of the depth extraction patterns indicating locations of target pixels for which to extract depth values.
In some embodiments, the locations of target pixels may be shifted along the X-axis direction and/or the Y-axis direction among the plurality of depth extraction patterns.
In some embodiments, a number of the target pixels in the depth extraction pattern may be determined based on a frame rate and/or a resolution of the stereo moving image.
In some embodiments, the number of the target pixels in the depth extraction pattern may decrease as the frame rate increases, to load off amount of calculation.
In some embodiments, the number of the target pixels in the depth extraction pattern may increase as the resolution increases.
In some embodiments, a length of the predetermined period may be set based on a characteristic of the stereo moving image.
In some embodiments, the length of the period may decrease as a motion of a subject increases.
In some embodiments, the method may further include interpolating depth holes in the pseudo dense depth map based on a depth value of an adjacent pixel thereof.
In some embodiments, the method may further include performing noise reduction processing on the sparse depth map before the superimposing the plurality of sparse depth maps.
In some embodiments, the creating the sparse depth map may be performed by using a template matching method.
In some embodiments, the creating the sparse depth map may be performed by calculating depth values for only a part of a stereo image.
In accordance with the present disclosure, an electronic device for processing a stereo moving image may include a processor and a memory for storing instructions; wherein the instructions, when executed by the processor, cause the processor to perform the method according to the present disclosure.
In accordance with the present disclosure, a computer-readable storage medium, on which a computer program is stored, wherein the computer program is executed by a computer to implement the method according to the present disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
These and/or other aspects and advantages of embodiments of the present disclosure will become apparent and more readily appreciated from the following descriptions made with reference to the drawings, in which:
FIG. 1 is a circuit diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure;
FIG. 2 is a functional block diagram of an image signal processor of the electronic device according to an embodiment of the present disclosure;
FIG. 3 is a diagram for explaining how to calculate a depth value based on a stereo image;
FIG. 4 is a diagram illustrating an example of depth extraction patterns (upper figure) and the superimposed patterns for each frame (lower figure) ;
FIG. 5 is a flowchart illustrating a method for creating a depth map for a stereo moving image according to an embodiment of the present disclosure;
FIG. 6 is a diagram for explaining the method of the flow chart in FIG. 5.
DETAILED DESCRIPTION
Embodiments of the present disclosure will be described in detail and examples of the embodiments will be illustrated in the accompanying drawings. The same or similar elements and the elements having same or similar functions are denoted by like reference numerals throughout the descriptions. The embodiments described herein with reference to the drawings are explanatory and aim to illustrate the present disclosure, but shall not be construed to limit the present disclosure.
<Electronic device 100>
FIG. 1 is a circuit diagram illustrating a schematic configuration of an electronic device 100 according to an embodiment of the present disclosure.
The electronic device 100 is a mobile apparatus such as a smartphone or a tablet terminal. But the electronic device 100 may be other types of electronic devices configured to take stereo moving images.
As shown in FIG. 1, the electronic device 100 includes a stereo camera module 10, a range sensor module 20, and an image signal processor 30 that controls the stereo camera module 10 and the range sensor module 20. The image signal processor 30 processes image data acquired from the stereo camera module 10.
The stereo camera module 10 includes a master camera module 11 and a slave camera module 12 for binocular stereo viewing use.
The master camera module 11 includes a first lens 11a that is capable of focusing on a subject (e.g., a person, an object) , a first image sensor 11b that detects an image inputted via the first lens 11a, and a first image sensor driver 11c that drives the first image sensor 11b, as shown in FIG. 1.
The slave camera module 12 includes a second lens 12a that is capable of focusing on the subject, a second image sensor 12b that detects an image inputted via the second lens 12a, and a second image sensor driver 12c that drives the second image sensor 12b, as shown in FIG. 1.
The master camera module 11 acquires a master camera image. Similarly, the slave camera module 12 acquires a slave camera image. In the present embodiment, the master camera image is a left image, and the slave camera image is a right image.
The range sensor module 20 includes a lens 20a, a range sensor 20b, a range sensor driver 20c and a projector 20d, as shown in FIG. 1. The projector 20d emits pulsed light toward a subject and the range sensor 20b detects reflection light from the subject through the lens 20a. The range sensor module 20 acquires a time of flight depth value (i.e., ToF depth value) based on the time from emitting the pulsed light to receiving the reflection light. The ToF depth value indicates an actual distance between the electronic device 100 and the subject.
The resolution of a ToF depth map created based on depth values detected by the range sensor module 20 is lower than the resolution of a depth map of a stereo image that is acquired based on the master camera image and the slave camera image.
The image signal processor 30 controls the master camera module 11 and the slave camera module 12 to acquire stereo moving images. The stereo moving image includes stereo images (frames) taken at a predetermined time interval. Each of the stereo images has a left image taken by the master camera module 11 and a right image taken by the slave camera module 12.
The electronic device 100 further includes a global navigation satellite system (GNSS) module 40, a wireless communication module 41, a CODEC 42, a speaker 43, a microphone 44, a display module 45, an input module 46, an inertial measurement unit (IMU) 47, a main processor 48, and a memory 49, as shown in FIG. 1.
The GNSS module 40 measures the current position of the electronic device 100. The wireless communication module 41 performs wireless communications with the Internet. The CODEC 42 bi-directionally performs encoding and decoding, using a predetermined encoding/decoding method. The speaker 43 outputs a sound in accordance with sound data decoded by the CODEC 42. The microphone 44 outputs sound data to the CODEC 42 based on inputted sound. The display module 45 displays various information such as real time stereo moving images. The input module 46 inputs information via a user’s operation. The IMU 47 detects the angular velocity and the acceleration of the electronic device 100.
The main processor 48 controls the global navigation satellite system (GNSS) module 40, the wireless communication module 41, the CODEC 42, the speaker 43, the microphone 44, the display module 45, the input module 46 and the IMU 47.
The memory 49 stores a program and data required for the image signal processor 30, acquired image data, and a program and data required for the main processor 48.
<The image signal processor 30>
Next, referring to FIG. 2, the image signal processor 30 according to the embodiment will be described in detail.
The image signal processor 30 includes a sparse depth map creating unit 31, a noise reduction unit 32, a superimposing unit 33 and an interpolating unit 34. Please note that at least one of the  units  31, 32, 33 and 34 may be implemented as a software (program) performed by a processor such as the main processor 48. Each of the  units  31, 32, 33 and 34 will be described in detail below.
The sparse depth map creating unit 31 is configured to create a sparse depth map based on a left image and a right image of the stereo moving image acquired by the stereo camera module 10. The sparse depth map is created for each frame. The sparse depth map is a depth map with depth values only for target pixels defined by a depth extraction pattern described later.
Here, the calculation method of depth values is described with reference to FIG. 3. Depth values are calculated based on a left image and a right image of a stereo image. Specifically, the similarity between the left image and the right image is calculated. The calculation is performed in block units (see FIG. 3) . A block may have an arbitrary size centered on a specific pixel.
The calculation is performed by a method called template matching or block matching, in  which a template image (e.g., left image) is scanned with an input image (e.g., right image) in a search range, and a block with the highest degree of similarity is detected. Please note that there are many methods for calculating the degree of similarity such as SAD, NCC, SNCC and ZNCC, but any method may be used in the present disclosure.
The calculation of similarity gives the corresponding pixels between the left image and the right image. As shown in FIG. 3, the difference in the horizontal pixel coordinates between the corresponding pixels is a parallax amount. The parallax amount indicates a depth value (i.e., a distance) of the specific pixel of the reference image (e.g., the left image) .
The sparse depth map creating unit 31 intermittently thins out pixels in each frame image to create a sparse depth map. Specifically, the sparse depth map creating unit 31 creates the sparse depth map by calculating depth values only for target pixels. The target pixels are defined by a depth extraction pattern which indicates locations of target pixels for which to extract depth values. The locations are also referred to as “sparse locations” .
Depth values are calculated sparsely by using the depth extraction map. FIG. 4 shows an example of depth extraction patterns P1 to P4 (upper figure) and the superimposed patterns for each time (lower figure) . In each of the depth extraction patterns P1 to P4, four target pixels (gray pixels) are defined in a unit of 8×8 pixels. In this case, the amount of calculation is only 1/16 compared to the case where depth values are calculated for all pixels. Pixel thinning rate is not limited to 1/16, but it is arbitrary. Pixel thinning rate may be changed for each frame.
The target pixels or sparse locations are shifted for each frame. As shown in FIG. 4, the locations of target pixels are shifted by a predetermined number of pixels (2 pixels in this example) along the X-axis direction and/or the Y-axis direction among the depth extraction patterns P1 to P4.
Specifically, the depth extraction pattern P2 is obtained by shifting the depth extraction pattern P1 by 2 pixels in the Y-axis direction (i.e., vertical direction) . The depth extraction pattern P3 is obtained by shifting the depth extraction pattern P1 by 2 pixels in the X-axis direction (i.e., horizontal direction) . The depth extraction pattern P4 is obtained by shifting the depth extraction pattern P1 by 2 pixels in each of the X-axis direction and the Y-axis direction.
Sparse locations may not be regular among the depth extraction patterns. That is to say, each of the depth extraction patterns may include sparse locations determined randomly.
The depth extraction pattern P1 is used at time t, the depth extraction pattern P2 is used at time t+Δt, the depth extraction pattern P3 is used at time t+2Δt, and the depth extraction pattern P4 is used at time t+3Δt. Here, Δt is a time interval which is an inverse of a frame rate of the stereo moving image. At time t+4Δt, the depth extraction pattern P1 is used again. At time t+5Δt, the depth extraction pattern P2 is used again. Thus, the depth extraction pattern P1, P2, P3 and P4 are used periodically for creating a sparse depth map for each frame of the stereo moving image.
The sparse depth map creating unit 31 may create a sparse depth map by calculating depth values for only a part of a stereo image. For example, a sparse depth map may be created by calculating depth values only for a person and/or an object in a stereo image.
The noise reduction unit 32 is configured to perform noise reduction processing on the sparse depth map. The noise reduction processing is performed by using, for example, a filter such as an averaging filter or a moving average filter.
The superimposing unit 33 is configured to superimpose a plurality of sparse depth maps created in a predetermined period. A pseudo dense depth map is obtained by the superposition of the plurality of sparse depth maps. In the example shown in FIG. 4, a length of the period is 4×Δt. Δt is an inverse of a frame rate of the stereo moving image.
In this embodiment, a first sparse depth map created by the depth extraction pattern P1, a second sparse depth map created by the depth extraction pattern P2, a third sparse depth map created by the depth extraction pattern P3, and a fourth sparse depth map created by the depth  extraction pattern P4 are superimposed by the superimposing unit 33, to obtain a pseudo dense depth map (see a depth map at the rightmost in the lower figure of FIG. 4) .
Please note that the superimposing unit 33 superimposes the latest four sparse depth maps at times t, t+Δt, t+2Δt and t+3Δt to create a pseudo dense depth map at t+3Δt. For example, at time t+4Δt, the superimposing unit 33 superimposes the sparse depth maps at time t+Δt, the sparse depth maps at time t+2Δt, the sparse depth maps at time t+3Δt and the sparse depth maps at time t+4Δt to create a pseudo dense depth map at t+4Δt.
As shown in the lower figure of FIG. 4, the superimposed depth extraction pattern at time t+3Δt is not sparse, but rather dense. According to the embodiment, it is possible to obtain a depth map of 1/4 resolution while reducing the amount of calculation to 1/16. Thus, it can be said that the superimposing unit 33 performs temporal interpolation on a sparse depth map.
A length of the predetermined period is not limited to 4Δt, but it may be set arbitrarily. The length may be set based on a characteristic of the stereo moving image (e.g., frame rate, resolution, motion of stereo moving image etc, ) . For example, a length of the period may decrease as a motion of a subject increases. It is possible to keep the quality of the pseudo dense depth map.
The interpolating unit 34 is configured to interpolate depth holes in the pseudo dense depth map created by the superimposing unit 33. The depth hole corresponds to a pixel which is not a target pixel and thus does not have a depth value. In other words, the depth hole is not a target pixel of the depth extraction pattern.
The interpolating unit 34 interpolates the depth holes in the pseudo dense depth map based on a depth value of an adjacent pixel thereof. For example, the interpolation is performed by referring to surrounding pixels of the depth hole (i.e., high correlation pixels) . In this sense, the interpolating unit 34 performs spatial interpolation. It can be said that a dense depth map is created by a temporal interpolation by the superimposing unit 33 and a spatial interpolation by the interpolating unit 34. Please note that the interpolation by the interpolating unit 34 may be omitted, for example, when a number of depth holes is low enough or a density of the pseudo dense depth map is already satisfactory.
As described above, the sparse depth map creating unit 31 intermittently thins out pixels in the stereo image. Pixel thinning rate corresponds to a number of the target pixels in the depth extraction pattern. The number may be determined based on a resolution of the stereo moving image. For example, the number of the target pixels in the depth extraction pattern increases as the resolution increases. As a result, it is possible to maintain the quality of the pseudo dense depth map.
The number of the target pixels in the depth extraction pattern may be determined based on a frame rate of the stereo moving image. For example, the number of the target pixels in the depth extraction pattern decreases as the frame rate increases. This effectively suppresses the amount of calculation for creating a pseudo dense depth map.
The number of the target pixels in the depth extraction pattern may be determined based on a performance of an electronic device. For example, the number of the target pixels in the depth extraction pattern increases as the performance of the electronic device 100 (e.g., a processing power of the Image Signal Processor 30) .
Further, the number of the target pixels in the depth extraction pattern may decrease as the load of the Image Signal Processor 30 increases. Still further, the number may change according to a requirement of an application installed in the electronic device 100.
<Method for creating depth map>
Next, a method for creating a depth map for a real-time stereo moving image according to an example of the present disclosure for the electronic device 100 will be described with reference to the flowchart shown in FIG. 5. In this example, the method includes the following steps.
In the step S1, the sparse depth map creating unit 31 creates a sparse depth map based on a left image and a right image of a stereo moving image frame. As mentioned, the sparse depth map is created by using a depth extraction pattern. The depth extraction pattern used to create a sparse depth map is associated with a time position of a period. The period may be an inverse of a frame rate of the stereo moving image. In the example shown in FIG. 4, the depth extraction pattern P1 is used at the first time position t, the depth extraction pattern P2 is used at the second time position t+Δt, the depth extraction pattern P3 is used at the third time position t+2Δt, and the depth extraction pattern P4 is used at the fourth time position t+3Δt.
In the step S2, the noise reduction unit 32 performs noise reduction processing on the sparse depth map created in the S1. As a result, noise in the sparse depth map decreases.
In the step S3, the superimposing unit 33 superimposes a plurality of sparse depth maps created in a predetermined period to obtain a pseudo dense depth map. As shown in FIG. 6, four sparse depth maps M1, M2, M3 and M4 are superimposed, thereby a pseudo dense depth map M5 is obtained. The sparse depth map M1 is created by calculating depth values of target pixels defined by the depth extraction pattern P1. Similarly, the sparse depth maps M2, M3 and M4 are created by calculating depth values of target pixels defined by the depth extraction patterns P2, P3 and P4 respectively.
In the step S4, the interpolating unit 34 interpolates depth holes in the pseudo dense depth map created in the S3. As a result, a dense depth map M6 is obtained as shown in FIG. 6. Please note that the S4 may be omitted, for example, when a number of depth holes is sufficiently low or when a density of the dense depth map meets a requirement of an application.
As described above, in the present disclosure, when generating a depth map of stereo image, a sparse depth map is created by using a depth extraction pattern associated with a time position of a predetermined period. The sparse dense map is overlaid with the latest sparse dense maps to obtain a pseudo dense depth map. As such, in the present disclosure, superimposing the sparse depth maps on the time axis, i.e., temporal interpolation, is performed. After that, depth holes in the pseudo dense depth map are filled by interpolation process.
According to the present disclosure, the amount of calculation to create a dense depth map can be greatly reduced. Therefore, even mobile terminals such as smartphones which calculation performance is limited, can perform high resolution and real-time processing of stereo moving images. The present disclosure can also be applied to applications that require real-time processing such as stereo movies.
Finally, it should be noted that the stereo moving image of the present disclosure is not only a stereo moving image taken by a user, but it may also be a stereo movie.
In the description of embodiments of the present disclosure, it is to be understood that terms such as "central" , "longitudinal" , "transverse" , "length" , "width" , "thickness" , "upper" , "lower" , "front" , "rear" , "back" , "left" , "right" , "vertical" , "horizontal" , "top" , "bottom" , "inner" , "outer" , "clockwise" and "counterclockwise" should be construed to refer to the orientation or the position as described or as shown in the drawings in discussion. These relative terms are only used to simplify the description of the present disclosure, and do not indicate or imply that the device or element referred to must have a particular orientation, or must be constructed or operated in a particular orientation. Thus, these terms cannot be constructed to limit the present disclosure.
In addition, terms such as "first" and "second" are used herein for purposes of description and are not intended to indicate or imply relative importance or significance or to imply the number of indicated technical features. Thus, a feature defined as "first" and "second" may comprise one or more of this feature. In the description of the present disclosure, "aplurality of" means “two or more than two” , unless otherwise specified.
In the description of embodiments of the present disclosure, unless specified or limited otherwise, the terms "mounted" , "connected" , "coupled" and the like are used broadly, and may be, for example, fixed connections, detachable connections, or integral connections; may also be  mechanical or electrical connections; may also be direct connections or indirect connections via intervening structures; may also be inner communications of two elements which can be understood by those skilled in the art according to specific situations.
In the embodiments of the present disclosure, unless specified or limited otherwise, a structure in which a first feature is "on" or "below" a second feature may include an embodiment in which the first feature is in direct contact with the second feature, and may also include an embodiment in which the first feature and the second feature are not in direct contact with each other, but are in contact via an additional feature formed therebetween. Furthermore, a first feature "on" , "above" or "on top of" a second feature may include an embodiment in which the first feature is orthogonally or obliquely "on" , "above" or "on top of" the second feature, or just means that the first feature is at a height higher than that of the second feature; while a first feature "below" , "under" or "on bottom of" a second feature may include an embodiment in which the first feature is orthogonally or obliquely "below" , "under" or "on bottom of" the second feature, or just means that the first feature is at a height lower than that of the second feature.
Various embodiments and examples are provided in the above description to implement different structures of the present disclosure. In order to simplify the present disclosure, certain elements and settings are described in the above. However, these elements and settings are only by way of example and are not intended to limit the present disclosure. In addition, reference numbers and/or reference letters may be repeated in different examples in the present disclosure. This repetition is for the purpose of simplification and clarity and does not refer to relations between different embodiments and/or settings. Furthermore, examples of different processes and materials are provided in the present disclosure. However, it would be appreciated by those skilled in the art that other processes and/or materials may also be applied.
Reference throughout this specification to "an embodiment" , "some embodiments" , "an exemplary embodiment" , "an example" , "a specific example" or "some examples" means that a particular feature, structure, material, or characteristics described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. Thus, the appearances of the above phrases throughout this specification are not necessarily referring to the same embodiment or example of the present disclosure. Furthermore, the particular features, structures, materials, or characteristics may be combined in any suitable manner in one or more embodiments or examples.
Any process or method described in a flow chart or described herein in other ways may be understood to include one or more modules, segments or portions of codes of executable instructions for achieving specific logical functions or steps in the process, and the scope of a preferred embodiment of the present disclosure includes other implementations, in which it should be understood by those skilled in the art that functions may be implemented in a sequence other than the sequences shown or discussed, including in a substantially identical sequence or in an opposite sequence.
The logic and/or step described in other manners herein or shown in the flow chart, for example, a particular sequence table of executable instructions for realizing the logical function, may be specifically achieved in any computer readable medium to be used by the instructions execution system, device or equipment (such as a system based on computers, a system comprising processors or other systems capable of obtaining instructions from the instructions execution system, device and equipment executing the instructions) , or to be used in combination with the instructions execution system, device and equipment. As to the specification, "the computer readable medium″ may be any device adaptive for including, storing, communicating, propagating or transferring programs to be used by or in combination with the instruction execution system, device or equipment. More specific examples of the computer readable medium comprise but are not limited to: an electronic connection (an electronic device) with one or more wires, a portable computer enclosure (amagnetic device) , a random access memory  (RAM) , a read only memory (ROM) , an erasable programmable read-only memory (EPROM or a flash memory) , an optical fiber device and a portable compact disk read-only memory (CDROM) . In addition, the computer readable medium may even be a paper or other appropriate medium capable of printing programs thereon, this is because, for example, the paper or other appropriate medium may be optically scanned and then edited, decrypted or processed with other appropriate methods when necessary to obtain the programs in an electric manner, and then the programs may be stored in the computer memories.
It should be understood that each part of the present disclosure may be realized by the hardware, software, firmware or their combination. In the above embodiments, a plurality of steps or methods may be realized by the software or firmware stored in the memory and executed by the appropriate instructions execution system. For example, if it is realized by the hardware, likewise in another embodiment, the steps or methods may be realized by one or a combination of the following techniques known in the art: a discrete logic circuit having a logic gate circuit for realizing a logic function of a data signal, an application-specific integrated circuit having an appropriate combination logic gate circuit, a programmable gate array (PGA) , a field programmable gate array (FPGA) , etc.
Those skilled in the art shall understand that all or parts of the steps in the above exemplifying method of the present disclosure may be achieved by commanding the related hardware with programs. The programs may be stored in a computer readable storage medium, and the programs comprise one or a combination of the steps in the method embodiments of the present disclosure when run on a computer.
In addition, each function cell of the embodiments of the present disclosure may be integrated in a processing module, or these cells may be separate physical existence, or two or more cells are integrated in a processing module. The integrated module may be realized in a form of hardware or in a form of software function modules. When the integrated module is realized in a form of software function module and is sold or used as a standalone product, the integrated module may be stored in a computer readable storage medium.
The storage medium mentioned above may be read-only memories, magnetic disks, CD, etc.
Although embodiments of the present disclosure have been shown and described, it would be appreciated by those skilled in the art that the embodiments are explanatory and cannot be construed to limit the present disclosure, and changes, modifications, alternatives and variations can be made in the embodiments without departing from the scope of the present disclosure.

Claims (13)

  1. A method for creating a depth map for a stereo moving image, the method comprising:
    creating a sparse depth map based on a left image and a right image of the stereo moving image for each frame; and
    superimposing a plurality of sparse depth maps created in a predetermined period to obtain a pseudo dense depth map,
    wherein the plurality of sparse depth maps are created based on a plurality of depth extraction patterns which are different from each other, each of the depth extraction patterns indicating locations of target pixels for which to extract depth values.
  2. The method according to claim 1, wherein the locations of target pixels are shifted along the X-axis direction and/or the Y-axis direction among the plurality of depth extraction patterns.
  3. The method according to claim 1 or 2, wherein a number of the target pixels in the depth extraction pattern is determined based on at least one of a frame rate, a resolution of the stereo moving image and a performance of an electronic device.
  4. The method according to claim 3, wherein the number of the target pixels in the depth extraction pattern decreases as the frame rate increases.
  5. The method according to claim 3, wherein the number of the target pixels in the depth extraction pattern increases as the resolution increases.
  6. The method according to any one of claims 1 to 5, wherein a length of the predetermined period is set based on a characteristic of the stereo moving image.
  7. The method according to claim 6, wherein the length of the period decreases as a motion of a subject increases.
  8. The method according to any one of claims 1 to 7, further comprising interpolating depth holes in the pseudo dense depth map based on a depth value of an adjacent pixel thereof.
  9. The method according to any one of claims 1 to 8, further comprising performing noise reduction processing on the sparse depth map before the superimposing the plurality of sparse depth maps.
  10. The method according to any one of claims 1 to 9, wherein the creating the sparse depth map is performed by using a template matching method.
  11. The method according to any one of claims 1 to 10, wherein the creating the sparse depth map is performed by calculating depth values for only a part of a stereo image.
  12. An electronic device for processing stereo moving image, comprising a processor and a memory for storing instructions, wherein the instructions, when executed by the processor, cause the processor to perform the method according to any one of claims 1 to 11.
  13. A computer-readable storage medium, on which a computer program is stored, wherein the computer program is executed by a computer to implement the method according to any one of claims 1 to 11.
PCT/CN2020/099275 2020-06-30 2020-06-30 Method for creating depth map for stereo moving image and electronic device WO2022000266A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/099275 WO2022000266A1 (en) 2020-06-30 2020-06-30 Method for creating depth map for stereo moving image and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/099275 WO2022000266A1 (en) 2020-06-30 2020-06-30 Method for creating depth map for stereo moving image and electronic device

Publications (1)

Publication Number Publication Date
WO2022000266A1 true WO2022000266A1 (en) 2022-01-06

Family

ID=79317854

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/099275 WO2022000266A1 (en) 2020-06-30 2020-06-30 Method for creating depth map for stereo moving image and electronic device

Country Status (1)

Country Link
WO (1) WO2022000266A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230083014A1 (en) * 2021-09-14 2023-03-16 Black Sesame Technologies Inc. Depth estimation based on data fusion of image sensor and depth sensor frames
WO2024092028A2 (en) 2022-10-25 2024-05-02 Vaccitech North America, Inc. Combination treatment regimes for treating cancer

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085343A (en) * 2017-03-10 2017-08-22 深圳奥比中光科技有限公司 Structured light projecting device and depth camera
US20170272724A1 (en) * 2016-03-17 2017-09-21 Electronics And Telecommunications Research Institute Apparatus and method for multi-view stereo
CN108564614A (en) * 2018-04-03 2018-09-21 Oppo广东移动通信有限公司 Depth acquisition methods and device, computer readable storage medium and computer equipment
US20190178635A1 (en) * 2017-12-08 2019-06-13 Ningbo Yingxin Information Technology Co., Ltd. Time-space coding method and apparatus for generating a structured light coded pattern
WO2020072905A1 (en) * 2018-10-04 2020-04-09 Google Llc Depth from motion for augmented reality for handheld user devices
CN111292369A (en) * 2020-03-10 2020-06-16 中车青岛四方车辆研究所有限公司 Pseudo-point cloud data generation method for laser radar

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170272724A1 (en) * 2016-03-17 2017-09-21 Electronics And Telecommunications Research Institute Apparatus and method for multi-view stereo
CN107085343A (en) * 2017-03-10 2017-08-22 深圳奥比中光科技有限公司 Structured light projecting device and depth camera
US20190178635A1 (en) * 2017-12-08 2019-06-13 Ningbo Yingxin Information Technology Co., Ltd. Time-space coding method and apparatus for generating a structured light coded pattern
CN108564614A (en) * 2018-04-03 2018-09-21 Oppo广东移动通信有限公司 Depth acquisition methods and device, computer readable storage medium and computer equipment
WO2020072905A1 (en) * 2018-10-04 2020-04-09 Google Llc Depth from motion for augmented reality for handheld user devices
CN111292369A (en) * 2020-03-10 2020-06-16 中车青岛四方车辆研究所有限公司 Pseudo-point cloud data generation method for laser radar

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230083014A1 (en) * 2021-09-14 2023-03-16 Black Sesame Technologies Inc. Depth estimation based on data fusion of image sensor and depth sensor frames
WO2024092028A2 (en) 2022-10-25 2024-05-02 Vaccitech North America, Inc. Combination treatment regimes for treating cancer

Similar Documents

Publication Publication Date Title
US11663733B2 (en) Depth determination for images captured with a moving camera and representing moving features
CN107274338B (en) Systems, methods, and apparatus for low-latency warping of depth maps
JP6158929B2 (en) Image processing apparatus, method, and computer program
US10909394B2 (en) Real-time multiple vehicle detection and tracking
US20170374256A1 (en) Method and apparatus for rolling shutter compensation
US20170186243A1 (en) Video Image Processing Method and Electronic Device Based on the Virtual Reality
WO2022000266A1 (en) Method for creating depth map for stereo moving image and electronic device
CN110543849B (en) Detector configuration method and device, electronic equipment and storage medium
CN112927271A (en) Image processing method, image processing apparatus, storage medium, and electronic device
CN109495733B (en) Three-dimensional image reconstruction method, device and non-transitory computer readable storage medium thereof
CN114531553B (en) Method, device, electronic equipment and storage medium for generating special effect video
CN103578077A (en) Image zooming method and related device
CN111325786B (en) Image processing method and device, electronic equipment and storage medium
KR20170073937A (en) Method and apparatus for transmitting image data, and method and apparatus for generating 3dimension image
CN115937291B (en) Binocular image generation method and device, electronic equipment and storage medium
CN109816791B (en) Method and apparatus for generating information
WO2022178782A1 (en) Electric device, method of controlling electric device, and computer readable storage medium
CN115457200B (en) Method, device, equipment and storage medium for automatic true stereo display of 2.5-dimensional image
WO2022016331A1 (en) Method of compensating tof depth map and electronic device
WO2022178781A1 (en) Electric device, method of controlling electric device, and computer readable storage medium
CN116249018B (en) Dynamic range compression method and device for image, electronic equipment and storage medium
CN111489428B (en) Image generation method, device, electronic equipment and computer readable storage medium
WO2022198525A1 (en) Method of improving stability of bokeh processing and electronic device
WO2022241728A1 (en) Image processing method, electronic device and non–transitory computer–readable media
CN111696144B (en) Depth information determining method, depth information determining device and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20942886

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20942886

Country of ref document: EP

Kind code of ref document: A1