WO2021057474A1 - 主体对焦方法、装置、电子设备和存储介质 - Google Patents

主体对焦方法、装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2021057474A1
WO2021057474A1 PCT/CN2020/114124 CN2020114124W WO2021057474A1 WO 2021057474 A1 WO2021057474 A1 WO 2021057474A1 CN 2020114124 W CN2020114124 W CN 2020114124W WO 2021057474 A1 WO2021057474 A1 WO 2021057474A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
image
target subject
tof
preview
Prior art date
Application number
PCT/CN2020/114124
Other languages
English (en)
French (fr)
Inventor
贾玉虎
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to EP20868949.7A priority Critical patent/EP4013033A4/en
Publication of WO2021057474A1 publication Critical patent/WO2021057474A1/zh
Priority to US17/671,303 priority patent/US20220166930A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/958Computational photography systems, e.g. light-field imaging systems for extended depth of field imaging
    • H04N23/959Computational photography systems, e.g. light-field imaging systems for extended depth of field imaging by adjusting depth of field during image capture, e.g. maximising or setting range based on scene characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/22Measuring arrangements characterised by the use of optical techniques for measuring depth
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • G01S17/8943D imaging with simultaneous measurement of time-of-flight at a 2D array of receiver pixels, e.g. time-of-flight cameras or flash lidar
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7747Organisation of the process, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • H04N23/631Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters
    • H04N23/632Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters for displaying or modifying preview images prior to image capturing, e.g. variety of image resolutions or capturing parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • H04N23/671Focus control based on electronic image sensor signals in combination with active ranging signals, e.g. using light or sound signals emitted toward objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • H04N23/675Focus control based on electronic image sensor signals comprising setting of focusing regions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/12Acquisition of 3D measurements of objects

Definitions

  • the present invention relates to the field of image processing, in particular to a subject focusing method, device, electronic equipment and storage medium.
  • an embodiment of the present application provides a subject focusing method, and the method includes:
  • the position information of the target subject in the preview image is acquired, and the preview lens focuses on the target subject according to the position information.
  • the performing subject recognition on the TOF image to determine the target subject includes:
  • the target subject is determined from the at least two candidate subjects.
  • the determining the target subject from the at least two candidate subjects includes:
  • the candidate subject with the largest weight is determined as the target subject.
  • the weighting rule includes at least one of the following rules:
  • the weight of characters is greater than that of animals, and the weight of animals is greater than that of plants;
  • the determining the target subject from the at least two candidate subjects includes:
  • the user selection instruction is an instruction for the user to select and trigger the subject identifications of the at least two candidate subjects
  • the candidate subject corresponding to the user selection instruction is determined as the target subject.
  • the inputting the TOF image into a preset subject detection network to obtain at least two candidate subjects includes:
  • the TOF image and the center weight map are input into the subject detection model to obtain a confidence map of the subject area, wherein the subject detection model is based on the TOF image, the center weight map and the corresponding data of the same scene in advance.
  • the model obtained by training the labeled subject mask map;
  • the determining the position information of the target subject in the preview image, and the preview lens focusing on the target subject according to the position information includes:
  • the preview lens focuses on the target subject according to the position coordinates of the target subject in the preview image.
  • the determining the position information of the target subject in the preview image, and the preview lens focusing on the target subject according to the position information includes:
  • the preview lens focuses on the target subject according to the focus position information of the target subject in the preview image.
  • the method further includes:
  • the subject identification of the TOF image to determine the target subject includes:
  • Subject recognition is performed on the TOF image and the RGB image, and the target subject is determined.
  • an embodiment of the present application provides a subject focusing device, which is characterized in that it includes:
  • the acquisition module is used to acquire TOF images
  • the recognition module is used to perform subject recognition on the TOF image and determine the target subject
  • the focusing module is used to obtain position information of the target subject in the preview image, and the preview lens focuses on the target subject according to the position information.
  • an embodiment of the present application provides a computer device, including a memory and a processor, the memory stores a computer program, and the processor implements the steps of any one of the methods in the first aspect when the computer program is executed .
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method according to any one of the first aspects of the claim are implemented.
  • the subject focusing method, device, electronic device, and storage medium provided by the embodiments of the present application acquire TOF images, perform subject recognition on TOF images, determine the target subject, and obtain the position information of the target subject in the preview image.
  • the target subject is focused, combined with the TOF image taken by the TOF lens, and the target subject is identified from the TOF image.
  • the preview lens is assisted to focus, that is, the preview lens focuses according to the position of the target subject. Improve the accuracy of focusing and further improve the quality of shooting.
  • Fig. 1 is a schematic diagram of an image processing circuit in an embodiment
  • Fig. 2 is a flowchart of a subject focusing method in an embodiment
  • Figure 2.1 is a schematic diagram of a TOF image in an embodiment
  • Figure 2.2 is a schematic diagram of an image after subject recognition in an embodiment
  • Figure 2.3 is a schematic diagram of subject focusing in an embodiment
  • FIG. 3 is a flowchart of a subject focusing method provided by another embodiment
  • FIG. 4 is a flowchart of a subject focusing method provided by another embodiment
  • FIG. 5 is a flowchart of a subject focusing method provided by another embodiment
  • FIG. 6 is a flowchart of a subject focusing method provided by another embodiment
  • FIG. 7 is a schematic diagram of a network structure of a subject detection model in an embodiment
  • Figure 8 is a schematic diagram of an image processing effect in an embodiment
  • FIG. 9 is a flowchart of a subject focusing method provided by another embodiment.
  • FIG. 10 is a flowchart of a subject focusing method provided by another embodiment
  • FIG. 11 is a flowchart of a subject focusing method provided by an embodiment
  • FIG. 12 is a block diagram of a subject focusing device provided by an embodiment
  • FIG. 13 is a block diagram of a subject focusing device provided by an embodiment
  • Fig. 14 is a block diagram of an electronic device provided by an embodiment.
  • the subject focusing method in the embodiment of the present application can be applied to electronic equipment.
  • the electronic device may be a computer device with a camera, a personal digital assistant, a tablet computer, a smart phone, a wearable device, etc.
  • the camera in the electronic device takes an image, it will automatically focus to ensure that the captured image is clear.
  • the above electronic device may include an image processing circuit, which may be implemented by hardware and/or software components, and may include various processing units that define an ISP (Image Signal Processing, image signal processing) pipeline.
  • Fig. 1 is a schematic diagram of an image processing circuit in an embodiment. As shown in FIG. 1, for ease of description, only various aspects of the image processing technology related to the embodiments of the present application are shown.
  • the image processing circuit includes a first ISP processor 130, a second ISP processor 140, and a control logic 150.
  • the first camera 110 includes one or more first lenses 112 and a first image sensor 114.
  • the first image sensor 114 may include a color filter array (such as a Bayer filter).
  • the first image sensor 114 may acquire the light intensity and wavelength information captured by each imaging pixel of the first image sensor 114, and provide information that can be obtained by the first ISP.
  • the second camera 120 includes one or more second lenses 122 and a second image sensor 124.
  • the second image sensor 124 may include a color filter array (such as a Bayer filter).
  • the second image sensor 124 may acquire the light intensity and wavelength information captured by each imaging pixel of the second image sensor 124, and provide information that can be used by the second ISP.
  • a set of image data processed by the processor 140 includes one or more first lenses 112 and a first image sensor 114.
  • the first image collected by the first camera 110 is transmitted to the first ISP processor 130 for processing.
  • the statistical data of the first image (such as image brightness, image contrast value) , The color of the image, etc.) are sent to the control logic 150.
  • the control logic 150 can determine the control parameters of the first camera 110 according to the statistical data, so that the first camera 110 can perform operations such as auto-focusing and auto-exposure according to the control parameters.
  • the first image can be stored in the image memory 160 after being processed by the first ISP processor 130, and the first ISP processor 130 can also read the image stored in the image memory 160 for processing.
  • the first image can be directly sent to the display 170 for display after being processed by the ISP processor 130, and the display 170 can also read the image in the image memory 160 for display.
  • the first ISP processor 130 processes image data pixel by pixel in multiple formats.
  • each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the first ISP processor 130 may perform one or more image processing operations on the image data and collect statistical information about the image data.
  • the image processing operations can be performed with the same or different bit depth accuracy.
  • the image memory 160 may be a part of a memory device, a storage device, or an independent dedicated memory in an electronic device, and may include DMA (Direct Memory Access) features.
  • DMA Direct Memory Access
  • the first ISP processor 130 may perform one or more image processing operations, such as temporal filtering.
  • the processed image data can be sent to the image memory 160 for additional processing before being displayed.
  • the first ISP processor 130 receives the processed data from the image memory 160, and performs image data processing in the RGB and YCbCr color spaces on the processed data.
  • the image data processed by the first ISP processor 130 may be output to the display 170 for viewing by the user and/or further processed by a graphics engine or a GPU (Graphics Processing Unit, graphics processor).
  • the output of the first ISP processor 130 can also be sent to the image memory 160, and the display 170 can read image data from the image memory 160.
  • the image memory 160 may be configured to implement one or more frame buffers.
  • the statistical data determined by the first ISP processor 130 may be sent to the control logic 150.
  • the statistical data may include first image sensor 114 statistical information such as automatic exposure, automatic white balance, automatic focus, flicker detection, black level compensation, and shading correction of the first lens 112.
  • the control logic 150 may include a processor and/or a microcontroller that executes one or more routines (such as firmware). The one or more routines can determine the control parameters and the first camera 110 of the first camera 110 based on the received statistical data.
  • the control parameters of the first camera 110 may include gain, integration time of exposure control, anti-shake parameters, flash control parameters, control parameters of the first lens 112 (for example, focal length for focusing or zooming), or a combination of these parameters.
  • the ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (for example, during RGB processing), and the first lens 112 shading correction parameters.
  • the second image captured by the second camera 120 is transmitted to the second ISP processor 140 for processing.
  • the statistical data of the second image (such as image brightness, image The contrast value of the image, the color of the image, etc.) are sent to the control logic 150, and the control logic 150 can determine the control parameters of the second camera 120 according to the statistical data, so that the second camera 120 can perform operations such as auto focus and auto exposure according to the control parameters .
  • the second image can be stored in the image memory 160 after being processed by the second ISP processor 140, and the second ISP processor 140 can also read the image stored in the image memory 160 for processing.
  • the second image can be directly sent to the display 170 for display after being processed by the ISP processor 140, and the display 170 can also read the image in the image memory 160 for display.
  • the second camera 120 and the second ISP processor 140 can also implement the processing procedures described by the first camera 110 and the first ISP processor 130.
  • the first camera 110 may be a color camera
  • the second camera 120 may be a TOF (Time Of Flight) camera or a structured light camera.
  • the TOF camera can obtain the TOF depth map
  • the structured light camera can obtain the structured light depth map.
  • the first camera 110 and the second camera 120 may both be color cameras. Obtain binocular depth maps through two color cameras.
  • the first ISP processor 130 and the second ISP processor 140 may be the same ISP processor.
  • the first camera 110 and the second camera 120 capture the same scene to obtain a visible light map and a TOF image, respectively, and send the visible light map and the depth map to the ISP processor.
  • the ISP processor can perform subject recognition on the TOF image captured by the second camera 120, determine the target subject, and determine the position information of the target subject in the preview lens of the first camera 110, and perform the correction in the preview lens according to the position information. Focus on the target subject.
  • the TOF image is used to identify the target subject, and the target subject is focused according to the position of the target subject in the preview lens, which improves the preparation rate for focusing, thereby improving the shooting quality.
  • Fig. 2 is a flowchart of a subject focusing method in an embodiment.
  • a subject focusing method which can be applied to the electronic device in Fig. 1, includes:
  • the TOF image may be an image captured by a TOF camera, or an image captured by an RGB lens.
  • the TOF image may be obtained through the TOF lens in real time after the TOF camera is turned on by the electronic device, or it may be obtained through the TOF lens when the user triggers the shooting or focusing function.
  • the TOF image is not limited in this embodiment.
  • the picture obtained through the TOF lens is the TOF image, which includes the foreground and background.
  • S202 Perform subject recognition on the TOF image, and determine the target subject.
  • ordinary image recognition technology can be used to recognize the subject in the TOF image.
  • the face recognition technology can be used to recognize the face in the TOF image, or pre-training can also be used.
  • Good detection model to identify the subject in TOF image.
  • the TOF image is subject to subject recognition, and the target subject is determined.
  • the target subject is an airplane. It can be seen that the foreground is extracted from the picture and there is no background.
  • a TOF image may include one subject or multiple subjects.
  • a target subject can be selected from the multiple subjects. For example, different weights are set for different types of subjects in advance, and the subject with the highest weight among multiple subjects is selected as the target subject, or, when multiple subjects are detected, the user is reminded to select one of the subjects as the target subject.
  • the position conversion relationship between the TOF image and the preview image can be obtained in a pre-calibrated manner, and it is determined that the target subject can obtain the position of the target subject in the TOF image.
  • the position information can be the coordinate information of the target subject in the preview image. For example, first determine the coordinates of each pixel of the target subject in the TOF image, then set each pixel of the target subject The coordinates of the point in the TOF image are converted to the preview image to obtain the position of the target subject in the preview image.
  • the preview lens determines the focus point according to the position of the target subject in the preview image, and adjusts the lens to the focus point position.
  • the depth information of the target subject can also be calculated based on the TOF image, the focus position of the target subject is estimated based on the depth information, the focus point is determined according to the focus position, and the lens is adjusted to the focus point position. As shown in Figure 2.3, find the corresponding position of the main body of the preview image in the preview lens and focus.
  • the subject focusing method acquires TOF images, performs subject recognition on TOF images, determines the target subject, determines the position information of the target subject in the preview image, the preview lens focuses on the target subject according to the position information, and combines the TOF lens Take the TOF image obtained and identify the target subject from the TOF image, and assist the preview lens to focus on the basis of identifying the target subject, that is, the preview lens focuses according to the position of the target subject, which improves the accuracy of focusing, and further Improved shooting quality.
  • the TOF image can include one subject or multiple subjects.
  • the subject can be directly determined as the target subject.
  • Fig. 3 is a flowchart of a subject focusing method provided by another embodiment. As shown in Fig. 3, a possible implementation of the step “S202, perform subject recognition on the TOF image to determine the target subject” in Fig. 2 includes:
  • the subject detection network can be pre-trained to perform subject recognition, and the TOF image can be input into the preset subject detection network to output candidate subjects.
  • the subject detection network can be obtained by training with a large number of TOF images, which can identify the foreground of the TOF image and identify and detect various subjects, such as people, flowers, cats, dogs, backgrounds, etc.
  • focusing is for focusing on one subject. Therefore, when multiple candidate subjects are identified in the TOF image, a target subject needs to be determined from them.
  • the target subject can be determined according to the weight of the subject of each category, or the target subject can be selected by the user. In the following, two methods for determining the target subject are introduced respectively through Figure 4 and Figure 5.
  • Determine the target subject from at least two candidate subjects may include:
  • the foregoing preset weight rule includes at least one of the following rules:
  • the weight of characters is greater than that of animals, and the weight of animals is greater than that of plants;
  • different weights can be set for different types of subjects in advance.
  • the system presets a set of default weighting rules based on the test results before publishing, such as: People>Birds>Cats>Dogs> Flowers, or, the closer to the lens, the higher the weight, the farther away from the lens, the lower the weight, or the closer to the intersection of the diagonal of the TOF image, the higher the weight, and the farther away from the intersection of the diagonal of the TOF image, the smaller the weight. And so on, can be determined according to actual scenario requirements.
  • multiple optional weighting rules can be set in the system, and the user can select at least one of them according to actual needs.
  • S402 Determine the candidate subject with the largest weight as the target subject.
  • the larger the weight the closer the subject is to the photographed object. Therefore, the candidate subject with the largest weight can be determined as the target subject.
  • the weight of each candidate subject is determined according to the preset weight rule, and the candidate subject with the largest weight is determined as the target subject.
  • Different weighting rules can be flexibly set according to the actual scene, so that the determined target subject is more suitable for the scene. It meets the needs of users, and can be flexibly adapted to various scenarios, with strong universality.
  • Another possible implementation of "302. Determine the target subject from at least two candidate subjects” may include:
  • the user selection instruction is an instruction for the user to select and trigger the subject identifications of at least two candidate subjects.
  • the user can send a user selection instruction to the device in a variety of ways.
  • the electronic device can display the candidate frame corresponding to each candidate subject on the display screen, and the user clicks
  • the candidate box can select a subject identification and generate a user selection instruction.
  • the user can also input the user selection instruction by voice.
  • the user can input the subject identification by voice to generate the user selection instruction.
  • the user needs to take a picture of a person, and when the subject is identified In the case of characters, animals and plants, the user can use voice to input "characters" to generate user selection instructions.
  • the user selection instruction may also be obtained in other ways, and this embodiment is not limited to this.
  • S502 Determine the candidate subject corresponding to the user selection instruction as the target subject.
  • the electronic device After obtaining the user selection instruction, the electronic device can determine the subject identification selected by the user according to the user selection instruction, and determine the corresponding target subject according to the subject identification.
  • the subject focusing method provided in this embodiment acquires an instruction for the user to select and trigger the subject identification of the at least two candidate subjects, and determines the candidate subject corresponding to the subject identification selected by the user as the target subject, so that the user can select according to actual needs Corresponding shooting subjects can improve the accuracy of focusing, and can also increase the intelligence of human-computer interaction.
  • a possible implementation manner of step "S301, input the TOF image into the preset subject detection network to obtain at least two candidate subjects" includes:
  • the central weight map refers to a map used to record the weight value of each pixel in the TOF image.
  • the weight value recorded in the center weight map gradually decreases from the center to the four sides, that is, the center weight is the largest, and the weight gradually decreases toward the four sides.
  • the weight value from the center pixel of the TOF image to the edge pixel of the image is gradually reduced by the center weight map.
  • the ISP processor or the central processing unit can generate a corresponding center weight map according to the size of the TOF image.
  • the weight value represented by the center weight map gradually decreases from the center to the four sides.
  • the center weight map can be generated using a Gaussian function, a first-order equation, or a second-order equation.
  • the Gaussian function may be a two-dimensional Gaussian function.
  • subject detection refers to automatically processing regions of interest when facing a scene and selectively ignoring regions of interest.
  • the area of interest is called the subject area.
  • the subject detection model is obtained by pre-collecting a large amount of training data, and inputting the training data into the subject detection model containing the initial network weight for training.
  • Each set of training data includes the TOF image corresponding to the same scene, the center weight map and the labeled subject mask map.
  • the TOF image and the center weight map are used as the input of the trained subject detection model, and the labeled subject mask map is used as the ground truth that the trained subject detection model expects to output.
  • the subject mask map is an image filter template used to identify the subject in the image, which can block other parts of the image and filter out the subject in the image.
  • the subject detection model can be trained to recognize and detect various subjects, such as people, flowers, cats, dogs, and backgrounds.
  • Fig. 7 is a schematic diagram of a network structure of a subject detection model in an embodiment.
  • the network structure of the subject detection model includes a convolutional layer 402, a pooling layer 404, a convolutional layer 406, a pooling layer 408, a convolutional layer 410, a pooling layer 412, a convolutional layer 414, and a pooling layer.
  • the network structure of the subject detection model in this embodiment is only an example, and is not intended to limit the application. It is understandable that the convolutional layer, pooling layer, bilinear interpolation layer, convolution feature connection layer, etc. in the network structure of the subject detection model can be set in multiples as needed.
  • the coding part of the subject detection model includes convolutional layer 402, pooling layer 404, convolutional layer 406, pooling layer 408, convolutional layer 410, pooling layer 412, convolutional layer 414, pooling layer 416, convolutional layer Layer 418, the decoding part includes convolution layer 420, bilinear interpolation layer 422, convolution layer 424, bilinear interpolation layer 426, convolution layer 428, convolution feature connection layer 430, bilinear interpolation layer 432, convolution The build-up layer 434, the convolution feature connection layer 436, the bilinear interpolation layer 438, the convolution layer 440, and the convolution feature connection layer 442.
  • the convolutional layer 406 and the convolutional layer 434 are cascaded (Concatenation), the convolutional layer 410 and the convolutional layer 428 are cascaded, and the convolutional layer 414 and the convolutional layer 424 are cascaded.
  • the bilinear interpolation layer 422 and the convolution feature connection layer 430 are bridged by deconvolution feature stacking (Deconvolution+add).
  • the bilinear interpolation layer 432 and the convolution feature connection layer 436 adopt deconvolution feature overlay bridges.
  • the bilinear interpolation layer 438 and the convolution feature connection layer 442 are bridged by deconvolution feature overlay.
  • the original image 450 (such as a TOF image) is input to the convolutional layer 402 of the subject detection model, the depth map 460 acts on the convolution feature connection layer 442 of the subject detection model, and the center weight map 440 acts on the convolution feature connection layer of the subject detection model 442.
  • the depth map 460 and the center weight map 440 are respectively input to the convolution feature connection layer 442 as a product factor.
  • the original image 450, the depth map 460, and the center weight map 440 are input to the subject detection model and output a confidence map 480 containing the subject.
  • a preset value loss rate is used for the depth map.
  • the preset value can be 50%.
  • Probability dropout is introduced in the training process of the depth map, so that the subject detection model can fully mine the information of the depth map. When the subject detection model cannot obtain the depth map, it can still output accurate results.
  • the dropout method is adopted for the depth map input, so that the subject detection model is more robust to the depth map, and the subject area can be accurately segmented even if there is no depth map.
  • the depth map is designed with a dropout probability of 50% during training, which can ensure the subject detection model when there is no depth information. It can still be detected normally.
  • the highlight detection is performed on the original image 450 using the highlight detection layer 444 to identify the highlight area in the original image.
  • the film image and the original image containing the highlight area are subjected to differential processing, and the highlight area is deleted from the main body mask image to obtain the main body with the highlight removed.
  • the confidence map of the subject area is a confidence map distributed from 0 to 1.
  • the confidence map of the subject area contains a lot of noise, there are many noises with lower confidence, or small clusters of high confidence areas that pass through the area.
  • the adaptive confidence threshold is filtered to obtain a binarized mask map. Morphological processing on the binarized mask image can further reduce noise, and guided filtering processing can make the edges smoother. It is understandable that the subject region confidence map may be a subject mask map containing noise.
  • the training method of the subject detection model includes: acquiring TOF images and labeled subject mask images of the same scene; generating a center weight map corresponding to the TOF image, where the center weight map represents The weight value gradually decreases from the center to the edge; the TOF image is applied to the input layer of the subject detection model containing the initial network weight, and the center weight map is applied to the output layer of the initial subject detection model, and the labeled
  • the subject mask map is used as the true value output by the subject detection model, and the subject detection model including the initial network weight is trained to obtain the target network weight of the subject detection model.
  • the training in this embodiment uses TOF images and center weight maps, that is, in the network structure of the subject detection model in Figure 7, no depth map is introduced in the output layer.
  • TOF images are used to act on the convolutional layer 402, and the center weight map 470 acts on The convolution feature connection layer 442 of the subject detection model.
  • the ISP processor or the central processing unit can input the TOF image and the center weight map into the subject detection model, and the subject area confidence map can be obtained by performing the detection.
  • the subject area confidence map is used to record the probability of the subject which can be recognized. For example, the probability of a certain pixel belonging to a person is 0.8, the probability of a flower is 0.1, and the probability of a background is 0.1.
  • Fig. 8 is a schematic diagram of an image processing effect in an embodiment. As shown in Figure 8, there is a butterfly in the TOF image 602. After the TOF image is input to the subject detection model, the subject area confidence map 604 is obtained, and then the subject area confidence map 604 is filtered and binarized to obtain the binarization For the mask image 606, morphological processing and guided filtering are performed on the binary mask image 606 to realize edge enhancement, and the main body mask image 608 is obtained.
  • S603 Determine at least two candidate subjects in the TOF image according to the subject region confidence map.
  • the subject refers to various objects, such as people, flowers, cats, dogs, cows, blue sky, white clouds, backgrounds, etc.
  • the target subject refers to the subject in need, which can be selected according to needs.
  • the ISP processor or the central processing unit can select the highest or second highest confidence level as the subject in the visible light map according to the confidence map of the subject area. If there is one subject, then the subject will be regarded as the target subject; if there are multiple subjects, One or more subjects can be selected as the target subject according to needs.
  • a TOF image is acquired, and a center weight map corresponding to the TOF image is generated, and then the TOF image and the center weight map are input into the corresponding subject detection model for detection, and the subject area confidence map can be obtained.
  • the target subject in the TOF image can be determined.
  • the center weight map can make the object in the center of the image easier to detect.
  • the subject detection model trained by using TOF image, center weight map and subject mask map It can more accurately identify the target subject in the visible light image, thereby making the focus more accurate.
  • subject recognition is performed based on TOF images. Further, when performing subject recognition, you can also use TOF images and preview RGB (Red, Green, Blue) images in combination to perform subject recognition. Recognition.
  • the subject focusing method described above may further include: acquiring the RGB image of the preview lens; then step "S202, performing subject recognition on the TOF image, and determining the target subject” includes: performing subject recognition on the TOF image and the RGB image, and determining Target subject.
  • the RGB image of the preview lens can also be obtained, and the TOF image and the RGB image can be combined for subject recognition, and the target subject can be determined, which can make the subject recognition more accurate.
  • Both the TOF image and the RGB image can be input into the above-mentioned subject detection model, the subject is identified, the subject is identified by the subject detection model, and the method of determining the target subject based on the identified subject can refer to the above-mentioned embodiment, which will not be repeated here. .
  • the preview lens can focus according to the position coordinates of the target subject in the preview image, or it can calculate the focus position of the target subject according to the depth information of the target subject to focus.
  • the specific realization method of the position information of the target subject in the preview shot is the specific realization method of the position information of the target subject in the preview shot.
  • a possible implementation of the step "S203. Acquire the position information of the target subject in the preview image, and the preview lens focuses on the target subject according to the position information" includes:
  • the TOF camera coordinate system can be established for the TOF camera, and the position coordinates of each pixel of the target subject in the TOF camera coordinate system are determined, that is, the position coordinates of the target subject in the TOF image are determined.
  • S902 Obtain the position coordinates of the target subject in the preview image according to a preset correspondence table between the coordinate system of the TOF lens and the coordinate system of the preview lens.
  • the preview camera coordinate system can also be established for the preview camera, and the coordinates of the pixel A in the TOF image can be determined by the coordinate of the pixel A1 in the preview image by a pre-calibration method, and a large number of pixels can be obtained.
  • the correspondence table between the coordinate system of the TOF lens and the coordinate system of the preview lens can be calculated. Therefore, after the position coordinates of the target subject in the TOF image are determined, the position coordinates of the target subject in the preview image can be determined according to the correspondence table between the coordinate system of the TOF lens and the coordinate system of the preview lens.
  • the preview lens focuses on the target subject according to the position coordinates of the target subject in the preview image.
  • the preview lens can determine the focus point according to the position coordinates of the target subject in the preview image, and adjust the position and angle of the preview lens so that the preview lens is adjusted to the focus point position.
  • the subject focusing method provided in this embodiment obtains the position coordinates of the target subject in the TOF image, and obtains the target subject according to a preset correspondence table between the coordinate system of the TOF lens and the coordinate system of the preview lens At the position coordinates of the preview image, the preview lens focuses on the target subject according to the position coordinates of the target subject in the preview image, and the correspondence relationship table between the coordinate system of the TOF lens and the coordinate system of the preview lens obtained in advance is used, The position coordinates of the target subject in the preview image can be determined quickly and accurately, and the focusing accuracy and efficiency are improved.
  • step "S203. Obtain the position information of the target subject in the preview image, and the preview lens focuses on the target subject according to the position information" includes:
  • the area containing the target subject is intercepted on the TOF image, and the depth information of the region containing the target subject is calculated.
  • S1002 Determine the focus position information of the target subject in the preview image according to the depth information of the target subject.
  • the focus position of the target subject can be estimated according to the depth information of the target subject, and the focus position can be further fine-tuned.
  • the depth information can include the depth value of the pixel in the TOF image, that is, after obtaining the depth value of each pixel in the area where the target subject is located, if the area is a single pixel, the depth value of the pixel can be directly used Perform auto focus. If the area contains multiple pixels, the depth values of multiple pixels need to be merged into a single depth value.
  • the average value of the depth values of the pixels in the area is taken as the single depth information of the area; further, in order to avoid the depth values of individual pixels being too large or too small and thus affecting the accurate depth of the focused object in the area, press the depth value
  • the depth values of the pixels in the middle distribution are selected for averaging, so as to obtain the single depth information of the area. It can also be obtained by other methods, which is not limited here.
  • adjust the focal length of the zoom camera lens to focus on that depth The adjustment can be done through a pre-set program. Specifically, there is a certain relationship between the focus size and the depth value. This relationship is stored in the camera system's memory in the form of a program. When a single depth value is obtained , Calculate the adjustment amount according to the program, and then realize auto focus.
  • the preview lens focuses on the target subject according to the focus position information of the target subject in the preview image.
  • the subject focusing method obtaineds the depth information of the target subject, determines the focus position information of the target subject in the preview lens according to the depth information of the target subject, and calculates the depth information of the target subject after identifying the target subject, thereby estimating the subject
  • the preview lens focuses on the target subject according to the focus position information of the target subject in the preview lens, so that the target subject can be focused more quickly.
  • FIG. 11 is a flowchart of a subject focusing method provided by an embodiment, and the method may include:
  • S1103 Determine the target subject according to the weight of each candidate subject
  • S1105 Determine the position information of the target subject in the preview image
  • the preview lens focuses on the target subject according to the position information.
  • the subject focusing method obtaineds TOF images, inputs the TOF images into the subject detection model for subject recognition, and obtains multiple candidate subjects; determines the target subject according to the weight of each candidate subject, and determines the target subject in the preview image Position information, the preview lens focuses on the target subject according to the position information, combines the TOF image taken by the TOF lens, and recognizes the target subject from the TOF image, and assists the preview lens to focus on the basis of identifying the target subject, which improves the focus Accuracy further improves the quality of shooting.
  • a subject focusing device including:
  • the obtaining module 121 is used to obtain TOF images
  • the recognition module 122 is configured to perform subject recognition on the TOF image and determine the target subject;
  • the focusing module 123 is configured to obtain position information of the target subject in the preview image, and the preview lens focuses on the target subject according to the position information.
  • the identification module 122 includes:
  • the detection unit 1221 is configured to input the TOF image into a preset subject detection network to obtain at least two candidate subjects;
  • the determining unit 1222 is configured to determine the target subject from the at least two candidate subjects.
  • the determining unit 1222 is configured to determine the weight of each candidate subject according to a preset weight rule; and determine the candidate subject with the largest weight as the target subject.
  • the weighting rule includes at least one of the following rules:
  • the weight of characters is greater than that of animals, and the weight of animals is greater than that of plants;
  • the determining unit 1222 is configured to obtain a user selection instruction; the user selection instruction is an instruction for the user to select and trigger the subject identification of the at least two candidate subjects; and the candidate corresponding to the user selection instruction The subject is determined as the target subject.
  • the recognition module 122 is configured to generate a center weight map corresponding to the TOF image, wherein the weight value represented by the center weight map gradually decreases from center to edge; and the TOF image and The center weight map is input into the subject detection model to obtain a confidence map of the subject area, wherein the subject detection model is based on the TOF image, the center weight map and the corresponding labeled subject mask map of the same scene in advance A model obtained by training; determining at least two candidate subjects in the TOF image according to the subject region confidence map.
  • the focusing module 123 is configured to obtain the position coordinates of the target subject in the TOF image; according to the preset correspondence between the coordinate system of the TOF lens and the coordinate system of the preview lens Table to obtain the position coordinates of the target subject in the preview image; the preview lens focuses on the target subject according to the position coordinates of the target subject in the preview image.
  • the focusing module 123 is configured to obtain the depth information of the target subject; determine the focus position information of the target subject in the preview image according to the depth information of the target subject; the preview lens according to the The focus position information of the target subject in the preview image is used to focus the target subject.
  • the acquisition module 121 is also used to acquire the RGB image of the preview lens; the recognition module 122 is used to perform subject recognition on the TOF image and the RGB image, and determine the target subject.
  • Each module in the above-mentioned main body focusing device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • an electronic device is provided.
  • the electronic device may be a terminal, and its internal structure diagram may be as shown in FIG. 14.
  • the electronic equipment includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus.
  • the processor of the electronic device is used to provide calculation and control capabilities.
  • the memory of the electronic device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the electronic device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize a subject focusing method.
  • the display screen of the electronic device can be a liquid crystal display screen or an electronic ink display screen
  • the input device of the electronic device can be a touch layer covered on the display screen, or it can be a button, trackball or touch pad set on the housing of the electronic device , It can also be an external keyboard, touchpad, or mouse.
  • FIG. 14 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the electronic device to which the solution of the present application is applied.
  • the specific electronic device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and a processor, and a computer program is stored in the memory, and the processor implements the following steps when the processor executes the computer program:
  • the position information of the target subject in the preview image is acquired, and the preview lens focuses on the target subject according to the position information.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the position information of the target subject in the preview image is acquired, and the preview lens focuses on the target subject according to the position information.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Human Computer Interaction (AREA)
  • Studio Devices (AREA)

Abstract

本申请涉及一种主体对焦方法、装置、电子设备和存储介质。所述方法包括:获取TOF图像,对TOF图像进行主体识别,确定目标主体,确定目标主体在预览图像中的位置信息,预览镜头根据位置信息对目标主体进行对焦,结合TOF镜头拍摄得到的TOF图像,并从TOF图像中识别目标主体,在识别出目标主体的基础上辅助预览镜头进行对焦,提高了对焦的准确性,进一步提高了拍摄质量。

Description

主体对焦方法、装置、电子设备和存储介质 技术领域
本发明涉及图像处理领域,特别是涉及一种主体对焦方法、装置、电子设备和存储介质。
背景技术
随着影像技术的发展,人们越来越习惯通过电子设备上的摄像头等图像采集设备拍摄图像或视频,记录各种信息。通常,在图像采集设备拍摄过程中需要进行对焦以提高拍摄质量,但是,当前对焦技术存在对焦不准确的问题,导致拍摄到的图像或视频的质量不佳。
发明内容
基于此,有必要针对上述技术问题,提供一种能够提高对焦精确度的主体对焦方法、装置、电子设备和存储介质。
第一方面,本申请实施例提供一种主体对焦方法,所述方法包括:
获取TOF图像;
对所述TOF图像进行主体识别,确定目标主体;
获取所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦。
一个实施例中,所述对所述TOF图像进行主体识别,确定目标主体,包括:
将所述TOF图像输入预设的主体检测网络,得到至少两个候选主体;
从所述至少两个候选主体中确定所述目标主体。
一个实施例中,所述从所述至少两个候选主体中确定所述目标主体,包括:
根据预设的权重规则,确定各所述候选主体的权重;
将权重最大的候选主体确定为所述目标主体。
一个实施例中,所述权重规则包括以下规则中的至少一个:
与镜头之间的距离越小,主体的权重越高;
与TOF图像对角线的交点的距离越小,主体的权重越高;
人物的权重大于动物的权重,动物的权重大于植物的权重;
根据用户指令确定各种类型的主体的权重。
一个实施例中,所述从所述至少两个候选主体中确定所述目标主体,包括:
获取用户选择指令;所述用户选择指令为用户对所述至少两个候选主体的主体标识进行选择触发的指令;
将所述用户选择指令对应的候选主体确定为所述目标主体。
一个实施例中,所述将所述TOF图像输入预设的主体检测网络,得到至少两个候选主体,包括:
生成与所述TOF图像对应的中心权重图,其中,所述中心权重图所表示的权重值从中心到边缘逐渐减小;
将所述TOF图像和所述中心权重图输入到所述主体检测模型中,得到主体区域置信度图,其中,所述主体检测模型是预先根据同一场景的TOF图像、中心权重图及对应的已标注的主体掩膜图进行训练得到的模型;
根据所述主体区域置信度图确定所述TOF图像中的至少两个候选主体。
一个实施例中,所述确定所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦,包括:
获取所述目标主体在所述TOF图像中的位置坐标;
根据所述TOF镜头的坐标系与所述预览镜头的坐标系之间的预设对应关系表,获得所述目标主体在所述预览图像的位置坐标;
所述预览镜头根据所述目标主体在所述预览图像中的位置坐标,对所述目标主体进行对焦。
一个实施例中,所述确定所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦,包括:
获取所述目标主体的深度信息;
根据所述目标主体的深度信息确定所述目标主体在预览图像中的对焦位置信息;
所述预览镜头根据所述目标主体在预览图像中的对焦位置信息,对所述目标主体进行对焦。
一个实施例中,所述方法还包括:
获取所述预览镜头的RGB图像;
所述对所述TOF图像进行主体识别,确定目标主体,包括:
对所述TOF图像和所述RGB图像进行主体识别,确定所述目标主体。
第二方面,本申请实施例提供一种主体对焦装置,其特征在于,包括:
获取模块,用于获取TOF图像;
识别模块,用于对所述TOF图像进行主体识别,确定目标主体;
对焦模块,用于获取所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦。
第三方面,本申请实施例提供一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现第一方面任一项所述方法的步骤。
第四方面,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求第一方面任一项所述的方法的步骤。
本申请实施例提供的主体对焦方法、装置、电子设备和存储介质,获取TOF图像,对TOF图像进行主体识别,确定目标主体,获取目标主体在预览图像中的位置信息,预览镜头根据位置信息对目标主体进行对焦,结合TOF镜头拍摄得到的TOF图像,并从TOF图像中识别目标主体,在识别出目标主体的基础上辅助预览镜头进行对焦,也即,预览镜头根据目标主体的位置进行对焦,提高了对焦的准确性,进一步提高了拍摄质量。
附图说明
图1为一个实施例中图像处理电路的示意图;
图2为一个实施例中主体对焦方法的流程图;
图2.1为一个实施例中TOF图像示意图;
图2.2为一个实施例中主体识别后的图像示意图;
图2.3为一个实施例中的主体对焦示意图;
图3为另一个实施例提供的主体对焦方法的流程图;
图4为另一个实施例提供的主体对焦方法的流程图;
图5为另一个实施例提供的主体对焦方法的流程图;
图6为另一个实施例提供的主体对焦方法的流程图;
图7为一个实施例中主体检测模型的网络结构示意图;
图8为一个实施例中图像处理效果示意图;
图9为又一个实施例提供的主体对焦方法的流程图;
图10为又一个实施例提供的主体对焦方法的流程图;
图11为一个实施例提供的主体对焦方法的流程图;
图12为一个实施例提供的主体对焦装置的框图;
图13为一个实施例提供的主体对焦装置的框图;
图14为一个实施例提供的电子设备的框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请实施例中的主体对焦方法可应用于电子设备。该电子设备可为带有摄像头的计算机设备、个人数字助理、平板电脑、智能手机、穿戴式设备等。电子设备中的摄像头在拍摄图像时,会进行自动对焦,以保证拍摄的图像清晰。
在一个实施例中,上述电子设备中可包括图像处理电路,图像处理电路可以利用硬件和/或软件组件实现,可包括定义ISP(Image Signal Processing,图像信号处理)管线的各种处理单元。图1为一个实施例中图像处理电路的示意图。如图1所示,为便于说明,仅示出与本申请实施例相关的图像处理技术的各个方面。
如图1所示,图像处理电路包括第一ISP处理器130、第二ISP处理器140和控制逻辑器150。第一摄像头110包括一个或多个第一透镜112和第一图像传感器114。第一图像传感器114可包括色彩滤镜阵列(如Bayer滤镜),第一图像传感器114可获取用第一图像传感器114的每个成像像素捕捉的光强度和波长信息,并提供可由第一ISP处理器130处理的一组图像数据。第二摄像头120包括一个或多个第二透镜122和第二图像传感器124。第二图像传感器124可包括色彩滤镜阵列(如Bayer滤镜),第二 图像传感器124可获取用第二图像传感器124的每个成像像素捕捉的光强度和波长信息,并提供可由第二ISP处理器140处理的一组图像数据。
第一摄像头110采集的第一图像传输给第一ISP处理器130进行处理,第一ISP处理器130处理第一图像后,可将第一图像的统计数据(如图像的亮度、图像的反差值、图像的颜色等)发送给控制逻辑器150,控制逻辑器150可根据统计数据确定第一摄像头110的控制参数,从而第一摄像头110可根据控制参数进行自动对焦、自动曝光等操作。第一图像经过第一ISP处理器130进行处理后可存储至图像存储器160中,第一ISP处理器130也可以读取图像存储器160中存储的图像以对进行处理。另外,第一图像经过ISP处理器130进行处理后可直接发送至显示器170进行显示,显示器170也可以读取图像存储器160中的图像以进行显示。
其中,第一ISP处理器130按多种格式逐个像素地处理图像数据。例如,每个图像像素可具有8、10、12或14比特的位深度,第一ISP处理器130可对图像数据进行一个或多个图像处理操作、收集关于图像数据的统计信息。其中,图像处理操作可按相同或不同的位深度精度进行。
图像存储器160可为存储器装置的一部分、存储设备、或电子设备内的独立的专用存储器,并可包括DMA(Direct Memory Access,直接直接存储器存取)特征。
当接收到来自第一图像传感器114接口时,第一ISP处理器130可进行一个或多个图像处理操作,如时域滤波。处理后的图像数据可发送给图像存储器160,以便在被显示之前进行另外的处理。第一ISP处理器130从图像存储器160接收处理数据,并对所述处理数据进行RGB和YCbCr颜色空间中的图像数据处理。第一ISP处理器130处理后的图像数据可输出给显示器170,以供用户观看和/或由图形引擎或GPU(Graphics Processing Unit,图形处理器)进一步处理。此外,第一ISP处理器130的输出还可发送给图像存储器160,且显示器170可从图像存储器160读取图像数据。在一个实施例中,图像存储器160可被配置为实现一个或多个帧缓冲器。
第一ISP处理器130确定的统计数据可发送给控制逻辑器150。例如,统计数据可包括自动曝光、自动白平衡、自动聚焦、闪烁检测、黑电平补偿、第一透镜112阴影校正等第一图像传感器114统计信息。控制逻辑器150可包括执行一个或多个例程(如固件)的处理器和/或微控制器,一个或多个例程可根据接收的统计数据,确定第一摄像头110的控制参数及第一ISP处理器130的控制参数。例如,第一摄像头110的控制参数可包括增益、曝光控制的积分时间、防抖参数、闪光控制参数、第一透镜112控制参数(例如聚焦或变焦用焦距)、或这些参数的组合等。ISP控制参数可包括用于自动白平衡和颜色调整(例如,在RGB处理期间)的增益水平和色彩校正矩阵,以及第一透镜112阴影校正参数。
同样地,第二摄像头120采集的第二图像传输给第二ISP处理器140进行处理,第二ISP处理器140处理第一图像后,可将第二图像的统计数据(如图像的亮度、图像的反差值、图像的颜色等)发送给控制逻辑器150,控制逻辑器150可根据统计数据确定第二摄像头120的控制参数,从而第二摄像头120可根据控制参数进行自动对焦、自动曝光等操作。第二图像经过第二ISP处理器140进行处理后可存储至图像存储器160中,第二ISP处理器140也可以读取图像存储器160中存储的图像以对进行处理。另外,第二图像经过ISP处理器140进行处理后可直接发送至显示器170进行显示,显示器170也可以读取图像存储器160中的图像以进行显示。第二摄像头120和第二ISP处理器140也可以实现如第一摄 像头110和第一ISP处理器130所描述的处理过程。
在一个实施例中,第一摄像头110可为彩色摄像头,第二摄像头120可为TOF(Time Of Flight,飞行时间)摄像头或结构光摄像头。TOF摄像头可获取TOF深度图,结构光摄像头可获取结构光深度图。第一摄像头110和第二摄像头120可均为彩色摄像头。通过两个彩色摄像头获取双目深度图。第一ISP处理器130和第二ISP处理器140可为同一ISP处理器。
第一摄像头110和第二摄像头120拍摄同一场景分别得到可见光图和TOF图像,将可见光图和深度图发送给ISP处理器。ISP处理器可以对第二摄像头120拍摄到的TOF图像进行主体识别,确定目标主体,并确定目标主体在第一摄像头110的预览镜头中的位置信息,并根据位置信息在所述预览镜头中对目标主体进行对焦处理。利用TOF图像识别出目标主体,根据目标主体在预览镜头中的位置对目标主体进行对焦,提高对焦的准备率,从而提高拍摄质量。
图2为一个实施例中主体对焦方法的流程图,如图2所示,一种主体对焦方法,可应用于图1中的电子设备中,包括:
S201、获取TOF图像。
其中,TOF图像可以是采用TOF相机拍摄得到的图像,也可以是使用RGB镜头拍摄得到的图像。
在本实施例中,以TOF相机拍摄得到TOF图像为例,可以是在电子设备开启TOF相机后,实时的通过TOF镜头获取TOF图像,也可以是在用户触发拍摄或者对焦功能时通过TOF镜头获取TOF图像,本实施例中不加以限制。如图2.1所示,通过TOF镜头获取的图片即为TOF图像,该TOF图像中包括前景和背景。
S202、对TOF图像进行主体识别,确定目标主体。
在本实施例中,可以采用普通的图像识别技术识别TOF图像中的主体,例如,当目标主体为人物时,可以采用人脸识别技术识别TOF图像中的人脸,或者,还可以采用预先训练好的检测模型来识别TOF图像中的主体。如图2.2所示,对TOF图像进行主体识别,确定目标主体,该目标主体为飞机,可以看出该图片中提取了前景,没有背景。
可选地,一张TOF图像种可能包括一个主体,也可能包括多个主体,当检测到TOF图像中包括多个主体时,可以从多个主体中选择一个目标主体。例如,预先给不同类别的主体设置不同的权值,取多个主体中权重最高的主体作为目标主体,或者,也可以是检测到多个主体时,提醒用户选择其中一个主体作为目标主体。
S203、获取目标主体在预览图像中的位置信息,预览镜头根据位置信息对目标主体进行对焦。
在本实施例中,可以采用预先标定的方式得到TOF图像与预览图像的位置转换关系,确定了目标主体可以获取目标主体在TOF图像中的位置,根据TOF图像与预览图像的位置转换关系,确定目标主体在预览图像中的位置,该位置信息可以是目标主体在预览图像中的坐标信息,例如,先确定目标主体的每个像素点在TOF图像中的坐标,则将目标主体的每个像素点在TOF图像中的坐标转换到预览图像中,得到目标主体在预览图像中的位置,预览镜头根据目标主体在预览图像中的位置确定对焦点,将镜头调整到对焦点位置。或者,也可以计算根据TOF图像计算获得目标主体的深度信息,根据该深度信息估算目标主体的对焦位置,根据该对焦位置确定对焦点,将镜头调整到对焦点位置。如图2.3所示, 找到预览镜头中的预览图像的主体的对应位置并对焦。
本申请实施例提供的主体对焦方法,获取TOF图像,对TOF图像进行主体识别,确定目标主体,确定目标主体在预览图像中的位置信息,预览镜头根据位置信息对目标主体进行对焦,结合TOF镜头拍摄得到的TOF图像,并从TOF图像中识别目标主体,在识别出目标主体的基础上辅助预览镜头进行对焦,也即,预览镜头根据目标主体的位置进行对焦,提高了对焦的准确性,进一步提高了拍摄质量。
通常,TOF图像中可以包括一个主体,也可能包括多个主体,当TOF图像中包括一个主体时,可以直接将该主体确定为目标主体,然而,当TOF图像中包括多个主体时,需要从中选一个目标主体作为拍摄对象。图3为另一个实施例提供的主体对焦方法的流程图,如图3所示,图2中的步骤“S202、对TOF图像进行主体识别,确定目标主体”的一种可能的实现方式包括:
S301、将TOF图像输入预设的主体检测网络,得到至少两个候选主体。
在本实施例中,可以预先训练主体检测网络用来进行主体识别,将TOF图像输入预设的主体检测网络中,即可输出候选主体。该主体检测网络可以是采用大量的TOF图像训练得到,可以对TOF图像的前景进行识别,识别检测各种主体,如人、花、猫、狗、背景等。
S302、从至少两个候选主体中确定目标主体。
在本实施例中,通常情况下,对焦是针对一个主体进行对焦,因此,当识别出TOF图像中存在多个候选主体时,需要从中确定一个目标主体。可以根据各类别的主体的权重确定目标主体,也可以是由用户选择目标主体。下面通过图4和图5分别介绍两种确定目标主体的方法。
如图4所示,“302、从至少两个候选主体中确定目标主体”的一种可能的实现方式可以包括:
S401、根据预设的权重规则,确定各候选主体的权重。
可选地,上述预设的权重规则包括以下规则中的至少一个:
与镜头之间的距离越小,主体的权重越高;
与TOF图片对角线的交点的距离越小,主体的权重越高;
人物的权重大于动物的权重,动物的权重大于植物的权重;
根据用户指令确定各种类型的主体的权重。
在本实施例中,可以预先为不同类别的主体设置不同的权值,例如,***在发布前根据测试结果预设一套默认的权重规则,如,人物>鸟类>猫类>狗类>花朵,或者,离镜头越近权重越高,离镜头越远权重越低,或者,离TOF图像对角线的交点越近权重越高,离TOF图像对角线的交点越远权重越小,等等,可以根据实际场景需求来确定。或者,***中可以设置多种可选的权重规则,用户可以根据实际需求选择其中的至少一个规则。
S402、将权重最大的候选主体确定为目标主体。
在本实施例中,权重越大,说明该主体越接近带拍摄的物体,因此,可以将权重最大的候选主体确定为目标主体。
本实施例中,根据预设的权重规则,确定各候选主体的权重,将权重最大的候选主体确定为目标主体,可以根据实际场景灵活的设置不同的权重规则,使得确定的目标主体更加符合场景和用户需求,且,可以灵活的适应各种场景中,普适性强。
如图5所示,“302、从至少两个候选主体中确定目标主体”的另一种可能的实现方式可以包括:
S501、获取用户选择指令;用户选择指令为用户对至少两个候选主体的主体标识进行选择触发的指令。
在本实施例中,用户可以通过多种方式向设备发送用户选择指令,例如,当确定的候选主体有多个时,电子设备可以在显示屏上显示每个候选主体对应的候选框,用户点击该候选框即可选择一个主体标识,生成用户选择指令。或者,用户还可用通过语音方式输入该用户选择指令,当确定的候选主体有多个时,用户可以采用语音方式输入主体标识,生成用户选择指令,比如,用户需要拍摄人物,当识别处的主体中人物、动物和植物时,用户可以采用语音方式输入“人物”以生成用户选择指令。还可以通过其它方式获取用户选择指令,本实施例中不以此为限。
S502、将用户选择指令对应的候选主体确定为目标主体。
电子设备获取到用户选择指令后,根据该用户选择指令可以确定用户选择的主体标识,根据该主体标识确定对应的目标主体。
本实施例提供的主体对焦方法,获取用户对所述至少两个候选主体的主体标识进行选择触发的指令,将用户选择的主体标识对应的候选主体确定为目标主体,使得用户可以根据实际需求选择相应的拍摄主体,提高对焦的准确性,还可以增加人机交互智能度。
下面将重点介绍主体识别的具体实现方式,如图6所示,步骤“S301、将TOF图像输入预设的主体检测网络,得到至少两个候选主体”的一种可能的实现方式包括:
S601、生成与TOF图像对应的中心权重图,其中,中心权重图所表示的权重值从中心到边缘逐渐减小。
其中,中心权重图是指用于记录TOF图像中各个像素点的权重值的图。中心权重图中记录的权重值从中心向四边逐渐减小,即中心权重最大,向四边权重逐渐减小。通过中心权重图表征TOF图像的图像中心像素点到图像边缘像素点的权重值逐渐减小。
ISP处理器或中央处理器可以根据TOF图像的大小生成对应的中心权重图。该中心权重图所表示的权重值从中心向四边逐渐减小。中心权重图可采用高斯函数、或采用一阶方程、或二阶方程生成。该高斯函数可为二维高斯函数。
S602、将TOF图像和中心权重图输入到主体检测模型中,得到主体区域置信度图,其中,主体检测模型是预先根据同一场景的TOF图像、中心权重图及对应的已标注的主体掩膜图进行训练得到的模型。
其中,主体检测(salient object detection)是指面对一个场景时,自动地对感兴趣区域进行处理而选择性的忽略不感兴趣区域。感兴趣区域称为主体区域。
其中,主体检测模型是预先采集大量的训练数据,将训练数据输入到包含有初始网络权重的主体检测模型进行训练得到的。每组训练数据包括同一场景对应的TOF图像、中心权重图及已标注的主体掩膜图。其中,TOF图像和中心权重图作为训练的主体检测模型的输入,已标注的主体掩膜(mask)图作为训练的主体检测模型期望输出得到的真实值(ground truth)。主体掩膜图是用于识别图像中主体的图像滤镜模板,可以遮挡图像的其他部分,筛选出图像中的主体。主体检测模型可训练能够识别检测各 种主体,如人、花、猫、狗、背景等。
图7为一个实施例中主体检测模型的网络结构示意图。如图7所示,主体检测模型的网络结构包括卷积层402、池化层404、卷积层406、池化层408、卷积层410、池化层412、卷积层414、池化层416、卷积层418、卷积层420、双线性插值层422、卷积层424、双线性插值层426、卷积层428、卷积特征连接层430、双线性插值层432、卷积层434、卷积特征连接层436、双线性插值层438、卷积层440、卷积特征连接层442等,卷积层402作为主体检测模型的输入层,卷积特征连接层442作为主体检测模型的输出层。本实施例中的主体检测模型的网络结构仅为示例,不作为对本申请的限制。可以理解的是,主体检测模型的网络结构中的卷积层、池化层、双线性插值层、卷积特征连接层等可以根据需要设置多个。
该主体检测模型的编码部分包括卷积层402、池化层404、卷积层406、池化层408、卷积层410、池化层412、卷积层414、池化层416、卷积层418,解码部分包括卷积层420、双线性插值层422、卷积层424、双线性插值层426、卷积层428、卷积特征连接层430、双线性插值层432、卷积层434、卷积特征连接层436、双线性插值层438、卷积层440、卷积特征连接层442。卷积层406和卷积层434级联(Concatenation),卷积层410和卷积层428级联,卷积层414与卷积层424级联。双线性插值层422和卷积特征连接层430采用反卷积特征叠加(Deconvolution+add)桥接。双线性插值层432和卷积特征连接层436采用反卷积特征叠加桥接。双线性插值层438和卷积特征连接层442采用反卷积特征叠加桥接。
原图450(如TOF图像)输入到主体检测模型的卷积层402,深度图460作用于主体检测模型的卷积特征连接层442,中心权重图440作用于主体检测模型的卷积特征连接层442。深度图460和中心权重图440分别作为一个乘积因子输入到卷积特征连接层442。原图450、深度图460和中心权重图440输入到主体检测模型后输出包含主体的置信度图480。
该主体检测模型的训练过程中对深度图采用预设数值的丢失率。该预设数值可为50%。深度图的训练过程中引入概率的dropout,让主体检测模型可以充分的挖掘深度图的信息,当主体检测模型无法获取深度图时,仍然可以输出准确结果。对深度图输入采用dropout的方式,让主体检测模型对深度图的鲁棒性更好,即使没有深度图也可以准确分割主体区域。
此外,因正常的电子设备拍摄过程中,深度图的拍摄和计算都相当耗时耗力,难以获取,在训练时深度图设计为50%的dropout概率,能够保证没有深度信息的时候主体检测模型依然可以正常检测。
对原图450采用高光检测层444进行高光检测识别出原图中的高光区域。对主体检测模型输出的主体区域置信度图进行自适应阈值过滤处理得到二值化的掩膜图,对二值化掩膜图进行形态学处理和引导滤波处理得到主体掩膜图,将主体掩膜图与包含高光区域的原图进行差分处理,将高光区域从主体掩膜图中删除得到去除高光的主体。主体区域置信度图是分布在0至1的置信度图,主体区域置信度图包含的噪点较多,有很多置信度较低的噪点,或聚合在一起的小块高置信度区域,通过区域自适应置信度阈值进行过滤处理,得到二值化掩膜图。对二值化掩膜图做形态学处理可以进一步降低噪声,做引导滤波处理,可以让边缘更平滑。可以理解的是,主体区域置信度图可为包含噪点的主体掩膜图。
在一个实施例中,该主体检测模型的训练方式,包括:获取同一场景的TOF图像和已标注的主体 掩膜图;生成与该TOF图像对应的中心权重图,其中,该中心权重图所表示的权重值从中心到边缘逐渐减小;将该TOF图像作用于包含初始网络权重的主体检测模型的输入层,将该中心权重图作用于初始的主体检测模型的输出层,将该已标注的主体掩膜图作为该主体检测模型输出的真实值,对该包含初始网络权重的主体检测模型进行训练,得到该主体检测模型的目标网络权重。
本实施例中的训练采用TOF图像和中心权重图,即在图7的主体检测模型的网络结构中输出层部分不引入深度图,采用TOF图像作用在卷积层402,中心权重图470作用于主体检测模型的卷积特征连接层442。
具体地,ISP处理器或中央处理器可将该TOF图像和中心权重图输入到主体检测模型中,进行检测可以得到主体区域置信度图。主体区域置信度图是用于记录主体属于哪种能识别的主体的概率,例如某个像素点属于人的概率是0.8,花的概率是0.1,背景的概率是0.1。
图8为一个实施例中图像处理效果示意图。如图8所示,TOF图像602中存在一只蝴蝶,将TOF图像输入到主体检测模型后得到主体区域置信度图604,然后对主体区域置信度图604进行滤波和二值化得到二值化掩膜图606,再对二值化掩膜图606进行形态学处理和引导滤波实现边缘增强,得到主体掩膜图608。
S603、根据主体区域置信度图确定TOF图像中的至少两个候选主体。
其中,主体是指各种对象,如人、花、猫、狗、牛、蓝天、白云、背景等。目标主体是指需要的主体,可根据需要选择。
具体地,ISP处理器或中央处理器可根据主体区域置信度图选取置信度最高或次高等作为可见光图中的主体,若存在一个主体,则将该主体作为目标主体;若存在多个主体,可根据需要选择其中一个或多个主体作为目标主体。
本实施例中的主体对焦方法,获取TOF图像,并生成与TOF图像对应的中心权重图后,将TOF图像和中心权重图输入到对应的主体检测模型中检测,可以得到主体区域置信度图,根据主体区域置信度图可以确定得到TOF图像中的目标主体,利用中心权重图可以让图像中心的对象更容易被检测,利用TOF图像、中心权重图和主体掩膜图等训练得到的主体检测模型,可以更加准确的识别出可见光图中的目标主体,从而使得对焦更加的准确。
在进行主体识别时,上述实施例中均是根据TOF图像进行主体识别,进一步地,在进行主体识别时,还可以使用TOF图像和预览RGB(Red、Green、Blue)图像相结合的方式进行主体识别,可选地,上述主体对焦方法还可以包括:获取预览镜头的RGB图像;则步骤“S202、对TOF图像进行主体识别,确定目标主体”包括:对TOF图像和RGB图像进行主体识别,确定目标主体。
在本实施例中,还可以获取预览镜头的RGB图像,结合TOF图像和RGB图像进行主体识别,确定目标主体,可以使得主体识别更加的准确。可以将TOF图像和RGB图像均输入到上述主体检测模型中,识别其中的主体,采用主体检测模型识别主体,以及根据识别得到的主体确定目标主体的方法可参照上述实施例,此处不再赘述。
在上述实施例中,确定了目标主体之后,预览镜头可以根据目标主体在预览图像中的位置坐标进行对焦,也可以是根据目标主体的深度信息计算目标主题的对焦位置进行对焦,下面分别介绍确定目标主 体在预览镜头中的位置信息的具体实现方式。
如图9所示,步骤“S203、获取目标主体在预览图像中的位置信息,预览镜头根据位置信息对目标主体进行对焦”的一种可能的实现方式包括:
S901、获取目标主体在TOF图像中的位置坐标。
在本实施例中,可以针对TOF相机建立TOF相机坐标系,确定目标主体的每个像素点在TOF相机坐标系中的位置坐标,也即确定目标主体在TOF图像中的位置坐标。
S902、根据所述TOF镜头的坐标系与所述预览镜头的坐标系之间的预设对应关系表,获得所述目标主体在所述预览图像的位置坐标。
在本实施例中,还可以针对预览相机建立预览相机坐标系,采用预先标定的方式,可以确定TOF图像中的像素点A的坐标在预览图像中对应的像素点A1的坐标,获取大量像素点A的坐标和像素点A1的坐标之后,可以计算得到TOF镜头的坐标系与预览镜头的坐标系之间的对应关系表。因此,当确定了目标主体在TOF图像中的位置坐标之后,根据TOF镜头的坐标系与预览镜头的坐标系之间的对应关系表,就可以确定目标主体在预览图像中的位置坐标。
S903、预览镜头根据目标主体在预览图像中的位置坐标,对目标主体进行对焦。
在本实施例中,预览镜头可以根据目标主体在预览图像中的位置坐标确定对焦点,调整预览镜头的位置和角度,使得预览镜头调整到对焦点位置。
本实施例提供的主体对焦方法,获取目标主体在TOF图像中的位置坐标,根据所述TOF镜头的坐标系与所述预览镜头的坐标系之间的预设对应关系表,获得所述目标主体在所述预览图像的位置坐标,预览镜头根据目标主体在预览图像中的位置坐标,对目标主体进行对焦,通过预先得到的TOF镜头的坐标系与预览镜头的坐标系之间的对应关系表,可以迅速、准确的确定目标主体在预览图像中的位置坐标,提高了对焦精度和对焦效率。
如图10所示,步骤“S203、获取目标主体在预览图像中的位置信息,预览镜头根据位置信息对目标主体进行对焦”的另一种可能的实现方式包括:
S1001、获取目标主体的深度信息。
本实施例中,确定了目标主体之后,在TOF图像上截取包含目标主体的区域,计算该包含目标主体的区域的深度信息。或者,也可以是计算整幅TOF图像的深度信息,然后再根据整幅TOF图像的深度信息获取目标主体的深度信息。
S1002、根据目标主体的深度信息确定目标主体在预览图像中的对焦位置信息。
在本实施例中,根据目标主体的深度信息可以估算目标主体的对焦位置,还可以进一步的对对焦位置进行微调。该深度信息中可以包括TOF图像中像素点的深度值,也即,获取目标主体所在区域中各像素的深度值后,若该区域为单个像素点,该像素点的深度值就可以直接用来进行自动对焦。若该区域中包含多个像素点,则需要将多个像素点的深度值融合成单一的深度值。优选地,取该区域中各像素的深度值的平均值作为该区域的单一深度信息;进一步的,为了避免个别像素深度值太大或太小从而影响区域中对焦对象的准确深度,按深度值大小分布情况,选取中间分布的像素的深度值来做平均,从而得到该区域的单一深度信息。也可以通过其他方法来获取,在此不作限定。在获取了目标主体所在区域的 单一深度信息后,调整变焦相机镜头的焦距以满足对该深度处聚焦。调整可以通过预先设定的程序来完成,具体地,焦距大小与深度值大小之间有一定的关系,将该关系以程序的形式保存在相机***的存储中,当获取到单一的深度值后,根据程序计算出调整的量,然后实现自动对焦。
S1003、预览镜头根据目标主体在预览图像中的对焦位置信息,对目标主体进行对焦。
本实施例提供的主体对焦方法,获取目标主体的深度信息,根据目标主体的深度信息确定目标主体在预览镜头中的对焦位置信息,识别出目标主体之后,计算目标主体的深度信息,从而估算主体的对焦位置,预览镜头根据目标主体在预览镜头中的对焦位置信息,对目标主体进行对焦,可以更快速的对焦到目标主体。
图11为一个实施例提供的主体对焦方法的流程图,该方法可以包括:
S1101、TOF镜头开启后,获取TOF图像;
S1102、将TOF图像输入主体检测模型中进行主体识别,得到多个候选主体;
S1103、根据各个候选主体的权重确定目标主体;
S1104、显示目标主体;
S1105、确定目标主体在预览图像中的位置信息;
S1106、预览镜头根据位置信息对目标主体进行对焦。
本申请实施例提供的主体对焦方法,获取TOF图像,将TOF图像输入主体检测模型中进行主体识别,得到多个候选主体;根据各个候选主体的权重确定目标主体,确定目标主体在预览图像中的位置信息,预览镜头根据位置信息对目标主体进行对焦,结合TOF镜头拍摄得到的TOF图像,并从TOF图像中识别目标主体,在识别出目标主体的基础上辅助预览镜头进行对焦,提高了对焦的准确性,进一步提高了拍摄质量。
应该理解的是,虽然图2-11的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-11中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图12所示,提供了一种主体对焦装置,包括:
获取模块121,用于获取TOF图像;
识别模块122,用于对所述TOF图像进行主体识别,确定目标主体;
对焦模块123,用于获取所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦。
在一个实施例中,如图13所示,识别模块122包括:
检测单元1221,用于将所述TOF图像输入预设的主体检测网络,得到至少两个候选主体;
确定单元1222,用于从所述至少两个候选主体中确定所述目标主体。
在一个实施例中,确定单元1222,用于根据预设的权重规则,确定各所述候选主体的权重;将权 重最大的候选主体确定为所述目标主体。
在一个实施例中,所述权重规则包括以下规则中的至少一个:
与镜头之间的距离越小,主体的权重越高;
与TOF图像对角线的交点的距离越小,主体的权重越高;
人物的权重大于动物的权重,动物的权重大于植物的权重;
根据用户指令确定各种类型的主体的权重。
在一个实施例中,确定单元1222,用于获取用户选择指令;所述用户选择指令为用户对所述至少两个候选主体的主体标识进行选择触发的指令;将所述用户选择指令对应的候选主体确定为所述目标主体。
在一个实施例中,识别模块122,用于生成与所述TOF图像对应的中心权重图,其中,所述中心权重图所表示的权重值从中心到边缘逐渐减小;将所述TOF图像和所述中心权重图输入到所述主体检测模型中,得到主体区域置信度图,其中,所述主体检测模型是预先根据同一场景的TOF图像、中心权重图及对应的已标注的主体掩膜图进行训练得到的模型;根据所述主体区域置信度图确定所述TOF图像中的至少两个候选主体。
在一个实施例中,对焦模块123,用于获取所述目标主体在所述TOF图像中的位置坐标;根据所述TOF镜头的坐标系与所述预览镜头的坐标系之间的预设对应关系表,获得所述目标主体在所述预览图像的位置坐标;所述预览镜头根据所述目标主体在所述预览图像中的位置坐标,对所述目标主体进行对焦。
在一个实施例中,对焦模块123,用于获取所述目标主体的深度信息;根据所述目标主体的深度信息确定所述目标主体在预览图像中的对焦位置信息;所述预览镜头根据所述目标主体在预览图像中的对焦位置信息,对所述目标主体进行对焦。
在一个实施例中,获取模块121,还用于获取所述预览镜头的RGB图像;识别模块122用于对所述TOF图像和所述RGB图像进行主体识别,确定所述目标主体。
关于主体对焦装置的具体限定可以参见上文中对于主体对焦方法的限定,在此不再赘述。上述主体对焦装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种电子设备,该电子设备可以是终端,其内部结构图可以如图14所示。该电子设备包括通过***总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该电子设备的处理器用于提供计算和控制能力。该电子设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***和计算机程序。该内存储器为非易失性存储介质中的操作***和计算机程序的运行提供环境。该电子设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种主体对焦方法。该电子设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该电子设备的输入装置可以是显示屏上覆盖的触摸层,也可以是电子设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图14中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的电子设备的限定,具体的电子设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现以下步骤:
获取TOF图像;
对所述TOF图像进行主体识别,确定目标主体;
获取所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
获取TOF图像;
对所述TOF图像进行主体识别,确定目标主体;
获取所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。

Claims (12)

  1. 一种主体对焦方法,其特征在于,所述方法包括:
    获取飞行时间TOF图像
    对所述TOF图像进行主体识别,确定目标主体;
    获取所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述TOF图像进行主体识别,确定目标主体,包括:
    将所述TOF图像输入预设的主体检测网络,得到至少两个候选主体;
    从所述至少两个候选主体中确定所述目标主体。
  3. 根据权利要求2所述的方法,其特征在于,所述从所述至少两个候选主体中确定所述目标主体,包括:
    根据预设的权重规则,确定各所述候选主体的权重;
    将权重最大的候选主体确定为所述目标主体。
  4. 根据权利要求3所述的方法,其特征在于,所述权重规则包括以下规则中的至少一个:
    与镜头之间的距离越小,主体的权重越高;
    与TOF图像对角线的交点的距离越小,主体的权重越高;
    人物的权重大于动物的权重,动物的权重大于植物的权重;
    根据用户指令确定各种类型的主体的权重。
  5. 根据权利要求2所述的方法,其特征在于,所述从所述至少两个候选主体中确定所述目标主体,包括:
    获取用户选择指令;所述用户选择指令为用户对所述至少两个候选主体的主体标识进行选择触发的指令;
    将所述用户选择指令对应的候选主体确定为所述目标主体。
  6. 根据权利要求2-5任一项所述的方法,其特征在于,所述将所述TOF图像输入预设的主体检测网络,得到至少两个候选主体,包括:
    生成与所述TOF图像对应的中心权重图,其中,所述中心权重图所表示的权重值从中心到边缘逐渐减小;
    将所述TOF图像和所述中心权重图输入到所述主体检测模型中,得到主体区域置信度图,其中,所述主体检测模型是预先根据同一场景的TOF图像、中心权重图及对应的已标注的主体掩膜图进行训练得到的模型;
    根据所述主体区域置信度图确定所述TOF图像中的至少两个候选主体。
  7. 根据权利要求1-5任一项所述的方法,其特征在于,所述确定所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦,包括:
    获取所述目标主体在所述TOF图像中的位置坐标;
    根据所述TOF镜头的坐标系与所述预览镜头的坐标系之间的预设对应关系表,获得所述目标主体在所述预览图像的位置坐标;
    所述预览镜头根据所述目标主体在所述预览图像中的位置坐标,对所述目标主体进行对焦。
  8. 根据权利要求1-5任一项所述的方法,其特征在于,所述确定所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦,包括:
    获取所述目标主体的深度信息;
    根据所述目标主体的深度信息确定所述目标主体在预览图像中的对焦位置信息;
    所述预览镜头根据所述目标主体在预览图像中的对焦位置信息,对所述目标主体进行对焦。
  9. 根据权利要求1-5任一项所述的方法,其特征在于,所述方法还包括:
    获取所述预览镜头的RGB图像;
    所述对所述TOF图像进行主体识别,确定目标主体,包括:
    对所述TOF图像和所述RGB图像进行主体识别,确定所述目标主体。
  10. 一种主体对焦装置,其特征在于,包括:
    获取模块,用于获取TOF图像;
    识别模块,用于对所述TOF图像进行主体识别,确定目标主体;
    对焦模块,用于获取所述目标主体在预览图像中的位置信息,预览镜头根据所述位置信息对所述目标主体进行对焦。
  11. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至9中任一项所述方法的步骤。
  12. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至9中任一项所述的方法的步骤。
PCT/CN2020/114124 2019-09-24 2020-09-09 主体对焦方法、装置、电子设备和存储介质 WO2021057474A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20868949.7A EP4013033A4 (en) 2019-09-24 2020-09-09 SUBJECT FOCUSING METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIA
US17/671,303 US20220166930A1 (en) 2019-09-24 2022-02-14 Method and device for focusing on target subject, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910906011.9A CN110493527B (zh) 2019-09-24 2019-09-24 主体对焦方法、装置、电子设备和存储介质
CN201910906011.9 2019-09-24

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/671,303 Continuation US20220166930A1 (en) 2019-09-24 2022-02-14 Method and device for focusing on target subject, and electronic device

Publications (1)

Publication Number Publication Date
WO2021057474A1 true WO2021057474A1 (zh) 2021-04-01

Family

ID=68559162

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/114124 WO2021057474A1 (zh) 2019-09-24 2020-09-09 主体对焦方法、装置、电子设备和存储介质

Country Status (4)

Country Link
US (1) US20220166930A1 (zh)
EP (1) EP4013033A4 (zh)
CN (1) CN110493527B (zh)
WO (1) WO2021057474A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113438466A (zh) * 2021-06-30 2021-09-24 东风汽车集团股份有限公司 车外视野拓宽方法、***、设备及计算机可读存储介质

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110493527B (zh) * 2019-09-24 2022-11-15 Oppo广东移动通信有限公司 主体对焦方法、装置、电子设备和存储介质
WO2021184341A1 (en) * 2020-03-20 2021-09-23 SZ DJI Technology Co., Ltd. Autofocus method and camera system thereof
WO2021184338A1 (zh) * 2020-03-20 2021-09-23 深圳市大疆创新科技有限公司 自动对焦方法、装置、云台、设备和存储介质
CN112770100B (zh) * 2020-12-31 2023-03-21 南昌欧菲光电技术有限公司 一种图像获取方法、摄影装置以及计算机可读存储介质
CN112969023A (zh) * 2021-01-29 2021-06-15 北京骑胜科技有限公司 图像拍摄方法、设备、存储介质以及计算机程序产品
CN117544851A (zh) * 2021-07-31 2024-02-09 华为技术有限公司 拍摄方法及相关装置
WO2023098743A1 (zh) * 2021-11-30 2023-06-08 上海闻泰信息技术有限公司 自动曝光方法、装置、设备及存储介质
CN115103107B (zh) * 2022-06-01 2023-11-07 上海传英信息技术有限公司 对焦控制方法、智能终端和存储介质
CN116723264B (zh) * 2022-10-31 2024-05-24 荣耀终端有限公司 确定目标位置信息的方法、设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104660904A (zh) * 2015-03-04 2015-05-27 深圳市欧珀通信软件有限公司 拍摄主体识别方法及装置
CN106231189A (zh) * 2016-08-02 2016-12-14 乐视控股(北京)有限公司 拍照处理方法和装置
US20170366737A1 (en) * 2016-06-15 2017-12-21 Stmicroelectronics, Inc. Glass detection with time of flight sensor
CN110099217A (zh) * 2019-05-31 2019-08-06 努比亚技术有限公司 一种基于tof技术的图像拍摄方法、移动终端及计算机可读存储介质
CN110149482A (zh) * 2019-06-28 2019-08-20 Oppo广东移动通信有限公司 对焦方法、装置、电子设备和计算机可读存储介质
CN110248096A (zh) * 2019-06-28 2019-09-17 Oppo广东移动通信有限公司 对焦方法和装置、电子设备、计算机可读存储介质
CN110493527A (zh) * 2019-09-24 2019-11-22 Oppo广东移动通信有限公司 主体对焦方法、装置、电子设备和存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101888956B1 (ko) * 2012-05-31 2018-08-17 엘지이노텍 주식회사 카메라 모듈 및 그의 오토 포커싱 방법
CN103514429B (zh) * 2012-06-21 2018-06-22 夏普株式会社 检测对象的特定部位的方法及图像处理设备
US9773155B2 (en) * 2014-10-14 2017-09-26 Microsoft Technology Licensing, Llc Depth from time of flight camera
CN104363378B (zh) * 2014-11-28 2018-01-16 广东欧珀移动通信有限公司 相机对焦方法、装置及终端
US10091409B2 (en) * 2014-12-30 2018-10-02 Nokia Technologies Oy Improving focus in image and video capture using depth maps
CN105956586B (zh) * 2016-07-15 2019-06-11 瑞胜科信息(深圳)有限公司 一种基于tof 3d摄像机的智能跟踪***
US11682107B2 (en) * 2018-12-14 2023-06-20 Sony Corporation Depth of field adjustment in images based on time of flight depth maps

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104660904A (zh) * 2015-03-04 2015-05-27 深圳市欧珀通信软件有限公司 拍摄主体识别方法及装置
US20170366737A1 (en) * 2016-06-15 2017-12-21 Stmicroelectronics, Inc. Glass detection with time of flight sensor
CN106231189A (zh) * 2016-08-02 2016-12-14 乐视控股(北京)有限公司 拍照处理方法和装置
CN110099217A (zh) * 2019-05-31 2019-08-06 努比亚技术有限公司 一种基于tof技术的图像拍摄方法、移动终端及计算机可读存储介质
CN110149482A (zh) * 2019-06-28 2019-08-20 Oppo广东移动通信有限公司 对焦方法、装置、电子设备和计算机可读存储介质
CN110248096A (zh) * 2019-06-28 2019-09-17 Oppo广东移动通信有限公司 对焦方法和装置、电子设备、计算机可读存储介质
CN110493527A (zh) * 2019-09-24 2019-11-22 Oppo广东移动通信有限公司 主体对焦方法、装置、电子设备和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4013033A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113438466A (zh) * 2021-06-30 2021-09-24 东风汽车集团股份有限公司 车外视野拓宽方法、***、设备及计算机可读存储介质
CN113438466B (zh) * 2021-06-30 2022-10-14 东风汽车集团股份有限公司 车外视野拓宽方法、***、设备及计算机可读存储介质

Also Published As

Publication number Publication date
US20220166930A1 (en) 2022-05-26
EP4013033A4 (en) 2022-10-19
CN110493527B (zh) 2022-11-15
EP4013033A1 (en) 2022-06-15
CN110493527A (zh) 2019-11-22

Similar Documents

Publication Publication Date Title
WO2021057474A1 (zh) 主体对焦方法、装置、电子设备和存储介质
US11457138B2 (en) Method and device for image processing, method for training object detection model
WO2020259179A1 (zh) 对焦方法、电子设备和计算机可读存储介质
US10997696B2 (en) Image processing method, apparatus and device
CN110248096B (zh) 对焦方法和装置、电子设备、计算机可读存储介质
EP3477931B1 (en) Image processing method and device, readable storage medium and electronic device
WO2021022983A1 (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
CN113766125B (zh) 对焦方法和装置、电子设备、计算机可读存储介质
WO2019085951A1 (en) Image processing method, and device
CN110349163B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
CN107622497B (zh) 图像裁剪方法、装置、计算机可读存储介质和计算机设备
CN110881103B (zh) 对焦控制方法和装置、电子设备、计算机可读存储介质
CN110490196B (zh) 主体检测方法和装置、电子设备、计算机可读存储介质
CN110650288B (zh) 对焦控制方法和装置、电子设备、计算机可读存储介质
CN110365897B (zh) 图像修正方法和装置、电子设备、计算机可读存储介质
CN110392211B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
CN110689007B (zh) 主体识别方法和装置、电子设备、计算机可读存储介质
CN110688926B (zh) 主体检测方法和装置、电子设备、计算机可读存储介质
CN110399823B (zh) 主体跟踪方法和装置、电子设备、计算机可读存储介质
CN110460773B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
CN110610171A (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
CN112866552B (zh) 对焦方法和装置、电子设备、计算机可读存储介质
CN110545384B (zh) 对焦方法和装置、电子设备、计算机可读存储介质
JP2017182668A (ja) データ処理装置、撮像装置、及びデータ処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20868949

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020868949

Country of ref document: EP

Effective date: 20220307

NENP Non-entry into the national phase

Ref country code: DE