WO2020042000A1 - 相机设备及对焦方法 - Google Patents

相机设备及对焦方法 Download PDF

Info

Publication number
WO2020042000A1
WO2020042000A1 PCT/CN2018/102912 CN2018102912W WO2020042000A1 WO 2020042000 A1 WO2020042000 A1 WO 2020042000A1 CN 2018102912 W CN2018102912 W CN 2018102912W WO 2020042000 A1 WO2020042000 A1 WO 2020042000A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
binocular
focused
image
area
Prior art date
Application number
PCT/CN2018/102912
Other languages
English (en)
French (fr)
Inventor
王铭钰
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201880038715.2A priority Critical patent/CN111345025A/zh
Priority to PCT/CN2018/102912 priority patent/WO2020042000A1/zh
Publication of WO2020042000A1 publication Critical patent/WO2020042000A1/zh
Priority to US17/084,409 priority patent/US20210051262A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/296Synchronisation thereof; Control thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/45Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from two or more image sensors being of different type or operating in different modes, e.g. with a CMOS sensor for moving images in combination with a charge-coupled device [CCD] for still images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • H04N23/675Focus control based on electronic image sensor signals comprising setting of focusing regions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/90Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20101Interactive definition of point of interest, landmark or seed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2213/00Details of stereoscopic systems
    • H04N2213/001Constructional or mechanical details

Definitions

  • the present application relates to the field of cameras, and more particularly, to a camera device and a focusing method.
  • camera equipment is widely used in various fields, such as mobile terminal field and drone field.
  • a target scene that is, a scene to be shot
  • focus is usually required.
  • camera equipment mainly adopts an automatic focusing method for focusing.
  • some traditional autofocus methods have low focusing efficiency and some have poor focusing effects, which need to be improved.
  • the present application provides a camera device and a focusing method to improve the focusing method of the camera device.
  • a focusing method for a camera device includes a main camera and a binocular camera for assisting focusing of the main camera.
  • the method includes: using the main camera to collect an image of a target scene During the process, controlling the binocular camera to collect the target scene to generate a binocular image; determining the position of the area to be focused in the target scene in the binocular image; The position in the binocular image determines distance information of the area to be focused; and controls the main camera to perform a focusing operation according to the distance information of the area to be focused.
  • a camera device including: a main camera; a binocular camera for assisting the main camera to focus; and a control device for performing the following operations: a process of acquiring an image of a target scene using the main camera Controlling the binocular camera to collect the target scene to generate a binocular image; determining a position of a region to be focused in the target scene in the binocular image; and according to the region to be focused in the binocular image The position in the target image determines distance information of the area to be focused; and controls the main camera to perform a focus operation according to the distance information of the area to be focused.
  • the binocular camera is used to provide the distance information of the area to be focused to the main camera, so that the main camera can focus when the distance of the area to be focused is known, which improves the focusing method of the camera device.
  • FIG. 1 is a structural example diagram of a camera device provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of a focusing method according to an embodiment of the present application.
  • FIG. 3 is an example diagram of an image captured by a camera device according to an embodiment of the present application.
  • FIG. 4 is a flowchart of a possible implementation manner of step S24 in FIG. 2.
  • FIG. 5 is a flowchart of another possible implementation manner of step S24 in FIG. 2.
  • Focusing can also be called focusing or focusing.
  • the focusing process refers to the process of adjusting the focus to make the image of the subject gradually clear when using the camera device.
  • Auto focus technology mainly includes two kinds of active focus and passive focus.
  • Active focus can also be called rangefinder focus.
  • the camera device transmits a ranging signal to the subject (the ranging signal may be infrared, ultrasonic, or laser, for example); then, the camera device receives the reflected echo of the ranging signal from the subject In this way, the camera device can calculate the distance of the photographed object based on the reflected echo, and intervene in the focusing process of the camera device according to the distance of the photographed object.
  • the distance measured by the active focus method is usually the foreground object closest to the camera device in the scene being shot. Therefore, the traditional active focusing method cannot focus on distant objects, making the use scenarios of this focusing method limited. Therefore, the current camera equipment mainly adopts a passive focus method for focusing.
  • Passive focusing can also be called behind-the-focus. Passive focus usually includes two methods: contrast detection autofocus (CDAF) and phase detection autofocus (PDAF).
  • CDAF contrast detection autofocus
  • PDAF phase detection autofocus
  • Contrast detection autofocus can be referred to as contrast focus for short, mainly based on the contrast change of the picture at the focal point, looking for the position of the lens when the contrast is maximum, that is, the position of accurate focus. Contrast focusing requires moving the lens repeatedly, it cannot be in place at one time, and the focusing process is slow.
  • Phase detection autofocus can be referred to as phase focus for short. It is mainly to reserve some obscuring pixels on the photosensitive element as phase focus, which is used for phase detection. During the specific focusing process, the camera device can determine the offset value of the focus through the distance between the phase focus points and its change, etc., so as to achieve the focus. Phase focus is limited by the strength of the signal at the phase focus point on the photosensitive element, and the focus effect is poor in low light conditions.
  • FIG. 1 is a structural example diagram of a camera device provided by an embodiment of the present application.
  • the camera device 10 includes a main camera 12, a binocular camera 14, 16, and a control device 18.
  • the main camera 12 may be used to collect an image (for example, a planar image) of a target scene (also called a captured scene).
  • the main camera 12 may include a lens, a focusing system (not shown in the figure), and the like.
  • the main camera 12 may further include a display screen (such as a liquid crystal display screen), and the user may collect and present the target scene aligned by the lens through the display screen.
  • the display screen may be a touch screen, and a user may select a region to be focused from an image displayed on the liquid crystal display screen by a touch operation (such as clicking or sliding).
  • Binocular cameras 14, 16 may also be referred to as binocular vision modules or binocular modules.
  • the binocular cameras 14, 16 may include a first camera 14 and a second camera 16.
  • the binocular cameras 14, 16 can be used to assist the main camera 12 for fast focusing.
  • the binocular cameras 14, 16 can take a binocular image (including a left-eye image and a right-eye image) of the target scene under the control of the control device 18, and compare the binocular image (or the parallax between the binocular images) with The target scene is analyzed in depth to obtain the distance distribution of the objects in the target scene. Then, the binocular cameras 14, 16 can provide the distance information of the area to be focused to the main camera 12, so that the main camera 12 can focus when the distance of the area to be focused is known.
  • the binocular cameras 14, 16 and the main camera 12 can be integrated on the camera device 10 (such as the housing of the camera device 10), that is, the two can be fixedly connected to form a non-detachable whole.
  • the binocular cameras 14, 16 may be detachably connected to the main camera 12.
  • the binocular cameras 14, 16 can be understood as the peripherals of the main camera 12.
  • the binocular camera 14,16 can be assembled with the main camera 12; when the binocular camera 14,16 is not required to focus, the binocular camera 14,16 can be used from The main camera 12 is removed, and the main camera 12 is used as a normal camera.
  • the embodiment of the present application does not specifically limit the positional relationship between the main camera 12 and the binocular cameras 14, 16 as long as the positional relationship of the main camera 12 and the binocular cameras 14, 16 is set so that the two can have approximately the same viewing angle range To shoot the scene.
  • the binocular cameras 14, 16 and the main camera 12 are located in the same camera device 10, the parallax between them is usually small. Therefore, it can be considered that the objects in the images collected by the main camera 12 will appear in the binocular images collected by the binocular cameras 14, 16 and can be used directly without registering or correcting the images collected by the two.
  • the images collected by the main camera 12 and the binocular cameras 14, 16 may also be corrected or registered to ensure the images in the images collected by the main camera 12. The content will all appear in the binocular image.
  • the images captured by the main camera and the binocular image can be registered according to the difference in the field of view of the main camera 12 and the binocular cameras 14, 16 and subsequent focusing operations can be performed using the registered images .
  • the image content in the registered image may be a common part of the image captured by the main camera and the binocular image.
  • the sum of the field angles of the binocular cameras 14,16 is usually larger. Therefore, according to the difference of the field angles of the main camera 12 and the binocular cameras 14,16, the binocular The image is cropped so that the cropped image and the image captured by the main camera describe the scene within the same field of view.
  • the main camera and / or the binocular camera may be used to determine the zoom ratio of the target scene; and then based on the difference between the main camera and the binocular camera ’s zoom factor of the target scene, the images and binocular images collected by the main camera Image registration.
  • the image content of the image captured by the main camera 12 will be less than the binocular camera
  • the binocular image may be cropped so that the cropped image matches the image content of the image captured by the main camera.
  • the registration method of the images collected by the main camera 12 and the binocular cameras 14, 16 may also be a combination of the above-mentioned methods, which will not be repeated here.
  • the control device 18 can be used to implement the control or information processing functions of the binocular cameras 14, 16 to assist the main camera 12 during focusing.
  • the control device 18 may belong to a part of the main camera 12, and may be implemented by a processor or a focusing system of the main camera 12, for example.
  • the control device 18 may belong to a part of the binocular cameras 14,16, for example, may be implemented by a chip part of the binocular cameras 14,16.
  • the control device 18 may be an independent component, such as an independent processor or controller, located outside the main camera 12 and the binocular cameras 14 and 16.
  • the control device 18 may also be a distributed control device, and its functions may be implemented jointly by multiple processors, which is not limited in this embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a focusing method for a camera device according to an embodiment of the present application.
  • FIG. 2 may be executed by the control device 18 in FIG. 1.
  • the method of FIG. 2 may include steps S22 to S28. These steps are described in detail below.
  • step S22 during the process of acquiring the image of the target scene using the main camera 12, the binocular cameras 14, 16 are controlled to acquire the binocular image of the target scene.
  • the process of acquiring the image of the target scene by the main camera 12 may refer to the process of aiming the camera lens of the main camera 12 at the target scene and preparing to shoot the target scene.
  • the binocular image may include a left-eye image and a right-eye image. There is a parallax between the binocular images, and a disparity map of the target scene can be obtained based on the parallax between the binocular images, and then a depth map of the target scene can be obtained.
  • the image collected by the main camera 12 may be, for example, the image 32 in FIG. 3, and the binocular images collected by the binocular cameras 14, 16 may include a left-eye image 34 and right eye image 36.
  • the parallax between the three images is very small, and the objects appearing in image 32 basically appear in the binocular images 34, 36. The existence of this tiny parallax will not actually affect the implementation of the subsequent auxiliary focus function. .
  • step S24 the position of the region to be focused in the target scene in the binocular image is determined.
  • the area to be focused may refer to an area that the main camera 12 (or a user) wishes to focus on. There are many ways to determine the area to be focused.
  • input information of a user may be received, and the input information may be used to select an area to be focused from an image collected by the main camera 12.
  • the main camera 12 may include a liquid crystal display screen for displaying the image.
  • the user can select the area to be focused from the image displayed on the LCD screen by touching or pressing the keys.
  • the user may select an area to be focused from the image collected by the main camera 12 by clicking; for another example, the user may designate a region as the area to be focused from the image collected by the main camera 12 by a sliding operation.
  • the area to be focused may sometimes be referred to as a focus point or a focus position.
  • the area to be focused generally includes an object that the user wants to focus on. Therefore, in some embodiments, the area to be focused may also be replaced with the object to be focused.
  • the area to be focused may be pre-configured for the camera device 10 without the user having to make a selection.
  • the camera device 10 may be a camera device mounted on a drone, and the drone may be used to detect an unknown scene to discover whether a target object (such as a person or other object) exists in the scene.
  • the area to be focused in 10 is set in advance as the area where the target is located. Once the target is found, it automatically focuses on the area where the target is located, so that the target is photographed.
  • the position of the area to be focused in the binocular image may refer to the position of the area to be focused in the left-eye image, or the position of the area to be focused in the right-eye image, or may also include the position of the area to be focused in the left-eye image. And the position of the right eye image.
  • step S24 There may be multiple implementations of step S24.
  • the position of the region to be focused in the binocular image can be determined according to the relative position relationship between the main camera 12 and the binocular cameras 14, 16; the position of the region to be focused in the binocular image can also be identified by means of semantic recognition. .
  • the implementation manner of step S24 will be described in detail below in combination with specific embodiments, and will not be described in detail here.
  • step S26 distance information of the area to be focused is determined according to the position of the area to be focused in the binocular image.
  • the distance information of the area to be focused may also be referred to as the depth information of the area to be focused.
  • the distance information of the area to be focused can be used to indicate the distance between the area to be focused and the camera device 10 (or the main camera 12, or the binocular cameras 14, 16).
  • a depth map of a target scene may be generated from a binocular image first. Then, according to the position of the region to be focused in the binocular image, a corresponding position of the region to be focused in the depth map may be determined. Then, the depth information of the corresponding position can be read from the depth map as the distance information of the area to be focused.
  • the position of the region to be focused in the left-eye image and the right-eye image may be determined first, and then the position and right of the region to be focused in the left-eye image may be determined according to the disparity between the left-eye image and the right-eye image.
  • the position in the eye image is registered to determine the distance information corresponding to the position.
  • the area where the small tower 38 is located is the area to be focused. Since the small tower 38 also appears in the left-eye image 34 and the right-eye image 36, the position of the area where the small tower 38 is located in the left-eye image 34 and the right-eye image 36 can be calculated first, and then according to the left-eye image 34 and The parallax of the right-eye image 36 determines the distance information of the area where the small tower 388 is located.
  • step S28 the main camera 12 is controlled to perform a focusing operation according to the distance information of the area to be focused.
  • step S28 is not specifically limited in the embodiment of the present application.
  • the focus position may be determined according to the distance information of the area to be focused, and then the camera lens of the main camera 12 is controlled to directly move to the focus position.
  • the focus position determined based on the distance information of the area to be focused can be used as the approximate position of the focus, and then the contrast focus can be used to perform contrast focus near the position, so as to find the exact position of the focus.
  • the binocular cameras 14 and 16 can provide distance information of the main camera 12 to be focused, so that the main camera 12 can focus when the distance of the to-be-focused area is known, compared to the contrast focusing method , Can increase the focusing speed.
  • the focusing method provided in the embodiment of the present application does not rely on the intensity of the optical signal received by the phase focusing point to a large extent like the phase focusing method. Therefore, a good focusing effect can also be achieved under low light conditions.
  • the focusing method provided in the embodiment of the present application can make up for the deficiencies of the traditional focusing method in some aspects, so as to balance the focusing effect and the focusing speed.
  • the focusing method provided in the embodiment of the present application can continuously track the distance of the area to be focused, it is very suitable for following focus.
  • step S24 that is, the determination manner of the position of the area to be focused in the binocular image.
  • FIG. 4 is a flowchart of a possible implementation manner of step S24.
  • FIG. 4 mainly determines the position of the area to be focused in the binocular image based on the relative position relationship between the main camera 12 and the binocular cameras 14, 16.
  • step S24 may include steps S42 to S46.
  • step S42 the user's input information is acquired.
  • the input information can be used to select an area to be focused from the image collected by the main camera 12.
  • the main camera 12 may include a liquid crystal display screen for displaying images. The user can select the area to be focused from the image displayed on the LCD screen by touching or pressing the keys.
  • step S44 the position of the area to be focused in the image collected by the main camera is determined according to the input information.
  • the position in the image collected by the main camera 12 corresponding to the user's touch position on the touch screen of the main camera 12 may be determined as the position of the area to be focused in the image collected by the main camera 12.
  • step S46 the position of the region to be focused in the binocular image is determined according to the position of the region to be focused in the image collected by the main camera and the relative position relationship between the main camera 12 and the binocular cameras 14,16.
  • the relative positional relationship between the main camera 12 and the binocular cameras 14, 16 can be obtained in advance.
  • the camera coordinate systems of the main camera 12 and the binocular cameras 14, 16 may be calibrated in advance, and the transformation matrix of the camera coordinate systems of the main camera 12 and the binocular cameras 14, 16 may be calculated.
  • the transformation matrix can be used to represent the relative positional relationship between the main camera 12 and the binocular cameras 14, 16.
  • FIG. 5 is a flowchart of another possible implementation manner of step S24. Different from the implementation shown in FIG. 4, FIG. 5 is mainly based on the image processing algorithm to identify and locate the area to be focused. As shown in FIG. 5, step S24 may include steps S52 to S56.
  • step S52 the user's input information is acquired.
  • the input information can be used to select an area to be focused from the image collected by the main camera 12.
  • the main camera 12 may include a liquid crystal display screen for displaying the image. The user can select the area to be focused from the image displayed on the LCD screen by touching or pressing the keys.
  • step S54 the semantics (or category) of the object in the area to be focused is identified.
  • the embodiment of the present application does not specifically limit the manner of semantic recognition of objects in the region to be focused, and may perform semantic recognition based on a traditional image classification algorithm; and may also perform semantic recognition based on a neural network model.
  • the process of semantic recognition based on traditional image classification algorithms can be implemented, for example, in the following ways: First, scale-invariant feature transformation (SIFT), histogram of orientation gradient (HOG), etc. are used for extraction. The features of the image in the area to be focused, and then the extracted image features are input to a classification model such as support vector machine (SVM), K proximity, etc., so as to determine the semantics of the objects in the area to be focused.
  • SIFT scale-invariant feature transformation
  • HOG histogram of orientation gradient
  • SVM support vector machine
  • K proximity etc.
  • the process of semantic recognition based on a neural network model can be implemented, for example, in the following way: first use the first neural network model to extract the features of the image in the area to be focused (you can use multiple convolution layers to extract features, or use convolution layers and Pooling layers are combined to extract image features), and then output the image features to a classification module (such as an SVM module) to obtain the semantics of the objects in the area to be focused.
  • a classification module such as an SVM module
  • the feature extraction layer of the first neural network model may be, for example, a convolution layer, or a convolution layer and a pooling layer
  • the fully connected layer can calculate the preset probability of each candidate semantics (or candidate category) according to the image characteristics, and use the semantics with the highest probability as the semantics of the objects in the area to be focused.
  • the embodiment of the present application does not specifically limit the type of the first neural network model.
  • the first neural network model may be a convolutional neural network (CNN), GoogleNet, or VGG.
  • step S56 an object matching the semantics is searched from the binocular image, and the position of the object matching the semantics in the binocular image is used as the position of the region to be focused in the binocular image.
  • the embodiment of the present application uses a semantic recognition method to determine the position of the area to be focused in the binocular image.
  • This implementation does not require accurate calibration (rough calibration) of the main camera 12 and the binocular cameras 14, 16, which simplifies Implementation of camera equipment.
  • step S56 is not specifically limited in the embodiment of the present application.
  • it can be implemented using a traditional feature matching algorithm.
  • features of an object corresponding to the semantics may be stored in advance.
  • the binocular image may be first divided into multiple image blocks, and then the features of each image block are extracted, and the object in the image block with the best matching feature is used as the object matching the semantics, and The position of the image block is used as the position of the area to be focused in the binocular image.
  • an object that matches the semantics may be searched from a binocular image according to a pre-trained second neural network model.
  • the second neural network model may be trained using an image containing the region to be focused, so that the neural network model can recognize the region to be focused from the image, and can output the position of the region to be focused in the image. Then, in actual use, the binocular image can be input to the second neural network model to determine the position of the region to be focused in the binocular image.
  • a second neural network model can be pre-trained so that the second neural network model can identify a small tower.
  • the binocular image can be input to a second neural network model to determine the position of the small tower 38 in the binocular image.
  • the second neural network model may include a feature extraction layer and a fully connected layer.
  • the feature extraction layer may be, for example, a convolution layer, or a convolution layer and a pooling layer.
  • the input of the fully connected layer may be the feature extracted by the feature extraction layer, and the output may be the position of the object matching the semantics in the binocular image.
  • the specific implementation manner of the second neural network model may be designed by referring to a conventional design manner of a neural network model having an image recognition and positioning function. For example, you can refer to the design of CNN models based on sliding windows for design.
  • the images captured by the main camera 12 and the binocular cameras 14, 16 may contain multiple objects with the same semantics.
  • the general position range of the area to be focused selected by the user in the image collected by the main camera 12 may be obtained, and the same as that in the binocular image.
  • An object matching the semantics is searched within a range corresponding to the general position range to reduce the probability of errors.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission by wire (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, a solid state disk (SSD)), etc. .
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a digital video disc (DVD)
  • DVD digital video disc
  • SSD solid state disk
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Studio Devices (AREA)

Abstract

提供一种相机设备(10)及对焦方法。相机设备(10)包括主相机(12)以及用于辅助主相机(12)对焦的双目摄像头(14,16),对焦方法包括:在使用主相机(12)采集目标场景的图像(32)的过程中,控制双目摄像头(14,16)采集目标场景,以生成双目图像(34,36)(S22);确定目标场景中的待对焦区域在双目图像(34,36)中的位置(S24);根据待对焦区域在双目图像(34,36)中的位置,确定待对焦区域的距离信息(S26);根据待对焦区域的距离信息,控制主相机(12)执行对焦操作(S28)。利用双目摄像头(14,16)向主相机(12)提供待对焦区域的距离信息,使得主相机(12)可以在待对焦区域距离已知的情况下进行对焦,改善了相机设备(10)的对焦方式。

Description

相机设备及对焦方法
版权申明
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。
技术领域
本申请涉及相机领域,更为具体地,涉及一种相机设备及对焦方法。
背景技术
随着相机技术的发展,相机设备被广泛应用于各个领域,如移动终端领域,无人机领域。
使用相机设备对目标场景(即待拍摄场景)进行拍摄的过程中,通常需要对焦(focus)。目前相机设备主要采用自动对焦的方式进行对焦。但传统的自动对焦方式有些对焦效率低,有些对焦效果差,亟待改善。
发明内容
本申请提供一种相机设备及对焦方法,以改善相机设备的对焦方式。
第一方面,提供一种相机设备的对焦方法,所述相机设备包括主相机以及用于辅助所述主相机对焦的双目摄像头,所述方法包括:在使用所述主相机采集目标场景的图像的过程中,控制所述双目摄像头采集所述目标场景,以生成双目图像;确定所述目标场景中的待对焦区域在所述双目图像中的位置;根据所述待对焦区域在所述双目图像中的位置,确定所述待对焦区域的距离信息;根据所述待对焦区域的距离信息,控制所述主相机执行对焦操作。
第二方面,提供一种相机设备,包括:主相机;用于辅助所述主相机对焦的双目摄像头;控制装置,用于执行以下操作:在使用所述主相机采集目标场景的图像的过程中,控制所述双目摄像头采集所述目标场景,以生成双目图像;确定所述目标场景中的待对焦区域在所述双目图像中的位置;根据所述待对焦区域在所述双目图像中的位置,确定所述待对焦区域的距离信息;根据所述待对焦区域的距离信息,控制所述主相机执行对焦操作。
利用双目摄像头向主相机提供待对焦区域的距离信息,使得主相机可以在待对焦区域距离已知的情况下进行对焦,改善了相机设备的对焦方式。
附图说明
图1是本申请实施例提供的相机设备的结构示例图。
图2是本申请实施例提供的对焦方法的流程图。
图3是本申请实施例提供的相机设备采集到的图像的示例图。
图4是图2中的步骤S24的一种可能的实现方式的流程图。
图5是图2中的步骤S24的另一种可能的实现方式的流程图。
具体实施方式
为了便于理解,先对对焦技术进行简单介绍。
对焦也可称为对光或聚焦。对焦过程指的是使用相机设备时对焦点进行调整,以使被拍摄物体的影像逐渐清晰的过程。
为了方便用户的使用,传统相机设备大多采用自动对焦(auto focus,AF)技术进行对焦。自动对焦技术主要包括主动式对焦和被动式对焦两种。
主动式对焦也可称为测距式对焦。采用主动式对焦方式进行对焦时,相机设备会向被拍摄物体发射测距信号(测距信号例如可以是红外线、超声波或激光);然后,相机设备接收被拍摄物体对测距信号的反射回波;这样,相机设备就可以根据该反射回波计算被拍摄物体的距离,并根据被拍摄物体的距离对相机设备的对焦过程进行干预。
主动式对焦方式测得的距离通常是被拍摄场景中的距离相机设备最近的前景物体。因此,传统的主动式对焦方式不能聚焦于远景物体,使得这种对焦方式的使用场景有限。因此,目前的相机设备主要采用被动式对焦方式进行对焦。
被动式对焦也可称为镜后式对焦。被动式对焦通常包括对比度检测自动对焦(contrast detection auto focus,CDAF)和相位检测自动对焦(phase detection auto focus,PDAF)两种对焦方式。
对比度检测自动对焦可以简称为对比度对焦,主要是根据焦点处画面的对比度变化,寻找对比度最大时的镜头位置,也就是准确对焦的位置。对比度对焦需要反复移动镜头,不能一次到位,对焦过程很慢。
相位检测自动对焦可以简称为相位对焦,主要是在感光元件上预留出一些遮蔽像素点作为相位对焦点,专门用来进行相位检测。具体对焦过程中,相机设备可以通过相位对焦点之间的距离及其变化等来决定对焦的偏移值,从而实现对焦。相位对焦受限于感光元件上的相位对焦点处信号的强弱,在暗光条件下对焦效果差。
由上可知,相机设备的传统对焦方式有些对焦效率低,有些对焦效果差,亟待改善。下面结合附图,对本申请实施例提的相机设备及对焦方法进行详细描述。
图1是本申请实施例提供的相机设备的结构示例图。如图1所示,相机设备10包括主相机12、双目摄像头14,16以及控制装置18。
主相机12可用于采集目标场景(或称被拍摄场景)的图像(例如可以是平面图像)。主相机12可以包括镜头、对焦***(图中未示出)等。在一些实施例中,主相机12还可以包括显示屏(如液晶显示屏),用户可以通过显示屏采集呈现镜头对准的目标场景。进一步地,该显示屏可以是触摸屏,用户可以通过触摸操作(如点选或滑动)从液晶显示屏所显示的图像中选择待对焦区域。
双目摄像头14,16也可称为双目视觉模块或双目模块。双目摄像头14,16可以包括第一摄像头14和第二摄像头16。双目摄像头14,16可用于辅助主相机12进行快速对焦。例如,双目摄像头14,16可以在控制装置18的控制下拍摄目标场景的双目图像(包括左眼图像和右眼图像),根据双目图像(或根据双目图像之间的视差)对目标场景进行深度分析,以获取目标场景中的物体的距离分布状况。然后,双目摄像头14,16可以向主相机12提供待对焦区域的距离信息,使得主相机12可以在待对焦区域的距离已知的情况下进行对焦。
双目摄像头14,16与主相机12可以集成在相机设备10(如相机设备10的壳体)上,即二者可以固定连接,形成不可拆卸的一个整体。
或者,双目摄像头14,16也可以以可拆卸的方式与主相机12连接。在这种情况下,可以将双目摄像头14,16理解为主相机12的外设。当需要双目摄像头14,16辅助对焦时,可以将双目摄像头14,16与主相机12装配在一起使用;当无需双目摄像头14,16辅助对焦时,可以将双目摄像头14,16从主相机12拆卸下来,将主相机12作为普通相机使用。
本申请实施例对主相机12和双目摄像头14,16之间的位置关系不做具体限定,只要主相机12和双目摄像头14,16的位置关系的设置使得二者能够对视角范围大致相同的场景进行拍摄即可。
由于双目摄像头14,16与主相机12位于同一相机设备10,它们之间的视差通常很小。因此,可以认为主相机12采集到的图像中的物体均会出现在双目摄像头14,16采集到的双目图像中,无需对二者采集到的图像进行配准或校正即可直接使用。
可选地,在其他实施例中,为了提高对焦准确性,也可以对主相机12和双目摄像头14,16采集到的图像进行校正或配准,以确保主相机12采集到的图像中图像内容均会在双目图像中出现。
作为一个示例,可以根据主相机12和双目摄像头14,16的视场角的差异,对主相机采集到的图像和双目图像进行配准,并利用配准之后的图像执行后续的对焦操作。配准后的图像中的图像内容可以是主相机采集到的图像和双目图像的公共部分。
例如,与主相机12相比,双目摄像头14,16的视场角之和通常会大一些,因此,可以根据主相机12和双目摄像头14,16的视场角的差异,对双目图像进行裁剪,使得裁剪后的图像和主相机采集到的图像描述同一视场角内的场景。
作为另一示例,可以先确定主相机和/或双目摄像头对目标场景的缩放倍数;再根据主相机和双目摄像头对目标场景的缩放倍数的差异,对主相机采集到的图像和双目图像进行配准。
例如,假设主相机12通过使用变焦功能将目标场景放大了两倍,而双目摄像头14,16未使用或不支持变焦功能,则主相机12采集到的图像的图像内容会少于双目摄像头14,16采集到的双目图像的图像内容。在这种情况下,可以对双目图像进行裁剪,使得裁剪后的图像和主相机采集到的图像的图像内容相匹配。
此外,主相机12和双目摄像头14,16采集到的图像的配准方式还可以是上述方式的组合,此处不再赘述。
控制装置18可用于实现双目摄像头14,16辅助主相机12对焦过程中的控制或信息处理功能。控制装置18可以属于主相机12的一部分,例如,可以由主相机12的处理器或对焦***实现。或者,控制装置18也可以属于双 目摄像头14,16的一部分,例如,可以由双目摄像头14,16的芯片部分实现。或者,控制装置18也可以是位于主相机12和双目摄像头14,16之外的独立部件,如独立的处理器或控制器。当然,控制装置18也可以是分布式的控制装置,其功能可以由多个处理器共同实现,本申请实施例对此并不限定。
图2是本申请实施例提供的相机设备的对焦方法的示意性流程图。图2可以由图1中的控制装置18执行。图2的方法可以包括步骤S22至步骤S28。下面分别这些步骤进行详细描述。
在步骤S22,在使用主相机12采集目标场景的图像的过程中,控制双目摄像头14,16采集目标场景的双目图像。
主相机12采集目标场景的图像的过程可以指主相机12的相机镜头对准目标场景,准备对目标场景进行拍摄的过程。
双目图像可以包括左眼图像和右眼图像。双目图像之间存在视差,可以基于双目图像之间的视差得到目标场景的视差图,进而得到目标场景的深度图。
以目标场景为如图3所示的自然风光场景为例,主相机12采集到的图像例如可以是图3中的图像32,双目摄像头14,16采集到的双目图像可以包括左眼图像34和右眼图像36。从图3可以看出,由于主相机12和双目镜摄像头14,16的位置不同,图像32和双目图像34,36存在一定的视差。但三张图像之间的视差很小,图像32中出现的物体基本均出现在双目图像34,36中,该微小视差的存在实际上并不会对后续辅助对焦功能的实现造成很大影响。
在步骤S24,确定目标场景中的待对焦区域在双目图像中的位置。
待对焦区域可以指主相机12(或用户)希望对焦到的区域。待对焦区域的确定方式可以有多种。作为一种实现方式,可以接收用户的输入信息,该输入信息可用于从主相机12采集到的图像中选取待对焦区域。例如,主相机12可以包含用于显示该图像的液晶显示屏。用户可以通过触摸或按键的方式从液晶显示屏显示的图像中选取待对焦区域。比如,用户可以通过点选的方式从主相机12采集到的图像中选取待对焦区域;又如,用户可以通过滑动操作从主相机12采集到的图像中划定一块区域作为待对焦区域。
待对焦区域有时也可称为待对焦点或待对焦位置。待对焦区域通常包含用户希望对焦到的物体,因此,在有些实施例中,待对焦区域也可以替换成 待对焦物体。
作为另一种实现方式,可以为相机设备10预先配置待对焦区域,而无需用户进行选择。例如,相机设备10可以是搭载在无人机上的相机设备,该无人机可用于对未知场景进行探测,以发现该场景中是否存在目标物(如人或其他物体),则可以将相机设备10的待对焦区域预先设置为该目标物所在的区域,一旦发现该目标物,则自动对焦至该目标物所在的区域,从而对该目标物进行拍摄。
待对焦区域在双目图像中的位置可以指待对焦区域在左眼图像中的位置,也可以指待对焦区域在右眼图像中的位置,也可以同时包括待对焦区域在左眼图像的位置和右眼图像的位置。
步骤S24的实现方式可以有多种。例如,可以根据主相机12和双目摄像头14,16之间的相对位置关系确定待对焦区域在双目图像中的位置;也可以利用语义识别的方式识别待对焦区域在双目图像中的位置。下文会结合具体的实施例对步骤S24的实现方式进行详细说明,此处暂不详述。
在步骤S26,根据待对焦区域在双目图像中的位置,确定待对焦区域的距离信息。
待对焦区域的距离信息也可称为待对焦区域的深度信息。待对焦区域的距离信息可用于指示待对焦区域与相机设备10(或主相机12,或双目摄像头14,16)之间的距离。
步骤S26的实现方式可以有多种。作为一个示例,可以先根据双目图像生成目标场景的深度图。然后,可以根据待对焦区域在双目图像中的位置,确定待对焦区域在深度图中的对应位置。接着,可以从深度图中读取该对应位置的深度信息,作为待对焦区域的距离信息。
作为另一个示例,可以先确定待对焦区域在左眼图像和右眼图像中的位置,然后可以根据左眼图像和右眼图像之间的视差,对待对焦区域在左眼图像中的位置和右眼图像中的位置进行配准,以确定该位置对应的距离信息。
如图3所示,假设主相机12希望对焦至图像32中的小塔38,则该小塔38所在区域即为待对焦区域。由于小塔38也会出现在左眼图像34和右眼图像36中,因此,可以先计算小塔38所在区域在左眼图像34和右眼图像36中的位置,然后根据左眼图像34和右眼图像36的视差确定小塔388所在区域的距离信息。
在步骤S28,根据待对焦区域的距离信息,控制主相机12执行对焦操作。
本申请实施例对步骤S28的实现方式不做具体限定。例如,可以根据待对焦区域的距离信息,确定焦点位置,然后控制主相机12的相机镜头直接移动至焦点位置。或者,可以将基于待对焦区域的距离信息确定出的焦点位置作为焦点的大概位置,然后再利用对比度对焦的方式在该位置附近进行对比度对焦,从而找到焦点的准确位置。
本申请实施例中,双目摄像头14,16能够为主相机12提供待对焦区域的距离信息,使得主相机12可以在待对焦区域的距离已知的情况下进行对焦,相比于对比度对焦方式,能够提高对焦速度。此外,本申请实施例提供的对焦方式也不会像相位对焦方式那样在很大程度上依赖相位对焦点接收到的光信号的强度,因此,在暗光条件下也能够达到不错的对焦效果。综上,本申请实施例提供的对焦方式能够弥补传统对焦方式在某些方面的不足,以兼顾对焦效果和对焦速度。此外,本申请实施例提供的对焦方式由于能够不断跟踪待对焦区域的距离,因此,很适于实现跟焦。
下面对步骤S24的实现方式(即待对焦区域在双目图像中的位置的确定方式)进行详细地举例说明。
图4是步骤S24的一种可能的实现方式的流程图。图4主要是基于主相机12和双目摄像头14,16之间的相对位置关系确定待对焦区域在双目图像中的位置。如图4所示,步骤S24可以包括步骤S42至步骤S46。
在步骤S42,获取用户的输入信息。
该输入信息可用于从主相机12采集到的图像中选取待对焦区域。例如,主相机12可以包含用于显示图像的液晶显示屏。用户可以通过触摸或按键的方式从液晶显示屏显示的图像中选取待对焦区域。
在步骤S44,根据输入信息,确定待对焦区域在主相机采集到的图像中的位置。
例如,可以将主相机12采集到的图像中的与用户在主相机12的触摸屏上的触摸位置所对应的位置确定为待对焦区域在主相机12采集到的图像中的位置。
在步骤S46,根据待对焦区域在主相机采集到的图像中的位置,以及主相机12和双目摄像头14,16之间的相对位置关系,确定待对焦区域在双目图 像中的位置。
主相机12和双目摄像头14,16之间的相对位置关系可以预先获取。例如,可以预先对主相机12和双目摄像头14,16的相机坐标系进行标定,计算出主相机12和双目摄像头14,16的相机坐标系的变换矩阵。该变换矩阵即可用于表示主相机12和双目摄像头14,16之间的相对位置关系。
图5是步骤S24的另一种可能的实现方式的流程图。与图4所示的实现方式不同,图5主要是基于图像处理算法对待对焦区域进行识别和定位。如图5所示,步骤S24可以包括步骤S52至步骤S56。
在步骤S52,获取用户的输入信息。
该输入信息可用于从主相机12采集到的图像中选取待对焦区域。例如,主相机12可以包含用于显示该图像的液晶显示屏。用户可以通过触摸或按键的方式从液晶显示屏显示的图像中选取待对焦区域。
在步骤S54,识别待对焦区域中的物体的语义(或类别)。
本申请实施例对待对焦区域中的物体的语义识别方式不做具体限定,可以基于传统的图像分类算法进行语义识别;也可以基于神经网络模型进行语义识别。
基于传统的图像分类算法进行语义识别的过程例如可以采用如下方式实现:先采用尺度不变特征转换(Scale-invariant feature transform,SIFT)、方向梯度直方图(histogram of oriented gradient,HOG)等方式提取待对焦区域中的图像的特征,然后将提取到的图像特征输入至支持向量机(support vector machine,SVM)、K邻近等分类模型,从而确定待对焦区域中的物体的语义。
基于神经网络模型进行语义识别的过程例如可以采用如下方式实现:先利用第一神经网络模型提取待对焦区域中的图像的特征(可以采用多个卷积层提取特征,也可以采用卷积层和池化层相结合的方式提取图像特征),然后将该图像特征输出至分类模块(如SVM模块),得到待对焦区域中的物体的语义。或者,可以先利用第一神经网络模型的特征提取层(特征提取层例如可以是卷积层,也可以是卷积层和池化层)提取待对焦区域中的图像的特征,然后将图像特征输入至该神经网络的全连接层。该全连接层可以根据图像特征,计算预先设定的各个候选语义(或候选类别)的概率,并将概率最大的语义作为待对焦区域中的物体的语义。
本申请实施例对上述第一神经网络模型的类型不做具体限定,例如可以是卷积神经网络(convolutional neural network,CNN),GoogleNet或VGG。
在步骤S56,从双目图像中搜索与所述语义相匹配的物体,并将与所述语义相匹配的物体在双目图像中的位置作为待对焦区域在双目图像中的位置。
本申请实施例采用语义识别的方式确定待对焦区域在双目图像中的位置,这种实现方式不需要对主相机12和双目摄像头14,16进行准确标定(粗略标定即可),简化了相机设备的实现。
本申请实施例对步骤S56的实现方式不做具体限定。作为一个示例,可以采用传统的特征匹配算法实现。例如,可以预先存储与所述语义对应的物体的特征。在实际匹配时,可以先将双目图像划分成多个图像块,然后提取每个图像块的特征,并将特征最为匹配的图像块中的物体作为与所述语义相匹配的物体,并将该图像块所在的位置作为待对焦区域在双目图像中的位置。
作为另一个示例,可以根据预先训练出的第二神经网络模型,从双目图像中搜索与所述语义相匹配的物体。
例如,可以使用包含待对焦区域的图像训练该第二神经网络模型,使得神经网络模型能够从图像中识别出该待对焦区域,并能够输出该待对焦区域在图像中的位置。然后,在实际使用时,可以将双目图像输入至该第二神经网络模型,以确定待对焦区域在双目图像中的位置。
以图3为例,可以预先训练第二神经网络模型,使得第二神经网络模型能够识别小塔。实际使用时,可以将双目图像输入至第二神经网络模型,以确定小塔38在双目图像中的位置。
该第二神经网络模型可以包括特征提取层和全连接层。特征提取层例如可以是卷积层,也可以是卷积层和池化层。全连接层的输入可以是特征提取层提取到的特征,输出可以是与所述语义相匹配的物体在双目图像中的位置。该第二神经网络模型的具体实现方式可以参照传统的具有图像识别和定位功能的神经网络模型的设计方式进行设计。例如,可以参见基于滑动窗口的CNN模型的设计方式进行设计。
在某些情况下,主相机12和双目摄像头14,16采集到的图像中可能会包含具有相同语义的多个物体。为了提高算法的鲁棒性,在图5所示的实施例 中,也可以获取用户选择的待对焦区域在主相机12采集到的图像中的大***置范围,并在双目图像中的与该大***置范围对应的范围内搜索与所述语义相匹配的物体,以减少出错的概率。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作 为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (18)

  1. 一种相机设备的对焦方法,其特征在于,所述相机设备包括主相机以及用于辅助所述主相机对焦的双目摄像头,
    所述方法包括:
    在使用所述主相机采集目标场景的图像的过程中,控制所述双目摄像头采集所述目标场景,以生成双目图像;
    确定所述目标场景中的待对焦区域在所述双目图像中的位置;
    根据所述待对焦区域在所述双目图像中的位置,确定所述待对焦区域的距离信息;
    根据所述待对焦区域的距离信息,控制所述主相机执行对焦操作。
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述目标场景中的待对焦区域在所述双目图像中的位置,包括:
    获取用户的输入信息,所述输入信息用于从所述主相机采集到的图像中选取所述待对焦区域;
    根据所述输入信息,确定所述待对焦区域在所述主相机采集到的图像中的位置;
    根据所述待对焦区域在所述主相机采集到的图像中的位置,以及所述主相机和所述双目摄像头之间的相对位置关系,确定所述待对焦区域在所述双目图像中的位置。
  3. 根据权利要求2所述的方法,其特征在于,所述主相机和所述双目摄像头之间的相对位置关系通过预先标定出的所述主相机和所述双目摄像头的相机坐标系的变换矩阵表示。
  4. 根据权利要求1所述的方法,其特征在于,所述确定所述目标场景中的待对焦区域在所述双目图像中的位置,包括:
    获取用户的输入信息,所述输入信息用于从所述主相机采集到的图像中选取所述待对焦区域;
    识别所述待对焦区域中的物体的语义;
    从所述双目图像中搜索与所述语义相匹配的物体,并将与所述语义相匹配的物体在所述双目图像中的位置作为所述待对焦区域在所述双目图像中的位置。
  5. 根据权利要求4所述的方法,其特征在于,所述识别所述待对焦区 域中的物体的语义,包括:
    根据预先训练出的第一神经网络模型,识别所述待对焦区域中的物体的语义。
  6. 根据权利要求4或5所述的方法,其特征在于,所述从所述双目图像中搜索与所述语义相匹配的物体,包括:
    根据预先训练出的第二神经网络模型,从所述双目图像中搜索与所述语义相匹配的物体。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,在所述确定所述目标场景中的待对焦区域在所述双目图像中的位置之前,所述方法还包括:
    根据所述主相机和所述双目摄像头的视场角的差异,对所述主相机采集到的图像和所述双目图像进行配准。
  8. 根据权利要求1-7中任一项所述的方法,其特征在于,在所述确定所述目标场景中的待对焦区域在所述双目图像中的位置之前,所述方法还包括:
    确定所述主相机对所述目标场景的缩放倍数;
    根据所述主相机和所述双目摄像头对所述目标场景的缩放倍数的差异,对所述主相机采集到的图像和所述双目图像进行配准。
  9. 根据权利要求1-8中任一项所述的方法,其特征在于,所述双目摄像头与所述主相机均集成在所述相机设备上,或者所述双目摄像头与所述主相机可拆卸连接。
  10. 一种相机设备,其特征在于,包括:
    主相机;
    用于辅助所述主相机对焦的双目摄像头;
    控制装置,用于执行以下操作:
    在使用所述主相机采集目标场景的图像的过程中,控制所述双目摄像头采集所述目标场景,以生成双目图像;
    确定所述目标场景中的待对焦区域在所述双目图像中的位置;
    根据所述待对焦区域在所述双目图像中的位置,确定所述待对焦区域的距离信息;
    根据所述待对焦区域的距离信息,控制所述主相机执行对焦操作。
  11. 根据权利要求10所述的相机设备,其特征在于,所述确定所述目标场景中的待对焦区域在所述双目图像中的位置,包括:
    获取用户的输入信息,所述输入信息用于从所述主相机采集到的图像中选取所述待对焦区域;
    根据所述输入信息,确定所述待对焦区域在所述主相机采集到的图像中的位置;
    根据所述待对焦区域在所述主相机采集到的图像中的位置,以及所述主相机和所述双目摄像头之间的相对位置关系,确定所述待对焦区域在所述双目图像中的位置。
  12. 根据权利要求11所述的相机设备,其特征在于,所述主相机和所述双目摄像头之间的相对位置关系通过预先标定出的所述主相机和所述双目摄像头的相机坐标系的变换矩阵表示。
  13. 根据权利要求10所述的相机设备,其特征在于,所述确定所述目标场景中的待对焦区域在所述双目图像中的位置,包括:
    获取用户的输入信息,所述输入信息用于从所述主相机采集到的图像中选取所述待对焦区域;
    识别所述待对焦区域中的物体的语义;
    从所述双目图像中搜索与所述语义相匹配的物体,并将与所述语义相匹配的物体在所述双目图像中的位置作为所述待对焦区域在所述双目图像中的位置。
  14. 根据权利要求13所述的相机设备,其特征在于,所述识别所述待对焦区域中的物体的语义,包括:
    根据预先训练出的第一神经网络模型,识别所述待对焦区域中的物体的语义。
  15. 根据权利要求13或14所述的相机设备,其特征在于,所述从所述双目图像中搜索与所述语义相匹配的物体,包括:
    根据预先训练出的第二神经网络模型,从所述双目图像中搜索与所述语义相匹配的物体。
  16. 根据权利要求10-15中任一项所述的相机设备,其特征在于,在所述确定所述目标场景中的待对焦区域在所述双目图像中的位置之前,所述对焦控制装置还用于执行以下操作:
    根据所述主相机和所述双目摄像头的视场角的差异,对所述主相机采集到的图像和所述双目图像进行配准。
  17. 根据权利要求10-16中任一项所述的相机设备,其特征在于,在所述确定所述目标场景中的待对焦区域在所述双目图像中的位置之前,所述对焦控制装置还用于执行以下操作:
    确定所述主相机对所述目标场景的缩放倍数;
    根据所述主相机和所述双目摄像头对所述目标场景的缩放倍数的差异,对所述主相机采集到的图像和所述双目图像进行配准。
  18. 根据权利要求10-17中任一项所述的相机设备,其特征在于,所述双目摄像头与所述主相机均集成在所述相机设备上,或者所述双目摄像头与所述主相机可拆卸连接。
PCT/CN2018/102912 2018-08-29 2018-08-29 相机设备及对焦方法 WO2020042000A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201880038715.2A CN111345025A (zh) 2018-08-29 2018-08-29 相机设备及对焦方法
PCT/CN2018/102912 WO2020042000A1 (zh) 2018-08-29 2018-08-29 相机设备及对焦方法
US17/084,409 US20210051262A1 (en) 2018-08-29 2020-10-29 Camera device and focus method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/102912 WO2020042000A1 (zh) 2018-08-29 2018-08-29 相机设备及对焦方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/084,409 Continuation US20210051262A1 (en) 2018-08-29 2020-10-29 Camera device and focus method

Publications (1)

Publication Number Publication Date
WO2020042000A1 true WO2020042000A1 (zh) 2020-03-05

Family

ID=69642821

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/102912 WO2020042000A1 (zh) 2018-08-29 2018-08-29 相机设备及对焦方法

Country Status (3)

Country Link
US (1) US20210051262A1 (zh)
CN (1) CN111345025A (zh)
WO (1) WO2020042000A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7508350B2 (ja) * 2020-12-04 2024-07-01 株式会社日立製作所 キャリブレーション装置およびキャリブレーション方法
CN112433339B (zh) * 2020-12-10 2022-04-15 济南国科医工科技发展有限公司 基于随机森林的显微镜精细对焦方法
US20240121511A1 (en) * 2021-04-28 2024-04-11 Qualcomm Incorporated Lens positioning for secondary camera in multi-camera system
CN113676719B (zh) * 2021-07-21 2023-11-14 北京中科慧眼科技有限公司 双目立体相机的对焦参数计算方法、***和智能终端

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103297696A (zh) * 2013-05-24 2013-09-11 北京小米科技有限责任公司 拍摄方法、装置和终端
CN103986876A (zh) * 2014-05-29 2014-08-13 宇龙计算机通信科技(深圳)有限公司 一种图像获取终端和图像获取方法
KR20160038409A (ko) * 2014-09-30 2016-04-07 엘지전자 주식회사 이동 단말기 및 그 제어 방법
US20160261844A1 (en) * 2015-03-06 2016-09-08 Massachusetts Institute Of Technology Methods and Apparatus for Enhancing Depth Maps with Polarization Cues
CN106454090A (zh) * 2016-10-09 2017-02-22 深圳奥比中光科技有限公司 基于深度相机的自动对焦方法及***
CN107122770A (zh) * 2017-06-13 2017-09-01 驭势(上海)汽车科技有限公司 多目相机***、智能驾驶***、汽车、方法和存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105744138B (zh) * 2014-12-09 2020-02-21 联想(北京)有限公司 快速对焦方法和电子设备
CN106973227A (zh) * 2017-03-31 2017-07-21 努比亚技术有限公司 基于双摄像头的智能拍照方法及装置
CN107920209A (zh) * 2017-12-27 2018-04-17 国网通用航空有限公司 一种高速相机自动对焦***、方法及处理器、计算机设备
CN108322726A (zh) * 2018-05-04 2018-07-24 浙江大学 一种基于双摄像头的自动对焦方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103297696A (zh) * 2013-05-24 2013-09-11 北京小米科技有限责任公司 拍摄方法、装置和终端
CN103986876A (zh) * 2014-05-29 2014-08-13 宇龙计算机通信科技(深圳)有限公司 一种图像获取终端和图像获取方法
KR20160038409A (ko) * 2014-09-30 2016-04-07 엘지전자 주식회사 이동 단말기 및 그 제어 방법
US20160261844A1 (en) * 2015-03-06 2016-09-08 Massachusetts Institute Of Technology Methods and Apparatus for Enhancing Depth Maps with Polarization Cues
CN106454090A (zh) * 2016-10-09 2017-02-22 深圳奥比中光科技有限公司 基于深度相机的自动对焦方法及***
CN107122770A (zh) * 2017-06-13 2017-09-01 驭势(上海)汽车科技有限公司 多目相机***、智能驾驶***、汽车、方法和存储介质

Also Published As

Publication number Publication date
CN111345025A (zh) 2020-06-26
US20210051262A1 (en) 2021-02-18

Similar Documents

Publication Publication Date Title
WO2020042000A1 (zh) 相机设备及对焦方法
CN108496350B (zh) 一种对焦处理方法及设备
CN108076278B (zh) 一种自动对焦方法、装置及电子设备
CN106462766B (zh) 在预览模式中进行图像捕捉参数调整
US9313419B2 (en) Image processing apparatus and image pickup apparatus where image processing is applied using an acquired depth map
WO2019105214A1 (zh) 图像虚化方法、装置、移动终端和存储介质
CN104363378B (zh) 相机对焦方法、装置及终端
CN105659580A (zh) 一种自动对焦方法、装置及电子设备
US9300858B2 (en) Control device and storage medium for controlling capture of images
US20120127276A1 (en) Image retrieval system and method and computer product thereof
CN101799621B (zh) 一种拍摄方法和拍摄设备
CN111932636B (zh) 双目摄像头的标定及图像矫正方法、装置、存储介质、终端、智能设备
CN110213491B (zh) 一种对焦方法、装置及存储介质
US20140307054A1 (en) Auto focus method and auto focus apparatus
CN106842178B (zh) 一种光场距离估计方法与光场成像***
CN113129241B (zh) 图像处理方法及装置、计算机可读介质、电子设备
WO2017117749A1 (zh) 基于多种测距方式的跟焦***、方法及拍摄***
WO2020124517A1 (zh) 拍摄设备的控制方法、拍摄设备的控制装置及拍摄设备
WO2021081909A1 (zh) 拍摄设备的对焦方法、拍摄设备、***及存储介质
WO2023142352A1 (zh) 一种深度图像的获取方法、装置、终端、成像***和介质
CN105335959B (zh) 成像装置快速对焦方法及其设备
CN114363522A (zh) 拍照方法及相关装置
US20140327743A1 (en) Auto focus method and auto focus apparatus
CN112749610A (zh) 深度图像、参考结构光图像生成方法、装置及电子设备
WO2022011657A1 (zh) 图像处理方法及装置、电子设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18931706

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18931706

Country of ref document: EP

Kind code of ref document: A1