WO2022257120A1 - Pupil position determination method, device and system - Google Patents

Pupil position determination method, device and system Download PDF

Info

Publication number
WO2022257120A1
WO2022257120A1 PCT/CN2021/099759 CN2021099759W WO2022257120A1 WO 2022257120 A1 WO2022257120 A1 WO 2022257120A1 CN 2021099759 W CN2021099759 W CN 2021099759W WO 2022257120 A1 WO2022257120 A1 WO 2022257120A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pupil
area
point
heat map
Prior art date
Application number
PCT/CN2021/099759
Other languages
French (fr)
Chinese (zh)
Inventor
郑爽
张国华
张代齐
袁麓
黄为
李腾
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180001856.9A priority Critical patent/CN113597616A/en
Priority to PCT/CN2021/099759 priority patent/WO2022257120A1/en
Publication of WO2022257120A1 publication Critical patent/WO2022257120A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a method, device and system for determining the pupil position.
  • Pupil key point positioning technology has a wide range of applications in the fields of gaze tracking, eye movement recognition, and iris detection.
  • pupil positioning technology has received more and more attention.
  • the two-dimensional (2 dimension, 2D) pupil key point location mainly used traditional image processing algorithms to perform binarization, erosion, expansion and other operations on the image, and then use the detection algorithm to circle the iris, and then further solve it according to the center of the circle. The center point of the iris.
  • the technology of eye tracking it is necessary to be able to accurately locate the three-dimensional (3 dimension, 3D) spatial position of the pupil point in the camera coordinate system.
  • the gaze tracking task it is mainly necessary to predict the direction that the human eye is looking at, and then further calculate the actual target position that the human eye is looking at.
  • it is first necessary to locate the starting point of the line of sight and the positioning of the starting point of the line of sight requires the pupil key point positioning technology.
  • How to determine the 3D space position of the exit pupil point in the camera coordinate system is a good line of sight estimation. The basics.
  • the present application provides a solution for determining the position of the pupil, which includes a method for determining the position of the pupil, a device, a system, a computer-readable storage medium, and a computer program product, which can realize the positioning of the three-dimensional position of the pupil.
  • the first aspect of the present application provides a method for determining the position of the pupil, including: acquiring an image including the pupil and a corresponding thermal map, wherein the thermal map is used to represent the probability distribution of pupil points in the image, The pupil point is the center point of the pupil.
  • the position of the pupil point in the image is determined according to the first area of the heat map, wherein the probability value corresponding to the pixel in the first area of the heat map is greater than a first threshold.
  • a second area in the image is determined, wherein the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first threshold.
  • the depth value of the pupil point is determined from the depth value of the pixels of the second region in the image.
  • the three-dimensional position of the pupil point is determined according to the two-dimensional position of the pupil point and the depth value of the pupil point, wherein the two-dimensional position of the pupil point refers to the position of the pupil point in the image.
  • the positioning of 3D pupil points can be realized, which provides technical support for technical directions such as line of sight tracking, eye movement, and human-computer interaction, provides an accurate and stable starting point for line of sight technology, and guarantees the stability of line of sight estimation .
  • the driver's sitting height can be measured according to the estimated 3D coordinate points of the pupils, so as to automatically adjust the seat so that the seat is at the most comfortable height.
  • the driver's distraction can also be judged according to the three-dimensional position of the pupil.
  • the deep neural network can be used to predict the heat map of the pupil point, and the probability distribution of the pupil point can be expressed through the heat map.
  • the deep neural network is easier to return to the heat map according to the image than to directly return the coordinates of the pupil point, and it is more sensitive to occlusion, illumination, and eyeball size. Scenes such as gestures are also more robust, which solves the problem of inaccurate pupil position positioning and insufficient robustness to a certain extent.
  • the existing pupil point positioning either uses traditional image processing algorithms to detect, or directly uses the deep neural network to extract features for direct regression of key point coordinates, which will predict the pupil heat map When applied to pupil coordinate positioning, what is regressed is the probability distribution of pupil coordinates, not the simple coordinate position.
  • the pupil coordinates predicted by the present invention are more accurate and more robust.
  • the position of the pupil point in the image is determined according to the first area of the heat map, and the method that can be adopted includes: determining the pupil point according to the center position of the first area of the heat map position in the image.
  • the center position of the first region of the heat map can be used as the position of the pupil point in the image, which has better robustness.
  • the argmax function can be used to solve the heat map of the human eye area to obtain the point with the highest probability value (that is, the probability value), the second highest point, the second highest point lower than the second highest point, etc., and the highest point can be selected according to needs
  • the set of , or the set of the highest point and the second highest point is used as the first area, and the mean value of these points is used as the position of the pupil point.
  • the reaction based on the heat map is the area of each pixel (for example, the point with the highest probability value corresponds to a collection of multiple pixels), not for a single pixel, and the solution of argmax also involves a collection of points (for example, the highest probability value point set, the second highest point set), so the mean value is used to solve the position of the pupil point.
  • the mean value can be used for calculation, which can have better robustness.
  • weights may also be introduced for calculation.
  • weighting may also be performed according to a probability value, and the higher the probability value, the greater the weight.
  • the calculation of the mean value can also be weighted according to the position of the point. For example, when calculating the position mean value, the position of each second-highest point is weighted according to the distance between the second-highest point and the highest point. The far weight is lower.
  • the first threshold is the second highest value of the probability values in the heat map.
  • the first threshold range can be selected as required, and the next highest value can be used as the first threshold, which can ensure the accuracy of pupil point calculation under a small amount of data.
  • a larger or smaller first threshold may also be selected, so that the first threshold corresponds to fewer or more points in the first region.
  • a possible implementation manner is to set a lower first threshold, so that the second highest point lower than the second highest point is also used as the first threshold corresponding to the first region.
  • determining the depth value of the pupil point from the depth value of the pixels in the second area of the image includes: determining the depth of the pupil point according to the mean value of the depth values of the pixels in the second area in the image value.
  • the mean value of the depth in the second region is used as the mean value of the pupil point, which increases the robustness.
  • the existing solution depth is often solved for the entire graph, and the solution is not accurate enough in places where the texture features are not obvious.
  • using the heat map to provide guidance for in-depth solutions can only focus on the solution of the heat map part.
  • the guidance of the heat map can promote the matching of the pupil positions of the two images, alleviate the impact of the texture features of the pupil position not being obvious enough, and use the regional mean value instead of the pupil center mean value, Improve the accuracy and stability of depth estimation.
  • the image includes a first image and a second image, where the first image and the second image are two images taken from different perspectives, and correspondingly, the heat map includes the first image The corresponding first heatmap and the second heatmap corresponding to the second image.
  • Determining the position of the pupil point in the image according to the first area of the heat map specifically includes: determining the position of the pupil point in the first image according to the first area of the first heat map.
  • Determining the second area in the image specifically includes: determining the second area in the first image, and determining the second area in the second image.
  • the depth value of the pupil point is determined by the pixel depth value of the second area in the image, specifically including: the depth value of the pupil point is determined by the depth value of the pixel in the second area in the first image, and the second area in the first image The depth value of the pixel is determined by the disparity corresponding to the image of the second region in the first image and the image of the second region in the second image.
  • the heat map can achieve two goals. On the one hand, it can accurately predict the coordinates of the pupil point, and on the other hand, it can only search for the location area matching the heat map as a guide for binocular depth estimation, that is, In the second area, the calculation amount is greatly reduced. In addition, to a certain extent, it can alleviate the defects of inconspicuous texture features in the pupil area and difficulty in image similarity matching, which reduces the difficulty of estimation, requires less computing power, and improves estimation accuracy.
  • the two cameras used can be two cameras installed on the two A-pillars of the vehicle, or a binocular camera in front of the driver's cab facing the interior of the vehicle. Wherein, the binocular camera may be two cameras integrated into one image acquisition device, or may be formed by installing two independent cameras at a fixed position.
  • a possible implementation manner of the first aspect further includes: performing image correction on the first image and the second image, where the image correction includes at least one of the following: image de-distortion, image position adjustment, and image cropping.
  • the image correction here includes image de-distortion, position adjustment, and cropping, etc., to obtain binocular images in which the left and right corresponding epipolar lines are parallel, so as to reduce the difficulty of subsequent image parallax calculations.
  • acquiring the heat map corresponding to the image includes: acquiring a human eye image from the image. Obtain a heat map based on the human eye image.
  • the pupil heat map can be generated only for the part of the human eye image, that is, the image is first clipped, which can reduce the amount of data processing for heat map generation.
  • the second aspect of the present application provides an apparatus for determining a pupil position, including: an acquisition module configured to acquire an image including the pupil.
  • the processing module is configured to obtain a heat map corresponding to the image, and the heat map is used to represent the probability distribution of the pupil point in the image, and the pupil point is the center point of the pupil.
  • the processing module is further configured to determine the position of the pupil point in the image according to the first area of the heat map, where the probability value corresponding to the pixel in the first area of the heat map is greater than a first threshold.
  • the processing module is further configured to determine a second area in the image, where the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first threshold.
  • the processing module is also used to determine the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point.
  • the two-dimensional position of the pupil point refers to the position of the pupil point in the image, and the depth value of the pupil point is determined by the first The depth value of the pixels in the second area is determined.
  • the processing module when used to determine the position of the pupil point in the image according to the first area of the heat map, it is specifically used to: according to the center position of the first area of the heat map, Determine the location of the pupil point in the image.
  • the first threshold is the second highest value of the probability values in the heat map.
  • determining the depth value of the pupil point from the depth value of the pixels in the second area of the image includes: determining the depth of the pupil point according to the mean value of the depth values of the pixels in the second area in the image value.
  • the image includes a first image and a second image
  • the first image and the second image are two images taken from different angles of view
  • the heat map includes the first heat force corresponding to the first image Figure and the second heatmap corresponding to the second image.
  • the processing module is used to determine the position of the pupil point in the image according to the first area of the heat map, it is specifically used to: determine the position of the pupil point in the first image according to the first area of the first heat map.
  • the processing module is used to determine the second area in the image, it is specifically used to: determine the second area in the first image, and determine the second area in the second image.
  • the depth value of the pupil point is determined by the pixel depth value of the second area in the image, including: the depth value of the pupil point is determined by the depth value of the pixel in the second area in the first image, and the depth value of the second area in the first image
  • the depth value of the pixel is determined by the disparity corresponding to the image of the second area in the first image and the image of the second area in the second image.
  • the processing module is further configured to: perform image correction on the first image and the second image, and the image correction includes at least one of the following: image de-distortion, image position adjustment, and image cropping.
  • the processing module when used to obtain the heat map corresponding to the image, it is specifically used to: obtain the human eye image from the image. Obtain a heat map based on the human eye image.
  • the third aspect of the present application provides an electronic device, including: a processor, and a memory, on which program instructions are stored, and when the program instructions are executed by the processor, the processor performs the pupil position detection of any one of the above-mentioned first aspects. Determine the method.
  • the fourth aspect of the present application provides an electronic device, including: a processor, and an interface circuit, wherein the processor accesses the memory through the interface circuit, and the memory stores program instructions, and when the program instructions are executed by the processor, the processor executes the above-mentioned The method for determining the pupil position of any one of the first aspect.
  • the fifth aspect of the present application provides a system for determining the pupil position, which includes an image acquisition device, and the electronic device provided in the third aspect or the fourth aspect coupled with the image acquisition device.
  • the system may be a vehicle-mounted device or a vehicle.
  • the image acquisition device may be a binocular camera installed in the vehicle, wherein the binocular camera may be realized by cooperation of two independent cameras, or may be realized by a camera device integrated with dual cameras.
  • the image acquisition device may also be a monocular camera capable of collecting depth information and image information.
  • the sixth aspect of the present application provides a computer-readable storage medium, on which program instructions are stored.
  • the program instructions When the program instructions are executed by a computer, the computer executes the method for determining the pupil position of any one of the above-mentioned first aspects.
  • the seventh aspect of the present application provides a computer program product, which includes program instructions, and when the program instructions are executed by a computer, the computer executes the method for determining the pupil position of any one of the above first aspects.
  • the above-mentioned pupil position determination scheme of the present application uses the first area of the heat map representing the probability of the pupil point to determine the two-dimensional position of the pupil point, and determines the depth information of the pupil point based on the second area of the heat map, and then can Determine the three-dimensional position of the pupil point. Due to the adoption of the above-mentioned thermal map method, compared with the conventional method of directly obtaining the position of the pupil point by using an image, a more accurate position of the pupil point can be determined when the texture features of the iris and the pupil are not obvious. The implementation of the method has good robustness to scenes that affect the texture of the iris and pupil, such as occlusion, lighting, and large eye posture, and the pupil position is more accurate.
  • the average depth value of the second area including the first area to calculate the pupil point depth value, it also solves the problem that the depth of the binocular estimated pupil position is not accurate enough, and the calculation based on the mean value has better robustness .
  • the depth value is calculated based on the second area of the heat map, and the depth of the entire image does not need to be calculated, thereby reducing the amount of calculation and improving the calculation speed.
  • Fig. 1a is a schematic structural diagram of an application scenario of the method for determining the pupil position provided by the embodiment of the present application;
  • Fig. 1b is a first schematic diagram of an application scenario of the method for determining the pupil position provided by the embodiment of the present application;
  • Fig. 1c is a second schematic diagram of an application scenario of the method for determining the pupil position provided by the embodiment of the present application;
  • FIG. 2 is a schematic flowchart of a method for determining a pupil position provided in an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a specific implementation method of a method for determining a pupil position provided in an embodiment of the present application
  • Fig. 4 is a schematic diagram of the heat map of the human eye pupil in the specific embodiment of the present application.
  • Fig. 5 is a schematic diagram of a device for determining a pupil position provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an electronic device provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of another electronic device provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a system for determining a pupil position provided by an embodiment of the present application.
  • the solution for determining the pupil position includes a pupil position determination method and device, a system, a computer-readable storage medium, and a computer program product. Since the principles of these technical solutions to solve problems are the same or similar, in the introduction of the following specific embodiments, some repetitions may not be repeated, but it should be considered that these specific embodiments have been referred to each other and can be combined with each other.
  • the method that can be used is: first obtain the 2D coordinates of the pupil point in the 2D image, then obtain the depth information of the entire image through traditional binocular estimation, and then obtain the 2D coordinates of the pupil point The coordinates correspond to the calculated depth information, and finally the 3D coordinates of the pupil point in the camera coordinate system are further obtained through the camera imaging algorithm. Due to the influence of camera imaging quality, illumination, eyeball turning at a large angle, squinting and other factors, the characteristics of the iris part of the image may not be obvious enough, and traditional image processing algorithms are not robust to these scenes (easy to misdetect) , the detection of the 2D coordinates of the pupil point can easily fail in such cases.
  • the depth information of the entire image is often directly obtained by using binoculars, which has problems such as calculation redundancy, calculation speed, slowness and low efficiency.
  • a method based on deep learning can be used to locate the 2D pupil point, for example, first perform face detection, then perform human eye area detection, and finally perform human eye pupil feature detection.
  • the hourglass network is first used to extract the features of the human eye pupil, and then the four edge points of the iris are further located, and then a circle is fitted through these four points, and finally the coordinates of the center of the circle are obtained through this circle as Pupil point coordinates.
  • the four-point fitting circle method is used, the fitting effect is poor when there are large differences in individual points, resulting in pupil The 2D coordinate positioning of will not be accurate enough.
  • the embodiment of the present application provides an improved solution for determining the pupil position.
  • the heat map of the pupil point of the human eye is predicted through the deep neural network, and then the depth information is searched and matched within the range of the heat map according to the guidance of the heat map to solve the problem.
  • the depth information of the pupil area is obtained, and then the average depth information of the pupil area is used as the depth information of the pupil point, and the 2D coordinates of the pupil point are obtained through the heat map of the pupil point, and finally the pupil point is obtained in the camera coordinate system through the principle of pinhole imaging 3D coordinates.
  • the technical solution of this application can still locate the 2D pupil position more accurately under the conditions of occlusion, illumination, large eyeball posture, etc., so the robustness is better, and it solves the influence of inconspicuous texture features of the iris and pupil, through
  • the binocular estimation of the depth of the pupil position is not accurate enough, and at the same time, it solves the problem of low efficiency and slow speed in the search and matching of the entire image in the binocular matching process.
  • the solution for determining the pupil position is applied to application fields such as gaze tracking, eye movement recognition, iris detection, and human-computer interaction.
  • eye-tracking or eye-movement recognition can be applied to the smart cockpit of a vehicle, it can be used to determine the driver's pupil position to predict the sitting height, so as to automatically adjust the seat so that the seat is at the most comfortable height, and can also be used for Monitors whether the driver is distracted based on the 3D position of the pupil.
  • It can also be applied to the field of monitoring whether the participants in the video conference are distracted, or monitoring whether the online class students in front of the screen are listening carefully, or judging the user's attention, and the sight tracking or eye movement recognition of specific users.
  • the obtained big data provides support for the research of psychology and so on.
  • an introduction will be made to the scenario where the solution for determining the pupil position provided by the embodiment of the present application is applied to a vehicle.
  • an embodiment of the device for determining the pupil position may include Image acquisition device 11 and processor 12.
  • the image acquisition device 11 is used to obtain the image of the user including the pupil.
  • the image acquisition device 11 is a camera, wherein the camera can be a binocular camera, which can collect three primary color images and depth (RedGreenBlue-Deep, RGB-D ) camera, etc., the camera can be installed on the vehicle as required.
  • a binocular camera composed of two independent cameras is used, and these two cameras are the first camera 111 and the second camera 111 arranged on the left and right A-pillars of the vehicle cockpit.
  • it can also be installed on the user-facing side of the rearview mirror in the vehicle cockpit, it can also be installed on the steering wheel, the area near the center console, and it can also be installed on the position above the display screen behind the seat. It is used to collect facial images of drivers or passengers in the vehicle cockpit.
  • the acquisition module 11 can also be an electronic device that receives user image data transmitted by the camera, such as a data transmission chip, such as a bus data transceiver chip, a network interface chip, etc., and the data transmission chip can also be a wireless Transmission chip, such as Bluetooth chip or WIFI chip, etc.
  • the acquisition module 11 may also be integrated into the processor, and become an interface circuit or a data transmission module integrated into the processor.
  • the processor 12 is used to generate a heat map representing the distribution probability of pupil points according to the image, determine the position of the pupil point in the image according to the first area of the heat map, and determine the position of the pupil point in the image.
  • the second area of the pupil point and used to determine the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point, the two-dimensional position of the pupil point means that the pupil point is in the The position in the image, the depth value of the pupil point is determined by the depth value of the pixel in the second area in the image.
  • the processor 12 can be an electronic device, specifically a processor of a vehicle-mounted processing device such as a car machine or a vehicle-mounted computer, or a conventional processor such as a central processing unit (Central Processing Unit, CPU) or a microprocessor (micro control unit, MCU).
  • a vehicle-mounted processing device such as a car machine or a vehicle-mounted computer
  • a conventional processor such as a central processing unit (Central Processing Unit, CPU) or a microprocessor (micro control unit, MCU).
  • CPU Central Processing Unit
  • MCU microcontrol unit
  • the chip processor can also be used as terminal hardware such as mobile phones and tablets.
  • the three-dimensional position of the pupil point can be obtained based on the above-mentioned heat map, and then further applied to application fields such as eye-tracking tracking, eye movement recognition, iris detection, and human-computer interaction. Since it is not directly based on the image itself, it is basically not affected by the inconspicuous texture features of the iris and pupils, and it can also solve the problem of inaccurate estimation of the depth of the pupil position through binoculars.
  • Figure 2 shows the flow chart of the method for determining the pupil position provided by the embodiment of the present application
  • the method of this embodiment can be determined by the pupil position determination device or some devices in the device For example, it may be executed by a vehicle, an on-vehicle device, or a processor. Taking the processor as an example below, the method for determining the pupil position provided in the embodiment of the present application is introduced, including the following steps:
  • S10 The processor on the vehicle acquires the image including the pupil collected by the image acquisition device through the interface circuit.
  • the image acquisition device may be a binocular camera, and the binocular camera may be realized by cooperation of two independent cameras, or may be realized by a camera device integrated with two cameras.
  • the image acquisition device can be a binocular camera composed of two cameras installed on the two A-pillars of the vehicle, or it can be in front of the cab, facing the vehicle.
  • the built-in binocular camera, or the camera of a smart device (such as a mobile phone, a tablet computer, etc.) that is integrated with a dual camera and can take pictures. Two images with parallax can be collected by two cameras.
  • the camera can be an RGB-D camera, and the RGB-D camera can be a monocular depth camera.
  • the collected images will also collect images corresponding to each pixel of the RGB image. depth information.
  • the collected image may be the image of the driver or the image of the passenger.
  • the pupil needs to be included in the image, so as to identify the pupil point in the image later.
  • the image recognition model can be realized by a deep neural network, which can be an hourglass network (hourglass), HRNet, U-Net, FCN, segmentation network provided by Deeplab, EspNet network, etc.
  • the image can also be filtered out when the image cannot be processed Poor quality images, keep good quality images.
  • noise reduction processing on the image, such as removing random, discrete, and isolated pixels, so as to meet the requirements of subsequent image processing.
  • the way of acquiring the image including the pupil can also be to receive the image including the pupil through data transmission, and the way to realize the data transmission can be through an independent communication interface, or integrated in the processor Communication interface and other ways to achieve.
  • the independent communication interface can be an interface chip for wired transmission, such as a serial data bus interface chip, a parallel data bus interface chip, a network cable interface chip, etc., or a wireless transmission chip, such as a Bluetooth chip or a WIFI chip.
  • the communication interface is integrated in the processor, it may be an interface circuit integrated in the processor, or a wireless transmission module integrated in the processor.
  • S20 The processor on the vehicle acquires a thermal map corresponding to the image including the pupil, where the thermal map is used to represent a probability distribution of pupil points in the image, where the pupil point is a center point of the pupil.
  • the heat map represents the probability distribution of pupil points
  • the heat map can use brightness to indicate the level of the probability value. The brighter the position in the figure, the higher the probability that the position is a pupil point. Since the heat map represents the probability distribution of pupil points, when used to identify pupil points, it is easier to regress the heat map than to directly regress the pupil point coordinates, and it is more robust to different scenarios.
  • a deep neural network can be used to generate the thermal map according to the image
  • the deep neural network can be, for example, an hourglass network (hourglass), a high-resolution network (High-Resoultion Net, HRNet), a U-shaped network (U-Net) , FCN, segmentation network provided by Deeplab, EspNet network (a lightweight convolutional neural network) and other networks that process image data.
  • a segmentation network can be used to generate a heat map corresponding to an image.
  • each deep neural network mentioned in the embodiment of the present application refers to the network after training, which will not be repeated in the following.
  • the human eye image of the graph label data is used as a sample.
  • the loss function can be trained using the mean square error (MSE) loss function or other loss functions.
  • the trained network has the function of generating a heat map of the pupil according to the image.
  • MSE mean square error
  • S30 The processor on the vehicle determines the position of the pupil point in the image according to the first area of the heat map, wherein the probability value corresponding to the pixel in the first area of the heat map is greater than the first threshold.
  • the position of the pupil point in the image may be determined according to the center position of the first area of the heat map.
  • the argmax function may be used to calculate the position of the point with the highest probability value and the second highest point in the image of the human eye area on the heat map of the human eye area.
  • the first threshold is the second highest probability value in the heat map
  • the first area is an area formed by points with the highest probability value and the second highest probability value.
  • the mean value of the highest point and the second highest point position in the calculated area can be used as the predicted pupil point position, for example, the position of the predicted pupil point can be: ( ⁇ Pai+ ⁇ Pbj)/(I+J), where Pai represents the position of each highest point Pa, Pbj represents the position of each second highest point Pb, I represents the number of Pa points, J represents the number of Pb points, i ⁇ I, j ⁇ J .
  • the value of I is relatively small, even 1 (indicating that there is only one highest point Pa), which will not be described hereafter.
  • a larger or smaller first threshold may also be selected, so that there are fewer or more points in the first region. For example, select a smaller first threshold, and take the second highest point lower than the second highest point as calculation.
  • the position of the pupil point when the position of the pupil point is calculated by means of the mean value, it can also be weighted according to the probability value. The higher the probability value, the greater the weight.
  • the predicted position of the pupil point can be: (e1 ⁇ Pai+e2 ⁇ Pbj+e3 ⁇ Pck)/(e1*I+e2*J+e3*K), where Pai represents the position of each highest point Pa, Pbj represents the position of each second highest point Pb, and Pck represents the position of each second highest point
  • the position of the secondary high point Pc I represents the number of Pa points, J represents the number of Pb points, K represents the number of each Pc point, i ⁇ I, j ⁇ J, k ⁇ K, e1, e2, e3 are weights , and e1>e2>e3.
  • weighting can also be performed according to the position of the point. For example, when calculating the mean value of the position, the position of each second highest point is obtained, and weighting is performed according to the distance between the second highest point and the highest point. The farther the weight is, the lower the weight is.
  • the reason for using the mean value to calculate the pupil point is based on the fact that the heat map reflects the area of each pixel (for example, the point with the highest probability value corresponds to a collection of multiple pixels), not for a single pixel, and argmax
  • the solution also involves a set of points (such as the set of the highest probability value, the set of the second highest point), so the mean value is used to solve the position of the pupil point. Stickiness.
  • S40 The processor on the vehicle determines the second area in the image, wherein the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first threshold a threshold.
  • the second area of the image is determined for the calculation of the depth value in the step described later. In this way, only the local second region is used to calculate the depth value, which reduces the amount of calculation compared to calculating the entire image.
  • the processor on the vehicle determines the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point.
  • the two-dimensional position of the pupil point refers to the position of the pupil point in the image the position in the pupil point, the depth value of the pupil point is determined by the depth value of the pixel in the second area in the image.
  • the internal reference information of the camera is used to obtain the three-dimensional position (ie, 3D coordinates) of the pupil point in the camera coordinate system according to the two-dimensional position and depth value of the pupil point through the pinhole imaging principle.
  • the mean value of the depth values of the pixels in the second region of the image is used as the depth value of the pupil point.
  • Using the average depth value of each pixel in the second region of the heat map as the pupil point depth value can have better robustness.
  • the camera is an RGB-D camera.
  • the RGB-D camera is a monocular depth camera.
  • the depth information corresponding to each pixel of the image will be collected. Therefore, in the above step S40, it is determined that the first After the second area, the depth value of each pixel in this area is directly used to calculate the mean value to obtain the pupil point depth value.
  • the depth may be calculated based on the parallax.
  • two images taken from different angles of view by two cameras are respectively the first image and the second image, correspondingly, the thermal map includes the first thermal map corresponding to the first image and the second The second heat map corresponding to the image;
  • the disparity corresponding to the image of the second area can determine the depth value of the pixel of the second area in the first image
  • the depth value of the pupil point is determined from the depth value of the pixels in the second region in the first image.
  • the depth Z of a point whose three-dimensional coordinates are P(x, y, z) for a certain point in the captured space can be obtained:
  • d is used to represent parallax
  • d (X L -X R )
  • X L and X R are used to represent the imaging coordinates of the image plane of the camera at two different positions of the object
  • f is used to represent the focal length of the camera
  • T x is used to represent the baseline (the distance between the optical axes of the two cameras).
  • the heat map As a guide for binocular depth estimation, and only search for the location area matching the heat map, that is, the second area, which greatly reduces the amount of calculation; in addition, it can also alleviate the pupil area to a certain extent.
  • the texture features are not obvious, and the defects of image similarity and matching are difficult.
  • the above image correction step is also included.
  • generating the thermal map of the pupil for the image in step S10 includes: extracting a human face image from an image including a human face pupil; identifying a human eye image from the human face image; according to The human eye image generates a heat map of the pupil.
  • the human eye images are concentrated to generate the pupil heat map, that is, the image is first clipped, which can reduce the amount of data processing for heat map generation.
  • a deep neural network when extracting a human face image or recognizing a human eye image, a deep neural network can be used to realize it respectively, or a deep neural network can be used to directly recognize the human eye image from the collected images.
  • the deep neural network used is such as hourglass network (hourglass), HRNet, U-Net, FCN, segmentation network provided by Deeplab, EspNet network and so on.
  • These two cameras can be RGB camera at the same time, also can be IR camera at the same time, be two infrared cameras in this specific embodiment, and carry out photographing synchronous processing to two cameras, so that two cameras can synchronously gather the driver's image , to avoid image matching errors caused by time errors.
  • These two cameras can be used to execute the following step S210 in this specific embodiment.
  • the processing of the collected images can be performed by a car, a vehicle-mounted device, or a processing device (such as a processor, processing chip, etc.) implementation.
  • a processing device such as a processor, processing chip, etc.
  • it can be executed by the electronic control unit (Electronic Control Unit, ECU) of the vehicle.
  • ECU Electronic Control Unit
  • it can also be executed by a smart device (such as a mobile phone, a PAD, etc.) or a cloud server communicating with the ECU.
  • the ECU can transmit the image data collected by the camera to the smart device or the cloud server.
  • steps S215-S220 and steps S240-S265 are executed by ECU as an example. It is not difficult to understand that the distribution of steps is not limited to the above-mentioned methods.
  • the first specific implementation of the method for determining the pupil position includes the following steps:
  • S210 The two cameras take pictures synchronously, and obtain two left and right images with different perspectives on the left and right, that is, an image pair.
  • S215-S220 Perform image correction on the left and right images in the obtained image pair.
  • Image correction is to use the pre-calibrated internal parameters of the two cameras and the external parameters between the two cameras to correct the left and right image pairs collected, including image de-distortion, position adjustment and cropping, etc., to obtain binocular images that are parallel to the left and right horizontal epipolar lines image.
  • the internal parameters include the principal points of the left and right cameras, the distortion vectors of the left and right cameras, and the external parameters include the rotation matrix and translation matrix between the cameras. Internal and external parameters are determined during camera calibration.
  • the rotation matrix and translation matrix can also be called the homography matrix (Homography)
  • face detection and face image extraction can be realized through deep neural network, such as using convolutional neural network (CNN), region selection network/extraction candidate frame network (Region Proposal Network, RPN), fully convolutional network (Fully Convolutional Networks, FCN), CNN region advance network (Regions with CNN features, RCNN), etc. to achieve face detection and extraction.
  • CNN convolutional neural network
  • RPN region selection network/extraction candidate frame network
  • FCN Fully convolutional network
  • FCN Fully Convolutional Networks
  • CNN region advance network Regions with CNN features, RCNN
  • step S230 For the extracted face image, use the human eye detection algorithm to detect the human eye, if the human eye can be detected, proceed to the next step; otherwise, end this process and return to step S210 to execute the next process .
  • the human eye detection algorithm can be detected through image algorithms based on the geometric features and grayscale features of the human eye, or human eye detection can be realized through a deep neural network.
  • the deep neural network is a CNN network, RPN network, FCN network, etc.
  • S235 When the human eye is detected, use the segmentation network to predict the heat map of the human eye pupil point for the human eye area, and obtain the human eye pupil heat map.
  • a schematic diagram of the heat map is shown in FIG. 4 .
  • the heat map represents the probability distribution of pupil points, so it will be more robust to situations such as occlusion, lighting, and large eye postures. This can solve the problem of inaccurate recognition caused by insufficient robustness of 2D pupil position positioning to a certain extent. .
  • the heat map of the human eye area Use the argmax function to solve the probability value, use the value corresponding to the second highest point of the probability value as the first threshold, determine the position of the point with the highest probability value and the second highest point in the image of the human eye area, and then use the mean value of the positions as the predicted pupil point position.
  • S245 Deduce the original 2D coordinates of the pupil in the original image according to the position of the pupil point.
  • the 2D coordinates of the pupil point in the original image are reversely deduced according to the corresponding relationship between the original image and the extracted position of the face area, and the corresponding relationship between the human eye area identified in the face area.
  • the second threshold may be the value corresponding to the second highest point of the probability value in the thermal map
  • the Binarization can obtain the area A and area A' of the left and right heat maps respectively.
  • the area A and the area A' correspond to the pupil point and the surrounding area.
  • S260 Calculate the depth information of each pixel in the area A (that is, area A') according to the parallax and the internal reference information of the camera, and take the average value of each depth information in the area as the depth of the pupil point.
  • the specific embodiment of the present application can directly obtain the approximate area of the pupil due to the use of the heat map, and only solve the parallax for the images in the approximate area A and area A' of the pupil determined by the heat map, that is
  • the depth information of the pupil area can be obtained, that is to say, the similarity search and matching of the entire image is simplified to the similarity search and matching of the pupil area, which greatly reduces the amount of calculation on the one hand, and eases the problem to a certain extent on the other hand.
  • the influence of texture features in the iris and pupil is not obvious enough, and it provides a guideline for searching and matching this area.
  • the embodiment of the present application also provides a corresponding device for determining the pupil position.
  • the device for determining the pupil position in this embodiment can be used to implement various optional embodiments of the above-mentioned method for determining the pupil position.
  • the device 100 for determining the pupil position can be used to implement the method for determining the external pupil, and the device 100 for determining the pupil position has an acquisition module 110 and a processing module 120 . in:
  • the acquiring module 110 is used for acquiring images including pupils. Specifically, the acquisition module 110 may be used to execute step S10 and examples thereof in the method for determining the pupil position above.
  • the processing module 120 is used to obtain a heat map corresponding to the image and used to represent the probability distribution of the pupil point in the image, and is also used to determine the pupil point in the
  • the position in the image is also used to determine the above-mentioned second area in the image, and is also used to determine the depth of the pixels in the second area in the image according to the two-dimensional position of the pupil point in the image
  • the value determines the depth value of the pupil point to determine the three-dimensional position of the pupil point.
  • the processing module 120 may be used to execute any one of steps S20-S50 in the method for determining the pupil position and any optional example thereof. For details, refer to the detailed description in the method embodiments, and details are not repeated here.
  • the processing module 120 when the processing module 120 is used to determine the position of the pupil point in the image according to the first region of the heat map, it is specifically used to: according to the heat map The center position of the first area of , and determine the position of the pupil point in the image. Specifically, in this case, the processing module 120 is specifically configured to execute any step in step S30 in the method for determining the pupil position and any optional example thereof. In some other embodiments, the position of the pupil point can be calculated by means of the mean value, and can also be weighted according to the probability value, or weighted according to the position of the point.
  • the first threshold is the second highest value of the probability values in the heat map. In some other embodiments, a larger or smaller first threshold may also be selected, so that there are fewer or more points in the first region.
  • the determining the depth value of the pupil point from the depth value of the pixels in the second area of the image comprises: determining the pupil point according to the mean value of the depth values of the pixels in the second area in the image the depth value.
  • the acquired image including the pupil includes a first image and a second image, and the first image and the second image are two images taken from different viewing angles.
  • the heat map includes a first heat map corresponding to the first image and a second heat map corresponding to the second image; in this case, when the processing module is used for the first region according to the heat map, it is determined
  • the position of the pupil point in the image is specifically used to: determine the position of the pupil point in the first image according to the first region of the first heat map; the processing module is used to determine the second position of the pupil point in the image area, it is specifically used to: determine the second area in the first image, and determine the second area in the second image; in this case, the depth value of the pupil point is determined by the pixels of the second area in the image
  • the determination of the depth value specifically includes: the depth value of the pupil point is determined by the depth value of the pixels in the second area in the first image, and the depth value of the pixels in the second area in the first image is determined by the depth value of the
  • the processing module is further configured to: perform image correction on the first image and the second image, wherein the image correction includes at least one of the following: image de-distortion, image position adjustment, and image cropping.
  • the processing module when used to obtain the heat map corresponding to the image, it is specifically used to: obtain a human eye image from the image; obtain the heat map according to the human eye image.
  • the apparatus 100 for determining the pupil position in the embodiment of the present application can be implemented by software, for example, it can be implemented by a computer program or instruction having the above functions, and the corresponding computer program or instruction can be stored in the internal memory of the terminal, through The processor reads the corresponding computer programs or instructions inside the memory to realize the above functions.
  • the device 100 for determining the pupil position in the embodiment of the present application can also be implemented by hardware, for example, the acquisition module 110 of the determination device 100 can be implemented by a camera on the vehicle, or the acquisition module 110 can also be implemented by a processor and the vehicle. The interface circuit between the cameras is realized.
  • the processing module 120 of the determination device 100 can be realized by a processing device on the vehicle, such as a processor of a vehicle processing device such as a vehicle machine or a vehicle-mounted computer, or the processing module 120 can also be realized by a terminal such as a mobile phone or a tablet.
  • the apparatus 100 for determining the pupil position in the embodiment of the present application may also be implemented by a combination of a processor and a software module.
  • processing details of the devices or modules in the embodiments of the present application can refer to the related expressions of the embodiments shown in FIG. 1a-FIG. 3 and related extended embodiments, and the embodiments of the present application will not be repeated.
  • the embodiment of the present application also provides a vehicle with the above-mentioned device for determining the pupil position.
  • the vehicle can be a family car or a truck, etc., or it can be a special vehicle, such as an ambulance, a fire engine, a police car or an engineering emergency vehicle, etc. .
  • each module of the above-mentioned device for determining the pupil position can be arranged in the vehicle system in the form of pre-installation or after-installation, wherein each module can rely on the bus or interface circuit of the vehicle for data interaction, or with the development of wireless technology With the development, each module can also use wireless communication for data interaction to eliminate the inconvenience caused by wiring.
  • FIG. 6 is a schematic structural diagram of an electronic device 600 provided by an embodiment of the present application.
  • the electronic device 600 includes: a processor 610 and a memory 620 .
  • the electronic device 600 shown in FIG. 6 may further include a communication interface 630, which may be used for communication with other devices.
  • the processor 610 may be connected to the memory 620 .
  • the memory 620 can be used to store the program codes and data. Therefore, the memory 620 may be a storage unit inside the processor 610, or an external storage unit independent of the processor 610, or may include a storage unit inside the processor 610 and an external storage unit independent of the processor 610. part.
  • the electronic device 600 may also include a bus.
  • the memory 620 and the communication interface 630 may be connected to the processor 610 through a bus.
  • the bus may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the processor 610 may be a central processing unit (central processing unit, CPU).
  • the processor can also be other general-purpose processors, digital signal processors (digital signal processors, DSPs), application specific integrated circuits (Application specific integrated circuits, ASICs), off-the-shelf programmable gate arrays (field programmable gate arrays, FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the processor 610 adopts one or more integrated circuits for executing related programs, so as to implement the technical solutions provided by the embodiments of the present application.
  • the memory 620 may include read-only memory and random-access memory, and provides instructions and data to the processor 610 .
  • a portion of processor 610 may also include non-volatile random access memory.
  • processor 610 may also store device type information.
  • the processor 610 executes the computer-executed instructions in the memory 620 to perform the operation steps of the method for determining the pupil position, for example, the method in the embodiment corresponding to FIG. embodiment, or the method in the specific implementation manner corresponding to FIG. 3 , or each optional embodiment therein.
  • the electronic device 600 may correspond to a corresponding subject performing the methods according to the various embodiments of the present application, and the above-mentioned and other operations and/or functions of the modules in the electronic device 600 are for realizing the present invention For the sake of brevity, the corresponding processes of the methods in the embodiments are not repeated here.
  • FIG. 7 is a schematic structural diagram of another electronic device 700 provided in this embodiment, including: a processor 710, and an interface circuit 720, wherein, The processor 710 accesses the memory through the interface circuit 720, and the memory stores program instructions.
  • the processor executes the method of the embodiment corresponding to FIG. 2, or each optional embodiment thereof, or the method corresponding to FIG. 3
  • the electronic device may further include a communication interface, a bus, etc.
  • the embodiment of the present application also provides a system 800 for determining the position of the pupil.
  • the device may be the electronic device 600 as shown in FIG. 6 , or the electronic device 700 as shown in FIG. 7 .
  • the image acquisition device 810 may be an RGB-D camera or a binocular camera, which is used to collect images including pupils and provide them to the electronic device, so that the electronic device executes the method of the embodiment corresponding to FIG. 2 , or each of them may Selected embodiments, or the method in the specific implementation manner corresponding to FIG. 3 , or each optional embodiment therein.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
  • the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, it is used to execute the method for determining the position of the pupil.
  • the method includes the solutions described in the above-mentioned embodiments at least one of the .
  • the computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • connect such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

A pupil position determination solution, comprising: acquiring an image comprising a pupil (S10) and a corresponding heat map (S20), the heat map being used for representing a probability distribution of the pupil point in the image, wherein the pupil point is a center point of the pupil. A two-dimensional position of the pupil point in the image and a depth value of the pupil point are determined by using a probability value corresponding to a pixel in the heat map, a first threshold value and a second threshold value. A three-dimensional position of the pupil point is then determined according to the two-dimensional position of the pupil point in the image and the depth value of the pupil point. Since the three-dimensional position of the pupil point is determined by using the probability distribution-based heat map, the position of the pupil point is more accurately predicted, and the robustness of pupil point prediction is better in different settings such as occlusion, illumination and large eyeball pose.

Description

瞳孔位置的确定方法、装置及***Method, device and system for determining pupil position 技术领域technical field
本申请涉及人工智能的技术领域,尤其涉及瞳孔位置的确定方法、装置及***。The present application relates to the technical field of artificial intelligence, in particular to a method, device and system for determining the pupil position.
背景技术Background technique
瞳孔关键点定位技术在视线追踪、眼动识别、虹膜检测等领域中有着广泛的用途,同时,随着虚拟现实技术以及人机交互的逐渐热门,瞳孔定位技术受到越来越广泛的关注。在早期,二维(2 dimension,2D)瞳孔关键点定位主要采用传统的图像处理算法,对图像进行二值化、腐蚀、膨胀等操作,然后利用检测算法圈出虹膜,再根据圆心进一步求解出虹膜的中心点。Pupil key point positioning technology has a wide range of applications in the fields of gaze tracking, eye movement recognition, and iris detection. At the same time, with the increasing popularity of virtual reality technology and human-computer interaction, pupil positioning technology has received more and more attention. In the early days, the two-dimensional (2 dimension, 2D) pupil key point location mainly used traditional image processing algorithms to perform binarization, erosion, expansion and other operations on the image, and then use the detection algorithm to circle the iris, and then further solve it according to the center of the circle. The center point of the iris.
近年来,随着深度学习的发展和硬件设备性能的不断提升,采用深度学习的方法进行瞳孔关键点定位越来越多,当前深度学习在人脸和人体关键点定位领域都有着很大的进展,精度也越来越高,效果变得越来越好。但是,由于瞳孔关键点定位对定位的精度有着很高的要求,当前仅采用深度学习并仅针对眼部瞳孔进行定位的技术还比较少。In recent years, with the development of deep learning and the continuous improvement of the performance of hardware equipment, more and more methods of using deep learning to locate key points of pupils are used. At present, deep learning has made great progress in the field of key point positioning of faces and human bodies. , the accuracy is getting higher and higher, and the effect is getting better and better. However, because the positioning of pupil key points has high requirements for positioning accuracy, there are still relatively few technologies that only use deep learning and only target eye pupils.
例如在进行视线追踪的技术中,需要能够精确定位瞳孔点在摄像头坐标系下的三维(3 dimension,3D)空间位置。具体的,在视线追踪任务中,主要需要预测人眼所看的方向,然后进一步计算出人眼所看的实际目标位置。为了达到此目的,首先需要定位出视线的起始点,而视线起始点的定位,即需要通过瞳孔关键点定位技术,如何确定出瞳孔点在摄像头坐标系下的3D空间位置,是做好视线估计的基础。For example, in the technology of eye tracking, it is necessary to be able to accurately locate the three-dimensional (3 dimension, 3D) spatial position of the pupil point in the camera coordinate system. Specifically, in the gaze tracking task, it is mainly necessary to predict the direction that the human eye is looking at, and then further calculate the actual target position that the human eye is looking at. In order to achieve this goal, it is first necessary to locate the starting point of the line of sight, and the positioning of the starting point of the line of sight requires the pupil key point positioning technology. How to determine the 3D space position of the exit pupil point in the camera coordinate system is a good line of sight estimation. The basics.
发明内容Contents of the invention
本申请提供一种瞳孔位置的确定方案,该方案包括用于瞳孔位置的确定方法、装置、***、计算机可读存储介质及计算机程序产品,可以实现对瞳孔三维位置的定位。The present application provides a solution for determining the position of the pupil, which includes a method for determining the position of the pupil, a device, a system, a computer-readable storage medium, and a computer program product, which can realize the positioning of the three-dimensional position of the pupil.
为达到上述目的,本申请第一方面提供了一种瞳孔位置的确定方法,包括:获取包括瞳孔的图像以及对应的热力图,其中,该热力图用于表示瞳孔点在图像中的概率分布,瞳孔点为瞳孔的中心点。根据热力图的第一区域,确定瞳孔点在图像中的位置,其中,热力图的第一区域中的像素对应的概率值大于第一阈值。确定图像中的第二区域,其中,第二区域在热力图中的像素对应的概率值大于第二阈值,并且第二阈值小于或等于第一阈值。由图像中的第二区域的像素的深度值确定瞳孔点的深度值。根据瞳孔点的二维位置、瞳孔点的深度值确定瞳孔点的三维位置,其中,瞳孔点的二维位置指瞳孔点在图像中的位置。In order to achieve the above purpose, the first aspect of the present application provides a method for determining the position of the pupil, including: acquiring an image including the pupil and a corresponding thermal map, wherein the thermal map is used to represent the probability distribution of pupil points in the image, The pupil point is the center point of the pupil. The position of the pupil point in the image is determined according to the first area of the heat map, wherein the probability value corresponding to the pixel in the first area of the heat map is greater than a first threshold. A second area in the image is determined, wherein the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first threshold. The depth value of the pupil point is determined from the depth value of the pixels of the second region in the image. The three-dimensional position of the pupil point is determined according to the two-dimensional position of the pupil point and the depth value of the pupil point, wherein the two-dimensional position of the pupil point refers to the position of the pupil point in the image.
由上,可以实现3D瞳孔点的定位,为视线追踪、眼动、人机交互等技术方向提供了技术支撑,为视线技术提供了精确且稳定的起始点,为视线估计的稳定性提供了保证。例如,在车辆的智能座舱中可以根据估计的瞳孔的3D坐标点测量驾驶员的坐 高,以自动调整座椅,使得座椅处于最舒适的高度。在驾驶员监控中,也可根据瞳孔三维位置判断驾驶员的分神情况。From the above, the positioning of 3D pupil points can be realized, which provides technical support for technical directions such as line of sight tracking, eye movement, and human-computer interaction, provides an accurate and stable starting point for line of sight technology, and guarantees the stability of line of sight estimation . For example, in the smart cockpit of the vehicle, the driver's sitting height can be measured according to the estimated 3D coordinate points of the pupils, so as to automatically adjust the seat so that the seat is at the most comfortable height. In driver monitoring, the driver's distraction can also be judged according to the three-dimensional position of the pupil.
其中,可以采用深度神经网络预测瞳孔点的热力图,通过热力图表示瞳孔点的概率分布,深度神经网络根据图像回归热力图相比直接回归瞳孔点坐标更加容易,并且对遮挡、光照、眼球大姿态等场景也更加鲁棒,在一定程度上解决瞳孔位置定位不够准确,效果不够鲁棒的问题。具体来说,在瞳孔点定位上,现有的瞳孔点定位要么是采用基于传统的图像处理算法来检测,要么是直接利用深度神经网络提取特征进行关键点坐标的直接回归,将预测瞳孔热力图应用在瞳孔坐标定位上,回归出来的是瞳孔坐标的概率分布,而不是简单的坐标位置,通过回归热力图的方法,本发明预测出的瞳孔坐标更加准确,同时也更加鲁棒。Among them, the deep neural network can be used to predict the heat map of the pupil point, and the probability distribution of the pupil point can be expressed through the heat map. The deep neural network is easier to return to the heat map according to the image than to directly return the coordinates of the pupil point, and it is more sensitive to occlusion, illumination, and eyeball size. Scenes such as gestures are also more robust, which solves the problem of inaccurate pupil position positioning and insufficient robustness to a certain extent. Specifically, in terms of pupil point positioning, the existing pupil point positioning either uses traditional image processing algorithms to detect, or directly uses the deep neural network to extract features for direct regression of key point coordinates, which will predict the pupil heat map When applied to pupil coordinate positioning, what is regressed is the probability distribution of pupil coordinates, not the simple coordinate position. Through the method of regression heat map, the pupil coordinates predicted by the present invention are more accurate and more robust.
作为第一方面的一种可能的实现方式,根据热力图的第一区域,确定瞳孔点的在图像中的位置,可以采用的方式包括:根据热力图的第一区域的中心位置,确定瞳孔点的在图像中的位置。As a possible implementation of the first aspect, the position of the pupil point in the image is determined according to the first area of the heat map, and the method that can be adopted includes: determining the pupil point according to the center position of the first area of the heat map position in the image.
可选的,可以使用热力图的第一区域中心位置作为瞳孔点在图像中的位置,具有较好的鲁棒性。例如,可以对人眼区域的热力图采用argmax函数求解出概率值(即概率值)最高的点、次高点、低于次高点的第二次高点等等,可以根据需要选取最高点的集合,或最高点与次高点的集合作为第一区域,将这些点位置均值作为瞳孔点的位置。由于基于热力图反应的是各像素的区域(例如概率值最高的点对应的是多个像素的集合),而非针对单一的像素,以及argmax所求解的也涉及点的集合(例如概率值最高点的集合、次高点的集合),因此使用均值来求解瞳孔点的位置,另外,采用均值的方式计算,可以具有更好的鲁棒性。Optionally, the center position of the first region of the heat map can be used as the position of the pupil point in the image, which has better robustness. For example, the argmax function can be used to solve the heat map of the human eye area to obtain the point with the highest probability value (that is, the probability value), the second highest point, the second highest point lower than the second highest point, etc., and the highest point can be selected according to needs The set of , or the set of the highest point and the second highest point is used as the first area, and the mean value of these points is used as the position of the pupil point. Since the reaction based on the heat map is the area of each pixel (for example, the point with the highest probability value corresponds to a collection of multiple pixels), not for a single pixel, and the solution of argmax also involves a collection of points (for example, the highest probability value point set, the second highest point set), so the mean value is used to solve the position of the pupil point. In addition, the mean value can be used for calculation, which can have better robustness.
可选的,在计算均值时,也可以引入权值进行计算。例如,在一些可能的实现方式中,当通过均值的方式计算瞳孔点位置时,还可以根据概率值进行加权,概率值越高权值越大。Optionally, when calculating the mean, weights may also be introduced for calculation. For example, in some possible implementation manners, when the pupil point position is calculated by means of an average value, weighting may also be performed according to a probability value, and the higher the probability value, the greater the weight.
在另一些可能的实现方式中,还可以根据点所在位置对计算均值时进行加权,例如计算位置均值时,对各个次高点位置,根据该次高点距离最高的点的距离进行加权,越远权值越低。In some other possible implementations, the calculation of the mean value can also be weighted according to the position of the point. For example, when calculating the position mean value, the position of each second-highest point is weighted according to the distance between the second-highest point and the highest point. The far weight is lower.
作为第一方面的一种可能的实现方式,第一阈值为热力图中的概率值的次高值。As a possible implementation manner of the first aspect, the first threshold is the second highest value of the probability values in the heat map.
由上,可以根据需要选择第一阈值范围,采用次高值作为第一阈值,可以实现在较小数据量下的确保瞳孔点的计算的准确性。在其他可能的实现方式中,也可以选择更大或更小的第一阈值,以使该第一阈值对应的第一区域中的点更少或更多。例如一种可能的实现方式是设置更低的第一阈值,以将低于次高点的第二次高点也作为第一区域对应的第一阈值。From the above, the first threshold range can be selected as required, and the next highest value can be used as the first threshold, which can ensure the accuracy of pupil point calculation under a small amount of data. In other possible implementation manners, a larger or smaller first threshold may also be selected, so that the first threshold corresponds to fewer or more points in the first region. For example, a possible implementation manner is to set a lower first threshold, so that the second highest point lower than the second highest point is also used as the first threshold corresponding to the first region.
作为第一方面的一种可能的实现方式,瞳孔点的深度值由图像的第二区域的像素的深度值确定包括:根据图像中的第二区域的像素的深度值的均值确定瞳孔点的深度值。As a possible implementation of the first aspect, determining the depth value of the pupil point from the depth value of the pixels in the second area of the image includes: determining the depth of the pupil point according to the mean value of the depth values of the pixels in the second area in the image value.
由上,上述计算瞳孔点的深度值时,通过第二区域内深度均值作为瞳孔点均值,增加了鲁棒性。具体来说,现有的求解深度往往是对整张图求解,并且在纹理特征不明显的地方求解的不够准确。而使用热力图为深度求解提供指引,能够只去关注热力 图部分的求解。另一方面,当深度值来自于双目图像采集装置时,热力图的指引可以促进两张图瞳孔位置的匹配,缓解瞳孔位置纹理特征不够明显的影响,并且利用了区域均值替代瞳孔中心均值,提高深度估计的准确度和稳定性。From the above, when calculating the depth value of the pupil point, the mean value of the depth in the second region is used as the mean value of the pupil point, which increases the robustness. Specifically, the existing solution depth is often solved for the entire graph, and the solution is not accurate enough in places where the texture features are not obvious. However, using the heat map to provide guidance for in-depth solutions can only focus on the solution of the heat map part. On the other hand, when the depth value comes from a binocular image acquisition device, the guidance of the heat map can promote the matching of the pupil positions of the two images, alleviate the impact of the texture features of the pupil position not being obvious enough, and use the regional mean value instead of the pupil center mean value, Improve the accuracy and stability of depth estimation.
作为第一方面的一种可能的实现方式,图像包括第一图像和第二图像,其中,第一图像和第二图像为从不同视角拍摄的两张图像,相应的,热力图包括第一图像对应的第一热力图和第二图像对应的第二热力图。根据热力图的第一区域,确定瞳孔点在图像中的位置,具体包括:根据第一热力图的第一区域,确定瞳孔点在第一图像中的位置。确定图像中的第二区域具体包括:确定第一图像中的第二区域,和,确定第二图像中的第二区域。瞳孔点的深度值由图像中的第二区域的像素深度值确定,具体包括:瞳孔点的深度值由第一图像中的第二区域的像素的深度值确定,第一图像中的第二区域的像素的深度值由第一图像中的第二区域的图像和第二图像中的第二区域的图像对应的视差确定。As a possible implementation of the first aspect, the image includes a first image and a second image, where the first image and the second image are two images taken from different perspectives, and correspondingly, the heat map includes the first image The corresponding first heatmap and the second heatmap corresponding to the second image. Determining the position of the pupil point in the image according to the first area of the heat map specifically includes: determining the position of the pupil point in the first image according to the first area of the first heat map. Determining the second area in the image specifically includes: determining the second area in the first image, and determining the second area in the second image. The depth value of the pupil point is determined by the pixel depth value of the second area in the image, specifically including: the depth value of the pupil point is determined by the depth value of the pixel in the second area in the first image, and the second area in the first image The depth value of the pixel is determined by the disparity corresponding to the image of the second region in the first image and the image of the second region in the second image.
对于使用双目摄像头的情况下,使用热力图的方式可实现一举两得,一方面能够精确预测瞳孔点坐标,另一方面能够为作为双目深度估计的指引,仅仅搜索匹配热力图的位置区域,即第二区域,大大降低计算量。此外,还能在一定程度上缓解瞳孔区域纹理特征不明显,图像相似匹配困难的缺陷,实现了降低估计难度,算力需求小,提高估计精度。应用于车辆时,采用的两个摄像头可以为车辆两A柱上安装的两摄像头,也可以是在驾驶室前方,朝向车内的双目摄像头。其中,双目摄像头可以是集成于一个图像采集装置上的两个摄像头,也可以是由独立的两个摄像头安装于固定位置而形成。For the case of using a binocular camera, using the heat map can achieve two goals. On the one hand, it can accurately predict the coordinates of the pupil point, and on the other hand, it can only search for the location area matching the heat map as a guide for binocular depth estimation, that is, In the second area, the calculation amount is greatly reduced. In addition, to a certain extent, it can alleviate the defects of inconspicuous texture features in the pupil area and difficulty in image similarity matching, which reduces the difficulty of estimation, requires less computing power, and improves estimation accuracy. When applied to a vehicle, the two cameras used can be two cameras installed on the two A-pillars of the vehicle, or a binocular camera in front of the driver's cab facing the interior of the vehicle. Wherein, the binocular camera may be two cameras integrated into one image acquisition device, or may be formed by installing two independent cameras at a fixed position.
作为第一方面的一种可能的实现方式,还包括:对第一图像、第二图像进行图像校正,图像校正包括至少以下之一:图像去畸变,图像位置调整,图像裁剪。A possible implementation manner of the first aspect further includes: performing image correction on the first image and the second image, where the image correction includes at least one of the following: image de-distortion, image position adjustment, and image cropping.
这里的图像校正包括图像去畸变,位置调整和裁剪等,得到左右对应极线平行的双目图像,以降低后续图像视差计算的难度。The image correction here includes image de-distortion, position adjustment, and cropping, etc., to obtain binocular images in which the left and right corresponding epipolar lines are parallel, so as to reduce the difficulty of subsequent image parallax calculations.
作为第一方面的一种可能的实现方式,获取图像对应的热力图包括:从图像中获取人眼图像。根据人眼图像获取热力图。As a possible implementation manner of the first aspect, acquiring the heat map corresponding to the image includes: acquiring a human eye image from the image. Obtain a heat map based on the human eye image.
通过该方式,实现可以仅针对人眼图像部分去生成瞳孔热力图,即对图像先进行剪裁处理,可减小热力图生成的数据处理量。In this way, it is realized that the pupil heat map can be generated only for the part of the human eye image, that is, the image is first clipped, which can reduce the amount of data processing for heat map generation.
本申请第二方面提供了一种瞳孔位置的确定装置,包括:获取模块,用于获取包括瞳孔的图像。处理模块,用于获取图像对应的热力图,热力图用于表示瞳孔点在图像中的概率分布,瞳孔点为瞳孔的中心点。处理模块还用于根据热力图的第一区域,确定瞳孔点在图像中的位置,其中,热力图的第一区域中的像素对应的概率值大于第一阈值。处理模块还用于确定图像中的第二区域,其中,第二区域在热力图中的像素对应的概率值大于第二阈值,第二阈值小于或等于第一阈值。处理模块还用于根据瞳孔点的二维位置、瞳孔点的深度值确定瞳孔点的三维位置,瞳孔点的二维位置指瞳孔点在图像中的位置,瞳孔点的深度值由图像中的第二区域的像素的深度值确定。The second aspect of the present application provides an apparatus for determining a pupil position, including: an acquisition module configured to acquire an image including the pupil. The processing module is configured to obtain a heat map corresponding to the image, and the heat map is used to represent the probability distribution of the pupil point in the image, and the pupil point is the center point of the pupil. The processing module is further configured to determine the position of the pupil point in the image according to the first area of the heat map, where the probability value corresponding to the pixel in the first area of the heat map is greater than a first threshold. The processing module is further configured to determine a second area in the image, where the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first threshold. The processing module is also used to determine the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point. The two-dimensional position of the pupil point refers to the position of the pupil point in the image, and the depth value of the pupil point is determined by the first The depth value of the pixels in the second area is determined.
作为第二方面的一种可能的实现方式,处理模块用于根据热力图的第一区域,确定瞳孔点的在图像中的位置时,具体用于:根据热力图的第一区域的中心位置,确定瞳孔点的在图像中的位置。As a possible implementation of the second aspect, when the processing module is used to determine the position of the pupil point in the image according to the first area of the heat map, it is specifically used to: according to the center position of the first area of the heat map, Determine the location of the pupil point in the image.
作为第二方面的一种可能的实现方式,第一阈值为热力图中的概率值的次高值。As a possible implementation of the second aspect, the first threshold is the second highest value of the probability values in the heat map.
作为第二方面的一种可能的实现方式,瞳孔点的深度值由图像的第二区域的像素的深度值确定包括:根据图像中的第二区域的像素的深度值的均值确定瞳孔点的深度值。As a possible implementation of the second aspect, determining the depth value of the pupil point from the depth value of the pixels in the second area of the image includes: determining the depth of the pupil point according to the mean value of the depth values of the pixels in the second area in the image value.
作为第二方面的一种可能的实现方式,图像包括第一图像和第二图像,第一图像和第二图像为从不同视角拍摄的两张图像,热力图包括第一图像对应的第一热力图和第二图像对应的第二热力图。处理模块用于根据热力图的第一区域,确定瞳孔点在图像中的位置时,具体用于:根据第一热力图的第一区域,确定瞳孔点在第一图像中的位置。处理模块用于确定图像中的第二区域时,具体用于:确定第一图像中的第二区域,和,确定第二图像中的第二区域。瞳孔点的深度值由图像中的第二区域的像素深度值确定,包括:瞳孔点的深度值由第一图像中的第二区域的像素的深度值确定,第一图像中的第二区域的像素的深度值由第一图像中的第二区域的图像和第二图像中的第二区域的图像对应的视差确定。As a possible implementation of the second aspect, the image includes a first image and a second image, the first image and the second image are two images taken from different angles of view, and the heat map includes the first heat force corresponding to the first image Figure and the second heatmap corresponding to the second image. When the processing module is used to determine the position of the pupil point in the image according to the first area of the heat map, it is specifically used to: determine the position of the pupil point in the first image according to the first area of the first heat map. When the processing module is used to determine the second area in the image, it is specifically used to: determine the second area in the first image, and determine the second area in the second image. The depth value of the pupil point is determined by the pixel depth value of the second area in the image, including: the depth value of the pupil point is determined by the depth value of the pixel in the second area in the first image, and the depth value of the second area in the first image The depth value of the pixel is determined by the disparity corresponding to the image of the second area in the first image and the image of the second area in the second image.
作为第二方面的一种可能的实现方式,处理模块还用于:对第一图像、第二图像进行图像校正,图像校正包括至少以下之一:图像去畸变,图像位置调整,图像裁剪。As a possible implementation of the second aspect, the processing module is further configured to: perform image correction on the first image and the second image, and the image correction includes at least one of the following: image de-distortion, image position adjustment, and image cropping.
作为第二方面的一种可能的实现方式,处理模块用于获取图像对应的热力图时,具体用于:从图像中获取人眼图像。根据人眼图像获取热力图。As a possible implementation of the second aspect, when the processing module is used to obtain the heat map corresponding to the image, it is specifically used to: obtain the human eye image from the image. Obtain a heat map based on the human eye image.
本申请第三方面提供了一种电子装置,包括:处理器,以及存储器,存储器上存储有程序指令,程序指令当被处理器执行时使得处理器执行上述第一方面任一项的瞳孔位置的确定方法。The third aspect of the present application provides an electronic device, including: a processor, and a memory, on which program instructions are stored, and when the program instructions are executed by the processor, the processor performs the pupil position detection of any one of the above-mentioned first aspects. Determine the method.
本申请第四方面提供了一种电子装置,包括:处理器,以及接口电路,其中,处理器通过接口电路访问存储器,存储器存储有程序指令,程序指令当被处理器执行时使得处理器执行上述第一方面任一项的瞳孔位置的确定方法。The fourth aspect of the present application provides an electronic device, including: a processor, and an interface circuit, wherein the processor accesses the memory through the interface circuit, and the memory stores program instructions, and when the program instructions are executed by the processor, the processor executes the above-mentioned The method for determining the pupil position of any one of the first aspect.
本申请第五方面提供了一种确定瞳孔位置的***,其包括图像采集装置,以及与图像采集装置耦合的上述第三方面或第四方面提供的电子装置。The fifth aspect of the present application provides a system for determining the pupil position, which includes an image acquisition device, and the electronic device provided in the third aspect or the fourth aspect coupled with the image acquisition device.
作为第五方面的一种可能的实现方式,该***可以是车载装置,也可以是车辆。图像采集装置可以是安装于车辆内的双目摄像头,其中,双目摄像头可以由两个独立的摄像头配合实现,也可以由集成有双摄像头的一个摄像装置实现。图像采集装置也可以是具有采集深度信息和图像信息的单目摄像头。As a possible implementation of the fifth aspect, the system may be a vehicle-mounted device or a vehicle. The image acquisition device may be a binocular camera installed in the vehicle, wherein the binocular camera may be realized by cooperation of two independent cameras, or may be realized by a camera device integrated with dual cameras. The image acquisition device may also be a monocular camera capable of collecting depth information and image information.
本申请第六方面提供了一种计算机可读存储介质,其上存储有程序指令,程序指令当被计算机执行时使得计算机执行上述第一方面任一项的瞳孔位置的确定方法。The sixth aspect of the present application provides a computer-readable storage medium, on which program instructions are stored. When the program instructions are executed by a computer, the computer executes the method for determining the pupil position of any one of the above-mentioned first aspects.
本申请第七方面提供了一种计算机程序产品,其包括有程序指令,程序指令当被计算机执行时使得计算机执行上述第一方面任一项的瞳孔位置的确定方法。The seventh aspect of the present application provides a computer program product, which includes program instructions, and when the program instructions are executed by a computer, the computer executes the method for determining the pupil position of any one of the above first aspects.
综上,本申请上述瞳孔位置的确定方案,通过采用表示瞳孔点概率的热力图的第一区域来确定瞳孔点二维位置,并基于热力图的第二区域确定瞳孔点的深度信息,进而可以确定瞳孔点的三维位置。由于采用了上述热力图的方式,相比常规的采用图像直接获取瞳孔点位置,在虹膜和瞳孔部位纹理特征不明显的情况下能够确定出较为准确的瞳孔点的位置,从而,本申请各可能的实现方式对遮挡、光照、眼球大姿态等对虹膜瞳孔纹理有影响的场景具有较好的鲁棒性,瞳孔位置定位较为准确。另一方面, 通过使用包括第一区域的第二区域的平均深度值计算瞳孔点深度值,也解决了双目估计瞳孔位置的深度不够准确的问题,并且基于均值计算具有较好的鲁棒性。另一方面,通过使用双目摄像头来采集两张图像时,基于热力图的第二区域计算深度值,不需对整张图的深度进行计算,从而降低了运算量,提高了计算速度。In summary, the above-mentioned pupil position determination scheme of the present application uses the first area of the heat map representing the probability of the pupil point to determine the two-dimensional position of the pupil point, and determines the depth information of the pupil point based on the second area of the heat map, and then can Determine the three-dimensional position of the pupil point. Due to the adoption of the above-mentioned thermal map method, compared with the conventional method of directly obtaining the position of the pupil point by using an image, a more accurate position of the pupil point can be determined when the texture features of the iris and the pupil are not obvious. The implementation of the method has good robustness to scenes that affect the texture of the iris and pupil, such as occlusion, lighting, and large eye posture, and the pupil position is more accurate. On the other hand, by using the average depth value of the second area including the first area to calculate the pupil point depth value, it also solves the problem that the depth of the binocular estimated pupil position is not accurate enough, and the calculation based on the mean value has better robustness . On the other hand, when the binocular camera is used to collect two images, the depth value is calculated based on the second area of the heat map, and the depth of the entire image does not need to be calculated, thereby reducing the amount of calculation and improving the calculation speed.
附图说明Description of drawings
图1a为本申请实施例提供的瞳孔位置的确定方法的一种应用场景的架构示意图;Fig. 1a is a schematic structural diagram of an application scenario of the method for determining the pupil position provided by the embodiment of the present application;
图1b为本申请实施例提供的瞳孔位置的确定方法的一种应用场景的第一示意图;Fig. 1b is a first schematic diagram of an application scenario of the method for determining the pupil position provided by the embodiment of the present application;
图1c为本申请实施例提供的瞳孔位置的确定方法的一种应用场景的第二示意图;Fig. 1c is a second schematic diagram of an application scenario of the method for determining the pupil position provided by the embodiment of the present application;
图2为本申请实施例提供的瞳孔位置的确定方法的流程示意图;FIG. 2 is a schematic flowchart of a method for determining a pupil position provided in an embodiment of the present application;
图3为本申请实施例提供的瞳孔位置的确定方法的具体实施方式的流程示意图;FIG. 3 is a schematic flowchart of a specific implementation method of a method for determining a pupil position provided in an embodiment of the present application;
图4为本申请具体实施方式中的人眼瞳孔的热力图的示意图;Fig. 4 is a schematic diagram of the heat map of the human eye pupil in the specific embodiment of the present application;
图5为本申请实施例提供的瞳孔位置的确定装置的示意图;Fig. 5 is a schematic diagram of a device for determining a pupil position provided by an embodiment of the present application;
图6为本申请实施例提供的电子装置的示意图;FIG. 6 is a schematic diagram of an electronic device provided in an embodiment of the present application;
图7为本申请实施例提供的另一电子装置的示意图;FIG. 7 is a schematic diagram of another electronic device provided by an embodiment of the present application;
图8为本申请实施例提供的瞳孔位置的确定***的示意图。FIG. 8 is a schematic diagram of a system for determining a pupil position provided by an embodiment of the present application.
应理解,上述结构示意图中,各框图的尺寸和形态仅供参考,不应构成对本申请实施例的排他性的解读。结构示意图所呈现的各框图间的相对位置和包含关系,仅为示意性地表示各框图间的结构关联,而非限制本申请实施例的物理连接方式。It should be understood that in the above structural diagrams, the size and shape of each block diagram are for reference only, and should not constitute an exclusive interpretation of the embodiment of the present application. The relative positions and containment relationships among the block diagrams shown in the structural schematic diagram are only schematic representations of the structural relationships among the block diagrams, rather than limiting the physical connection methods of the embodiments of the present application.
具体实施方式Detailed ways
下面结合附图并举实施例,对本申请提供的技术方案作进一步说明。应理解,本申请实施例中提供的***结构和业务场景主要是为了说明本申请的技术方案的可能的实施方式,不应被解读为对本申请的技术方案的唯一限定。本领域普通技术人员可知,随着***结构的演进和新业务场景的出现,本申请提供的技术方案对类似技术问题同样适用。The technical solutions provided by the present application will be further described below in conjunction with the accompanying drawings and examples. It should be understood that the system structure and business scenarios provided in the embodiments of the present application are mainly for illustrating possible implementations of the technical solution of the present application, and should not be interpreted as the only limitation on the technical solution of the present application. Those skilled in the art know that, with the evolution of the system structure and the emergence of new business scenarios, the technical solutions provided in this application are also applicable to similar technical problems.
应理解,本申请实施例提供的瞳孔位置的确定方案,包括瞳孔位置的确定方法及装置、***、计算机可读存储介质及计算机程序产品。由于这些技术方案解决问题的原理相同或相似,在如下具体实施例的介绍中,某些重复之处可能不再赘述,但应视为这些具体实施例之间已有相互引用,可以相互结合。It should be understood that the solution for determining the pupil position provided in the embodiments of the present application includes a pupil position determination method and device, a system, a computer-readable storage medium, and a computer program product. Since the principles of these technical solutions to solve problems are the same or similar, in the introduction of the following specific embodiments, some repetitions may not be repeated, but it should be considered that these specific embodiments have been referred to each other and can be combined with each other.
在根据2D图像确定瞳孔点位置时,可以采用的方式是:首先获得2D图像中的瞳孔点2D坐标,然后通过传统的双目估计来获得整张图的深度信息,再根据瞳孔点2D坐标得到该坐标对应的计算出的深度信息,最后通过摄像头成像算法进一步获得摄像头坐标系下瞳孔点的3D坐标。这种技术由于受摄像头成像质量、光照、大角度转动眼球、眯眼等因素的影响,图像虹膜部位的特征可能不够明显,传统图像处理算法对这些场景的鲁棒性不强(容易误检测),瞳孔点的2D坐标的检测在此类情况下很容易失效。使用双目求解图像深度信息时,由于人眼虹膜和瞳孔没有明显界线,并且多数情况属于一团黑色,这样的话瞳孔部位的纹理特征不明显,双目匹配算法很难精确计算出瞳孔部位的深度信息。并且,求解深度信息时,往往是直接使用双目求解 出整张图像的深度信息,存在计算冗余、计算速度、慢效率低等问题。When determining the position of the pupil point based on the 2D image, the method that can be used is: first obtain the 2D coordinates of the pupil point in the 2D image, then obtain the depth information of the entire image through traditional binocular estimation, and then obtain the 2D coordinates of the pupil point The coordinates correspond to the calculated depth information, and finally the 3D coordinates of the pupil point in the camera coordinate system are further obtained through the camera imaging algorithm. Due to the influence of camera imaging quality, illumination, eyeball turning at a large angle, squinting and other factors, the characteristics of the iris part of the image may not be obvious enough, and traditional image processing algorithms are not robust to these scenes (easy to misdetect) , the detection of the 2D coordinates of the pupil point can easily fail in such cases. When using binoculars to solve image depth information, since the iris and pupils of the human eye have no obvious boundaries, and most of them are a mass of black, in this case, the texture features of the pupils are not obvious, and it is difficult for the binocular matching algorithm to accurately calculate the depth of the pupils. information. Moreover, when solving the depth information, the depth information of the entire image is often directly obtained by using binoculars, which has problems such as calculation redundancy, calculation speed, slowness and low efficiency.
其中,关于瞳孔点的2D坐标的检测时,可以采用基于深度学习的方法进行2D瞳孔点的定位,例如,先进行人脸检测、然后进行人眼区域检测、最后进行人眼瞳孔特征检测。人眼瞳孔检测部分,首先使用沙漏网络来提取人眼瞳孔特征,然后进一步定位出虹膜的四个边缘点,再通过这四个点,来拟合出一个圆,最后通过这个圆获得圆心坐标作为瞳孔点坐标。但是,在遮挡、光照、眼球大姿态等情形下,虹膜边缘四个点很难检测准确,虽然采用四点拟合圆的方式,但个别点差异大的情况下,拟合效果差,导致瞳孔的2D坐标定位会不够准确。Among them, regarding the detection of the 2D coordinates of the pupil point, a method based on deep learning can be used to locate the 2D pupil point, for example, first perform face detection, then perform human eye area detection, and finally perform human eye pupil feature detection. In the human eye pupil detection part, the hourglass network is first used to extract the features of the human eye pupil, and then the four edge points of the iris are further located, and then a circle is fitted through these four points, and finally the coordinates of the center of the circle are obtained through this circle as Pupil point coordinates. However, under conditions such as occlusion, lighting, and large eye postures, it is difficult to accurately detect the four points on the edge of the iris. Although the four-point fitting circle method is used, the fitting effect is poor when there are large differences in individual points, resulting in pupil The 2D coordinate positioning of will not be accurate enough.
本申请实施例提供了一种改进的瞳孔位置的确定方案,首先通过深度神经网络预测出人眼瞳孔点的热力图,再根据热力图的指引在热力图范围内进行深度信息的搜索匹配,求解到瞳孔区域的深度信息,然后以瞳孔区域的平均深度信息作为瞳孔点的深度信息,通过瞳孔点热力图求解出瞳孔点的2D坐标,最后通过小孔成像原理求解出瞳孔点在摄像头坐标系下的3D坐标。本申请的技术方案,在受到遮挡、光照、眼球大姿态等情况下,2D瞳孔位置定位仍能较为准确,因此鲁棒性更佳,并且解决了虹膜和瞳孔部位纹理特征不明显的影响,通过双目估计瞳孔位置的深度不够准确的问题,同时解决了双目匹配过程中的整张图的搜索匹配存在的效率低速度慢的问题。The embodiment of the present application provides an improved solution for determining the pupil position. First, the heat map of the pupil point of the human eye is predicted through the deep neural network, and then the depth information is searched and matched within the range of the heat map according to the guidance of the heat map to solve the problem. The depth information of the pupil area is obtained, and then the average depth information of the pupil area is used as the depth information of the pupil point, and the 2D coordinates of the pupil point are obtained through the heat map of the pupil point, and finally the pupil point is obtained in the camera coordinate system through the principle of pinhole imaging 3D coordinates. The technical solution of this application can still locate the 2D pupil position more accurately under the conditions of occlusion, illumination, large eyeball posture, etc., so the robustness is better, and it solves the influence of inconspicuous texture features of the iris and pupil, through The binocular estimation of the depth of the pupil position is not accurate enough, and at the same time, it solves the problem of low efficiency and slow speed in the search and matching of the entire image in the binocular matching process.
本申请实施例提供的瞳孔位置的确定方案应用到视线追踪、眼动识别、虹膜检测、人机交互等应用领域。例如,基于视线追踪或眼动识别可应用于车辆的智能座舱时,可以用来确定驾驶员瞳孔位置从而预测坐高,以自动调整座椅,使得座椅处于最舒适的高度,也可以用于根据瞳孔3D位置监测驾驶员是否分神。还可以应用于监控视频会议中的参会者是否分心领域、或监控屏幕前的上网课学生是否处于认真听讲领域、或判断用户关注度领域、以及对特定用户的视线追踪或眼动识别所获得的大数据的对心理学的研究提供支持等。The solution for determining the pupil position provided by the embodiment of the present application is applied to application fields such as gaze tracking, eye movement recognition, iris detection, and human-computer interaction. For example, when eye-tracking or eye-movement recognition can be applied to the smart cockpit of a vehicle, it can be used to determine the driver's pupil position to predict the sitting height, so as to automatically adjust the seat so that the seat is at the most comfortable height, and can also be used for Monitors whether the driver is distracted based on the 3D position of the pupil. It can also be applied to the field of monitoring whether the participants in the video conference are distracted, or monitoring whether the online class students in front of the screen are listening carefully, or judging the user's attention, and the sight tracking or eye movement recognition of specific users. The obtained big data provides support for the research of psychology and so on.
下面参见图1a、图1b和图1c,以本申请实施例提供的瞳孔位置的确定方案应用于车辆的场景进行介绍,其中,应用于车辆10时,瞳孔位置的确定装置的一实施例可以包括图像采集装置11和处理器12。Referring to Fig. 1a, Fig. 1b and Fig. 1c, an introduction will be made to the scenario where the solution for determining the pupil position provided by the embodiment of the present application is applied to a vehicle. When applied to a vehicle 10, an embodiment of the device for determining the pupil position may include Image acquisition device 11 and processor 12.
其中,图像采集装置11用于获取包括瞳孔的用户的图像,本实施例中,图像采集装置11为摄像头,其中摄像头可以是双目摄像头、可采集三原色图像和深度(RedGreenBlue-Deep,RGB-D)的摄像头等,摄像头可以按照需求安装到车辆上。本实施例中,具体可以如图1b和图1c所示,采用了由两独立的摄像头构成的双目摄像头,这两个摄像头为设置在车辆座舱的左右A柱上的第一摄像头111和第二摄像头112。在其他例子中,也可以安装在车辆座舱内的后视镜的朝向用户的一侧,还可以安装在方向盘、中控台附近区域,还可以安装在座椅后方显示屏上方等位置,主要用于对车辆座舱的驾驶员或乘客的脸部图像进行采集。Wherein, the image acquisition device 11 is used to obtain the image of the user including the pupil. In this embodiment, the image acquisition device 11 is a camera, wherein the camera can be a binocular camera, which can collect three primary color images and depth (RedGreenBlue-Deep, RGB-D ) camera, etc., the camera can be installed on the vehicle as required. In this embodiment, specifically as shown in Figure 1b and Figure 1c, a binocular camera composed of two independent cameras is used, and these two cameras are the first camera 111 and the second camera 111 arranged on the left and right A-pillars of the vehicle cockpit. Two cameras 112. In other examples, it can also be installed on the user-facing side of the rearview mirror in the vehicle cockpit, it can also be installed on the steering wheel, the area near the center console, and it can also be installed on the position above the display screen behind the seat. It is used to collect facial images of drivers or passengers in the vehicle cockpit.
在其他一些实施例中,获取模块11也可以是接收摄像头传输的用户图像数据的电子设备,如数据传输芯片,数据传输芯片例如总线数据收发芯片、网络接口芯片等,数据传输芯片也可以是无线传输芯片,如蓝牙芯片或WIFI芯片等。在另一些实施例中,获取模块11也可以集成于处理器中,成为集成到处理器中的接口电路或数据传输模块等。In some other embodiments, the acquisition module 11 can also be an electronic device that receives user image data transmitted by the camera, such as a data transmission chip, such as a bus data transceiver chip, a network interface chip, etc., and the data transmission chip can also be a wireless Transmission chip, such as Bluetooth chip or WIFI chip, etc. In some other embodiments, the acquisition module 11 may also be integrated into the processor, and become an interface circuit or a data transmission module integrated into the processor.
其中,处理器12用于根据所述图像生成表示瞳孔点分布概率的热力图,并根据热力图的第一区域确定所述瞳孔点在所述图像中的位置,还用于确定所述图像中的第二区域,以及用于根据所述瞳孔点的二维位置、所述瞳孔点的深度值确定所述瞳孔点的三维位置,所述瞳孔点的二维位置指所述瞳孔点在所述图像中的位置,所述瞳孔点的深度值由所述图像中的第二区域的像素的深度值确定。处理器12可以为电子设备,具体可以为车机或车载电脑等车载处理装置的处理器,也可以为中央处理器(central processing unit,CPU)、微处理器(micro control unit,MCU)等常规的芯片处理器,还可以为手机、平板等终端硬件。Wherein, the processor 12 is used to generate a heat map representing the distribution probability of pupil points according to the image, determine the position of the pupil point in the image according to the first area of the heat map, and determine the position of the pupil point in the image. The second area of the pupil point, and used to determine the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point, the two-dimensional position of the pupil point means that the pupil point is in the The position in the image, the depth value of the pupil point is determined by the depth value of the pixel in the second area in the image. The processor 12 can be an electronic device, specifically a processor of a vehicle-mounted processing device such as a car machine or a vehicle-mounted computer, or a conventional processor such as a central processing unit (Central Processing Unit, CPU) or a microprocessor (micro control unit, MCU). The chip processor can also be used as terminal hardware such as mobile phones and tablets.
通过上述的结构,可以基于上述热力图获得瞳孔点的三维位置,进而进一步应用到视线追踪、眼动识别、虹膜检测、人机交互等应用领域。由于不直接依据图像本身,故基本不受到虹膜和瞳孔部位纹理特征不明显的影响,并且也可解决通过双目估计瞳孔位置的深度不够准确的问题。Through the above-mentioned structure, the three-dimensional position of the pupil point can be obtained based on the above-mentioned heat map, and then further applied to application fields such as eye-tracking tracking, eye movement recognition, iris detection, and human-computer interaction. Since it is not directly based on the image itself, it is basically not affected by the inconspicuous texture features of the iris and pupils, and it can also solve the problem of inaccurate estimation of the depth of the pupil position through binoculars.
基于图1a-图1c示出的应用场景,图2示出了本申请实施例提供的瞳孔位置的确定方法流程图,该实施例的方法可以由瞳孔位置的确定装置或该装置中的部分器件来执行,例如可由车、车载装置、或处理器等来执行。下面以处理器为例,对本申请实施例提供的瞳孔位置的确定方法进行介绍,包括以下步骤:Based on the application scenarios shown in Figures 1a-1c, Figure 2 shows the flow chart of the method for determining the pupil position provided by the embodiment of the present application, the method of this embodiment can be determined by the pupil position determination device or some devices in the device For example, it may be executed by a vehicle, an on-vehicle device, or a processor. Taking the processor as an example below, the method for determining the pupil position provided in the embodiment of the present application is introduced, including the following steps:
S10:车辆上的处理器通过接口电路获取通过图像采集装置采集的包括瞳孔的图像。S10: The processor on the vehicle acquires the image including the pupil collected by the image acquisition device through the interface circuit.
在一些实施例中,图像采集装置可以是双目摄像头,双目摄像头可以由两个独立的摄像头配合实现,也可以由集成有双摄像头的一个摄像装置实现。例如,当应用于图1a-图1c示出的车辆的应用场景时,图像采集装置可以为车辆两A柱上安装的两个摄像头构成的双目摄像头,也可以是在驾驶室前方,朝向车内的双目摄像头,或由集成有双摄像头的可进行拍照的智能设备(如手机、平板电脑等)的摄像头。通过两个摄像头可以采集到具有视差的两幅图像。In some embodiments, the image acquisition device may be a binocular camera, and the binocular camera may be realized by cooperation of two independent cameras, or may be realized by a camera device integrated with two cameras. For example, when applied to the application scenarios of vehicles shown in Figures 1a-1c, the image acquisition device can be a binocular camera composed of two cameras installed on the two A-pillars of the vehicle, or it can be in front of the cab, facing the vehicle. The built-in binocular camera, or the camera of a smart device (such as a mobile phone, a tablet computer, etc.) that is integrated with a dual camera and can take pictures. Two images with parallax can be collected by two cameras.
在另一些实施例中,摄像头可以为RGB-D摄像头,RGB-D摄像头可以是单目深度摄像头,所采集的图像除了RGB三原色的图像外,还会采集到对应所述RGB图像的各像素的深度信息。In some other embodiments, the camera can be an RGB-D camera, and the RGB-D camera can be a monocular depth camera. In addition to the images of the RGB three primary colors, the collected images will also collect images corresponding to each pixel of the RGB image. depth information.
本实施例中,所采集的图像可以是驾驶员的图像,也可以是乘客的图像。其中,图像中需要包括瞳孔,以便于后续对该图像中的瞳孔点进行识别。应理解的是,当摄像头采集到的图像不包括瞳孔,或者不包括完整的瞳孔,可以通过图像识别模型识别并筛除上述图片,该图像识别模型可以用于从采集的图像中识别出人眼图像的识别模型,图像识别模型可以通过深度神经网络实现,该深度神经网络可以是沙漏网络(hourglass)、HRNet、U-Net、FCN、Deeplab提供的分割网络、EspNet网络等等。当图像识别模型识别的人眼图像的置信度低于一设定的阈值时,即表示无法识别出图像中的人眼,此时可筛除该图片。In this embodiment, the collected image may be the image of the driver or the image of the passenger. Wherein, the pupil needs to be included in the image, so as to identify the pupil point in the image later. It should be understood that when the image collected by the camera does not include the pupil, or does not include the complete pupil, the above-mentioned picture can be identified and screened out through the image recognition model, which can be used to identify the human eye from the collected image The image recognition model, the image recognition model can be realized by a deep neural network, which can be an hourglass network (hourglass), HRNet, U-Net, FCN, segmentation network provided by Deeplab, EspNet network, etc. When the confidence level of the human eye image recognized by the image recognition model is lower than a set threshold, it means that the human eye in the image cannot be recognized, and the picture can be screened out.
另外,在摄像头所采集到的图像质量较差,例如由于驾驶员或乘客头部移动导致的图像模糊,或光线不足导致的图像模糊等情况下,导致该图像无法处理时,也可以筛除图像质量差的图像,保留质量好的图像。In addition, when the quality of the image captured by the camera is poor, such as image blur caused by the movement of the driver's or passenger's head, or image blur caused by insufficient light, the image can also be filtered out when the image cannot be processed Poor quality images, keep good quality images.
另外,还可以通过对图像进行降噪处理,例如去除随机的、离散的、孤立的像素点,以达到图像后续处理的需求。In addition, it is also possible to perform noise reduction processing on the image, such as removing random, discrete, and isolated pixels, so as to meet the requirements of subsequent image processing.
在另外一些实施例中,获取包括瞳孔的图像的方式,也可以是通过数据传输的方式接收包括瞳孔的图像,实现该数据传输的方式可以是通过独立的通信接口,或是集成于处理器中的通信接口等方式实现。例如,独立的通信接口可以是有线传输的接口芯片,如串行数据总线接口芯片,并行数据总线接口芯片、网线接口芯片等,也可以是无线传输芯片,如蓝牙芯片或WIFI芯片。当通信接口集成于处理器中时,可以是集成到处理器内的接口电路,或集成到处理器内的无线传输模块等。In some other embodiments, the way of acquiring the image including the pupil can also be to receive the image including the pupil through data transmission, and the way to realize the data transmission can be through an independent communication interface, or integrated in the processor Communication interface and other ways to achieve. For example, the independent communication interface can be an interface chip for wired transmission, such as a serial data bus interface chip, a parallel data bus interface chip, a network cable interface chip, etc., or a wireless transmission chip, such as a Bluetooth chip or a WIFI chip. When the communication interface is integrated in the processor, it may be an interface circuit integrated in the processor, or a wireless transmission module integrated in the processor.
S20:车辆上的处理器获取包含瞳孔的所述图像对应的热力图,所述热力图用于表示瞳孔点在所述图像中的概率分布,所述瞳孔点为所述瞳孔的中心点。S20: The processor on the vehicle acquires a thermal map corresponding to the image including the pupil, where the thermal map is used to represent a probability distribution of pupil points in the image, where the pupil point is a center point of the pupil.
其中,本申请实施例中热力图表示的是瞳孔点的概率分布,其中,热力图中可以使用亮度表示概率值的高低,图中越高亮的位置,表示该位置为瞳孔点的概率越高。由于热力图表示的是瞳孔点的概率分布,因此用于识别瞳孔点时,回归热力图相比直接回归瞳孔点坐标更加容易,对不同的场景也更加鲁棒。Wherein, in the embodiment of the present application, the heat map represents the probability distribution of pupil points, and the heat map can use brightness to indicate the level of the probability value. The brighter the position in the figure, the higher the probability that the position is a pupil point. Since the heat map represents the probability distribution of pupil points, when used to identify pupil points, it is easier to regress the heat map than to directly regress the pupil point coordinates, and it is more robust to different scenarios.
其中,可以利用深度神经网络,根据所述图像生成所述热力图,深度神经网络可例如沙漏网络(hourglass)、高分辨率网络(High-Resoultion Net,HRNet)、U形网络(U-Net)、FCN、Deeplab提供的分割网络、EspNet网络(一种轻量级卷积神经网络)等对图像数据进行处理功能的网络。本例中,可以采用分割网络来生成图像对应的热力图。这里需要说明的是,本申请实施例中提到的各深度神经网络均指经过训练后的网络,后文不再重复赘述,其中,深度神经网络网络在训练时,可将具有瞳孔点的热力图标签数据的人眼图像作为样本,损失函数可采用均方误差(mean square error,MSE)损失函数或其他损失函数进行训练,训练后的网络具有根据图像生成瞳孔的热力图的功能。Wherein, a deep neural network can be used to generate the thermal map according to the image, and the deep neural network can be, for example, an hourglass network (hourglass), a high-resolution network (High-Resoultion Net, HRNet), a U-shaped network (U-Net) , FCN, segmentation network provided by Deeplab, EspNet network (a lightweight convolutional neural network) and other networks that process image data. In this example, a segmentation network can be used to generate a heat map corresponding to an image. What needs to be explained here is that each deep neural network mentioned in the embodiment of the present application refers to the network after training, which will not be repeated in the following. The human eye image of the graph label data is used as a sample. The loss function can be trained using the mean square error (MSE) loss function or other loss functions. The trained network has the function of generating a heat map of the pupil according to the image.
S30:车辆上的处理器根据所述热力图的第一区域,确定所述瞳孔点在所述图像中的位置,其中,所述热力图的第一区域中的像素对应的概率值大于第一阈值。S30: The processor on the vehicle determines the position of the pupil point in the image according to the first area of the heat map, wherein the probability value corresponding to the pixel in the first area of the heat map is greater than the first threshold.
在一些实施例中,确定所述瞳孔点的在所述图像中的位置时,可以根据所述热力图的第一区域的中心位置,确定所述瞳孔点的在所述图像中的位置。In some embodiments, when determining the position of the pupil point in the image, the position of the pupil point in the image may be determined according to the center position of the first area of the heat map.
在一些实施例中,可以对所述人眼区域的热力图采用argmax函数求解出概率值最高的点和次高点在人眼区域的图像中的位置。当所述第一阈值为所述热力图中的概率值的次高值时,第一区域为概率值最高和次高的点构成的区域。计算瞳孔点的位置时,可以采用计算的所述区域内所述最高的点和次高的点位置的均值作为所预测的瞳孔点位置,例如预测的瞳孔点的位置可为:(∑Pai+∑Pbj)/(I+J),其中Pai表示各个最高点Pa的位置,Pbj表示各个次高点Pb的位置,I表示Pa点的数量,J表示Pb点的数量,i∈I,j∈J。一般情况下I值比较小,甚至值为1(表示最高点Pa仅一个),对此后文不再赘述。In some embodiments, the argmax function may be used to calculate the position of the point with the highest probability value and the second highest point in the image of the human eye area on the heat map of the human eye area. When the first threshold is the second highest probability value in the heat map, the first area is an area formed by points with the highest probability value and the second highest probability value. When calculating the position of the pupil point, the mean value of the highest point and the second highest point position in the calculated area can be used as the predicted pupil point position, for example, the position of the predicted pupil point can be: (∑Pai+∑ Pbj)/(I+J), where Pai represents the position of each highest point Pa, Pbj represents the position of each second highest point Pb, I represents the number of Pa points, J represents the number of Pb points, i∈I, j∈J . In general, the value of I is relatively small, even 1 (indicating that there is only one highest point Pa), which will not be described hereafter.
不难理解,在一些实施例中,也可以选择更大或更小的第一阈值,以使第一区域中的点更少或更多。例如选择更小的第一阈值,将低于次高点的第二次高点也作为计算。It is not difficult to understand that in some embodiments, a larger or smaller first threshold may also be selected, so that there are fewer or more points in the first region. For example, select a smaller first threshold, and take the second highest point lower than the second highest point as calculation.
另外,在一些实施例中,当通过所述均值的方式计算瞳孔点位置时,还可以根据 概率值进行加权,概率值越高权值越大,例如预测的瞳孔点的位置可为:(e1∑Pai+e2∑Pbj+e3∑Pck)/(e1*I+e2*J+e3*K),其中Pai表示各个最高点Pa的位置,Pbj表示各个次高点Pb的位置,Pck表示各第二次高点Pc的位置,I表示Pa点的数量,J表示Pb点的数量,K表示各Pc点的数量,i∈I,j∈J,k∈K,e1、e2、e3为权值,且e1>e2>e3。In addition, in some embodiments, when the position of the pupil point is calculated by means of the mean value, it can also be weighted according to the probability value. The higher the probability value, the greater the weight. For example, the predicted position of the pupil point can be: (e1 ∑Pai+e2∑Pbj+e3∑Pck)/(e1*I+e2*J+e3*K), where Pai represents the position of each highest point Pa, Pbj represents the position of each second highest point Pb, and Pck represents the position of each second highest point The position of the secondary high point Pc, I represents the number of Pa points, J represents the number of Pb points, K represents the number of each Pc point, i∈I, j∈J, k∈K, e1, e2, e3 are weights , and e1>e2>e3.
在另一些实施例中,还可以根据点所在位置进行加权,例如计算位置均值时,获得各个次高点位置,根据该次高点距离最高的点的距离进行加权,越远权值越低。In some other embodiments, weighting can also be performed according to the position of the point. For example, when calculating the mean value of the position, the position of each second highest point is obtained, and weighting is performed according to the distance between the second highest point and the highest point. The farther the weight is, the lower the weight is.
这里说明的是,使用均值计算瞳孔点的原因,是基于热力图反映的是各像素的区域(例如概率值最高的点对应的是多个像素的集合),而非针对单一的像素,以及argmax所求解的也涉及点的集合(例如概率值最高点的集合、次高点的集合),因此使用所述均值来求解瞳孔点的位置,另外,采用均值的方式计算,可以具有更好的鲁棒性。It is explained here that the reason for using the mean value to calculate the pupil point is based on the fact that the heat map reflects the area of each pixel (for example, the point with the highest probability value corresponds to a collection of multiple pixels), not for a single pixel, and argmax The solution also involves a set of points (such as the set of the highest probability value, the set of the second highest point), so the mean value is used to solve the position of the pupil point. Stickiness.
S40:车辆上的处理器确定所述图像中的第二区域,其中,所述第二区域在热力图中的像素对应的概率值大于第二阈值,所述第二阈值小于或等于所述第一阈值。S40: The processor on the vehicle determines the second area in the image, wherein the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first threshold a threshold.
其中,确定出图像的第二区域,用于后述步骤的深度值的计算。这样,仅采用局部的第二区域进行深度的值的计算,相对于计算整个图像来说,降低了计算量。Wherein, the second area of the image is determined for the calculation of the depth value in the step described later. In this way, only the local second region is used to calculate the depth value, which reduces the amount of calculation compared to calculating the entire image.
S50:车辆上的处理器根据所述瞳孔点的二维位置、所述瞳孔点的深度值确定所述瞳孔点的三维位置,所述瞳孔点的二维位置指所述瞳孔点在所述图像中的位置,所述瞳孔点的深度值由所述图像中的第二区域的像素的深度值确定。S50: The processor on the vehicle determines the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point. The two-dimensional position of the pupil point refers to the position of the pupil point in the image the position in the pupil point, the depth value of the pupil point is determined by the depth value of the pixel in the second area in the image.
其中,本步骤利用摄像头的内参信息,通过小孔成像原理,即可根据所述瞳孔点的二维位置、深度值求解出瞳孔点在摄像头坐标系下的三维位置(即3D坐标)。Wherein, in this step, the internal reference information of the camera is used to obtain the three-dimensional position (ie, 3D coordinates) of the pupil point in the camera coordinate system according to the two-dimensional position and depth value of the pupil point through the pinhole imaging principle.
在一些实施例中,将所述图像的第二区域的各像素深度值的均值作为瞳孔点的深度值。通过热力图的第二区域内的各像素的深度均值作为瞳孔点深度值,可以具有更好的鲁棒性。In some embodiments, the mean value of the depth values of the pixels in the second region of the image is used as the depth value of the pupil point. Using the average depth value of each pixel in the second region of the heat map as the pupil point depth value can have better robustness.
在一些实施例中,所述摄像头为RGB-D摄像头,例如RGB-D摄像头为单目深度摄像头,此时会采集到对应所述图像的各像素的深度信息,因此,在上述步骤S40确定第二区域后,直接利用该区域的各像素的深度值求均值,获得瞳孔点深度值。In some embodiments, the camera is an RGB-D camera. For example, the RGB-D camera is a monocular depth camera. At this time, the depth information corresponding to each pixel of the image will be collected. Therefore, in the above step S40, it is determined that the first After the second area, the depth value of each pixel in this area is directly used to calculate the mean value to obtain the pupil point depth value.
在一些实施例中,当采用两个摄像头采集的两图像为具有视差的图像时,可以基于视差计算出所述深度。具体的,通过两个摄像头从不同视角拍摄的两张图像,分别为第一图像和第二图像,对应的,所述热力图包括所述第一图像对应的第一热力图和所述第二图像对应的第二热力图;In some embodiments, when the two images captured by two cameras are images with parallax, the depth may be calculated based on the parallax. Specifically, two images taken from different angles of view by two cameras are respectively the first image and the second image, correspondingly, the thermal map includes the first thermal map corresponding to the first image and the second The second heat map corresponding to the image;
一方面,根据所述第一热力图的第一区域,确定所述瞳孔点在所述第一图像中的位置;In one aspect, determining the position of the pupil point in the first image according to the first area of the first heat map;
另一方面,确定所述第一图像中的第二区域,和,确定所述第二图像中的第二区域;由所述第一图像中的第二区域的图像和所述第二图像中的第二区域的图像对应的视差可确定出所述第一图像中的所述第二区域的像素的深度值In another aspect, determining a second region in the first image, and determining a second region in the second image; from the image of the second region in the first image and the second region in the second image The disparity corresponding to the image of the second area can determine the depth value of the pixel of the second area in the first image
然后,所述瞳孔点的深度值由所述第一图像中的第二区域的像素的深度值进行确定。Then, the depth value of the pupil point is determined from the depth value of the pixels in the second region in the first image.
其中,根据两张图像的视差与深度关系,即根据下述公式(1)可求得对于所拍 摄的空间中的物体的某点三维坐标为P(x,y,z)的深度Z:Among them, according to the relationship between the parallax and the depth of the two images, that is, according to the following formula (1), the depth Z of a point whose three-dimensional coordinates are P(x, y, z) for a certain point in the captured space can be obtained:
Z=fT x/d      (1) Z=fT x /d (1)
其中,d用于表示视差,d=(X L-X R),X L和X R用于表示该物体在两个不同位置的相机的像平面的成像坐标,f用于表示相机焦距,T x用于表示基线(两相机光轴之间的距离)。 Among them, d is used to represent parallax, d=(X L -X R ), X L and X R are used to represent the imaging coordinates of the image plane of the camera at two different positions of the object, f is used to represent the focal length of the camera, T x is used to represent the baseline (the distance between the optical axes of the two cameras).
对于使用双目摄像头的情况下,使用热力图作为双目深度估计的指引,仅仅搜索匹配热力图的位置区域,即第二区域,大大降低计算量;此外,还能在一定程度上缓解瞳孔区域纹理特征不明显,图像相似匹配困难的缺陷。For the case of using a binocular camera, use the heat map as a guide for binocular depth estimation, and only search for the location area matching the heat map, that is, the second area, which greatly reduces the amount of calculation; in addition, it can also alleviate the pupil area to a certain extent. The texture features are not obvious, and the defects of image similarity and matching are difficult.
另外,采用两个摄像头时,还可包括:对采集的所述第一图像、第二图像进行图像校正,包括下列操作中的一个或多个:图像去畸变,图像位置调整,图像裁剪等图像处理操作。通过图像校正,以获得左右对应极线平行的双目图像。另一些实施例中,当采用一个摄像头时,也包括上述图像校正步骤。In addition, when two cameras are used, it may also include: performing image correction on the collected first image and the second image, including one or more of the following operations: image de-distortion, image position adjustment, image cropping, etc. Processing operations. Through image correction, a binocular image in which the left and right corresponding epipolar lines are parallel is obtained. In some other embodiments, when a camera is used, the above image correction step is also included.
在一些实施例中,步骤S10中所述针对所述图像生成所述瞳孔的热力图包括:从包括人脸瞳孔的图像中提取人脸图像;从所述人脸图像中识别人眼图像;根据所述人眼图像生成所述瞳孔的热力图。通过该方式,集中人眼图像去生成瞳孔热力图,即对图像先进行剪裁处理,可减小热力图生成的数据处理量。In some embodiments, generating the thermal map of the pupil for the image in step S10 includes: extracting a human face image from an image including a human face pupil; identifying a human eye image from the human face image; according to The human eye image generates a heat map of the pupil. In this way, the human eye images are concentrated to generate the pupil heat map, that is, the image is first clipped, which can reduce the amount of data processing for heat map generation.
其中,在提取人脸图像、或识别人眼图像时,可以分别采用深度神经网络实现,也可以使用一个深度神经网络直接从采集的图像中识别出人眼图像。所使用的深度神经网络例如沙漏网络(hourglass)、HRNet、U-Net、FCN、Deeplab提供的分割网络、EspNet网络等等。Wherein, when extracting a human face image or recognizing a human eye image, a deep neural network can be used to realize it respectively, or a deep neural network can be used to directly recognize the human eye image from the collected images. The deep neural network used is such as hourglass network (hourglass), HRNet, U-Net, FCN, segmentation network provided by Deeplab, EspNet network and so on.
下面,对本申请实施例提供的瞳孔位置的确定方法的第一具体实施方式进行介绍。本具体实施方式仍以应用于车内场景为例进行说明,在该车内场景中,车辆A柱上安置两个型号相同的摄像头,并且安装时尽量水平平行安装,使得两个摄像头采集的两个图像的水平极线尽可能平行。Next, the first specific implementation manner of the method for determining the pupil position provided in the embodiment of the present application will be introduced. This specific embodiment is still described by taking the application in the scene in the car as an example. In the scene in the car, two cameras of the same model are placed on the A-pillar of the vehicle, and they are installed horizontally and parallel as much as possible during installation, so that the two cameras collected by the two cameras The horizontal epipolar lines of each image are as parallel as possible.
这两个摄像头可同时为RGB摄像头,也可以同时为IR摄像头,本具体实施方式中为两个红外摄像头,并对两个摄像头做拍照同步处理,以使得两个摄像头可同步采集驾驶员的图像,避免因为时间误差造成图像匹配误差。这两个摄像头可以用于执行本具体实施方式中的下述步骤S210。These two cameras can be RGB camera at the same time, also can be IR camera at the same time, be two infrared cameras in this specific embodiment, and carry out photographing synchronous processing to two cameras, so that two cameras can synchronously gather the driver's image , to avoid image matching errors caused by time errors. These two cameras can be used to execute the following step S210 in this specific embodiment.
对所采集的图像的处理过程(即根据采集的图像进行瞳孔3D坐标的估计过程),例如下述步骤S215-S265的执行过程,可以由车、车载装置、或处理装置(如处理器、处理芯片等)执行。例如,可以由车辆的电子控制单元(Electronic Control Unit,ECU)执行。另一些具体实施方式中,也可以通过与ECU通讯的智能设备(如手机、PAD等)或云端服务器执行,该情况下,可由ECU将摄像头采集的图像数据传输给智能设备或云端服务器。另一些具体实施方式中,还可以由ECU与所述智能设备或与云端服务器配合完成,该情况下,可由ECU执行部分步骤,可由智能设备或云端服务器执行部分步骤,例如下述步骤S225-S235涉及使用深度神经网络,可以由算力较高的智能设备或云端服务器执行,下述步骤S215-S220和步骤S240-S265以由ECU执行 举例,不难理解,步骤的分配并不限于上述方式。The processing of the collected images (i.e. the process of estimating the 3D coordinates of the pupil according to the collected images), such as the following steps S215-S265, can be performed by a car, a vehicle-mounted device, or a processing device (such as a processor, processing chip, etc.) implementation. For example, it can be executed by the electronic control unit (Electronic Control Unit, ECU) of the vehicle. In some other specific embodiments, it can also be executed by a smart device (such as a mobile phone, a PAD, etc.) or a cloud server communicating with the ECU. In this case, the ECU can transmit the image data collected by the camera to the smart device or the cloud server. In some other specific implementation manners, it can also be completed by cooperation between the ECU and the smart device or the cloud server. In this case, some steps can be performed by the ECU, and some steps can be performed by the smart device or the cloud server, such as the following steps S225-S235 Involving the use of deep neural networks can be executed by smart devices or cloud servers with high computing power. The following steps S215-S220 and steps S240-S265 are executed by ECU as an example. It is not difficult to understand that the distribution of steps is not limited to the above-mentioned methods.
下面参照图3示出的流程图,本申请实施例提供的瞳孔位置的确定方法的第一具体实施方式包括以下步骤:Referring to the flow chart shown in Figure 3 below, the first specific implementation of the method for determining the pupil position provided by the embodiment of the present application includes the following steps:
S210:两个摄像头进行同步拍照,获得左右不同视角的左右两张图像,即图像对。S210: The two cameras take pictures synchronously, and obtain two left and right images with different perspectives on the left and right, that is, an image pair.
S215-S220:对获得的图像对中的左右两张图像进行图像校正。S215-S220: Perform image correction on the left and right images in the obtained image pair.
由于两摄像头的光轴之间无法保证严格平行,受摆放位置、角度等因素的影响,存在一定的偏差,因此需对获得的图像进行图像校正。图像校正是利用预先标定的两摄像头的内参、两摄像头之间的外参对采集得到的左右图像对进行校正,包括图像去畸变,位置调整和裁剪等,得到左右对应水平极线平行的双目图像。其中内参包括左右摄像头的主点,左右摄像头的畸变向量,外参数包括摄像头之间的旋转矩阵和平移矩阵。内参和外参在摄像头标定时确定。其中的旋转矩阵和平移矩阵也可以称为单应矩阵(Homography)Since the optical axes of the two cameras cannot be guaranteed to be strictly parallel, there is a certain deviation due to the influence of factors such as placement position and angle, so it is necessary to perform image correction on the obtained image. Image correction is to use the pre-calibrated internal parameters of the two cameras and the external parameters between the two cameras to correct the left and right image pairs collected, including image de-distortion, position adjustment and cropping, etc., to obtain binocular images that are parallel to the left and right horizontal epipolar lines image. The internal parameters include the principal points of the left and right cameras, the distortion vectors of the left and right cameras, and the external parameters include the rotation matrix and translation matrix between the cameras. Internal and external parameters are determined during camera calibration. The rotation matrix and translation matrix can also be called the homography matrix (Homography)
S225:在获得图像校正后的左右图像后,分别对两图像进行人脸检测和提取,若检测到人脸则提取人脸图像,并执行下一步;否则结束本次流程,并可返回步骤S210,执行下一次的流程。S225: After obtaining the corrected left and right images, perform face detection and extraction on the two images respectively, if a face is detected, extract the face image, and execute the next step; otherwise, end this process and return to step S210 , execute the next process.
其中可通过深度神经网络实现人脸检测和人脸图像的提取,例如使用卷积神经网络(CNN)、区域选取网络/提取候选框网络(Region Proposal Network,RPN)、全卷积网络(Fully Convolutional Networks,FCN)、CNN区域提前网络(Regions with CNN features,RCNN)等实现人脸的检测和提取。Among them, face detection and face image extraction can be realized through deep neural network, such as using convolutional neural network (CNN), region selection network/extraction candidate frame network (Region Proposal Network, RPN), fully convolutional network (Fully Convolutional Networks, FCN), CNN region advance network (Regions with CNN features, RCNN), etc. to achieve face detection and extraction.
S230:针对提取的人脸图像,采用人眼检测算法进行人眼的检测,若能够检测到人眼,则继续执行下一步;否则结束本次流程,并可返回步骤S210,执行下一次的流程。S230: For the extracted face image, use the human eye detection algorithm to detect the human eye, if the human eye can be detected, proceed to the next step; otherwise, end this process and return to step S210 to execute the next process .
人眼检测算法可以根据人眼的几何特征,灰度特征等通过图像算法来检测,或者通过深度神经网络实现人眼检测,例如深度神经网络为CNN网络、RPN网络、FCN网络等等。The human eye detection algorithm can be detected through image algorithms based on the geometric features and grayscale features of the human eye, or human eye detection can be realized through a deep neural network. For example, the deep neural network is a CNN network, RPN network, FCN network, etc.
S235:在检测到人眼的情况下,使用分割网络,对人眼区域进行人眼瞳孔点的热力图(heatmap)预测,获得人眼瞳孔热力图。如图4示出了该热力图的示意图。S235: When the human eye is detected, use the segmentation network to predict the heat map of the human eye pupil point for the human eye area, and obtain the human eye pupil heat map. A schematic diagram of the heat map is shown in FIG. 4 .
由于眼睛部位的图像比较小,数据量少,因此利用分割网络可以高效的完成热力图的预测。而热力图表示的是瞳孔点的概率分布,因此对遮挡、光照、眼球大姿态等情况会更加鲁棒,这样能够在一定程度上解决2D瞳孔位置定位效果不够鲁棒导致的识别不够准确的问题。Since the image of the eye part is relatively small and the amount of data is small, the prediction of the heat map can be efficiently completed by using the segmentation network. The heat map represents the probability distribution of pupil points, so it will be more robust to situations such as occlusion, lighting, and large eye postures. This can solve the problem of inaccurate recognition caused by insufficient robustness of 2D pupil position positioning to a certain extent. .
S240:获得人眼区域的针对瞳孔点的热力图后,一方面,针对其中的一张人脸图像(该人脸图像指在步骤S220校正后的图像),对所述人眼区域的热力图采用argmax函数求解出概率值,将概率值次高点对应的值作为第一阈值,确定出概率值最高的点和次高点在人眼区域的图像中的位置,然后采用所述位置的均值作为所预测的瞳孔点位置。S240: After obtaining the heat map for the pupil point of the human eye area, on the one hand, for one of the face images (the face image refers to the image corrected in step S220), the heat map of the human eye area Use the argmax function to solve the probability value, use the value corresponding to the second highest point of the probability value as the first threshold, determine the position of the point with the highest probability value and the second highest point in the image of the human eye area, and then use the mean value of the positions as the predicted pupil point position.
S245:根据瞳孔点的位置,反推出瞳孔在原图中的原始的2D坐标。S245: Deduce the original 2D coordinates of the pupil in the original image according to the position of the pupil point.
即根据原图与所提取的人脸区域的位置的对应关系、人脸区域中识别的人眼区域的对应关系,反推出瞳孔点在原图中的2D坐标。That is, the 2D coordinates of the pupil point in the original image are reversely deduced according to the corresponding relationship between the original image and the extracted position of the face area, and the corresponding relationship between the human eye area identified in the face area.
S250:另一方面,分别针对两张人眼区域热力图,以第二阈值对热力图进行二值化,例如第二阈值可以为热力图中概率值的次高点对应的值,通过所述二值化可以分别获得左右两张热力图的区域A和区域A’。其中该区域A和区域A’对应瞳孔点及周围的区域。S250: On the other hand, for the two thermal maps of the human eye area, binarize the thermal maps with the second threshold, for example, the second threshold may be the value corresponding to the second highest point of the probability value in the thermal map, through the Binarization can obtain the area A and area A' of the left and right heat maps respectively. Wherein the area A and the area A' correspond to the pupil point and the surrounding area.
S255:由所述左右两张热力图的区域A和区域A’,分别反推出在人脸图像(该人脸图像指在步骤S220校正后的图像)中的区域。然后,对左右两张人脸图像(该人脸图像指在步骤S220校正后的图像)中的区域A和区域A’内的图像,通过立体图像的视差匹配算法求出视差。S255: From the region A and region A' of the two left and right heat maps, deduce the regions in the face image (the face image refers to the image corrected in step S220) respectively. Then, for the images in the area A and the area A' in the two left and right face images (the face image refers to the image corrected in step S220), the parallax is obtained by the parallax matching algorithm of the stereoscopic image.
S260:根据所述视差、摄像头的内参信息求出所述区域A(也即区域A’)内各个像素的深度信息,对所述区域内的各深度信息取均值,作为瞳孔点的深度。S260: Calculate the depth information of each pixel in the area A (that is, area A') according to the parallax and the internal reference information of the camera, and take the average value of each depth information in the area as the depth of the pupil point.
由上步骤S250-S260可以看出,本申请的具体实施方式由于使用热力图,可以直接获得瞳孔的大致区域,仅针对热力图确定的瞳孔大致区域A和区域A’内的图像求解视差,即可获得瞳孔区域的深度信息,也就是说,将对整张图像的相似搜索匹配简化为瞳孔区域的相似搜索匹配,这在一方面极大地降低了计算量,另一方面也在一定程度上缓解了虹膜和瞳孔部位纹理特征不够明显的影响,为搜索匹配这个区域提供了指引。It can be seen from the above steps S250-S260 that the specific embodiment of the present application can directly obtain the approximate area of the pupil due to the use of the heat map, and only solve the parallax for the images in the approximate area A and area A' of the pupil determined by the heat map, that is The depth information of the pupil area can be obtained, that is to say, the similarity search and matching of the entire image is simplified to the similarity search and matching of the pupil area, which greatly reduces the amount of calculation on the one hand, and eases the problem to a certain extent on the other hand. The influence of texture features in the iris and pupil is not obvious enough, and it provides a guideline for searching and matching this area.
S265:将步骤S245获得的瞳孔的2D坐标、步骤S260获得的瞳孔的深度信息,利用摄像头的内参信息,通过小孔成像原理求解出瞳孔在摄像头坐标系下的3D位置。S265: Using the 2D coordinates of the pupil obtained in step S245 and the depth information of the pupil obtained in step S260, using the internal reference information of the camera, the 3D position of the pupil in the camera coordinate system is obtained through the principle of pinhole imaging.
如图5所示,本申请实施例还提供了相应的一种瞳孔位置的确定装置,关于该装置的有益效果或解决的技术问题,可以参见与各装置分别对应的方法中的描述,或者参见发明内容中的描述,此处仅进行简述。该实施例中的瞳孔位置的确定装置,可以用于实现上述的瞳孔位置的确定方法中的各可选实施例。As shown in Figure 5, the embodiment of the present application also provides a corresponding device for determining the pupil position. Regarding the beneficial effects or technical problems solved by the device, you can refer to the descriptions in the methods corresponding to each device, or refer to The description in the summary of the invention is only briefly described here. The device for determining the pupil position in this embodiment can be used to implement various optional embodiments of the above-mentioned method for determining the pupil position.
如图5所示,该瞳孔位置的确定装置100可以用于执行上述瞳孔外置的确定方法,该瞳孔位置的确定装置100具有获取模块110、处理模块120。其中:As shown in FIG. 5 , the device 100 for determining the pupil position can be used to implement the method for determining the external pupil, and the device 100 for determining the pupil position has an acquisition module 110 and a processing module 120 . in:
获取模块110用于获取包括瞳孔的图像。具体的,获取模块110可以用于执行上述瞳孔位置的确定方法中的步骤S10以及其中的示例。The acquiring module 110 is used for acquiring images including pupils. Specifically, the acquisition module 110 may be used to execute step S10 and examples thereof in the method for determining the pupil position above.
处理模块120用于获取所述图像对应的用于表示瞳孔点在所述图像中的概率分布的热力图,还用于根据所述热力图的上述第一区域,确定所述瞳孔点在所述图像中的位置,还用于确定所述图像中的上述第二区域,还用于根据所述瞳孔点在所述图像中的二维位置、由所述图像中的第二区域的像素的深度值确定的瞳孔点的深度值,来确定所述瞳孔点的三维位置。具体的,处理模块120可以用于执行上述瞳孔位置的确定方法中的步骤S20-S50中的任一步骤以及其中任一可选的示例。具体可参见方法实施例中的详细描述,此处不做赘述。The processing module 120 is used to obtain a heat map corresponding to the image and used to represent the probability distribution of the pupil point in the image, and is also used to determine the pupil point in the The position in the image is also used to determine the above-mentioned second area in the image, and is also used to determine the depth of the pixels in the second area in the image according to the two-dimensional position of the pupil point in the image The value determines the depth value of the pupil point to determine the three-dimensional position of the pupil point. Specifically, the processing module 120 may be used to execute any one of steps S20-S50 in the method for determining the pupil position and any optional example thereof. For details, refer to the detailed description in the method embodiments, and details are not repeated here.
在一些实施例中,所述处理模块120在用于所述根据所述热力图的第一区域,确定所述瞳孔点的在所述图像中的位置时,具体用于:根据所述热力图的第一区域的中心位置,确定所述瞳孔点的在所述图像中的位置。具体的,这种情况下,处理模块120具体用于执行上述瞳孔位置的确定方法中的步骤S30中的任一步骤以及其中任一可选的示例。在其他一些实施例中,可以采用通过均值的方式计算瞳孔点位置,还可以根 据概率值进行加权,或者根据点所在位置进行加权。In some embodiments, when the processing module 120 is used to determine the position of the pupil point in the image according to the first region of the heat map, it is specifically used to: according to the heat map The center position of the first area of , and determine the position of the pupil point in the image. Specifically, in this case, the processing module 120 is specifically configured to execute any step in step S30 in the method for determining the pupil position and any optional example thereof. In some other embodiments, the position of the pupil point can be calculated by means of the mean value, and can also be weighted according to the probability value, or weighted according to the position of the point.
在一些实施例中,所述第一阈值为所述热力图中的概率值的次高值。在其他一些实施例中,也可以选择更大或更小的第一阈值,以使第一区域中的点更少或更多。In some embodiments, the first threshold is the second highest value of the probability values in the heat map. In some other embodiments, a larger or smaller first threshold may also be selected, so that there are fewer or more points in the first region.
在一些实施例中,所述瞳孔点的深度值由所述图像的第二区域的像素的深度值确定包括:根据所述图像中的第二区域的像素的深度值的均值确定所述瞳孔点的深度值。In some embodiments, the determining the depth value of the pupil point from the depth value of the pixels in the second area of the image comprises: determining the pupil point according to the mean value of the depth values of the pixels in the second area in the image the depth value.
在一些实施例中,获取的包括瞳孔的图像包括第一图像和第二图像,并且,第一图像和第二图像为从不同视角拍摄的两张图像。对应的,热力图包括第一图像对应的第一热力图和第二图像对应的第二热力图;在这种情况下,当处理模块用于所述根据所述热力图的第一区域,确定所述瞳孔点在所述图像中的位置时,具体用于:根据第一热力图的第一区域,确定瞳孔点在第一图像中的位置;处理模块用于确定所述图像中的第二区域时,具体用于:确定第一图像中的第二区域,和,确定第二图像中的第二区域;这种情况下,瞳孔点的深度值由所述图像中的第二区域的像素深度值确定,具体包括:所述瞳孔点的深度值由第一图像中的第二区域的像素的深度值确定,第一图像中的第二区域的像素的深度值由第一图像中的第二区域的图像和第二图像中的第二区域的图像对应的视差确定。In some embodiments, the acquired image including the pupil includes a first image and a second image, and the first image and the second image are two images taken from different viewing angles. Correspondingly, the heat map includes a first heat map corresponding to the first image and a second heat map corresponding to the second image; in this case, when the processing module is used for the first region according to the heat map, it is determined The position of the pupil point in the image is specifically used to: determine the position of the pupil point in the first image according to the first region of the first heat map; the processing module is used to determine the second position of the pupil point in the image area, it is specifically used to: determine the second area in the first image, and determine the second area in the second image; in this case, the depth value of the pupil point is determined by the pixels of the second area in the image The determination of the depth value specifically includes: the depth value of the pupil point is determined by the depth value of the pixels in the second area in the first image, and the depth value of the pixels in the second area in the first image is determined by the depth value of the pixels in the second area in the first image. The disparity corresponding to the image of the second area and the image of the second area in the second image is determined.
在一些实施例中,所述处理模块还用于:对第一图像、第二图像进行图像校正,其中,图像校正包括至少以下之一:图像去畸变,图像位置调整,图像裁剪。In some embodiments, the processing module is further configured to: perform image correction on the first image and the second image, wherein the image correction includes at least one of the following: image de-distortion, image position adjustment, and image cropping.
在一些实施例中,所述处理模块用于获取所述图像对应的热力图时,具体用于:从所述图像中获取人眼图像;根据所述人眼图像获取所述热力图。In some embodiments, when the processing module is used to obtain the heat map corresponding to the image, it is specifically used to: obtain a human eye image from the image; obtain the heat map according to the human eye image.
应理解的是,本申请实施例中的瞳孔位置的确定装置100可以由软件实现,例如可以由具有上述功能计算机程序或指令来实现,相应计算机程序或指令可以存储在终端内部的存储器中,通过处理器读取该存储器内部的相应计算机程序或指令来实现上述功能。或者,本申请实施例的瞳孔位置的确定装置100还可以由硬件来实现,例如,确定装置100的获取模块110可以由车辆上的摄像头实现,或者,获取模块110也可以由处理器与车辆上的摄像头之间的接口电路来实现。确定装置100的处理模块120可以由车辆上的处理装置实现,例如车机或车载电脑等车载处理装置的处理器实现,又或者处理模块120还可以由手机或平板等终端实现。或者,本申请实施例中的瞳孔位置的确定装置100还可以由处理器和软件模块的结合实现。It should be understood that the apparatus 100 for determining the pupil position in the embodiment of the present application can be implemented by software, for example, it can be implemented by a computer program or instruction having the above functions, and the corresponding computer program or instruction can be stored in the internal memory of the terminal, through The processor reads the corresponding computer programs or instructions inside the memory to realize the above functions. Alternatively, the device 100 for determining the pupil position in the embodiment of the present application can also be implemented by hardware, for example, the acquisition module 110 of the determination device 100 can be implemented by a camera on the vehicle, or the acquisition module 110 can also be implemented by a processor and the vehicle. The interface circuit between the cameras is realized. The processing module 120 of the determination device 100 can be realized by a processing device on the vehicle, such as a processor of a vehicle processing device such as a vehicle machine or a vehicle-mounted computer, or the processing module 120 can also be realized by a terminal such as a mobile phone or a tablet. Alternatively, the apparatus 100 for determining the pupil position in the embodiment of the present application may also be implemented by a combination of a processor and a software module.
应理解,本申请实施例中的装置或模块的处理细节可以参考图1a-图3所示的实施例及相关扩展实施例的相关表述,本申请实施例将不再重复赘述。It should be understood that the processing details of the devices or modules in the embodiments of the present application can refer to the related expressions of the embodiments shown in FIG. 1a-FIG. 3 and related extended embodiments, and the embodiments of the present application will not be repeated.
另外,本申请实施例还提供了具有上述瞳孔位置的确定装置的车辆,该车辆可以是家用轿车或载货汽车等,还可以是特种车辆,例如救护车、消防车、警车或工程抢险车等。其中,上述瞳孔位置的确定装置的各模块可以采用预装或后装的形式布置于车辆***中,其中各模块之间可依赖于车辆的总线或接口电路进行数据交互,或者随着无线技术的发展,各模块之间还可以采用无线的通信方式进行数据交互,以消除布线带来的不便。In addition, the embodiment of the present application also provides a vehicle with the above-mentioned device for determining the pupil position. The vehicle can be a family car or a truck, etc., or it can be a special vehicle, such as an ambulance, a fire engine, a police car or an engineering emergency vehicle, etc. . Wherein, each module of the above-mentioned device for determining the pupil position can be arranged in the vehicle system in the form of pre-installation or after-installation, wherein each module can rely on the bus or interface circuit of the vehicle for data interaction, or with the development of wireless technology With the development, each module can also use wireless communication for data interaction to eliminate the inconvenience caused by wiring.
本申请实施例还提供了一种电子装置,包括:处理器,以及存储器,其上存储有程序指令,程序指令当被处理器执行时使得处理器执行图2对应的实施例的方法,或其中的各可选实施例,或图3对应的具体实施方式的方法中,或其中的各可选实施例。 图6是本申请实施例提供的一种电子装置600的结构性示意性图。该电子装置600包括:处理器610、存储器620。The embodiment of the present application also provides an electronic device, including: a processor, and a memory, on which program instructions are stored, and when the program instructions are executed by the processor, the processor executes the method of the embodiment corresponding to FIG. 2 , or wherein In each optional embodiment of , or in the method of the specific implementation manner corresponding to FIG. 3 , or in each optional embodiment therein. FIG. 6 is a schematic structural diagram of an electronic device 600 provided by an embodiment of the present application. The electronic device 600 includes: a processor 610 and a memory 620 .
应理解,图6中所示的电子装置600中还可包括通信接口630,可以用于与其他设备之间进行通信。It should be understood that the electronic device 600 shown in FIG. 6 may further include a communication interface 630, which may be used for communication with other devices.
其中,该处理器610可以与存储器620连接。该存储器620可以用于存储该程序代码和数据。因此,该存储器620可以是处理器610内部的存储单元,也可以是与处理器610独立的外部存储单元,还可以是包括处理器610内部的存储单元和与处理器610独立的外部存储单元的部件。Wherein, the processor 610 may be connected to the memory 620 . The memory 620 can be used to store the program codes and data. Therefore, the memory 620 may be a storage unit inside the processor 610, or an external storage unit independent of the processor 610, or may include a storage unit inside the processor 610 and an external storage unit independent of the processor 610. part.
可选的,电子装置600还可以包括总线。其中,存储器620、通信接口630可以通过总线与处理器610连接。总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。Optionally, the electronic device 600 may also include a bus. Wherein, the memory 620 and the communication interface 630 may be connected to the processor 610 through a bus. The bus may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on.
应理解,在本申请实施例中,该处理器610可以采用中央处理单元(central processing unit,CPU)。该处理器还可以是其它通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(Application specific integrated circuit,ASIC)、现成可编程门矩阵(field programmable gate Array,FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。或者该处理器610采用一个或多个集成电路,用于执行相关程序,以实现本申请实施例所提供的技术方案。It should be understood that, in this embodiment of the present application, the processor 610 may be a central processing unit (central processing unit, CPU). The processor can also be other general-purpose processors, digital signal processors (digital signal processors, DSPs), application specific integrated circuits (Application specific integrated circuits, ASICs), off-the-shelf programmable gate arrays (field programmable gate arrays, FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. Alternatively, the processor 610 adopts one or more integrated circuits for executing related programs, so as to implement the technical solutions provided by the embodiments of the present application.
该存储器620可以包括只读存储器和随机存取存储器,并向处理器610提供指令和数据。处理器610的一部分还可以包括非易失性随机存取存储器。例如,处理器610还可以存储设备类型的信息。The memory 620 may include read-only memory and random-access memory, and provides instructions and data to the processor 610 . A portion of processor 610 may also include non-volatile random access memory. For example, processor 610 may also store device type information.
在电子装置600运行时,所述处理器610执行所述存储器620中的计算机执行指令执行上述瞳孔位置的确定方法的操作步骤,例如执行图2对应的实施例的方法,或其中的各可选实施例,或图3对应的具体实施方式的方法,或其中的各可选实施例。When the electronic device 600 is running, the processor 610 executes the computer-executed instructions in the memory 620 to perform the operation steps of the method for determining the pupil position, for example, the method in the embodiment corresponding to FIG. embodiment, or the method in the specific implementation manner corresponding to FIG. 3 , or each optional embodiment therein.
应理解,根据本申请实施例的电子装置600可以对应于执行根据本申请各实施例的方法中的相应主体,并且电子装置600中的各个模块的上述和其它操作和/或功能分别为了实现本实施例各方法的相应流程,为了简洁,在此不再赘述。It should be understood that the electronic device 600 according to the embodiment of the present application may correspond to a corresponding subject performing the methods according to the various embodiments of the present application, and the above-mentioned and other operations and/or functions of the modules in the electronic device 600 are for realizing the present invention For the sake of brevity, the corresponding processes of the methods in the embodiments are not repeated here.
本申请实施例还提供了另一种电子装置,如图7所示的该实施例提供的另一种电子装置700的结构性示意性图,包括:处理器710,以及接口电路720,其中,处理器710通过接口电路720访问存储器,存储器存储有程序指令,程序指令当被处理器执行时使得处理器执行图2对应的实施例的方法,或其中的各可选实施例,或图3对应的具体实施方式的方法,或其中的各可选实施例。另外,该电子装置还可包括通信接口、总线等,具体可参见图6所示的实施例中的介绍,不再赘述。The embodiment of the present application also provides another electronic device, as shown in FIG. 7 , which is a schematic structural diagram of another electronic device 700 provided in this embodiment, including: a processor 710, and an interface circuit 720, wherein, The processor 710 accesses the memory through the interface circuit 720, and the memory stores program instructions. When the program instructions are executed by the processor, the processor executes the method of the embodiment corresponding to FIG. 2, or each optional embodiment thereof, or the method corresponding to FIG. 3 The method of the specific embodiment, or each alternative embodiment thereof. In addition, the electronic device may further include a communication interface, a bus, etc. For details, refer to the introduction in the embodiment shown in FIG. 6 , and details are not repeated here.
本申请实施例还提供了一种确定瞳孔位置的***800,如图8所示,该确定瞳孔位置的***800包括:图像采集装置810,以及与该图像采集装置810耦合的电子装置,该电子装置可以是如图6示出的电子装置600,也可以是如图7示出的电子装置700。图像采集装置810可以是RGB-D摄像头,也可以是双目摄像头,用于采集包括瞳孔的图像提供给电子装置,以使该电子装置执行图2对应的实施例的方法,或其中 的各可选实施例,或图3对应的具体实施方式的方法中,或其中的各可选实施例。The embodiment of the present application also provides a system 800 for determining the position of the pupil. As shown in FIG. The device may be the electronic device 600 as shown in FIG. 6 , or the electronic device 700 as shown in FIG. 7 . The image acquisition device 810 may be an RGB-D camera or a binocular camera, which is used to collect images including pupils and provide them to the electronic device, so that the electronic device executes the method of the embodiment corresponding to FIG. 2 , or each of them may Selected embodiments, or the method in the specific implementation manner corresponding to FIG. 3 , or each optional embodiment therein.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时用于执行上述瞳孔位置的确定方法,该方法包括上述各个实施例所描述的方案中的至少之一。The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, it is used to execute the method for determining the position of the pupil. The method includes the solutions described in the above-mentioned embodiments at least one of the .
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是,但不限于,电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质, 该程序可以被指令执行***、装置或者器件使用或者与其结合使用。The computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。A computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括、但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider). connect).
其中,说明书和权利要求书中的词语“第一、第二、第三等”或模块A、模块B、模块C等类似用语,仅用于区别类似的对象,不代表针对对象的特定排序,可以理解地,在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。Among them, the words "first, second, third, etc." or similar terms such as module A, module B, and module C in the description and claims are only used to distinguish similar objects, and do not represent a specific ordering of objects. It is understood that where permitted, the specific order or sequence can be interchanged so that the embodiments of the application described herein can be practiced in other sequences than illustrated or described herein.
在以上的描述中,所涉及的表示步骤的标号,如S110、S120……等,并不表示一定会按此步骤执行,在允许的情况下可以互换前后步骤的顺序,或同时执行。In the above description, the referenced numbers representing the steps, such as S110, S120, etc., do not necessarily mean that the steps will be executed, and the order of the preceding and following steps can be interchanged or executed simultaneously if allowed.
说明书和权利要求书中使用的术语“包括”不应解释为限制于其后列出的内容;它不排除其它的元件或步骤。因此,其应当诠释为指定所提到的所述特征、整体、步骤或部件的存在,但并不排除存在或添加一个或更多其它特征、整体、步骤或部件及其组群。因此,表述“包括装置A和B的设备”不应局限为仅由部件A和B组成的设备。The term "comprising" used in the description and claims should not be interpreted as being restricted to what is listed thereafter; it does not exclude other elements or steps. Therefore, it should be interpreted as specifying the presence of said features, integers, steps or components, but not excluding the presence or addition of one or more other features, integers, steps or components and groups thereof. Therefore, the expression "apparatus comprising means A and B" should not be limited to an apparatus consisting of parts A and B only.
本说明书中提到的“一个实施例”或“实施例”意味着与该实施例结合描述的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在本说明书各处出现的用语“在一个实施例中”或“在实施例中”并不一定都指同一实施例,但可以指同一实施例。此外,在一个或多个实施例中,能够以任何适当的方式组合各特定特征、结构或特性,如从本公开对本领域的普通技术人员显而易见的那样。Reference in this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places in this specification do not necessarily all refer to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.
注意,上述仅为本申请的较佳实施例及所运用的技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请的构思的情况下,还可以包括更多其他等效实施例,均属于本申请的保护范畴。Note that the above are only preferred embodiments and technical principles used in this application. Those skilled in the art will understand that the present application is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present application. Therefore, although the present application has been described in detail through the above embodiments, the present application is not limited to the above embodiments, and may include more other equivalent embodiments without departing from the concept of the present application, all of which belong to protection scope of this application.

Claims (19)

  1. 一种瞳孔位置的确定方法,其特征在于,包括:A method for determining a pupil position, comprising:
    获取包括瞳孔的图像;Obtain an image including the pupil;
    获取所述图像对应的热力图,其中,所述热力图用于表示瞳孔点在所述图像中的概率分布,所述瞳孔点为所述瞳孔的中心点;Acquiring a thermal map corresponding to the image, wherein the thermal map is used to represent the probability distribution of pupil points in the image, and the pupil point is the center point of the pupil;
    根据所述热力图的第一区域,确定所述瞳孔点在所述图像中的位置,其中,所述热力图的第一区域中的像素对应的概率值大于第一阈值;Determine the position of the pupil point in the image according to the first area of the heat map, wherein the probability value corresponding to the pixel in the first area of the heat map is greater than a first threshold;
    确定所述图像中的第二区域,其中,所述第二区域在热力图中的像素对应的概率值大于第二阈值,所述第二阈值小于或等于所述第一阈值;Determining a second area in the image, wherein the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first threshold;
    根据所述瞳孔点的二维位置、所述瞳孔点的深度值确定所述瞳孔点的三维位置,其中,所述瞳孔点的二维位置指所述瞳孔点在所述图像中的位置,所述瞳孔点的深度值由所述图像中的第二区域的像素的深度值确定。Determine the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point, wherein the two-dimensional position of the pupil point refers to the position of the pupil point in the image, so The depth value of the pupil point is determined by the depth value of the pixels of the second region in the image.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述热力图的第一区域,确定所述瞳孔点的在所述图像中的位置,包括:The method according to claim 1, wherein the determining the position of the pupil point in the image according to the first region of the heat map comprises:
    根据所述热力图的第一区域的中心位置,确定所述瞳孔点的在所述图像中的位置。The position of the pupil point in the image is determined according to the center position of the first area of the heat map.
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一阈值为所述热力图中的概率值的次高值。The method according to claim 1 or 2, wherein the first threshold is the second highest value of the probability values in the heat map.
  4. 根据权利要求1所述的方法,其特征在于,所述瞳孔点的深度值由所述图像的第二区域的像素的深度值确定包括:根据所述图像中的第二区域的像素的深度值的均值确定所述瞳孔点的深度值。The method according to claim 1, wherein the determining the depth value of the pupil point by the depth value of the pixels in the second area of the image comprises: according to the depth value of the pixels in the second area in the image The mean value of determines the depth value of the pupil point.
  5. 根据权利要求1或4所述的方法,其特征在于,所述图像包括第一图像和第二图像,所述第一图像和所述第二图像为从不同视角拍摄的两张图像,所述热力图包括所述第一图像对应的第一热力图和所述第二图像对应的第二热力图;The method according to claim 1 or 4, wherein the images include a first image and a second image, the first image and the second image are two images taken from different angles of view, the The thermal map includes a first thermal map corresponding to the first image and a second thermal map corresponding to the second image;
    所述根据所述热力图的第一区域,确定所述瞳孔点在所述图像中的位置,包括:根据所述第一热力图的第一区域,确定所述瞳孔点在所述第一图像中的位置;The determining the position of the pupil point in the image according to the first area of the heat map includes: determining the position of the pupil point in the first image according to the first area of the first heat map position in
    所述确定所述图像中的第二区域包括:确定所述第一图像中的第二区域,和,确定所述第二图像中的第二区域;The determining a second region in the image includes: determining a second region in the first image, and, determining a second region in the second image;
    所述瞳孔点的深度值由所述图像中的第二区域的像素深度值确定,包括:所述瞳孔点的深度值由所述第一图像中的第二区域的像素的深度值确定,所述第一图像中的所述第二区域的像素的深度值由所述第一图像中的第二区域的图像和所述第二图像中的第二区域的图像对应的视差确定。The depth value of the pupil point is determined by the pixel depth value of the second area in the image, including: the depth value of the pupil point is determined by the depth value of the pixel of the second area in the first image, so The depth value of the pixel in the second area in the first image is determined by the disparity corresponding to the image in the second area in the first image and the image in the second area in the second image.
  6. 根据权利要求5所述的方法,其特征在于,还包括:The method according to claim 5, further comprising:
    对所述第一图像、第二图像进行图像校正,所述图像校正包括至少以下之一:Image correction is performed on the first image and the second image, and the image correction includes at least one of the following:
    图像去畸变,图像位置调整,图像裁剪。Image de-distortion, image position adjustment, image cropping.
  7. 根据权利要求1所述的方法,其特征在于,所述获取所述图像对应的热力图包括:The method according to claim 1, wherein said obtaining the thermal map corresponding to the image comprises:
    从所述图像中获取人眼图像;obtaining an image of a human eye from said image;
    根据所述人眼图像获取所述热力图。The heat map is acquired according to the human eye image.
  8. 一种瞳孔位置的确定装置,其特征在于,包括:A device for determining a pupil position, characterized in that it comprises:
    获取模块,用于获取包括瞳孔的图像;An acquisition module, configured to acquire images including pupils;
    处理模块,用于获取所述图像对应的热力图,其中,所述热力图用于表示瞳孔点在所述图像中的概率分布,所述瞳孔点为所述瞳孔的中心点;A processing module, configured to obtain a heat map corresponding to the image, wherein the heat map is used to represent the probability distribution of pupil points in the image, and the pupil point is the center point of the pupil;
    处理模块还用于根据所述热力图的第一区域,确定所述瞳孔点在所述图像中的位置,其中,所述热力图的第一区域中的像素对应的概率值大于第一阈值;The processing module is further configured to determine the position of the pupil point in the image according to the first area of the heat map, wherein the probability value corresponding to the pixel in the first area of the heat map is greater than a first threshold;
    所述处理模块还用于确定所述图像中的第二区域,其中,所述第二区域在热力图中的像素对应的概率值大于第二阈值,所述第二阈值小于或等于所述第一阈值;The processing module is further configured to determine a second area in the image, wherein the probability value corresponding to the pixel in the heat map of the second area is greater than a second threshold, and the second threshold is less than or equal to the first a threshold;
    所述处理模块还用于根据所述瞳孔点的二维位置、所述瞳孔点的深度值确定所述瞳孔点的三维位置,其中,所述瞳孔点的二维位置指所述瞳孔点在所述图像中的位置,所述瞳孔点的深度值由所述图像中的第二区域的像素的深度值确定。The processing module is also used to determine the three-dimensional position of the pupil point according to the two-dimensional position of the pupil point and the depth value of the pupil point, wherein the two-dimensional position of the pupil point means that the pupil point is in the The position in the image, the depth value of the pupil point is determined by the depth value of the pixel in the second area in the image.
  9. 根据权利要求8所述的装置,其特征在于,所述处理模块用于所述根据所述热力图的第一区域,确定所述瞳孔点的在所述图像中的位置时,具体用于:The device according to claim 8, wherein the processing module is used for determining the position of the pupil point in the image according to the first region of the heat map, specifically for:
    根据所述热力图的第一区域的中心位置,确定所述瞳孔点的在所述图像中的位置。The position of the pupil point in the image is determined according to the center position of the first area of the heat map.
  10. 根据权利要求8或9所述的装置,其特征在于,所述第一阈值为所述热力图中的概率值的次高值。The device according to claim 8 or 9, wherein the first threshold is the second highest value of the probability values in the heat map.
  11. 根据权利要求8所述的装置,其特征在于,所述瞳孔点的深度值由所述图像的第二区域的像素的深度值确定包括:根据所述图像中的第二区域的像素的深度值的均值确定所述瞳孔点的深度值。The device according to claim 8, wherein the determining the depth value of the pupil point from the depth value of the pixels in the second area of the image comprises: according to the depth value of the pixels in the second area in the image The mean value of determines the depth value of the pupil point.
  12. 根据权利要求8或11所述的装置,其特征在于,所述图像包括第一图像和第二图像,所述第一图像和所述第二图像为从不同视角拍摄的两张图像,所述热力图包括所述第一图像对应的第一热力图和所述第二图像对应的第二热力图;The device according to claim 8 or 11, wherein the images include a first image and a second image, the first image and the second image are two images taken from different angles of view, the The thermal map includes a first thermal map corresponding to the first image and a second thermal map corresponding to the second image;
    所述处理模块用于所述根据所述热力图的第一区域,确定所述瞳孔点在所述图像中的位置时,具体用于:根据所述第一热力图的第一区域,确定所述瞳孔点在所述第一图像中的位置;When the processing module is used to determine the position of the pupil point in the image according to the first area of the heat map, it is specifically used to: determine the position of the pupil point according to the first area of the first heat map. the position of the pupil point in the first image;
    所述处理模块用于所述确定所述图像中的第二区域时,具体用于:确定所述第一 图像中的第二区域,和,确定所述第二图像中的第二区域;When the processing module is used for determining the second area in the image, it is specifically used to: determine the second area in the first image, and determine the second area in the second image;
    所述瞳孔点的深度值由所述图像中的第二区域的像素深度值确定,包括:所述瞳孔点的深度值由所述第一图像中的第二区域的像素的深度值确定,所述第一图像中的所述第二区域的像素的深度值由所述第一图像中的第二区域的图像和所述第二图像中的第二区域的图像对应的视差确定。The depth value of the pupil point is determined by the pixel depth value of the second area in the image, including: the depth value of the pupil point is determined by the depth value of the pixel of the second area in the first image, so The depth value of the pixel in the second area in the first image is determined by the disparity corresponding to the image in the second area in the first image and the image in the second area in the second image.
  13. 根据权利要求12所述的装置,其特征在于,所述处理模块还用于:The device according to claim 12, wherein the processing module is also used for:
    对所述第一图像、第二图像进行图像校正,所述图像校正包括至少以下之一:Image correction is performed on the first image and the second image, and the image correction includes at least one of the following:
    图像去畸变,图像位置调整,图像裁剪。Image de-distortion, image position adjustment, image cropping.
  14. 根据权利要求8所述的装置,其特征在于,所述处理模块用于获取所述图像对应的热力图时,具体用于:The device according to claim 8, wherein when the processing module is used to obtain the heat map corresponding to the image, it is specifically used for:
    从所述图像中获取人眼图像;obtaining an image of a human eye from said image;
    根据所述人眼图像获取所述热力图。The heat map is acquired according to the human eye image.
  15. 一种电子装置,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器,以及processor, and
    存储器,其上存储有程序指令,所述程序指令当被所述处理器执行时使得所述处理器执行权利要求1至7任一项所述的瞳孔位置的确定方法。A memory on which program instructions are stored, and when executed by the processor, the program instructions cause the processor to execute the method for determining the pupil position according to any one of claims 1 to 7.
  16. 一种电子装置,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器,以及接口电路,processor, and interface circuitry,
    其中,所述处理器通过所述接口电路访问存储器,所述存储器存储有程序指令,所述程序指令当被所述处理器执行时使得所述处理器执行权利要求1至7任一项所述的瞳孔位置的确定方法。Wherein, the processor accesses the memory through the interface circuit, and the memory stores program instructions, and when the program instructions are executed by the processor, the processor executes the program described in any one of claims 1 to 7. The determination method of the pupil position.
  17. 一种确定瞳孔位置的***,其特征在于,包括:A system for determining a pupil position, comprising:
    图像采集装置,以及与所述图像采集装置耦合的权利要求15所述的电子装置,或与所述图像采集装置耦合的权利要求16所述的电子装置。An image acquisition device, and the electronic device of claim 15 coupled with the image acquisition device, or the electronic device of claim 16 coupled with the image acquisition device.
  18. 一种计算机可读存储介质,其特征在于,其上存储有程序指令,所述程序指令当被计算机执行时使得所述计算机执行权利要求1至7任一项所述的瞳孔位置的确定方法。A computer-readable storage medium, characterized in that program instructions are stored thereon, and when the program instructions are executed by a computer, the computer executes the method for determining the pupil position according to any one of claims 1 to 7.
  19. 一种计算机程序产品,其特征在于,其包括有程序指令,所述程序指令当被计算机执行时使得所述计算机执行权利要求1至7任一项所述的瞳孔位置的确定方法。A computer program product, characterized in that it includes program instructions, and when the program instructions are executed by a computer, the computer executes the method for determining the pupil position according to any one of claims 1 to 7.
PCT/CN2021/099759 2021-06-11 2021-06-11 Pupil position determination method, device and system WO2022257120A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180001856.9A CN113597616A (en) 2021-06-11 2021-06-11 Pupil position determination method, device and system
PCT/CN2021/099759 WO2022257120A1 (en) 2021-06-11 2021-06-11 Pupil position determination method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/099759 WO2022257120A1 (en) 2021-06-11 2021-06-11 Pupil position determination method, device and system

Publications (1)

Publication Number Publication Date
WO2022257120A1 true WO2022257120A1 (en) 2022-12-15

Family

ID=78242915

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/099759 WO2022257120A1 (en) 2021-06-11 2021-06-11 Pupil position determination method, device and system

Country Status (2)

Country Link
CN (1) CN113597616A (en)
WO (1) WO2022257120A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024004789A1 (en) * 2022-06-28 2024-01-04 日本電気株式会社 Information processing device, information processing method, information processing system, and recording medium
CN116704572B (en) * 2022-12-30 2024-05-28 荣耀终端有限公司 Eye movement tracking method and device based on depth camera

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6095989A (en) * 1993-07-20 2000-08-01 Hay; Sam H. Optical recognition methods for locating eyes
CN103136512A (en) * 2013-02-04 2013-06-05 重庆市科学技术研究院 Pupil positioning method and system
CN107516093A (en) * 2017-09-25 2017-12-26 联想(北京)有限公司 The determination method and electronic equipment of a kind of eye pupil central point
CN109522887A (en) * 2019-01-24 2019-03-26 北京七鑫易维信息技术有限公司 A kind of Eye-controlling focus method, apparatus, equipment and storage medium
CN109857254A (en) * 2019-01-31 2019-06-07 京东方科技集团股份有限公司 Pupil positioning method and device, VR/AR equipment and computer-readable medium
US20190385325A1 (en) * 2017-02-22 2019-12-19 Korea Advanced Institute Of Science And Technology Apparatus and method for depth estimation based on thermal image, and neural network learning method therefof
KR102074519B1 (en) * 2018-10-05 2020-02-06 엔컴주식회사 METHOD AND DEVICE OF DETECTING DROWSINESS USING INFRARED AND DEPTH IMAGE, and Non-Transitory COMPUTER READABLE RECORDING MEDIUM
JP2020048971A (en) * 2018-09-27 2020-04-02 アイシン精機株式会社 Eyeball information estimation device, eyeball information estimation method, and eyeball information estimation program
CN111223143A (en) * 2019-12-31 2020-06-02 广州市百果园信息技术有限公司 Key point detection method and device and computer readable storage medium
CN111428680A (en) * 2020-04-07 2020-07-17 深圳市华付信息技术有限公司 Pupil positioning method based on deep learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6095989A (en) * 1993-07-20 2000-08-01 Hay; Sam H. Optical recognition methods for locating eyes
CN103136512A (en) * 2013-02-04 2013-06-05 重庆市科学技术研究院 Pupil positioning method and system
US20190385325A1 (en) * 2017-02-22 2019-12-19 Korea Advanced Institute Of Science And Technology Apparatus and method for depth estimation based on thermal image, and neural network learning method therefof
CN107516093A (en) * 2017-09-25 2017-12-26 联想(北京)有限公司 The determination method and electronic equipment of a kind of eye pupil central point
JP2020048971A (en) * 2018-09-27 2020-04-02 アイシン精機株式会社 Eyeball information estimation device, eyeball information estimation method, and eyeball information estimation program
KR102074519B1 (en) * 2018-10-05 2020-02-06 엔컴주식회사 METHOD AND DEVICE OF DETECTING DROWSINESS USING INFRARED AND DEPTH IMAGE, and Non-Transitory COMPUTER READABLE RECORDING MEDIUM
CN109522887A (en) * 2019-01-24 2019-03-26 北京七鑫易维信息技术有限公司 A kind of Eye-controlling focus method, apparatus, equipment and storage medium
CN109857254A (en) * 2019-01-31 2019-06-07 京东方科技集团股份有限公司 Pupil positioning method and device, VR/AR equipment and computer-readable medium
CN111223143A (en) * 2019-12-31 2020-06-02 广州市百果园信息技术有限公司 Key point detection method and device and computer readable storage medium
CN111428680A (en) * 2020-04-07 2020-07-17 深圳市华付信息技术有限公司 Pupil positioning method based on deep learning

Also Published As

Publication number Publication date
CN113597616A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
Itoh et al. Interaction-free calibration for optical see-through head-mounted displays based on 3d eye localization
US9406137B2 (en) Robust tracking using point and line features
US20180081434A1 (en) Eye and Head Tracking
US11398044B2 (en) Method for face modeling and related products
JP5529660B2 (en) Pupil detection device and pupil detection method
EP2509070B1 (en) Apparatus and method for determining relevance of input speech
EP2941736B1 (en) Mobile device based text detection and tracking
WO2020015468A1 (en) Image transmission method and apparatus, terminal device, and storage medium
EP3382510A1 (en) Visibility improvement method based on eye tracking, machine-readable storage medium and electronic device
JP6184271B2 (en) Imaging management apparatus, imaging management system control method, and program
TWI332453B (en) The asynchronous photography automobile-detecting apparatus and method thereof
WO2022257120A1 (en) Pupil position determination method, device and system
WO2020237611A1 (en) Image processing method and apparatus, control terminal and mobile device
WO2020063000A1 (en) Neural network training and line of sight detection methods and apparatuses, and electronic device
WO2023272453A1 (en) Gaze calibration method and apparatus, device, computer-readable storage medium, system, and vehicle
WO2020062960A1 (en) Neural network training method and apparatus, gaze tracking method and apparatus, and electronic device
WO2023231663A1 (en) Eye type detection method and apparatus, computer device, storage medium and computer program product
JP2018081402A (en) Image processing system, image processing method, and program
CN111325107A (en) Detection model training method and device, electronic equipment and readable storage medium
CN113902932A (en) Feature extraction method, visual positioning method and device, medium and electronic equipment
US9741171B2 (en) Image processing device and image processing method
CN111385481A (en) Image processing method and device, electronic device and storage medium
JP2004157778A (en) Nose position extraction method, program for operating it on computer, and nose position extraction device
CN110781712B (en) Human head space positioning method based on human face detection and recognition
CN109309827B (en) Multi-user real-time tracking device and method for 360-degree suspended light field three-dimensional display system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21944627

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21944627

Country of ref document: EP

Kind code of ref document: A1