WO2022095440A1 - Self-driving-oriented human-machine collaborative perception method and system - Google Patents

Self-driving-oriented human-machine collaborative perception method and system Download PDF

Info

Publication number
WO2022095440A1
WO2022095440A1 PCT/CN2021/098223 CN2021098223W WO2022095440A1 WO 2022095440 A1 WO2022095440 A1 WO 2022095440A1 CN 2021098223 W CN2021098223 W CN 2021098223W WO 2022095440 A1 WO2022095440 A1 WO 2022095440A1
Authority
WO
WIPO (PCT)
Prior art keywords
driver
image
feature
perception
face
Prior art date
Application number
PCT/CN2021/098223
Other languages
French (fr)
Chinese (zh)
Inventor
池成
徐刚
沈剑豪
邓远志
林国勇
周阳
李文杰
Original Assignee
深圳技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳技术大学 filed Critical 深圳技术大学
Publication of WO2022095440A1 publication Critical patent/WO2022095440A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Definitions

  • the present invention relates to the technical field of intelligent driving, in particular to a human-machine collaborative perception method and system for automatic driving.
  • the multi-sensor information fusion scheme can form an effective complement to the uncertain information, and has become the mainstream solution for current perception.
  • Lidar, millimeter-wave radar, camera, ultrasonic radar or other multi-sensor combination sensing solutions of the same type are used to obtain relatively comprehensive vehicle body environment information, but the inescapable drawback of such multi-sensor combination solutions is the unit Due to the large amount of data processed in time and the high demand for hardware resources, it is often difficult to meet the real-time requirements of the system and the system is uneconomical.
  • the present invention aims to solve at least one of the technical problems existing in the prior art. To this end, the present invention proposes a human-machine collaborative perception method for automatic driving, which can effectively reduce the amount of data processed per unit time and improve the processing speed of the system.
  • the present invention also proposes a human-machine collaborative perception system for automatic driving with the above-mentioned method for automatic driving-oriented human-machine collaborative perception.
  • the human-machine collaborative sensing method for automatic driving includes the following steps: S100, capturing the driver's head image through binocular infrared CCDs arranged at different positions in the vehicle, and obtaining the driver's face Synthesize images, extract facial features, obtain feature corners, and establish an eyeball coordinate system by locating the feature corners; S200, obtain a driver's three-dimensional line of sight based on the eyeball coordinate system, and transform the driver's line of sight through coordinate matrix transformation
  • the driver's 3D line of sight and the image pixel information of the environment perception camera are placed in the same world coordinates, and the mapping relationship between the driver's 3D line of sight and the pixels on the environment perception camera imaging is established to obtain the driver's visual landing point and save it to the driver's gaze point cache database.
  • S300 based on the driver's gaze point cache database, carry out eye movement analysis on the gaze target frequency and the gaze duration, obtain the eye movement state and mark, and mark the pixel area of the intersection of the driver's visual landing point and the imaging, and construct Tuowei environment perception image database;
  • S400 according to the Tuowei environment perception image database, adjust the distribution weight of the image processing neural network in the process of automatic driving, and adaptively adjust the fineness and area of image pixel traversal.
  • the human-machine collaborative perception method for automatic driving has at least the following beneficial effects: quickly locate the region of interest in the image in the intelligent camera through visual tracking and human eye attention mechanism, and use perception fusion technology to accelerate environmental perception
  • the system information processing speed can significantly reduce the hardware computing requirements of the perception system, improve the real-time performance of the system, and has good economics.
  • the step S100 includes: S110, capturing the driver's head image through the binocular infrared CCD, obtaining the driver's face images at the same time and different angles, performing panoramic synthesis and stitching, grayscale processing, After the binarization processing, the synthetic image of the driver's face is obtained; S120 , the synthetic image of the driver's face is distinguished through a face skin color model to obtain an area to be inspected, and the to-be-inspected area is matched based on the face model area, obtain a face detection image, and store it in a historical face database; S130, extract the feature corner points from the face detection image, and identify the inner and outer corners of the two eyes, the two mouth corners, and the center of the two eyes point, and establish a face plane according to the two inner corners of the eyes and the two mouth corners; S140, based on the position and direction of the binocular infrared CCD in the world coordinate system, solve the three-dimensional space coordinates of the feature corners, and
  • the step S140 further includes: S141, reading the face detection image from the historical face database, analyzing the changes of the driver's facial features and visual attention through the video stream, Detecting the driver's mental state to obtain a mental state score; S142, if the mental state score is less than a set threshold, continue to use the binocular infrared CCD to capture the driver's head image to obtain the face detection image, and recalculate The mental state score; S143, otherwise, solve the three-dimensional space coordinates of the feature corners, and establish the eyeball coordinate system.
  • the step S200 includes: S210, performing eyeball region recognition on the synthetic image of the driver's face, intercepting the eyeball region image, performing threshold analysis on the eyeball region image, and obtaining the pupil threshold image and the eyeball region image respectively.
  • Purchin's patch threshold image identify pupil and Purchin's patch, calculate the coordinates of pupil center and Purchin's patch center, and establish pupil-Purchin's patch position relationship mapping function;
  • S220 according to the pupil-Purchin's patch positional relationship mapping function Obtain the three-dimensional line of sight of the driver;
  • S230 place the three-dimensional line of sight of the driver and the imaging pixel information of the environment perception camera under the same world coordinates through coordinate matrix transformation, and establish the distance between the three-dimensional line of sight of the driver and the pixels on the imaging of the environment perception camera The mapping relationship;
  • S240 the intersection of the driver's three-dimensional sight line and the image of the environment perception camera is obtained, the driver's visual landing point is obtained, and the driver's gaze point cache database is saved.
  • the step S220 further includes: S221, obtaining the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, establishing the driver's head coordinate system, and adjusting the head pitch angle, The deflection angle, roll angle, and three-axis translation data are recorded to obtain head motion data; S222, data fusion is performed based on the environment model, and the sight tracking data is compensated by the head motion data, and the three-dimensional sight line of the driver is calculated and output. .
  • the eye movement states include fixation, saccade, and smooth trailing tracking.
  • the step S400 further includes: S410, obtaining the current driver's eye movement feature, and comparing it with the data eye movement feature of the preset driver's eye movement feature database, and obtaining the data obtained by the environment perception camera Pixel feature classification of the perceived image; S420, process the perceived image according to the pixel feature classification.
  • the human-machine cooperative perception system for automatic driving includes: a facial image acquisition module, configured to capture images of the driver's head through binocular infrared CCDs arranged at different positions in the vehicle, and obtain A synthetic image of the driver's face; a corner point positioning module for extracting facial features from the synthetic image of the driver's face, acquiring characteristic corner points, and establishing an eyeball coordinate system through the positioning of the characteristic corner points; a gaze point acquiring module is used to obtain the driver's three-dimensional line of sight based on the eyeball coordinate system, and place the driver's three-dimensional line of sight and the imaging pixel information of the environment perception camera in the same world coordinates through coordinate matrix transformation to establish the driver's three-dimensional line of sight and environment perception camera.
  • a facial image acquisition module configured to capture images of the driver's head through binocular infrared CCDs arranged at different positions in the vehicle, and obtain A synthetic image of the driver's face
  • a corner point positioning module for extracting facial features from the synthetic image of
  • the mapping relationship between the pixel points on the imaging is obtained, and the driver's visual landing point is obtained and saved to the driver's gaze point cache database; the Tuowei perception marking module is used to determine the gaze target frequency and gaze duration based on the driver's gaze point cache database.
  • Perform eye movement analysis obtain and mark the eye movement state, mark the pixel area of the intersection of the driver's vision and the imaging, and construct a Tuowei environment perception image database;
  • the perception fusion module according to the Tuowei environment perception image database , adjust the distribution weight of the image processing neural network in the process of automatic driving, and adaptively adjust the fineness and area of image pixel traversal.
  • the human-machine collaborative perception system for automatic driving has at least the following beneficial effects: quickly locate the region of interest in the image in the smart camera through visual tracking and human eye attention mechanism, and use perception fusion technology to accelerate environmental perception
  • the system information processing speed can significantly reduce the hardware computing requirements of the perception system, improve the real-time performance of the system, and has good economics.
  • a head motion compensation module configured to obtain the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, establish the driver's head coordinate system, and pitch the head.
  • Angle, deflection angle, roll angle, and three-axis translation data are recorded to obtain head motion data, and data fusion is performed based on the environment model, the gaze tracking data is compensated by the head motion data, and the driver is calculated and output. three-dimensional sight.
  • a perceptual image classification processing module configured to obtain the current driver's eye movement feature, and compare it with the data eye movement feature of the preset driver's eye movement feature database, and obtain the result obtained through the environmental perception. Classify the pixel features of the perceptual image acquired by the camera, and process the perceptual image according to the pixel feature classification.
  • FIG. 1 is a schematic flowchart of a method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a data processing process in a method according to an embodiment of the present invention.
  • FIG. 3 is a schematic block diagram of modules of a system according to an embodiment of the present invention.
  • Facial image acquisition module 100 corner location module 200 , gaze point acquisition module 300 , extension perceptual labeling module 400 , perceptual fusion module 500 , head motion compensation module 600 , perceptual image classification processing module 700 .
  • the meaning of several is one or more, the meaning of multiple is two or more, greater than, less than, exceeding, etc. are understood as not including this number, above, below, within, etc. are understood as including this number . If it is described that the first and the second are only for the purpose of distinguishing technical features, it cannot be understood as indicating or implying relative importance, or indicating the number of the indicated technical features or the order of the indicated technical features. relation.
  • Purkinje image a bright spot on the cornea of the eye, is produced by the reflection of light entering the pupil on the outer surface of the cornea (corneal reflection, CR).
  • the method according to the embodiment of the present invention includes: S100 , photographing a driver's head image through binocular infrared CCDs arranged at different positions in the vehicle, obtaining a synthetic image of the driver's face, extracting facial features, and obtaining characteristic angles
  • the eyeball coordinate system is established by locating the feature corners; S200, based on the eyeball coordinate system, the driver's three-dimensional sight line is obtained, and the driver's three-dimensional sight line and the imaging pixel information of the environment perception camera are placed in the same world coordinates through coordinate matrix transformation.
  • the data processing process is roughly divided into the following four steps: binocular infrared imaging to collect facial images, GPU to process facial images, CPU to calculate data and cache, store and fuse output; specifically including: image acquisition, Facial feature detection, feature corner detection, feature 3D coordinate extraction, driving line of sight direction calculation, driving line of sight drop point calculation, fusion output, refer to Figure 2.
  • the binocular infrared CCD is used to photograph the driver's face under all conditions of the driving process, collect video images and cache them.
  • the GPU's facial image processing includes the detection of the driver's facial features and the extraction of feature corner coordinates; the CPU calculates and locates the RIO area of the driver's line of sight in the environmental camera imaging; the storage and fusion output is based on the cached driver's visual RIO area history information, The perceptual data is reconstructed and expanded, and the expanded data is fused and output based on the visual attention mechanism.
  • the image acquisition as the input is mainly to collect and buffer the driver's face video stream image.
  • Video image acquisition cannot interfere with the driver's own behavior, so it is necessary to use non-wearable and other acquisition methods that interfere with the driver's behavior; in addition, the lighting environment of the cockpit is usually very complex and harsh, and the application of conventional cameras is limited.
  • Adopt an infrared camera that is not sensitive to the light environment.
  • the binocular infrared CCDs set in different positions in the car are used to shoot the driver's head image, and the same time is obtained. Images of the driver's face from different angles.
  • Each camera may only capture a part of the face image.
  • the images captured by the binocular cameras undergo panoramic stitching, grayscale processing, and binarization processing to form a complete composite image of the driver's face. Then pass to the next processing flow.
  • Facial feature detection is used to extract the driver's face position in the composite image, preprocess the next facial feature point extraction, and provide basic data for gaze direction calculation; it should be noted that face feature detection requires Keep track of the driver's facial features, thereby increasing the speed of the system and reducing the false detection rate.
  • the face area in the synthetic comparison image is separated from the background area by the face skin color model to obtain the area to be inspected that may have the face area, and then the area to be inspected is matched by the face model, and the analysis is performed.
  • the matching degree with the face model is obtained by comparison, so as to extract the area where the human face may exist according to the matching degree.
  • the captured images are also stored cyclically to construct a historical face database, so as to provide time-series dimension information for the subsequent monitoring of the driver's mental state.
  • Feature corner extraction is based on facial feature detection.
  • the face detection image is obtained by cropping the driver's face image through facial feature detection, and the inner and outer corners of the two eyes and the two mouth corners are extracted from the picture. , two center points of human eyes, locate these feature corner points, and establish a face coordinate system.
  • the specific process first, after the face detection image, according to the principle of "three courts and five eyes" in the face area, the rough positioning of the human eye range is carried out, so as to narrow the detection range of the human eye and improve the detection accuracy and detection speed; then through the dynamic threshold segmentation, gradient Transformation and other means are used to complete the extraction of characteristic corner points such as eyes, and the face plane is established according to the two inner corners of the eyes and the corners of the two mouths.
  • the driver's face can be continuously observed through the infrared CCD camera, and the changes of the driver's facial feature points and visual attention can be analyzed through the video stream to detect the driver's driving mental state (including fatigue). state, driving concentration), get a mental state score, and proceed to the next step when the driver is in a good mental state, otherwise continue to take face and eye images and observe them cyclically; prevent the collection of visual tracking data of the driver's fatigue state , resulting in misjudgment after fusion perception.
  • the driver's driving mental state including fatigue. state, driving concentration
  • the extraction of the three-dimensional coordinates of the characteristic corners is to extract the three-dimensional coordinates of the above-mentioned characteristic corners.
  • the position and orientation of the above characteristic corner points in the binocular camera imaging system can be obtained, and then the relative position of the face corner points can be obtained according to the relationship between the camera coordinate system and the ideal coordinate system. coordinate.
  • a face plane coordinate system is established according to the coordinates of the corners of the eyes and the corners of the mouth, where the face orientation is perpendicular to the face plane.
  • the three-dimensional space coordinates of the corners of the face are obtained, and the world coordinate system of each corner can be obtained through a series of coordinate transformations.
  • the coordinates of the center of the eyeball relative to the face coordinate system are unchanged. Therefore, the coordinates of the center of the eyeball can be determined according to the coordinates of the corner of the eye and the corner of the mouth. Accordingly, the eyeball coordinate system is established through the acquired corner coordinates.
  • Driving gaze direction calculation this step is the process of solving the driver's gaze direction and maintaining continuous tracking.
  • the visual path of the human eye is the direction of the connecting line between the fovea in the middle of the retina and the middle of the lens. Because the movement of the eyeball is a complex dynamic process with translation and forward and backward movement, rather than the rotation of a pure sphere; therefore, it is very difficult to detect and obtain the strictly defined line of sight direction, so the general definition of the line of sight direction is the center of the eyeball The connection direction to the center of the human eye surface.
  • Purchin's spot does not change greatly with eye movement, so the driver's line of sight can be solved by extracting the coordinates of pupil and Purchin's spot.
  • the eyeball region recognition is performed on the face detection image extracted from the face composite image, and the eyeball region image is intercepted.
  • perform threshold analysis on the image of the eye area obtain the pupil threshold image and Purchin's spot threshold image respectively, identify the pupil and Purchin's spot, calculate the center coordinates of the pupil and Purchin's spot, and establish the mapping function of the pupil-Purchin's spot position relationship.
  • the calculation of the driver's sight direction in the embodiment of the present invention further includes: using head tracking to perform image data compensation; including: obtaining the spatial position of the head feature point relative to the camera coordinate system through image recognition, and establishing the driver's head coordinate It caches and records the three-axis translation data of the head pitch angle, yaw angle, roll angle, and three-dimensional coordinate system, performs data fusion based on the environment model, uses the head motion data to compensate the gaze tracking data, and finally calculates and outputs the three-dimensional space. directional sight.
  • the calculation of the driving line of sight is mainly used to complete the driver's attention extraction and tracking, and to establish the mapping relationship between the driver's 3D line of sight and the pixels on the image of the environment perception camera.
  • the three-dimensional line of sight of the driver is based on the eyeball coordinate system, and the pixel position on the imaging of the environment perception camera is based on the imaging coordinate system. Therefore, establishing the mapping relationship between the two involves coordinate matrix transformation.
  • the position of the eyeball coordinate system relative to the imaging coordinate system of the binocular infrared CCD camera is determined, and the coordinates of the imaging coordinate system of the binocular infrared CCD camera relative to the vehicle body are determined; similarly, the imaging coordinate system of the environment perception camera is relative to the environment perception camera.
  • the position of the coordinate system is relatively determined, and the position of the environment perception camera relative to the vehicle body is determined. Therefore, the driver's 3D line of sight and the imaging pixel information of the environment perception camera are placed in the same world coordinates through the coordinate matrix transformation, and the mapping relationship between the 3D line of sight and the imaging pixel information of the environment perception camera can be established; By seeking intersection, the driver's visual landing point can be solved. Continuously track the driver's visual location and save it to the driver's gaze point cache database.
  • fixation is the momentary static time of the eyeball accompanied by tiny eye movement features, which is the stable retinal eye movement on the target of interest, and the fixation time is at least 100ms to 200ms
  • saccadic is the rapid eye movement of the eyeball, used for The fovea of the visual center is quickly transferred to the new area of interest, and the duration is in the range of 10ms to 100ms
  • smooth trailing tracking is the eye movement behavior when the eye tracks the target of interest, and a stable movement occurs between the eye and the object.
  • the eye movement analysis is performed on the gaze target frequency and gaze duration, and the driver's eye movement state (gazing, jumping, and trailing tracking) at the moment is marked; Based on the above-mentioned marking information, a Tuo-dimensional environment perception image database containing the driver's visual information is constructed.
  • an attention neural network module is trained.
  • the neural network module is used to automatically adjust the distribution weight of a certain part of the traditional image processing neural network, so as to adapt to the fineness and area of image pixel traversal. It can quickly locate the region of interest in the image and reduce the pixel traversal time of the algorithm.
  • pixel features are also used for auxiliary classification through a preset driver's eye movement feature database, and the current driver's eye movement feature is obtained by obtaining the current driver's eye movement feature.
  • Find matching data eye movement features in the driver's eye movement feature database determine the image pixel feature classification of the current perception image, and select a specific image processing method for processing, which can improve the environment in harsh environments (light, rain, night, etc.) Perceived accuracy.
  • the system of the embodiment of the present invention includes: a facial image acquisition module 100, which is used to capture the driver's head image through binocular infrared CCDs arranged at different positions in the vehicle to obtain a composite image of the driver's face;
  • the point positioning module 200 is used to extract facial features from the synthetic image of the driver's face, obtain characteristic corner points, and establish an eyeball coordinate system through the positioning of the characteristic corner points;
  • the gaze point acquisition module 300 is used to obtain driving based on the eyeball coordinate system.
  • the driver's three-dimensional sight line and the image pixel information of the environment perception camera are placed in the same world coordinates through coordinate matrix transformation, and the mapping relationship between the driver's three-dimensional sight line and the pixels on the environment perception camera imaging is established to obtain the driver's vision.
  • the landing point is saved to the driver's gaze point cache database; the Tuowei perception marking module 400 is used to analyze the gaze target frequency and gaze duration based on the driver's gaze point cache database, obtain the eye movement state and mark it, and perform eye movement analysis.
  • the pixel area of the intersection of the driver's vision and the imaging is marked, and a Tuowei environment perception image database is constructed; the perception fusion module 500, according to the Tuowei environment perception image database, assigns weights to the image processing neural network in the process of automatic driving. Adjust, adaptively adjust the fineness and area of image pixel traversal.
  • the head motion compensation module 600 is used to obtain the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, establish the driver's head coordinate system, and measure the head pitch angle, yaw angle, roll angle, three-axis
  • the translation data is recorded to obtain the head motion data, and the data fusion is performed based on the environment model, the gaze tracking data is compensated by the head motion data, and the driver's three-dimensional sight line is calculated and output.
  • the perceptual image classification processing module 700 is used to obtain the current driver's eye movement feature, and compare it with the data eye movement feature of the preset driver's eye movement feature database to obtain the pixel feature classification of the perceptual image obtained by the environment perception camera, And according to the pixel feature classification, the perceptual image is processed.
  • blocks in the block diagrams and flowchart illustrations support combinations of means for performing the specified functions, combinations of elements or steps for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware computer systems, or combinations of special purpose hardware and computer instructions, that perform the specified functions, elements, or steps.
  • Program modules, applications, and the like described herein may include one or more software components, including, for example, software objects, methods, data structures, and the like. Each such software component may include computer-executable instructions that, in response to execution, cause at least a portion of the functions described herein (eg, one or more operations of the exemplary methods described herein) be executed.
  • Software components can be coded in any of a variety of programming languages.
  • An exemplary programming language may be a low-level programming language, such as assembly language associated with a particular hardware architecture and/or operating system platform.
  • Software components that include assembly language instructions may need to be converted into executable machine code by an assembler prior to execution by a hardware architecture and/or platform.
  • Another exemplary programming language may be a higher level programming language that is portable across multiple architectures.
  • Software components including higher level programming languages may need to be converted into an intermediate representation by an interpreter or compiler before execution.
  • Other examples of programming languages include, but are not limited to, macro languages, shell or command languages, job control languages, scripting languages, database query or search languages, or report writing languages.
  • a software component containing instructions from one of the above-described programming language examples can be directly executed by an operating system or other software component without first being converted to another form.
  • Software components may be stored as files or other data storage constructs. Software components with similar types or related functions may be stored together, for example, in a particular directory, folder, or library. Software components may be static (eg, preset or fixed) or dynamic (eg, created or modified at execution time).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are a self-driving-oriented human-machine collaborative perception method and system. The method comprises: capturing head images of a driver by means of using binocular infrared CCDs arranged at different positions in a vehicle, extracting facial features and obtaining feature corner points; establishing a mapping relationship between the three-dimensional line of sight of the driver and pixels on environmental perception camera imaging, obtaining a gazing point of the driver, and saving the gazing point to a fixation point cache database; on the basis of the fixation point cache database, performing eye movement analysis on fixation target frequency and fixation duration to obtain eye movement states, constructing an extended dimension environment perception image database, and adjusting distribution weights of image processing neural networks during the process of self driving. The present invention rapidly positions a region of interest in an image in an intelligent camera by means of visual tracking and a human eye attention mechanism, and accelerates the processing speed of environmental perception system information by using perception fusion technologies, which can significantly reduce hardware operation requirements of the perception system and improve the real-time performance of the system, and is thus economical.

Description

面向自动驾驶的人机协同感知方法及***Human-machine collaborative perception method and system for autonomous driving 技术领域technical field
本发明涉及智能驾驶的技术领域,特别涉及一种面向自动驾驶的人机协同感知方法及***。The present invention relates to the technical field of intelligent driving, in particular to a human-machine collaborative perception method and system for automatic driving.
背景技术Background technique
智能辅助驾驶乃至自动驾驶技术是未来汽车发展的必然趋势。在该技术领域,当前感知***成为阻碍自动驾驶技术发展的瓶颈。Intelligent assisted driving and even autonomous driving technology are the inevitable trend of future automobile development. In this technical field, the current perception system has become a bottleneck hindering the development of autonomous driving technology.
目前,由于单传感器方案在应对复杂环境时存在的信息单一、不确定性高、可靠性低等先天不足,难以满足感知***鲁棒性、准确性、稳定性的苛刻要求。相反,多传感器信息融合方案能够形成对不确定性信息的有效互补,已经成为当下感知主流的解决方案。激光雷达、毫米波雷达、摄像头、超声波雷达彼此之间或者多个同类型的多传感器组合感知方案就是为了获取车辆相对全面的车身环境信息,但是此类多传感器组合方案无法避开的弊端就是单位时间处理数据量大、硬件资源需求高,往往难以满意***的实时性要求、***不经济。At present, due to the inherent deficiencies such as single information, high uncertainty, and low reliability in the single-sensor scheme when dealing with complex environments, it is difficult to meet the stringent requirements of the robustness, accuracy, and stability of the sensing system. On the contrary, the multi-sensor information fusion scheme can form an effective complement to the uncertain information, and has become the mainstream solution for current perception. Lidar, millimeter-wave radar, camera, ultrasonic radar or other multi-sensor combination sensing solutions of the same type are used to obtain relatively comprehensive vehicle body environment information, but the inescapable drawback of such multi-sensor combination solutions is the unit Due to the large amount of data processed in time and the high demand for hardware resources, it is often difficult to meet the real-time requirements of the system and the system is uneconomical.
发明内容SUMMARY OF THE INVENTION
本发明旨在至少解决现有技术中存在的技术问题之一。为此,本发明提出一种面向自动驾驶的人机协同感知方法,能够有效降低单位时间处理数据量,提升***处理速度。The present invention aims to solve at least one of the technical problems existing in the prior art. To this end, the present invention proposes a human-machine collaborative perception method for automatic driving, which can effectively reduce the amount of data processed per unit time and improve the processing speed of the system.
本发明还提出一种具有上述面向自动驾驶的人机协同感知方法的面向自动驾驶的人机协同感知***。The present invention also proposes a human-machine collaborative perception system for automatic driving with the above-mentioned method for automatic driving-oriented human-machine collaborative perception.
根据本发明的第一方面实施例的面向自动驾驶的人机协同感知方法,包括以下步骤:S100,通过设置在车内的不同位置的双目红外CCD拍摄驾驶员头部影像,获取驾驶员面部合成图像,提取脸部特征,获取特征角点,并通过所述特征角点的定位建立眼球坐标系;S200,基于所述眼球坐标系获取驾驶员三维视线,并通过坐标矩阵变换将所述驾驶员三维视线与环境感知摄像头成像像素信息置 于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库;S300,基于所述驾驶员注视点缓存数据库,对注视目标频率与注视时长进行眼动分析,获得眼动状态并标记,并对驾驶员视觉落点与成像的交点的像素区域进行标记,构建拓维环境感知图像数据库;S400,根据所述拓维环境感知图像数据库,对自动驾驶过程中的图像处理神经网络的分配权重进行调整,自适应调整图像像素遍历精细程度和区域。The human-machine collaborative sensing method for automatic driving according to the embodiment of the first aspect of the present invention includes the following steps: S100, capturing the driver's head image through binocular infrared CCDs arranged at different positions in the vehicle, and obtaining the driver's face Synthesize images, extract facial features, obtain feature corners, and establish an eyeball coordinate system by locating the feature corners; S200, obtain a driver's three-dimensional line of sight based on the eyeball coordinate system, and transform the driver's line of sight through coordinate matrix transformation The driver's 3D line of sight and the image pixel information of the environment perception camera are placed in the same world coordinates, and the mapping relationship between the driver's 3D line of sight and the pixels on the environment perception camera imaging is established to obtain the driver's visual landing point and save it to the driver's gaze point cache database. ; S300, based on the driver's gaze point cache database, carry out eye movement analysis on the gaze target frequency and the gaze duration, obtain the eye movement state and mark, and mark the pixel area of the intersection of the driver's visual landing point and the imaging, and construct Tuowei environment perception image database; S400, according to the Tuowei environment perception image database, adjust the distribution weight of the image processing neural network in the process of automatic driving, and adaptively adjust the fineness and area of image pixel traversal.
根据本发明实施例的面向自动驾驶的人机协同感知方法,至少具有如下有益效果:通过视觉追踪和人眼注意力机制快速定位智能摄像头中图像中的感兴趣区域,利用感知融合技术加速环境感知***信息处理速度,能显著降低感知***硬件运算需求、提升***实时性,具有较好的经济性。The human-machine collaborative perception method for automatic driving according to the embodiment of the present invention has at least the following beneficial effects: quickly locate the region of interest in the image in the intelligent camera through visual tracking and human eye attention mechanism, and use perception fusion technology to accelerate environmental perception The system information processing speed can significantly reduce the hardware computing requirements of the perception system, improve the real-time performance of the system, and has good economics.
根据本发明的一些实施例,所述步骤S100包括:S110,通过所述双目红外CCD拍摄驾驶员头部影像,获得同一时间不同角度的驾驶员面部图像,进行全景合成拼接、灰度处理、二值化处理后,得到所述驾驶员面部合成图像;S120,通过人脸肤色模型,对所述驾驶员面部合成图像进行区分,得出待检区域,并基于人脸模型匹配所述待检区域,得到人脸检测图像,存入历史人脸数据库中;S130,从所述人脸检测图像中提取所述特征角点,识别出两只眼睛的内外眼角、两个嘴角及两个人眼中心点,并根据两内眼角和两嘴角建立人脸平面;S140,基于所述双目红外CCD在世界坐标系的位置和方向,求解所述特征角点的三维空间坐标,进行坐标转换得到所述特征角点的世界坐标,并根据所述特征角点的世界坐标建立所述眼球坐标系。According to some embodiments of the present invention, the step S100 includes: S110, capturing the driver's head image through the binocular infrared CCD, obtaining the driver's face images at the same time and different angles, performing panoramic synthesis and stitching, grayscale processing, After the binarization processing, the synthetic image of the driver's face is obtained; S120 , the synthetic image of the driver's face is distinguished through a face skin color model to obtain an area to be inspected, and the to-be-inspected area is matched based on the face model area, obtain a face detection image, and store it in a historical face database; S130, extract the feature corner points from the face detection image, and identify the inner and outer corners of the two eyes, the two mouth corners, and the center of the two eyes point, and establish a face plane according to the two inner corners of the eyes and the two mouth corners; S140, based on the position and direction of the binocular infrared CCD in the world coordinate system, solve the three-dimensional space coordinates of the feature corners, and perform coordinate transformation to obtain the described The world coordinates of the feature corners, and the eyeball coordinate system is established according to the world coordinates of the feature corners.
根据本发明的一些实施例,所述步骤S140还包括:S141,从所述历史人脸数据库中读取所述人脸检测图像,通过视频流分析驾驶员面部特征变化及视觉注意力变化,对驾驶员精神状态进行检测,获取精神状态评分;S142,若所述精神状态评分小于设定阈值,则继续通过所述双目红外CCD拍摄驾驶员头部影像获取所述人脸检测图像,重新计算所述精神状态评分;S143,否则,求解所述特征 角点的三维空间坐标,并建立所述眼球坐标系。According to some embodiments of the present invention, the step S140 further includes: S141, reading the face detection image from the historical face database, analyzing the changes of the driver's facial features and visual attention through the video stream, Detecting the driver's mental state to obtain a mental state score; S142, if the mental state score is less than a set threshold, continue to use the binocular infrared CCD to capture the driver's head image to obtain the face detection image, and recalculate The mental state score; S143, otherwise, solve the three-dimensional space coordinates of the feature corners, and establish the eyeball coordinate system.
根据本发明的一些实施例,所述步骤S200包括:S210,对所述驾驶员面部合成图像进行眼球区域识别,截取眼球区域图像,将所述眼球区域图像进行阈值分析,分别获取瞳孔阈值图像和普尔钦斑阈值图像,识别瞳孔和普尔钦斑,并计算出瞳孔中心和普尔钦斑点中心坐标,确立瞳孔-普尔钦斑位置关系映射函数;S220,根据所述瞳孔-普尔钦斑位置关系映射函数得出所述驾驶员三维视线;S230,通过坐标矩阵变换将所述驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系;S240,对所述驾驶员三维视线与环境感知摄像头成像求交,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库。According to some embodiments of the present invention, the step S200 includes: S210, performing eyeball region recognition on the synthetic image of the driver's face, intercepting the eyeball region image, performing threshold analysis on the eyeball region image, and obtaining the pupil threshold image and the eyeball region image respectively. Purchin's patch threshold image, identify pupil and Purchin's patch, calculate the coordinates of pupil center and Purchin's patch center, and establish pupil-Purchin's patch position relationship mapping function; S220, according to the pupil-Purchin's patch positional relationship mapping function Obtain the three-dimensional line of sight of the driver; S230, place the three-dimensional line of sight of the driver and the imaging pixel information of the environment perception camera under the same world coordinates through coordinate matrix transformation, and establish the distance between the three-dimensional line of sight of the driver and the pixels on the imaging of the environment perception camera The mapping relationship; S240, the intersection of the driver's three-dimensional sight line and the image of the environment perception camera is obtained, the driver's visual landing point is obtained, and the driver's gaze point cache database is saved.
根据本发明的一些实施例,所述步骤S220还包括:S221,通过图像识别获取驾驶员头部特征点相对于相机坐标系的空间位置,建立驾驶员头部坐标系,对头部俯仰角、偏转角、侧倾角、三轴平动数据进行记录,得到头部运动数据;S222,基于环境模型进行数据融合,通过所述头部运动数据补偿视线追踪数据,推算并输出所述驾驶员三维视线。According to some embodiments of the present invention, the step S220 further includes: S221, obtaining the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, establishing the driver's head coordinate system, and adjusting the head pitch angle, The deflection angle, roll angle, and three-axis translation data are recorded to obtain head motion data; S222, data fusion is performed based on the environment model, and the sight tracking data is compensated by the head motion data, and the three-dimensional sight line of the driver is calculated and output. .
根据本发明的一些实施例,所述眼动状态包括:注视、跳视以及平滑尾随跟踪。According to some embodiments of the present invention, the eye movement states include fixation, saccade, and smooth trailing tracking.
根据本发明的一些实施例,所述步骤S400还包括:S410,获取当前驾驶员眼动特征,并与预设驾驶员眼动特征数据库的数据眼动特征进行比对,获取通过环境感知摄像头获取的感知图像的像素特征分类;S420,根据所述像素特征分类,对所述感知图像进行处理。According to some embodiments of the present invention, the step S400 further includes: S410, obtaining the current driver's eye movement feature, and comparing it with the data eye movement feature of the preset driver's eye movement feature database, and obtaining the data obtained by the environment perception camera Pixel feature classification of the perceived image; S420, process the perceived image according to the pixel feature classification.
根据本发明的第二方面实施例的面向自动驾驶的人机协同感知***,包括:面部图像采集模块,用于通过设置在车内的不同位置的双目红外CCD拍摄驾驶员头部影像,获取驾驶员面部合成图像;角点定位模块,用于从所述驾驶员面部合成图像中提取脸部特征,获取特征角点,并通过所述特征角点的定位建立眼球坐标系;注视点获取模块,用于基于所述眼球坐标系获取驾驶员三维视线,并通 过坐标矩阵变换将所述驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库;拓维感知标记模块,用于基于所述驾驶员注视点缓存数据库,对注视目标频率与注视时长进行眼动分析,获得眼动状态并标记,并对驾驶员视觉落点与成像的交点的像素区域进行标记,构建拓维环境感知图像数据库;感知融合模块,根据所述拓维环境感知图像数据库,对自动驾驶过程中的图像处理神经网络的分配权重进行调整,自适应调整图像像素遍历精细程度和区域。According to the second aspect of the present invention, the human-machine cooperative perception system for automatic driving includes: a facial image acquisition module, configured to capture images of the driver's head through binocular infrared CCDs arranged at different positions in the vehicle, and obtain A synthetic image of the driver's face; a corner point positioning module for extracting facial features from the synthetic image of the driver's face, acquiring characteristic corner points, and establishing an eyeball coordinate system through the positioning of the characteristic corner points; a gaze point acquiring module is used to obtain the driver's three-dimensional line of sight based on the eyeball coordinate system, and place the driver's three-dimensional line of sight and the imaging pixel information of the environment perception camera in the same world coordinates through coordinate matrix transformation to establish the driver's three-dimensional line of sight and environment perception camera. The mapping relationship between the pixel points on the imaging is obtained, and the driver's visual landing point is obtained and saved to the driver's gaze point cache database; the Tuowei perception marking module is used to determine the gaze target frequency and gaze duration based on the driver's gaze point cache database. Perform eye movement analysis, obtain and mark the eye movement state, mark the pixel area of the intersection of the driver's vision and the imaging, and construct a Tuowei environment perception image database; the perception fusion module, according to the Tuowei environment perception image database , adjust the distribution weight of the image processing neural network in the process of automatic driving, and adaptively adjust the fineness and area of image pixel traversal.
根据本发明实施例的面向自动驾驶的人机协同感知***,至少具有如下有益效果:通过视觉追踪和人眼注意力机制快速定位智能摄像头中图像中的感兴趣区域,利用感知融合技术加速环境感知***信息处理速度,能显著降低感知***硬件运算需求、提升***实时性,具有较好的经济性。The human-machine collaborative perception system for automatic driving according to the embodiment of the present invention has at least the following beneficial effects: quickly locate the region of interest in the image in the smart camera through visual tracking and human eye attention mechanism, and use perception fusion technology to accelerate environmental perception The system information processing speed can significantly reduce the hardware computing requirements of the perception system, improve the real-time performance of the system, and has good economics.
根据本发明的一些实施例,还包括:头部运动补偿模块,用于通过图像识别获取驾驶员头部特征点相对于相机坐标系的空间位置,建立驾驶员头部坐标系,对头部俯仰角、偏转角、侧倾角、三轴平动数据进行记录,得到头部运动数据,以及,基于环境模型进行数据融合,通过所述头部运动数据补偿视线追踪数据,推算并输出所述驾驶员三维视线。According to some embodiments of the present invention, it further includes: a head motion compensation module, configured to obtain the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, establish the driver's head coordinate system, and pitch the head. Angle, deflection angle, roll angle, and three-axis translation data are recorded to obtain head motion data, and data fusion is performed based on the environment model, the gaze tracking data is compensated by the head motion data, and the driver is calculated and output. three-dimensional sight.
根据本发明的一些实施例,还包括:感知图像分类处理模块,用于获取当前驾驶员眼动特征,并与预设驾驶员眼动特征数据库的数据眼动特征进行比对,获取通过环境感知摄像头获取的感知图像的像素特征分类,并根据所述像素特征分类,对所述感知图像进行处理。According to some embodiments of the present invention, it further includes: a perceptual image classification processing module, configured to obtain the current driver's eye movement feature, and compare it with the data eye movement feature of the preset driver's eye movement feature database, and obtain the result obtained through the environmental perception. Classify the pixel features of the perceptual image acquired by the camera, and process the perceptual image according to the pixel feature classification.
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the present invention will be set forth, in part, from the following description, and in part will be apparent from the following description, or may be learned by practice of the invention.
附图说明Description of drawings
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and readily understood from the following description of embodiments taken in conjunction with the accompanying drawings, wherein:
图1为本发明实施例的方法的流程示意图;1 is a schematic flowchart of a method according to an embodiment of the present invention;
图2为本发明实施例的方法中的数据处理过程示意图;2 is a schematic diagram of a data processing process in a method according to an embodiment of the present invention;
图3为本发明实施例的***的模块示意框图。FIG. 3 is a schematic block diagram of modules of a system according to an embodiment of the present invention.
附图标记:Reference number:
面部图像采集模块100、角点定位模块200、注视点获取模块300、拓维感知标记模块400、感知融合模块500、头部运动补偿模块600、感知图像分类处理模块700。Facial image acquisition module 100 , corner location module 200 , gaze point acquisition module 300 , extension perceptual labeling module 400 , perceptual fusion module 500 , head motion compensation module 600 , perceptual image classification processing module 700 .
具体实施方式Detailed ways
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。The following describes in detail the embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary, only used to explain the present invention, and should not be construed as a limitation of the present invention.
在本发明的描述中,若干的含义是一个或者多个,多个的含义是两个及两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到第一、第二只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。In the description of the present invention, the meaning of several is one or more, the meaning of multiple is two or more, greater than, less than, exceeding, etc. are understood as not including this number, above, below, within, etc. are understood as including this number . If it is described that the first and the second are only for the purpose of distinguishing technical features, it cannot be understood as indicating or implying relative importance, or indicating the number of the indicated technical features or the order of the indicated technical features. relation.
名词解释:Glossary:
RIO,region of interest,感兴趣区域。RIO, region of interest, region of interest.
普尔钦斑(Purkinje image),眼球角膜上的一个亮光点,由进入瞳孔的光线在角膜外表面上反射(corneal reflection,CR)而产生。Purkinje image, a bright spot on the cornea of the eye, is produced by the reflection of light entering the pupil on the outer surface of the cornea (corneal reflection, CR).
参照图1,本发明的实施例的方法包括:S100,通过设置在车内的不同位置的双目红外CCD拍摄驾驶员头部影像,获取驾驶员面部合成图像,提取脸部特征,获取特征角点,并通过特征角点的定位建立眼球坐标系;S200,基于眼球坐标系获取驾驶员三维视线,并通过坐标矩阵变换将驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头 成像上像素点间的映射关系,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库;S300,基于驾驶员注视点缓存数据库,对注视目标频率与注视时长进行眼动分析,获得眼动状态并标记,并对驾驶员视觉落点与成像的交点的像素区域进行标记,构建拓维环境感知图像数据库;S400,根据拓维环境感知图像数据库,对自动驾驶过程中的图像处理神经网络的分配权重进行调整,自适应调整图像像素遍历精细程度和区域。Referring to FIG. 1 , the method according to the embodiment of the present invention includes: S100 , photographing a driver's head image through binocular infrared CCDs arranged at different positions in the vehicle, obtaining a synthetic image of the driver's face, extracting facial features, and obtaining characteristic angles The eyeball coordinate system is established by locating the feature corners; S200, based on the eyeball coordinate system, the driver's three-dimensional sight line is obtained, and the driver's three-dimensional sight line and the imaging pixel information of the environment perception camera are placed in the same world coordinates through coordinate matrix transformation. Establish the mapping relationship between the driver's 3D line of sight and the pixels on the image of the environment perception camera, obtain the driver's visual landing point, and save it to the driver's gaze point cache database; S300, based on the driver's gaze point cache database, compares the frequency of the gaze target and the gaze point The eye movement analysis is carried out for the duration, and the eye movement status is obtained and marked, and the pixel area of the intersection of the driver's vision and the imaging is marked, and the Tuowei environment perception image database is constructed; S400, according to the Tuowei environment perception image database, the automatic The distribution weights of the image processing neural network in the driving process are adjusted, and the fineness and area of image pixel traversal are adaptively adjusted.
本发明的实施例中对数据的处理过程大致上分为以下四步:双目红外成像采集面部图像、GPU对面部图像处理、CPU计算数据并缓存、存储与融合输出;具体包括:图像采集、脸部特征检测、特征角点检测、特征三维坐标提取、驾驶视线方向计算、驾驶视线落点计算、融合输出,参照图2。双目红外CCD用于拍摄驾驶过程全工况下驾驶员面部,采集视频图像并缓存。GPU对面部图像处理包括对驾驶员面部特征检测和特征角点坐标提取;CPU计算定位驾驶员视线在环境摄像头成像中的RIO区域;存储与融合输出则根据缓存的驾驶员视觉RIO区域历史信息,对感知数据进行重构拓维,基于视觉注意力机制对拓维数据进行融合输出。In the embodiment of the present invention, the data processing process is roughly divided into the following four steps: binocular infrared imaging to collect facial images, GPU to process facial images, CPU to calculate data and cache, store and fuse output; specifically including: image acquisition, Facial feature detection, feature corner detection, feature 3D coordinate extraction, driving line of sight direction calculation, driving line of sight drop point calculation, fusion output, refer to Figure 2. The binocular infrared CCD is used to photograph the driver's face under all conditions of the driving process, collect video images and cache them. The GPU's facial image processing includes the detection of the driver's facial features and the extraction of feature corner coordinates; the CPU calculates and locates the RIO area of the driver's line of sight in the environmental camera imaging; the storage and fusion output is based on the cached driver's visual RIO area history information, The perceptual data is reconstructed and expanded, and the expanded data is fused and output based on the visual attention mechanism.
作为输入端的图像采集主要是采集和缓存驾驶员面部视频流图像。视频图像采集不能对驾驶员本身行为产生干扰,因此需要采用非穿戴式等对驾驶员行为干涉小的采集方式;此外,驾驶舱的光照环境通常非常复杂且恶劣,常规摄像头适用场景有限,因此必须采用对光照环境不敏感的红外摄像头。且驾驶过程中存在驾驶员自身运动造成的大偏转姿态,为了避免单摄像头捕捉驾驶员眼部失败,通过在设置于车内的不同位置的双目红外CCD拍摄驾驶员头部影像,获得同一时间不同角度的驾驶员面部图像。每个摄像头可能只能摄取脸部图像的一部分,本发明的一些实施例中,双目摄像头摄取的图像经过全景合成拼接、灰度处理、二值化处理成一部完整的驾驶员面部合成图像,再传递到下一处理流程。The image acquisition as the input is mainly to collect and buffer the driver's face video stream image. Video image acquisition cannot interfere with the driver's own behavior, so it is necessary to use non-wearable and other acquisition methods that interfere with the driver's behavior; in addition, the lighting environment of the cockpit is usually very complex and harsh, and the application of conventional cameras is limited. Adopt an infrared camera that is not sensitive to the light environment. In addition, there is a large deflection posture caused by the driver's own movement during the driving process. In order to avoid the failure of the single camera to capture the driver's eyes, the binocular infrared CCDs set in different positions in the car are used to shoot the driver's head image, and the same time is obtained. Images of the driver's face from different angles. Each camera may only capture a part of the face image. In some embodiments of the present invention, the images captured by the binocular cameras undergo panoramic stitching, grayscale processing, and binarization processing to form a complete composite image of the driver's face. Then pass to the next processing flow.
脸部特征检测,用于提取合成图像中的驾驶员脸部位置,为接下来的脸部特征点提取做预处理,并为视线方向计算提供基础数据;应注意的是,人脸特征检 测需要保持对驾驶员脸部特征的持续追踪,从而提高***运行速度、降低误检率。本发明的实施例的方法中,通过人脸肤色模型将合成较像中的脸部区域与背景区域分开,得到可能存在脸部区域的待检区域,再通过人脸模型匹配待检区域,分析比较得到与该人脸模型的匹配度,从而根据该匹配度提取可能存在人脸的区域。在人脸检测成功基础上,开始之后的特征角点检测;如果人脸检测失败,则重新继续循环检测人脸。在脸部特征检测过程过程中,本发明的实施例的方法中,还对摄取的图像进行循环存储,构建历史人脸数据库,以便为接下来的驾驶员精神状态监测提供时间序列维度信息。Facial feature detection is used to extract the driver's face position in the composite image, preprocess the next facial feature point extraction, and provide basic data for gaze direction calculation; it should be noted that face feature detection requires Keep track of the driver's facial features, thereby increasing the speed of the system and reducing the false detection rate. In the method of the embodiment of the present invention, the face area in the synthetic comparison image is separated from the background area by the face skin color model to obtain the area to be inspected that may have the face area, and then the area to be inspected is matched by the face model, and the analysis is performed. The matching degree with the face model is obtained by comparison, so as to extract the area where the human face may exist according to the matching degree. On the basis of the successful face detection, start the feature corner detection; if the face detection fails, continue to cycle to detect the face again. In the process of facial feature detection, in the method of the embodiment of the present invention, the captured images are also stored cyclically to construct a historical face database, so as to provide time-series dimension information for the subsequent monitoring of the driver's mental state.
特征角点提取建立在脸部特征检测的基础上,通过脸部特征检测从驾驶员面部图像中裁剪获得人脸检测图像,从图片中提取人脸上的两只眼睛的内外眼角、两个嘴角、两个人眼中心点,对这些特征角点定位,并建立人脸坐标系。具体流程:首先在人脸检测图像之后,依据人脸区域“三庭五眼”原则进行人眼范围粗定位,以便缩小人眼检测范围,提高检测精度和检测速度;接着通过动态阈值分割、梯度变换等手段,完成眼睛等特征角点的提取,并根据两内眼角和两嘴角建立人脸平面。在检测出脸部区域及眼睛之后,可以通过红外CCD相机对驾驶员面部进行持续观测,通过视频流分析驾驶员面部特征点变化和视觉注意力变化,以检测驾驶员驾驶的精神状态(包括疲劳状态、驾驶专注度),得到精神状态评分,并在驾驶员精神状态良好的情况下进行下一步操作,否则持续摄取脸部和眼部图像并循环观测;防止采集驾驶员疲劳状态的视觉追踪数据,融合感知后造成误判。Feature corner extraction is based on facial feature detection. The face detection image is obtained by cropping the driver's face image through facial feature detection, and the inner and outer corners of the two eyes and the two mouth corners are extracted from the picture. , two center points of human eyes, locate these feature corner points, and establish a face coordinate system. The specific process: first, after the face detection image, according to the principle of "three courts and five eyes" in the face area, the rough positioning of the human eye range is carried out, so as to narrow the detection range of the human eye and improve the detection accuracy and detection speed; then through the dynamic threshold segmentation, gradient Transformation and other means are used to complete the extraction of characteristic corner points such as eyes, and the face plane is established according to the two inner corners of the eyes and the corners of the two mouths. After the face area and eyes are detected, the driver's face can be continuously observed through the infrared CCD camera, and the changes of the driver's facial feature points and visual attention can be analyzed through the video stream to detect the driver's driving mental state (including fatigue). state, driving concentration), get a mental state score, and proceed to the next step when the driver is in a good mental state, otherwise continue to take face and eye images and observe them cyclically; prevent the collection of visual tracking data of the driver's fatigue state , resulting in misjudgment after fusion perception.
特征角三维坐标提取,即提取上述特征角点的三维坐标。通过经过标定的双目红外CCD相机***,可以获取上述特征角点在双目相机成像***中的位置和方位,然后,根据摄像机坐标系与理想坐标系之间的关系获取脸部角点位置相对坐标。根据眼角点和嘴角坐标建立人脸平面坐标系,其中人脸朝向与人脸平面垂直。接着,由摄像机在世界坐标系的位置和方向,从而求解出脸部角点三维空间坐标,通过一系类坐标转换即可获取各角点的世界坐标系。人眼在转动过程中基 本可以假设眼球中心相对人脸坐标系的坐标是不变的,因此,可根据眼角和嘴角坐标确定眼球中心的坐标。据此,通过获取的角点坐标,建立眼球坐标系。The extraction of the three-dimensional coordinates of the characteristic corners is to extract the three-dimensional coordinates of the above-mentioned characteristic corners. Through the calibrated binocular infrared CCD camera system, the position and orientation of the above characteristic corner points in the binocular camera imaging system can be obtained, and then the relative position of the face corner points can be obtained according to the relationship between the camera coordinate system and the ideal coordinate system. coordinate. A face plane coordinate system is established according to the coordinates of the corners of the eyes and the corners of the mouth, where the face orientation is perpendicular to the face plane. Then, according to the position and direction of the camera in the world coordinate system, the three-dimensional space coordinates of the corners of the face are obtained, and the world coordinate system of each corner can be obtained through a series of coordinate transformations. In the process of human eye rotation, it can be basically assumed that the coordinates of the center of the eyeball relative to the face coordinate system are unchanged. Therefore, the coordinates of the center of the eyeball can be determined according to the coordinates of the corner of the eye and the corner of the mouth. Accordingly, the eyeball coordinate system is established through the acquired corner coordinates.
驾驶视线方向计算,该步骤就是求解驾驶员视线方向并保持持续追踪的过程。人眼的视线路径是视网膜中间的凹与晶体中间的连接线的方向。由于眼球的活动是很平动夹杂前后移动的复杂动态的过程,而并非纯球体的转动;因此,检测并获取严格定义上的视线方向将是非常困难的,故而一般定义上视线方向就是眼球中心与人眼表面中心的连线方向。根据“瞳孔-角膜反射方法”原理可知,普尔钦斑不会随着眼球运动而大幅变化,因此提取瞳孔和普尔钦斑坐标即可求解驾驶员视线方向。具体的,对面部合成图像中提取出的人脸检测图像进行眼球区域识别,截取眼球区域图像。再对眼球区域图像进行阈值分析,分别获取瞳孔阈值图像和普尔钦斑阈值图像,识别出瞳孔和普尔钦斑并计算出瞳孔中心和普尔钦斑点中心坐标,确立瞳孔-普尔钦斑位置关系映射函数。本发明的实施例中的对驾驶视线方向的计算还包括:利用头部跟踪进行图像数据补偿;包括:通过图像识别获取头部特征点相对于相机坐标系的空间位置,建立驾驶员头部坐标系,对头部俯仰角、偏转角、侧倾角、三维坐标系的三轴平动数据进行缓存记录,基于环境模型进行数据融合,利用头部运动数据补偿视线追踪数据,最终推算并输出三维空间方向视线。Driving gaze direction calculation, this step is the process of solving the driver's gaze direction and maintaining continuous tracking. The visual path of the human eye is the direction of the connecting line between the fovea in the middle of the retina and the middle of the lens. Because the movement of the eyeball is a complex dynamic process with translation and forward and backward movement, rather than the rotation of a pure sphere; therefore, it is very difficult to detect and obtain the strictly defined line of sight direction, so the general definition of the line of sight direction is the center of the eyeball The connection direction to the center of the human eye surface. According to the principle of "pupil-corneal reflection method", it can be seen that Purchin's spot does not change greatly with eye movement, so the driver's line of sight can be solved by extracting the coordinates of pupil and Purchin's spot. Specifically, the eyeball region recognition is performed on the face detection image extracted from the face composite image, and the eyeball region image is intercepted. Then perform threshold analysis on the image of the eye area, obtain the pupil threshold image and Purchin's spot threshold image respectively, identify the pupil and Purchin's spot, calculate the center coordinates of the pupil and Purchin's spot, and establish the mapping function of the pupil-Purchin's spot position relationship. . The calculation of the driver's sight direction in the embodiment of the present invention further includes: using head tracking to perform image data compensation; including: obtaining the spatial position of the head feature point relative to the camera coordinate system through image recognition, and establishing the driver's head coordinate It caches and records the three-axis translation data of the head pitch angle, yaw angle, roll angle, and three-dimensional coordinate system, performs data fusion based on the environment model, uses the head motion data to compensate the gaze tracking data, and finally calculates and outputs the three-dimensional space. directional sight.
驾驶视线落点计算主要用于完成驾驶员关注提取与追踪,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系。驾驶员三维视线是基于眼球坐标系的,环境感知摄像头成像上像素位置是基于成像坐标系的,因此建立二者之间的映射关系涉及坐标矩阵变换。其中,眼球坐标系相对双目红外CCD摄像机成像坐标系的位置是确定的,而双目红外CCD摄像机成像坐标系相对车身的坐标是确定的;同样,环境感知摄像头成像坐标系相对环境感知摄像头成像坐标系位置相对确定,而环境感知摄像头相对车身的位置确定。因此,通过坐标矩阵变换将驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,既可建立三维视线与环境感知摄像头成像像素信息的映射关系;再利用三维视线与环 境感知摄像头成像求交,即可求解驾驶员视觉落点。对驾驶员视觉落点持续追踪,并保存至驾驶员注视点缓存数据库。可以理解的是,在本发明的实施例方法中,若检测到驾驶员三维视线与环境感知摄像头成像存在交集,即视觉落点求解成功则继续进行下一步数据融合,否则继续循环求解注视点。The calculation of the driving line of sight is mainly used to complete the driver's attention extraction and tracking, and to establish the mapping relationship between the driver's 3D line of sight and the pixels on the image of the environment perception camera. The three-dimensional line of sight of the driver is based on the eyeball coordinate system, and the pixel position on the imaging of the environment perception camera is based on the imaging coordinate system. Therefore, establishing the mapping relationship between the two involves coordinate matrix transformation. Among them, the position of the eyeball coordinate system relative to the imaging coordinate system of the binocular infrared CCD camera is determined, and the coordinates of the imaging coordinate system of the binocular infrared CCD camera relative to the vehicle body are determined; similarly, the imaging coordinate system of the environment perception camera is relative to the environment perception camera. The position of the coordinate system is relatively determined, and the position of the environment perception camera relative to the vehicle body is determined. Therefore, the driver's 3D line of sight and the imaging pixel information of the environment perception camera are placed in the same world coordinates through the coordinate matrix transformation, and the mapping relationship between the 3D line of sight and the imaging pixel information of the environment perception camera can be established; By seeking intersection, the driver's visual landing point can be solved. Continuously track the driver's visual location and save it to the driver's gaze point cache database. It can be understood that, in the embodiment method of the present invention, if it is detected that there is an intersection between the driver's three-dimensional sight line and the imaging of the environment perception camera, that is, the visual landing point is successfully solved, the next step of data fusion is continued, otherwise, the gaze point is continuously solved in a loop.
信息融合则是在驾驶员注视点与成像像素配准的基础上进行有效信息提取和组合。在视觉理论上,眼部运动大致分为注视、跳视、平滑尾随跟踪三种运动形式,三种眼部运动形式代表了驾驶员不同的主观意图。其中,注视是眼球片刻的静止时间并伴随微小的眼动特征,其是对感兴趣目标上稳定的视网膜眼动,注视停留时间至少100ms~200ms以上;跳视则是眼球快速眼动,用于将视觉中心的中心凹快速转移到新的感兴趣区域,持续时间范围是10ms~100ms;平滑尾随跟踪是眼球跟踪运动感兴趣目标时的眼动行为,此时眼球与物体之间会产生一个稳定的关系。基于驾驶员注视点缓存数据库,对注视目标频率与注视时长进行眼动分析,并对此刻驾驶员眼动状态(注视、跳动、尾随跟踪)进行标记;同时对驾驶员视觉落点与成像的交点的像素区域进行标记;基于上述标记信息,构建包含驾驶员视觉信息的拓维环境感知图像数据库。Information fusion is to extract and combine effective information based on the registration of driver's gaze point and imaging pixels. In vision theory, eye movement is roughly divided into three movement forms: fixation, saccade, and smooth trailing tracking. The three eye movement forms represent different subjective intentions of drivers. Among them, fixation is the momentary static time of the eyeball accompanied by tiny eye movement features, which is the stable retinal eye movement on the target of interest, and the fixation time is at least 100ms to 200ms; saccadic is the rapid eye movement of the eyeball, used for The fovea of the visual center is quickly transferred to the new area of interest, and the duration is in the range of 10ms to 100ms; smooth trailing tracking is the eye movement behavior when the eye tracks the target of interest, and a stable movement occurs between the eye and the object. Relationship. Based on the driver's gaze point cache database, the eye movement analysis is performed on the gaze target frequency and gaze duration, and the driver's eye movement state (gazing, jumping, and trailing tracking) at the moment is marked; Based on the above-mentioned marking information, a Tuo-dimensional environment perception image database containing the driver's visual information is constructed.
根据拓维环境感知图像数据,训练一个注意力的神经网络模块,该神经网络模块用于自动调整传统图像处理神经网络某部分的分配权重,以自适应图像像素遍历精细程度和区域,在像素标记下迅速定位图像感兴趣区域,减少算法像素遍历时间。在本发明实施例的方法中,对于通过环境感知摄像头获取的感知图像,还会通过预设驾驶员眼动特征数据库进行像素特征进行辅助分类,通过获取当前驾驶员眼动特征,在该预设驾驶员眼动特征数据库中查找匹配的数据眼动特征,确定当前感知图像的图像像素特征分类,从而选用特定的图像处理方法进行处理,能够提升恶劣环境下(光照、雨天、夜晚等)的环境感知的准确度。According to the image data of Tuowei environment perception, an attention neural network module is trained. The neural network module is used to automatically adjust the distribution weight of a certain part of the traditional image processing neural network, so as to adapt to the fineness and area of image pixel traversal. It can quickly locate the region of interest in the image and reduce the pixel traversal time of the algorithm. In the method of the embodiment of the present invention, for the perception image obtained by the environment perception camera, pixel features are also used for auxiliary classification through a preset driver's eye movement feature database, and the current driver's eye movement feature is obtained by obtaining the current driver's eye movement feature. Find matching data eye movement features in the driver's eye movement feature database, determine the image pixel feature classification of the current perception image, and select a specific image processing method for processing, which can improve the environment in harsh environments (light, rain, night, etc.) Perceived accuracy.
本发明的实施例的***,参照图3,包括:面部图像采集模块100,用于通过设置在车内的不同位置的双目红外CCD拍摄驾驶员头部影像,获取驾驶员面部合成图像;角点定位模块200,用于从驾驶员面部合成图像中提取脸部特征, 获取特征角点,并通过特征角点的定位建立眼球坐标系;注视点获取模块300,用于基于眼球坐标系获取驾驶员三维视线,并通过坐标矩阵变换将驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库;拓维感知标记模块400,用于基于驾驶员注视点缓存数据库,对注视目标频率与注视时长进行眼动分析,获得眼动状态并标记,并对驾驶员视觉落点与成像的交点的像素区域进行标记,构建拓维环境感知图像数据库;感知融合模块500,根据拓维环境感知图像数据库,对自动驾驶过程中的图像处理神经网络的分配权重进行调整,自适应调整图像像素遍历精细程度和区域。头部运动补偿模块600,用于通过图像识别获取驾驶员头部特征点相对于相机坐标系的空间位置,建立驾驶员头部坐标系,对头部俯仰角、偏转角、侧倾角、三轴平动数据进行记录,得到头部运动数据,以及,基于环境模型进行数据融合,通过头部运动数据补偿视线追踪数据,推算并输出驾驶员三维视线。感知图像分类处理模块700,用于获取当前驾驶员眼动特征,并与预设驾驶员眼动特征数据库的数据眼动特征进行比对,获取通过环境感知摄像头获取的感知图像的像素特征分类,并根据像素特征分类,对感知图像进行处理。The system of the embodiment of the present invention, with reference to FIG. 3 , includes: a facial image acquisition module 100, which is used to capture the driver's head image through binocular infrared CCDs arranged at different positions in the vehicle to obtain a composite image of the driver's face; The point positioning module 200 is used to extract facial features from the synthetic image of the driver's face, obtain characteristic corner points, and establish an eyeball coordinate system through the positioning of the characteristic corner points; the gaze point acquisition module 300 is used to obtain driving based on the eyeball coordinate system. The driver's three-dimensional sight line and the image pixel information of the environment perception camera are placed in the same world coordinates through coordinate matrix transformation, and the mapping relationship between the driver's three-dimensional sight line and the pixels on the environment perception camera imaging is established to obtain the driver's vision. The landing point is saved to the driver's gaze point cache database; the Tuowei perception marking module 400 is used to analyze the gaze target frequency and gaze duration based on the driver's gaze point cache database, obtain the eye movement state and mark it, and perform eye movement analysis. The pixel area of the intersection of the driver's vision and the imaging is marked, and a Tuowei environment perception image database is constructed; the perception fusion module 500, according to the Tuowei environment perception image database, assigns weights to the image processing neural network in the process of automatic driving. Adjust, adaptively adjust the fineness and area of image pixel traversal. The head motion compensation module 600 is used to obtain the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, establish the driver's head coordinate system, and measure the head pitch angle, yaw angle, roll angle, three-axis The translation data is recorded to obtain the head motion data, and the data fusion is performed based on the environment model, the gaze tracking data is compensated by the head motion data, and the driver's three-dimensional sight line is calculated and output. The perceptual image classification processing module 700 is used to obtain the current driver's eye movement feature, and compare it with the data eye movement feature of the preset driver's eye movement feature database to obtain the pixel feature classification of the perceptual image obtained by the environment perception camera, And according to the pixel feature classification, the perceptual image is processed.
尽管本文描述了具体实施方案,但是本领域中的普通技术人员将认识到,许多其它修改或另选的实施方案同样处于本公开的范围内。例如,结合特定设备或组件描述的功能和/或处理能力中的任一项可以由任何其它设备或部件来执行。另外,虽然已根据本公开的实施方案描述了各种例示性具体实施和架构,但是本领域中的普通技术人员将认识到,对本文所述的例示性具体实施和架构的许多其它修改也处于本公开的范围内。Although specific embodiments are described herein, those of ordinary skill in the art will recognize that many other modifications or alternative embodiments are also within the scope of this disclosure. For example, any of the functions and/or processing capabilities described in connection with a particular device or component may be performed by any other device or component. Additionally, although various exemplary implementations and architectures have been described in accordance with the embodiments of the present disclosure, those of ordinary skill in the art will recognize that many other modifications to the exemplary implementations and architectures described herein are within within the scope of this disclosure.
上文参考根据示例性实施方案所述的***、方法、***和/或计算机程序产品的框图和流程图描述了本公开的某些方面。应当理解,框图和流程图中的一个或多个块以及框图和流程图中的块的组合可分别通过执行计算机可执行程序指令来实现。同样,根据一些实施方案,框图和流程图中的一些块可能无需按示出 的顺序执行,或者可以无需全部执行。另外,超出框图和流程图中的块所示的那些部件和/或操作以外的附加部件和/或操作可存在于某些实施方案中。Certain aspects of the present disclosure are described above with reference to block diagrams and flowchart illustrations of systems, methods, systems and/or computer program products according to example embodiments. It will be understood that one or more blocks of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, respectively, can be implemented by the execution of computer-executable program instructions. Also, some of the blocks in the block diagrams and flow diagrams may not need to be performed in the order shown, or all of the blocks may not need to be performed in the order shown, according to some implementations. Additionally, additional components and/or operations beyond those illustrated by blocks in the block diagrams and flowcharts may be present in certain embodiments.
因此,框图和流程图中的块支持用于执行指定功能的装置的组合、用于执行指定功能的元件或步骤的组合以及用于执行指定功能的程序指令装置。还应当理解,框图和流程图中的每个块以及框图和流程图中的块的组合可以由执行特定功能、元件或步骤的专用硬件计算机***或者专用硬件和计算机指令的组合来实现。Accordingly, blocks in the block diagrams and flowchart illustrations support combinations of means for performing the specified functions, combinations of elements or steps for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware computer systems, or combinations of special purpose hardware and computer instructions, that perform the specified functions, elements, or steps.
本文所述的程序模块、应用程序等可包括一个或多个软件组件,包括例如软件对象、方法、数据结构等。每个此类软件组件可包括计算机可执行指令,所述计算机可执行指令响应于执行而使本文所述的功能的至少一部分(例如,本文所述的例示性方法的一种或多种操作)被执行。Program modules, applications, and the like described herein may include one or more software components, including, for example, software objects, methods, data structures, and the like. Each such software component may include computer-executable instructions that, in response to execution, cause at least a portion of the functions described herein (eg, one or more operations of the exemplary methods described herein) be executed.
软件组件可以用各种编程语言中的任一种来编码。一种例示性编程语言可以为低级编程语言,诸如与特定硬件体系结构和/或操作***平台相关联的汇编语言。包括汇编语言指令的软件组件可能需要在由硬件架构和/或平台执行之前由汇编程序转换为可执行的机器代码。另一种示例性编程语言可以为更高级的编程语言,其可以跨多种架构移植。包括更高级编程语言的软件组件在执行之前可能需要由解释器或编译器转换为中间表示。编程语言的其它示例包括但不限于宏语言、外壳或命令语言、作业控制语言、脚本语言、数据库查询或搜索语言、或报告编写语言。在一个或多个示例性实施方案中,包含上述编程语言示例中的一者的指令的软件组件可直接由操作***或其它软件组件执行,而无需首先转换成另一种形式。Software components can be coded in any of a variety of programming languages. An exemplary programming language may be a low-level programming language, such as assembly language associated with a particular hardware architecture and/or operating system platform. Software components that include assembly language instructions may need to be converted into executable machine code by an assembler prior to execution by a hardware architecture and/or platform. Another exemplary programming language may be a higher level programming language that is portable across multiple architectures. Software components including higher level programming languages may need to be converted into an intermediate representation by an interpreter or compiler before execution. Other examples of programming languages include, but are not limited to, macro languages, shell or command languages, job control languages, scripting languages, database query or search languages, or report writing languages. In one or more exemplary implementations, a software component containing instructions from one of the above-described programming language examples can be directly executed by an operating system or other software component without first being converted to another form.
软件组件可存储为文件或其它数据存储构造。具有相似类型或相关功能的软件组件可一起存储在诸如特定的目录、文件夹或库中。软件组件可为静态的(例如,预设的或固定的)或动态的(例如,在执行时创建或修改的)。Software components may be stored as files or other data storage constructs. Software components with similar types or related functions may be stored together, for example, in a particular directory, folder, or library. Software components may be static (eg, preset or fixed) or dynamic (eg, created or modified at execution time).
上面结合附图对本发明实施例作了详细说明,但是本发明不限于上述实施例,在所属技术领域普通技术人员所具备的知识范围内,还可以在不脱离本发明 宗旨的前提下作出各种变化。The embodiments of the present invention have been described in detail above in conjunction with the accompanying drawings, but the present invention is not limited to the above-mentioned embodiments, and within the scope of knowledge possessed by those of ordinary skill in the art, various Variety.

Claims (10)

  1. 一种面向自动驾驶的人机协同感知方法,其特征在于,包括以下步骤:A human-machine collaborative perception method for automatic driving, characterized in that it includes the following steps:
    S100,通过设置在车内的不同位置的双目红外CCD拍摄驾驶员头部影像,获取驾驶员面部合成图像,提取脸部特征,获取特征角点,并通过所述特征角点的定位建立眼球坐标系;S100, the driver's head image is captured by the binocular infrared CCDs arranged at different positions in the vehicle, the synthetic image of the driver's face is obtained, the facial features are extracted, the characteristic corners are obtained, and the eyeballs are established by locating the characteristic corners Coordinate System;
    S200,基于所述眼球坐标系获取驾驶员三维视线,并通过坐标矩阵变换将所述驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库;S200, obtaining the driver's three-dimensional sight line based on the eyeball coordinate system, and placing the driver's three-dimensional sight line and the imaging pixel information of the environment perception camera in the same world coordinates through coordinate matrix transformation, and establishing the driver's three-dimensional sight line and the environment perception camera imaging The mapping relationship between the upper pixel points, obtain the driver's visual landing point, and save it to the driver's gaze point cache database;
    S300,基于所述驾驶员注视点缓存数据库,对注视目标频率与注视时长进行眼动分析,获得眼动状态并标记,并对驾驶员视觉落点与成像的交点的像素区域进行标记,构建拓维环境感知图像数据库;S300, based on the driver's gaze point cache database, perform eye movement analysis on the gaze target frequency and the gaze duration, obtain and mark the eye movement state, and mark the pixel area of the intersection of the driver's visual landing point and the imaging to construct a topology dimensional environment-aware image database;
    S400,根据所述拓维环境感知图像数据库,对自动驾驶过程中的图像处理神经网络的分配权重进行调整,自适应调整图像像素遍历精细程度和区域。S400, according to the Tuowei environment perception image database, adjust the distribution weight of the image processing neural network in the process of automatic driving, and adaptively adjust the fineness and area of image pixel traversal.
  2. 根据权利要求1所述的面向自动驾驶的人机协同感知方法,其特征在于,所述步骤S100包括:The human-machine collaborative sensing method for autonomous driving according to claim 1, wherein the step S100 comprises:
    S110,通过所述双目红外CCD拍摄驾驶员头部影像,获得同一时间不同角度的驾驶员面部图像,进行全景合成拼接、灰度处理、二值化处理后,得到所述驾驶员面部合成图像;S110, using the binocular infrared CCD to capture an image of the driver's head to obtain the driver's face images at different angles at the same time, and after performing panoramic stitching, grayscale processing, and binarization processing to obtain the driver's face composite image ;
    S120,通过人脸肤色模型,对所述驾驶员面部合成图像进行区分,得出待检区域,并基于人脸模型匹配所述待检区域,得到人脸检测图像,存入历史人脸数据库中;S120: Distinguish the synthetic image of the driver's face through a face skin color model to obtain an area to be inspected, and match the area to be inspected based on the face model to obtain a face detection image, which is stored in a historical face database ;
    S130,从所述人脸检测图像中提取所述特征角点,识别出两只眼睛的内外眼角、两个嘴角及两个人眼中心点,并根据两内眼角和两嘴角建立人脸平面;S130, extract the characteristic corner points from the face detection image, identify the inner and outer corners of the two eyes, the two mouth corners and the center points of the two human eyes, and establish a face plane according to the two inner corners of the eyes and the two mouth corners;
    S140,基于所述双目红外CCD在世界坐标系的位置和方向,求解所述特征角点的三维空间坐标,进行坐标转换得到所述特征角点的世界坐标,并根据所述 特征角点的世界坐标建立所述眼球坐标系。S140, based on the position and direction of the binocular infrared CCD in the world coordinate system, solve the three-dimensional space coordinates of the feature corners, perform coordinate transformation to obtain the world coordinates of the feature corners, and obtain the world coordinates of the feature corners according to the coordinates of the feature corners. The world coordinates establish the eyeball coordinate system.
  3. 根据权利要求2所述的面向自动驾驶的人机协同感知方法,其特征在于,所述步骤S140还包括:The human-machine collaborative sensing method for autonomous driving according to claim 2, wherein the step S140 further comprises:
    S141,从所述历史人脸数据库中读取所述人脸检测图像,通过视频流分析驾驶员面部特征变化及视觉注意力变化,对驾驶员精神状态进行检测,获取精神状态评分;S141, read the face detection image from the historical face database, analyze the driver's facial feature change and visual attention change through the video stream, detect the driver's mental state, and obtain a mental state score;
    S142,若所述精神状态评分小于设定阈值,则继续通过所述双目红外CCD拍摄驾驶员头部影像获取所述人脸检测图像,重新计算所述精神状态评分;S142, if the mental state score is less than a set threshold, continue to use the binocular infrared CCD to capture the driver's head image to obtain the face detection image, and recalculate the mental state score;
    S143,否则,求解所述特征角点的三维空间坐标,并建立所述眼球坐标系。S143, otherwise, solve the three-dimensional space coordinates of the feature corners, and establish the eyeball coordinate system.
  4. 根据权利要求1所述的面向自动驾驶的人机协同感知方法,其特征在于,所述步骤S200包括:The human-machine collaborative sensing method for autonomous driving according to claim 1, wherein the step S200 comprises:
    S210,对所述驾驶员面部合成图像进行眼球区域识别,截取眼球区域图像,将所述眼球区域图像进行阈值分析,分别获取瞳孔阈值图像和普尔钦斑阈值图像,识别瞳孔和普尔钦斑,并计算出瞳孔中心和普尔钦斑点中心坐标,确立瞳孔-普尔钦斑位置关系映射函数;S210, performing eyeball region recognition on the synthetic image of the driver's face, intercepting the eyeball region image, performing threshold analysis on the eyeball region image, obtaining a pupil threshold image and a Purchin's spot threshold image respectively, identifying the pupil and the Purchin's spot, and Calculate the center coordinates of the pupil center and Purchin's spot, and establish the mapping function of the pupil-purchin's spot position relationship;
    S220,根据所述瞳孔-普尔钦斑位置关系映射函数得出所述驾驶员三维视线;S220, obtaining the three-dimensional sight line of the driver according to the pupil-Purchin's patch positional relationship mapping function;
    S230,通过坐标矩阵变换将所述驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系;S230, place the driver's three-dimensional sight line and the imaging pixel information of the environment perception camera under the same world coordinates through coordinate matrix transformation, and establish a mapping relationship between the driver's three-dimensional sight line and the pixels on the environment perception camera imaging;
    S240,对所述驾驶员三维视线与环境感知摄像头成像求交,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库。S240: Obtain the intersection of the driver's three-dimensional sight line and the image of the environment perception camera, obtain the driver's visual landing point, and save it to the driver's gaze point cache database.
  5. 根据权利要求4所述的面向自动驾驶的人机协同感知方法,其特征在于,所述步骤S220还包括:The human-machine collaborative sensing method for autonomous driving according to claim 4, wherein the step S220 further comprises:
    S221,通过图像识别获取驾驶员头部特征点相对于相机坐标系的空间位置,建立驾驶员头部坐标系,对头部俯仰角、偏转角、侧倾角、三轴平动数据进行记 录,得到头部运动数据;S221, obtain the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, establish the driver's head coordinate system, record the head pitch angle, yaw angle, roll angle, and three-axis translation data, and obtain head movement data;
    S222,基于环境模型进行数据融合,通过所述头部运动数据补偿视线追踪数据,推算并输出所述驾驶员三维视线。S222, performing data fusion based on the environment model, compensating the sight tracking data by using the head motion data, and calculating and outputting the three-dimensional sight line of the driver.
  6. 根据权利要求1所述的面向自动驾驶的人机协同感知方法,其特征在于,所述眼动状态包括:注视、跳视以及平滑尾随跟踪。The human-machine cooperative perception method for automatic driving according to claim 1, wherein the eye movement state comprises: gaze, saccade and smooth trailing tracking.
  7. 根据权利要求1所述的面向自动驾驶的人机协同感知方法,其特征在于,所述步骤S400还包括:The human-machine collaborative sensing method for autonomous driving according to claim 1, wherein the step S400 further comprises:
    S410,获取当前驾驶员眼动特征,并与预设驾驶员眼动特征数据库的数据眼动特征进行比对,获取通过环境感知摄像头获取的感知图像的像素特征分类;S410, obtaining the current driver's eye movement feature, and comparing it with the data eye movement feature of the preset driver's eye movement feature database, to obtain the pixel feature classification of the perception image obtained by the environment perception camera;
    S420,根据所述像素特征分类,对所述感知图像进行处理。S420, process the perceptual image according to the pixel feature classification.
  8. 一种面向自动驾驶的人机协同感知***,其特征在于,包括:A human-machine collaborative perception system for automatic driving, characterized in that it includes:
    面部图像采集模块,用于通过设置在车内的不同位置的双目红外CCD拍摄驾驶员头部影像,获取驾驶员面部合成图像;The facial image acquisition module is used to capture the driver's head image through the binocular infrared CCD set at different positions in the car to obtain the driver's face composite image;
    角点定位模块,用于从所述驾驶员面部合成图像中提取脸部特征,获取特征角点,并通过所述特征角点的定位建立眼球坐标系;A corner point positioning module, used for extracting facial features from the synthetic image of the driver's face, acquiring feature corner points, and establishing an eyeball coordinate system through the positioning of the feature corner points;
    注视点获取模块,用于基于所述眼球坐标系获取驾驶员三维视线,并通过坐标矩阵变换将所述驾驶员三维视线与环境感知摄像头成像像素信息置于同一世界坐标下,建立驾驶员三维视线与环境感知摄像头成像上像素点间的映射关系,获得驾驶员视觉落点,保存至驾驶员注视点缓存数据库;The gaze point acquisition module is used to acquire the driver's three-dimensional sight line based on the eyeball coordinate system, and place the driver's three-dimensional sight line and the imaging pixel information of the environment perception camera in the same world coordinates through coordinate matrix transformation to establish the driver's three-dimensional sight line. The mapping relationship with the pixels on the imaging of the environment perception camera to obtain the driver's visual landing point and save it to the driver's gaze point cache database;
    拓维感知标记模块,用于基于所述驾驶员注视点缓存数据库,对注视目标频率与注视时长进行眼动分析,获得眼动状态并标记,并对驾驶员视觉落点与成像的交点的像素区域进行标记,构建拓维环境感知图像数据库;The Tuowei perception marking module is used to analyze the gaze target frequency and gaze duration based on the driver's gaze point cache database, obtain the eye movement state and mark it, and analyze the pixel of the intersection of the driver's visual landing point and the imaging Mark the area to build a Tuowei environment perception image database;
    感知融合模块,根据所述拓维环境感知图像数据库,对自动驾驶过程中的图像处理神经网络的分配权重进行调整,自适应调整图像像素遍历精细程度和区域。The perception fusion module adjusts the distribution weight of the image processing neural network in the process of automatic driving according to the Tuowei environment perception image database, and adaptively adjusts the fineness and area of image pixel traversal.
  9. 根据权利要求8所述的面向自动驾驶的人机协同感知***,其特征在于, 包括:The human-machine collaborative perception system for automatic driving according to claim 8, characterized in that, comprising:
    头部运动补偿模块,用于通过图像识别获取驾驶员头部特征点相对于相机坐标系的空间位置,建立驾驶员头部坐标系,对头部俯仰角、偏转角、侧倾角、三轴平动数据进行记录,得到头部运动数据,以及,基于环境模型进行数据融合,通过所述头部运动数据补偿视线追踪数据,推算并输出所述驾驶员三维视线。The head motion compensation module is used to obtain the spatial position of the driver's head feature point relative to the camera coordinate system through image recognition, and establish the driver's head coordinate system. The head movement data is recorded to obtain head movement data, and data fusion is performed based on the environment model, the eye tracking data is compensated by the head movement data, and the driver's three-dimensional sight line is calculated and output.
  10. 根据权利要求8所述的面向自动驾驶的人机协同感知***,其特征在于,包括:The human-machine collaborative perception system for automatic driving according to claim 8, characterized in that, comprising:
    感知图像分类处理模块,用于获取当前驾驶员眼动特征,并与预设驾驶员眼动特征数据库的数据眼动特征进行比对,获取通过环境感知摄像头获取的感知图像的像素特征分类,并根据所述像素特征分类,对所述感知图像进行处理。The perceptual image classification processing module is used to obtain the current driver's eye movement feature, and compare it with the data eye movement feature of the preset driver's eye movement feature database to obtain the pixel feature classification of the perceptual image obtained by the environment perception camera, and The perceptual image is processed according to the pixel feature classification.
PCT/CN2021/098223 2020-11-03 2021-06-04 Self-driving-oriented human-machine collaborative perception method and system WO2022095440A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011206771.8 2020-11-03
CN202011206771.8A CN112380935B (en) 2020-11-03 2020-11-03 Man-machine collaborative sensing method and system for automatic driving

Publications (1)

Publication Number Publication Date
WO2022095440A1 true WO2022095440A1 (en) 2022-05-12

Family

ID=74577593

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/098223 WO2022095440A1 (en) 2020-11-03 2021-06-04 Self-driving-oriented human-machine collaborative perception method and system

Country Status (2)

Country Link
CN (1) CN112380935B (en)
WO (1) WO2022095440A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115562490A (en) * 2022-10-12 2023-01-03 西北工业大学太仓长三角研究院 Cross-screen eye movement interaction method and system for aircraft cockpit based on deep learning

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380935B (en) * 2020-11-03 2023-05-26 深圳技术大学 Man-machine collaborative sensing method and system for automatic driving
CN113139443B (en) * 2021-04-08 2023-12-22 武汉理工大学 Automatic identification and matching method for traffic targets facing forward video of eye tracker
CN113837027A (en) * 2021-09-03 2021-12-24 东风柳州汽车有限公司 Driving assistance sensing method, device, equipment and storage medium
CN114022946B (en) * 2022-01-06 2022-05-03 深圳佑驾创新科技有限公司 Sight line measuring method and device based on binocular camera
CN115909254B (en) * 2022-12-27 2024-05-10 钧捷智能(深圳)有限公司 DMS system based on camera original image and image processing method thereof
CN116524581B (en) * 2023-07-05 2023-09-12 南昌虚拟现实研究院股份有限公司 Human eye image facula classification method, system, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2500890A2 (en) * 2011-03-18 2012-09-19 Any Co., Ltd. Image processing device, method thereof, and moving body anti-collision device
CN106650708A (en) * 2017-01-19 2017-05-10 南京航空航天大学 Visual detection method and system for automatic driving obstacles
CN107392189A (en) * 2017-09-05 2017-11-24 百度在线网络技术(北京)有限公司 For the method and apparatus for the driving behavior for determining unmanned vehicle
CN108549880A (en) * 2018-04-28 2018-09-18 深圳市商汤科技有限公司 Collision control method and device, electronic equipment and storage medium
CN111731187A (en) * 2020-06-19 2020-10-02 杭州视为科技有限公司 Automobile A-pillar blind area image display system and method
CN112380935A (en) * 2020-11-03 2021-02-19 深圳技术大学 Man-machine cooperative perception method and system for automatic driving

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101628394B1 (en) * 2010-11-22 2016-06-08 현대자동차주식회사 Method for tracking distance of eyes of driver
US20150339589A1 (en) * 2014-05-21 2015-11-26 Brain Corporation Apparatus and methods for training robots utilizing gaze-based saliency maps
CN114666499A (en) * 2016-05-11 2022-06-24 索尼公司 Image processing apparatus, image processing method, and movable body
CN108229284B (en) * 2017-05-26 2021-04-09 北京市商汤科技开发有限公司 Sight tracking and training method and device, system, electronic equipment and storage medium
CN109492514A (en) * 2018-08-28 2019-03-19 初速度(苏州)科技有限公司 A kind of method and system in one camera acquisition human eye sight direction
CN109493305A (en) * 2018-08-28 2019-03-19 初速度(苏州)科技有限公司 A kind of method and system that human eye sight is superimposed with foreground image
CN110969060A (en) * 2018-09-29 2020-04-07 北京市商汤科技开发有限公司 Neural network training method, neural network training device, neural network tracking method, neural network training device, visual line tracking device and electronic equipment
CN109725714B (en) * 2018-11-14 2022-06-14 北京七鑫易维信息技术有限公司 Sight line determining method, device and system and head-mounted eye movement equipment
CN110243381B (en) * 2019-07-11 2020-10-30 北京理工大学 Cooperative sensing monitoring method for air-ground robot
CN111007939B (en) * 2019-11-25 2021-09-21 华南理工大学 Virtual reality system space positioning method based on depth perception
CN110962746B (en) * 2019-12-12 2022-07-19 上海擎感智能科技有限公司 Driving assisting method, system and medium based on sight line detection
CN110991559B (en) * 2019-12-19 2023-05-12 中国矿业大学 Indoor personnel behavior non-contact cooperative sensing method
CN111857329B (en) * 2020-05-26 2022-04-26 北京航空航天大学 Method, device and equipment for calculating fixation point
CN111626221A (en) * 2020-05-28 2020-09-04 四川大学 Driver gazing area estimation method based on human eye information enhancement

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2500890A2 (en) * 2011-03-18 2012-09-19 Any Co., Ltd. Image processing device, method thereof, and moving body anti-collision device
CN106650708A (en) * 2017-01-19 2017-05-10 南京航空航天大学 Visual detection method and system for automatic driving obstacles
CN107392189A (en) * 2017-09-05 2017-11-24 百度在线网络技术(北京)有限公司 For the method and apparatus for the driving behavior for determining unmanned vehicle
CN108549880A (en) * 2018-04-28 2018-09-18 深圳市商汤科技有限公司 Collision control method and device, electronic equipment and storage medium
CN111731187A (en) * 2020-06-19 2020-10-02 杭州视为科技有限公司 Automobile A-pillar blind area image display system and method
CN112380935A (en) * 2020-11-03 2021-02-19 深圳技术大学 Man-machine cooperative perception method and system for automatic driving

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115562490A (en) * 2022-10-12 2023-01-03 西北工业大学太仓长三角研究院 Cross-screen eye movement interaction method and system for aircraft cockpit based on deep learning
CN115562490B (en) * 2022-10-12 2024-01-09 西北工业大学太仓长三角研究院 Deep learning-based aircraft cockpit cross-screen-eye movement interaction method and system

Also Published As

Publication number Publication date
CN112380935B (en) 2023-05-26
CN112380935A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
WO2022095440A1 (en) Self-driving-oriented human-machine collaborative perception method and system
JP6695503B2 (en) Method and system for monitoring the condition of a vehicle driver
JP4898026B2 (en) Face / Gaze Recognition Device Using Stereo Camera
Singh et al. Monitoring driver fatigue using facial analysis techniques
US7043056B2 (en) Facial image processing system
JP6973258B2 (en) Image analyzers, methods and programs
JP4692526B2 (en) Gaze direction estimation apparatus, gaze direction estimation method, and program for causing computer to execute gaze direction estimation method
CN106529409A (en) Eye ocular fixation visual angle measuring method based on head posture
JP5578603B2 (en) Gaze control device, gaze control method, and program thereof
García et al. Driver monitoring based on low-cost 3-D sensors
CN104200192A (en) Driver gaze detection system
JP5001930B2 (en) Motion recognition apparatus and method
US20220001544A1 (en) Auxiliary photographing device for dyskinesia analysis, and control method and apparatus for auxiliary photographing device for dyskinesia analysis
CN110889355A (en) Face recognition verification method, system and storage medium
JP2003150942A (en) Eye position tracing method
Kajiwara Driver-condition detection using a thermal imaging camera and neural networks
CN114022514A (en) Real-time sight line inference method integrating head posture and eyeball tracking
Cai et al. Gaze estimation driven solution for interacting children with ASD
CN115171189A (en) Fatigue detection method, device, equipment and storage medium
CN110674751A (en) Device and method for detecting head posture based on monocular camera
NL2030131B1 (en) Human-machine cooperative sensing method and system for automatic driving
CN113128320B (en) Human face living body detection method and device based on TOF camera and electronic equipment
Batista Locating facial features using an anthropometric face model for determining the gaze of faces in image sequences
KR100338805B1 (en) Method for detecting drowsiness level
Diaz et al. 3D Head Pose Estimation enhanced through SURF-based Key-Frames

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21888143

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21888143

Country of ref document: EP

Kind code of ref document: A1