CN111796681A - Self-adaptive sight estimation method and medium based on differential convolution in man-machine interaction - Google Patents
Self-adaptive sight estimation method and medium based on differential convolution in man-machine interaction Download PDFInfo
- Publication number
- CN111796681A CN111796681A CN202010647088.1A CN202010647088A CN111796681A CN 111796681 A CN111796681 A CN 111796681A CN 202010647088 A CN202010647088 A CN 202010647088A CN 111796681 A CN111796681 A CN 111796681A
- Authority
- CN
- China
- Prior art keywords
- convolution
- sight line
- differential
- gaze
- human
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
- G06V40/19—Sensors therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
- G06V40/193—Preprocessing; Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
- G06V40/197—Matching; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Ophthalmology & Optometry (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
- Eye Examination Apparatus (AREA)
Abstract
The invention requests to protect a self-adaptive sight line estimation method and a medium based on differential convolution in human-computer interaction, wherein the method comprises the following steps: s1, preprocessing the face image, detecting the face and positioning the human eye area by using the MTCNN algorithm, and extracting human eye characteristic information; s2, directly estimating the head pose by using the face image; s3, automatically fusing the head posture and the human eye characteristic diagram by utilizing the full connection layer of the convolutional neural network, and performing primary sight estimation; s4, predicting the gaze difference of the eyes by training through a differential convolution network; and S5, calibrating the preliminary realization estimation result by using the gaze difference, and outputting a final sight line estimation result. The results of verification on the public data set eyescap and comparison with the sight line estimation model with good performance in recent years show that the sight line estimation model provided by the method can estimate the sight line direction more accurately in the state of free head movement.
Description
Technical Field
The invention belongs to the field of image processing and pattern recognition, and particularly relates to an adaptive sight estimation method based on poor carriage convolution.
Background
With the rapid development in the fields of computer vision, artificial intelligence, and the like, the research on the sight line estimation technology has attracted extensive attention. The sight line is a very important non-linguistic clue for analyzing human behavior and psychological state, is one of the expressions of human attention and interest, and the sight line information is helpful for deducing the internal state or intention of the human, so that the interaction between individuals can be better understood. Therefore, line-of-sight estimation plays an important role in many research areas, such as: human-computer interaction, virtual reality, social interaction analysis, medical treatment, and the like.
The sight line estimation in a broad sense refers to research related to eyeballs, eye movements, sight lines and the like. Generally speaking, the sight line estimation method can be divided into two broad categories, model-based method and appearance-based method. The basic idea of the model-based method is to estimate the sight direction based on the characteristics of corneal reflection and the like of the eyes and by combining the prior knowledge of the 3D eyeball. The appearance-based method directly extracts visual features of eyes, trains a regression model, and learns a model that maps appearances to sight directions, thereby performing sight estimation. Through comparative analysis, the accuracy obtained by the model-based method is higher, but the requirements on the quality and the resolution of the picture are also higher, in order to achieve the purpose, special hardware is generally required to be used, and the mobility of the head pose and the like of a user is greatly limited; while appearance-based methods perform better for low-resolution and high-noise images, training of the model requires a large amount of data and is prone to the phenomenon of over-fitting. With the rise of deep learning and the disclosure of large data sets, appearance-based approaches are receiving increasing attention.
At present, although the research on the sight line estimation technology is greatly improved, the accuracy obtained by a universal model is limited due to the difference of the eye shape and the intraocular structure among individuals, and meanwhile, the moving amplitude of the head of a user has a great influence on the experimental result, and the recognition accuracy is reduced.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A self-adaptive sight line estimation method and medium based on differential convolution in man-machine interaction are provided. The technical scheme of the invention is as follows:
a self-adaptive sight line estimation method based on differential convolution in human-computer interaction comprises the following steps:
s1, preprocessing the face image by utilizing a bilinear difference method to carry out multi-scale scaling, detecting the face by utilizing an optimized multi-task cascade convolution neural network algorithm, realizing pupil center positioning, and extracting human eye characteristic information;
s2, directly estimating the head pose by using the face image;
s3, automatically fusing the head pose in the step S1 and the human eye feature map in the step S2 by utilizing the full connection layer of the convolutional neural network to carry out preliminary sight estimation;
s4, predicting the gaze difference of the eyes by training by utilizing a differential convolution network;
and S5, calibrating the preliminary realization estimation result by using the obtained gaze difference, and outputting a final sight line estimation result.
In the step S1, 5 human face feature points are output by using the optimized multi-task cascade convolution neural network algorithm, so that pupil center positioning is completed while human face detection is performed.
The output of the multitasking cascade convolution neural network algorithm includes the pupil center position.
Further, the step S2 of directly performing head pose estimation by using the face image specifically includes: positioning the head position and orientation using a real-time head pose estimation system of a random regression forest, using Tt=[Tx,Ty,Tz]Indicating the position information of the head at time t, Rt=[Ry,Rp,Rr]Representing the rotation angle information of the head at the time t, the head deflection parameter at the time t can be recorded as ht=(Tt,Rt)。
Further, the step S3 automatically fuses the head pose and the eye feature map by using the full connection layer of the convolutional neural network to perform the preliminary gaze estimation, which specifically includes:
using a convolutional neural network based approach, we map 3@48 x 72 eyesTaking an image I as input, wherein 3 represents the number of channels of an eye image, 48 multiplied by 72 represents the size of the eye image, preprocessing the image, applying the preprocessed image to a convolutional layer, inputting an obtained characteristic map into a full-link layer, and finally obtaining a primary sight direction g in the full-link layer by training a linear regressionp(I) The loss function is:
wherein, ggt(I) For the true gaze direction, D is the training data set and | is the cardinality computation graph.
Further, the step S4 predicts the gaze difference amount of the eye through training by using a differential convolution network, and specifically includes:
the differential convolution is to analyze the mode direction of a certain sample and an adjacent sample, and the differential calculation reflects the change of continuous samples by calculating the difference between sample activations;
the differential convolution network adopts a parallel structure, each branch of the parallel structure consists of three convolution layers, each convolution layer is subjected to batch processing normalization and a ReLU unit, and maximum pooling is applied after a first layer and a second layer so as to reduce the size of an image; and after the third layer, performing normalization processing on the characteristic images of the two input images and splicing the characteristic images into a new tensor, and then applying two full-connection layers on the tensor to predict the gaze difference of the two input images.
Further, the differential convolutional network selects a ReLU function as an activation function of the convolutional layer and the fully-connected layer, and the formula is as follows:
f(x)=max(0,x) (10)
where x is the input, f (x) is the output after the ReLU unit;
training a gaze estimation model using a loss function, using dp(I, J) represents the predicted gaze difference of the difference network, then the loss function LdComprises the following steps:
wherein I is a test image, F is a reference image, DkTo a subset of the training set D, only images of one eye of the kth individual are included.
Further, the step S5 is to calibrate the preliminary estimation result by using the obtained gaze difference, and output a final gaze estimation result, specifically: predicting the difference d between a test image I and a reference image F by means of a differential convolution networkp(I, J) in combination with the true gaze value ggt(F) To predict the final viewing direction ggt(F)+dp(I, J), formula:
wherein D iscFor the calibration set of reference images, w (-) weights the importance of each prediction.
A storage medium, the storage medium being a computer readable storage medium storing one or more programs that, when executed by an electronic device comprising a plurality of application programs, cause the electronic device to perform the method of any of the above.
The invention has the following advantages and beneficial effects:
currently, most apparent-based gaze estimation methods return the gaze direction directly from a single face or eye image. However, due to differences in eye shape and intraocular structure between individuals, the universal model has limited acquisition accuracy, and its output often exhibits high variability and subject-related bias. Meanwhile, when the head deflection angle is too large, the sight line estimation result is also greatly affected. Therefore, the present disclosure provides an adaptive line-of-sight estimation method based on differential convolution to solve the above problems. And (3) introducing differential convolution, directly training a differential convolution neural network to predict the gaze difference between two eye input images of the same subject, and then calibrating the initial implementation estimation result by using the gaze difference. In addition, head pose information is fused in the model to improve the robustness of the gaze estimation system.
Tests on the public data set Eyediap show that when head posture information is blended and a differential network is used for calibration, the sight line estimation error is minimum. The introduction of the visible differential convolution can effectively calibrate the sight line estimation result, reduce the error of the sight line estimation, and the fusion of the head posture information can enable the system to have better robustness to the change of the head posture. In order to more clearly compare the visual line estimation effects of different models, the algorithm model provided by the invention is compared with other visual line estimation methods based on the convolutional neural network, the error of the model provided by the invention on the visual line estimation is smaller, and the excellent performance is obtained.
Drawings
FIG. 1 is a diagram of a preferred embodiment of a line-of-sight estimation framework based on a differential convolutional network (DNet) according to the present invention;
fig. 2 is a diagram of a differential convolutional network structure.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
s1, preprocessing the face image by using a bilinear difference method to perform multi-scale scaling, detecting the face by using an optimized multi-task cascade convolution neural network algorithm (in the invention, the existing algorithm is adopted, so the invention is abbreviated), realizing pupil center positioning at the same time, and extracting human eye characteristic information;
and S2, directly estimating the head pose by using the human face image. A real-time head pose estimation system of a random regression forest is employed to locate the head position and orientation. By Tt=[Tx,Ty,Tz]Indicating the position information of the head at time t, Rt=[Ry,Rp,Rr]Representing the rotation angle information of the head at the time t, the head deflection parameter at the time t can be recorded as ht=(Tt,Rt)。
S3, automatically fusing the head pose and the human eye feature map by using the full connection layer of the convolutional neural network, performing preliminary sight line estimation, and taking an eye image I of 3@48 × 72 as input by adopting a method based on the convolutional neural network, wherein 3 represents the number of channels of the eye image, and 48 × 72 represents the size of the eye image. Preprocessing the image, applying the preprocessed image to a convolutional layer, inputting the obtained characteristic map into a full-link layer, and finally obtaining a primary sight line direction g in the full-link layer by training a linear regressionp(I) In that respect The loss function is:
wherein, ggt(I) For the true gaze direction, D is the training data set and | is the cardinality computation graph.
And S4, predicting the gaze difference of the eyes by training by using a differential convolution network. Differential convolution is the analysis of the pattern direction of a certain sample and the adjacent sample, and differential calculation reflects the change of continuous samples by calculating the difference between sample activations.
The differential convolutional network adopts a parallel structure, and each branch of the parallel structure is composed of three convolutional layers, and each convolutional layer is subjected to batch normalization and a ReLU unit. Maximum pooling is applied after the first and second layers to reduce the image size. And after the third layer, normalizing the characteristic images of the two input images and splicing the characteristic images into a new tensor. Two fully connected layers are then applied in the tensor to predict the gaze difference of the two input images.
The differential convolutional network selects a ReLU function as an activation function of the convolutional layer and the fully-connected layer, and the formula is as follows:
f(x)=max(0,x) (10)
where x is the input and f (x) is the output after the ReLU unit.
Training a gaze estimation model using a loss function, using dp(I, J) represents the predicted gaze difference of the difference network, then the loss function LdComprises the following steps:
wherein I is a test image, F is a reference image, DkTo a subset of the training set D, only images of one eye of the kth individual are included.
And S5, calibrating the preliminary realization estimation result by using the obtained gaze difference, and outputting the final sight line estimation result. Predicting the difference d between a test image I and a reference image F by means of a differential convolution networkp(I, J) in combination with the true gaze value ggt(F) To predict the final viewing direction ggt(F)+dp(I, J), formula:
wherein D iscFor the calibration set of reference images, w (-) weights the importance of each prediction.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.
Claims (8)
1. A self-adaptive sight line estimation method based on differential convolution in human-computer interaction is characterized by comprising the following steps:
s1, preprocessing the face image by utilizing a bilinear difference method to carry out multi-scale scaling, detecting the face by utilizing an optimized multi-task cascade convolution neural network algorithm, realizing pupil center positioning, and extracting human eye characteristic information;
s2, directly estimating the head pose by using the face image;
s3, automatically fusing the head pose in the step S1 and the human eye feature map in the step S2 by utilizing the full connection layer of the convolutional neural network to carry out preliminary sight estimation;
s4, predicting the gaze difference of the eyes by training by utilizing a differential convolution network;
and S5, calibrating the preliminary realization estimation result by using the obtained gaze difference, and outputting a final sight line estimation result.
2. The adaptive sight line estimation method based on differential convolution in human-computer interaction according to claim 1, wherein in the step S1, 5 human face feature points are output by using an optimized multitask cascade convolution neural network algorithm, so that pupil center positioning is completed while human face detection is performed.
The output of the multitasking cascade convolution neural network algorithm includes the pupil center position.
3. The adaptive sight line estimation method based on differential convolution in human-computer interaction according to claim 2, wherein the step S2 is to directly perform head pose estimation by using a face image, and specifically includes: positioning the head position and orientation using a real-time head pose estimation system of a random regression forest, using Tt=[Tx,Ty,Tz]Indicating the position information of the head at time t, Rt=[Ry,Rp,Rr]Representing the rotation angle information of the head at the time t, the head deflection parameter at the time t can be recorded as ht=(Tt,Rt)。
4. The adaptive sight line estimation method based on differential convolution in human-computer interaction according to claim 3, wherein the step S3 is to perform preliminary sight line estimation by automatically fusing a head pose and a human eye feature map by using a full connection layer of a convolutional neural network, and specifically includes:
adopting a method based on a convolutional neural network, taking an eye image I of 3@48 multiplied by 72 as input, wherein 3 represents the channel number of the eye image, 48 multiplied by 72 represents the size of the eye image, preprocessing the image, applying the preprocessed image to a convolutional layer, inputting an obtained feature map into a full-link layer, and finally, performing full-link processing on the full-link layerThe layer is connected to obtain a primary sight direction g by training a linear regressionp(I) The loss function is:
wherein, ggt(I) For the true gaze direction, D is the training data set and | is the cardinality computation graph.
5. The adaptive sight line estimation method based on differential convolution in human-computer interaction according to claim 4, wherein the step S4 predicts the gaze difference amount of the eye through training by using a differential convolution network, and specifically comprises:
the differential convolution is to analyze the mode direction of a certain sample and an adjacent sample, and the differential calculation reflects the change of continuous samples by calculating the difference between sample activations;
the differential convolution network adopts a parallel structure, each branch of the parallel structure consists of three convolution layers, each convolution layer is subjected to batch processing normalization and a ReLU unit, and maximum pooling is applied after a first layer and a second layer so as to reduce the size of an image; and after the third layer, performing normalization processing on the characteristic images of the two input images and splicing the characteristic images into a new tensor, and then applying two full-connection layers on the tensor to predict the gaze difference of the two input images.
6. The adaptive sight line estimation method based on differential convolution in human-computer interaction according to claim 5, characterized in that the differential convolution network selects a ReLU function as an activation function of a convolution layer and a full link layer, and the formula is as follows:
f(x)=max(0,x) (10)
where x is the input, f (x) is the output after the ReLU unit;
training a gaze estimation model using a loss function, using dp(I, J) represents the predicted gaze difference of the difference network, then the loss function LdComprises the following steps:
wherein I is a test image, F is a reference image, DkTo a subset of the training set D, only images of one eye of the kth individual are included.
7. The adaptive gaze estimation method based on differential convolution in human-computer interaction according to claim 6, wherein the S5 calibrates the preliminary implementation estimation result by using the obtained gaze difference, and outputs a final gaze estimation result, specifically: predicting the difference d between a test image I and a reference image F by means of a differential convolution networkp(I, J) in combination with the true gaze value ggt(F) To predict the final viewing direction ggt(F)+dp(I, J), formula:
wherein D iscFor the calibration set of reference images, w (-) weights the importance of each prediction.
8. A storage medium being a computer readable storage medium storing one or more programs which, when executed by an electronic device comprising a plurality of application programs, cause the electronic device to perform the method of any of claims 1-7 above.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010647088.1A CN111796681A (en) | 2020-07-07 | 2020-07-07 | Self-adaptive sight estimation method and medium based on differential convolution in man-machine interaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010647088.1A CN111796681A (en) | 2020-07-07 | 2020-07-07 | Self-adaptive sight estimation method and medium based on differential convolution in man-machine interaction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111796681A true CN111796681A (en) | 2020-10-20 |
Family
ID=72809704
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010647088.1A Pending CN111796681A (en) | 2020-07-07 | 2020-07-07 | Self-adaptive sight estimation method and medium based on differential convolution in man-machine interaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111796681A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112418074A (en) * | 2020-11-20 | 2021-02-26 | 重庆邮电大学 | Coupled posture face recognition method based on self-attention |
CN112597823A (en) * | 2020-12-07 | 2021-04-02 | 深延科技(北京)有限公司 | Attention recognition method and device, electronic equipment and storage medium |
CN112711984A (en) * | 2020-12-09 | 2021-04-27 | 北京航空航天大学 | Fixation point positioning method and device and electronic equipment |
CN113642393A (en) * | 2021-07-07 | 2021-11-12 | 重庆邮电大学 | Attention mechanism-based multi-feature fusion sight line estimation method |
CN113705349A (en) * | 2021-07-26 | 2021-11-26 | 电子科技大学 | Attention power analysis method and system based on sight estimation neural network |
CN113807330A (en) * | 2021-11-19 | 2021-12-17 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Three-dimensional sight estimation method and device for resource-constrained scene |
CN113838135A (en) * | 2021-10-11 | 2021-12-24 | 重庆邮电大学 | Pose estimation method, system and medium based on LSTM double-current convolution neural network |
CN114898453A (en) * | 2022-05-23 | 2022-08-12 | 重庆邮电大学 | Cooperative network based sight line estimation method |
CN116226712A (en) * | 2023-03-03 | 2023-06-06 | 湖北商贸学院 | Online learner concentration monitoring method, system and readable storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111046734A (en) * | 2019-11-12 | 2020-04-21 | 重庆邮电大学 | Multi-modal fusion sight line estimation method based on expansion convolution |
-
2020
- 2020-07-07 CN CN202010647088.1A patent/CN111796681A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111046734A (en) * | 2019-11-12 | 2020-04-21 | 重庆邮电大学 | Multi-modal fusion sight line estimation method based on expansion convolution |
Non-Patent Citations (3)
Title |
---|
GANG LIU 等: ""A Differential Approach for Gaze Estimation"", 《HTTPS://ARXIV.ORG/ABS/1904.09459V3》 * |
GANG LIU 等: ""A Differential Approach for Gaze Estimation"", 《IEEE》 * |
陈雪峰: ""行为特征融合的视觉注意力检测技术研究"", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112418074A (en) * | 2020-11-20 | 2021-02-26 | 重庆邮电大学 | Coupled posture face recognition method based on self-attention |
CN112418074B (en) * | 2020-11-20 | 2022-08-23 | 重庆邮电大学 | Coupled posture face recognition method based on self-attention |
CN112597823A (en) * | 2020-12-07 | 2021-04-02 | 深延科技(北京)有限公司 | Attention recognition method and device, electronic equipment and storage medium |
CN112711984B (en) * | 2020-12-09 | 2022-04-12 | 北京航空航天大学 | Fixation point positioning method and device and electronic equipment |
CN112711984A (en) * | 2020-12-09 | 2021-04-27 | 北京航空航天大学 | Fixation point positioning method and device and electronic equipment |
CN113642393A (en) * | 2021-07-07 | 2021-11-12 | 重庆邮电大学 | Attention mechanism-based multi-feature fusion sight line estimation method |
CN113642393B (en) * | 2021-07-07 | 2024-03-22 | 重庆邮电大学 | Attention mechanism-based multi-feature fusion sight estimation method |
CN113705349A (en) * | 2021-07-26 | 2021-11-26 | 电子科技大学 | Attention power analysis method and system based on sight estimation neural network |
CN113705349B (en) * | 2021-07-26 | 2023-06-06 | 电子科技大学 | Attention quantitative analysis method and system based on line-of-sight estimation neural network |
CN113838135A (en) * | 2021-10-11 | 2021-12-24 | 重庆邮电大学 | Pose estimation method, system and medium based on LSTM double-current convolution neural network |
CN113838135B (en) * | 2021-10-11 | 2024-03-19 | 重庆邮电大学 | Pose estimation method, system and medium based on LSTM double-flow convolutional neural network |
CN113807330A (en) * | 2021-11-19 | 2021-12-17 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Three-dimensional sight estimation method and device for resource-constrained scene |
CN114898453A (en) * | 2022-05-23 | 2022-08-12 | 重庆邮电大学 | Cooperative network based sight line estimation method |
CN116226712A (en) * | 2023-03-03 | 2023-06-06 | 湖北商贸学院 | Online learner concentration monitoring method, system and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111796681A (en) | Self-adaptive sight estimation method and medium based on differential convolution in man-machine interaction | |
CN109117731B (en) | Classroom teaching cognitive load measurement system | |
JP7273157B2 (en) | Model training method, device, terminal and program | |
US10614289B2 (en) | Facial tracking with classifiers | |
US20230377190A1 (en) | Method and device for training models, method and device for detecting body postures, and storage medium | |
CN111325726A (en) | Model training method, image processing method, device, equipment and storage medium | |
CN109359512A (en) | Eyeball position method for tracing, device, terminal and computer readable storage medium | |
CN111046734B (en) | Multi-modal fusion sight line estimation method based on expansion convolution | |
CN111598168B (en) | Image classification method, device, computer equipment and medium | |
CN112330684B (en) | Object segmentation method and device, computer equipment and storage medium | |
CN110555426A (en) | Sight line detection method, device, equipment and storage medium | |
CN114120432A (en) | Online learning attention tracking method based on sight estimation and application thereof | |
Aslan et al. | Multimodal video-based apparent personality recognition using long short-term memory and convolutional neural networks | |
CN110222780A (en) | Object detecting method, device, equipment and storage medium | |
CN111754546A (en) | Target tracking method, system and storage medium based on multi-feature map fusion | |
Sumer et al. | Attention flow: End-to-end joint attention estimation | |
CN113177559A (en) | Image recognition method, system, device and medium combining breadth and dense convolutional neural network | |
Sun et al. | Personality assessment based on multimodal attention network learning with category-based mean square error | |
Duraisamy et al. | Classroom engagement evaluation using computer vision techniques | |
CN111339878B (en) | Correction type real-time emotion recognition method and system based on eye movement data | |
Lee et al. | Automatic facial recognition system assisted-facial asymmetry scale using facial landmarks | |
CN110889393A (en) | Human body posture estimation method and device | |
CN115116117A (en) | Learning input data acquisition method based on multi-mode fusion network | |
CN113805695A (en) | Reading understanding level prediction method and device, electronic equipment and storage medium | |
Elahi et al. | Webcam-based accurate eye-central localization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201020 |