CN114970608A

CN114970608A - Human-computer interaction method and system based on electro-oculogram signals

Info

Publication number: CN114970608A
Application number: CN202210489333.XA
Authority: CN
Inventors: 尹志刚; 陈惠宇
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2022-08-30
Anticipated expiration: 2042-05-06
Also published as: CN114970608B

Abstract

The invention provides a human-computer interaction method and a system based on an eye electrical signal, wherein the method comprises the following steps: acquiring ocular electrical signals of a target object from a plurality of orientations, and determining input information of a classification model according to the ocular electrical signals of the target object; inputting the input information into a feature extraction layer of the classification model to obtain fusion features of the electro-ocular signals in the plurality of directions; inputting the fusion characteristics into a classification layer of the classification model to obtain the sight direction category of the target object; moving an input device connected with the human-computer interaction device according to a control strategy corresponding to the category of the sight direction of the target object so as to control the human-computer interaction device; and the classification model is trained and acquired according to the electro-ocular signals in the plurality of directions of the sample object and the sight direction category of the sample object. The invention can reduce the software and hardware cost of human-computer interaction while realizing simple and effective human-computer interaction.

Description

Human-computer interaction method and system based on electro-oculogram signals

Technical Field

The invention relates to the technical field of human-computer interaction, in particular to a human-computer interaction method and system based on an eye electrical signal.

Background

In recent years, in the field of human-computer interaction, due to the introduction of Graphical User Interfaces (GUI), bandwidth for presenting information to a User is greatly increased. GUI brings a major revolution in man-machine interaction, however, the output bandwidth is multiplied and the input devices remain substantially unchanged, i.e. keyboard and pointing devices are dominant (such as mouse, trackball or touch pad). Although, in recent years, handwriting devices (such as devices that input in the form of a stylus or a graphical pen) have been introduced, this has resulted in a great asymmetry in input bandwidth and output bandwidth. In addition, due to the limitation of the input device, the hand nerve of some disabled people is degenerated, and the control of the input device cannot be manually realized, so that effective human-computer interaction cannot be realized.

Currently, in order to reduce the bandwidth asymmetry and improve and facilitate human-Computer interaction, a Brain Computer Interface (BCI) is usually used to measure the electrical activity of the Brain of a user, so as to be used as a communication medium for a patient with neurodegenerative disease, thereby realizing effective human-Computer interaction. Although neurodegenerative patients have limited contact with the surrounding environment, highly specialized equipment creates the potential for human-computer interaction for them. However, the difference of the electroencephalogram signals among individuals is large and complex, and when each user uses the BCI for human-computer interaction, the user needs to train according to the electroencephalogram signal of the user, and then the user can effectively interact with the human-computer interaction device. Therefore, in this scenario, it is necessary to rely on high software and hardware costs to effectively implement human-computer interaction.

Disclosure of Invention

The invention provides a human-computer interaction method and system based on electro-oculogram signals, which are used for overcoming the defects that effective human-computer interaction can be realized only after an electroencephalogram signal of each user needs to be trained in the prior art, so that the software cost and the hardware cost are higher, and the human-computer interaction cost is reduced.

The invention provides a human-computer interaction method based on an eye electrical signal, which comprises the following steps:

acquiring ocular electrical signals of a target object from a plurality of orientations, and determining input information of a classification model according to the ocular electrical signals of the target object;

inputting the input information into a feature extraction layer of the classification model to obtain fusion features of the electro-ocular signals in the plurality of directions;

inputting the fusion characteristics into a classification layer of the classification model to obtain the sight direction category of the target object;

according to the control strategy corresponding to the category of the sight direction of the target object, carrying out mobile operation on input equipment connected with the human-computer interaction equipment so as to control the human-computer interaction equipment;

and the classification model is trained and acquired according to the electro-ocular signals in the plurality of directions of the sample object and the sight direction category of the sample object.

According to the human-computer interaction method based on the electro-oculogram signals, the feature extraction layer comprises a plurality of feature extraction modules and a fusion module; each feature extraction module corresponds to each direction one by one; each feature extraction module is constructed and generated based on a one-dimensional convolutional neural network;

correspondingly, the inputting the input information into a feature extraction layer of the classification model to obtain the fusion features of the electro-ocular signals in the plurality of orientations includes:

inputting input information determined according to the electro-ocular signals of the target object in each direction into the feature extraction module corresponding to each direction to obtain essential features of the electro-ocular signals of the target object in each direction;

and inputting the essential characteristics of the electro-ocular signals of the target object in a plurality of directions into the fusion module to obtain the fusion characteristics of the electro-ocular signals of the target object.

According to the human-computer interaction method based on the electro-oculogram signal, the plurality of orientations comprise a horizontal orientation and a vertical orientation;

the acquiring of the electro-ocular signals of the target object from a plurality of orientations and determining the input information of the classification model according to the electro-ocular signals of the target object comprises:

acquiring a plurality of ocular electrical signals of the target object from a horizontal orientation and a vertical orientation, respectively;

acquiring a differential signal of the electro-ocular signals in the horizontal direction according to the plurality of electro-ocular signals in the horizontal direction;

acquiring a differential signal of the electro-ocular signals in the vertical direction according to the plurality of electro-ocular signals in the vertical direction;

and determining the input information of the classification model by combining the differential signal of the electro-ocular signal in the horizontal direction and the differential signal of the electro-ocular signal in the vertical direction.

According to the human-computer interaction method based on the electro-ocular signals, the method for acquiring the plurality of electro-ocular signals of the target object from the horizontal direction and the vertical direction respectively comprises the following steps:

acquiring a plurality of eye electrical signals of the target object from a horizontal orientation and a vertical orientation, respectively, based on a biosensor fixed to a face of the target object;

the biosensor comprises a first electrode, a second electrode and a third electrode;

the first electrodes are fixed on two sides of the head of the target object, are in the same horizontal direction with the eyes of the target object, and are used for acquiring a plurality of eye electric signals of the target object from the horizontal direction;

the second electrodes are fixed above and below the eyes of the target object and are in the same vertical direction with the eyes of the target object, and are used for acquiring a plurality of eye electric signals of the target object from the vertical direction;

the third electrode is fixed at the ear root of the target object and is used for providing a ground reference voltage for the isolation side of the biosensor.

According to the human-computer interaction method based on the electro-ocular signal, the input information of the classification model is determined according to the electro-ocular signal of the target object, and the method comprises the following steps:

preprocessing the electro-ocular signal of the target object;

wherein the preprocessing comprises filtering processing and normalization processing;

the filtering processing comprises direct current component removal, noise signal processing and Butterworth low-pass filtering processing;

and taking the preprocessed electro-ocular signal of the target object as the input information of the classification model.

based on a feature extraction algorithm, carrying out feature extraction on the electro-oculogram signal of the target object to obtain input information of the classification model;

the feature extraction algorithm comprises a wavelet transform and a time domain analysis method.

According to the man-machine interaction method based on the electro-oculogram signals, the sight direction categories comprise forward head-up, leftward looking, rightward looking, downward looking and upward looking;

the control strategies include hold, move left, move right, move down, and move up.

The invention also provides a human-computer interaction system based on the electro-oculogram signal, which comprises:

the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring the electro-ocular signals of a target object from a plurality of directions and determining the input information of the classification model according to the electro-ocular signals of the target object;

the fusion feature extraction module is used for inputting the input information into a feature extraction layer of the classification model to obtain fusion features of the electro-ocular signals in the plurality of directions;

the classification module is used for inputting the fusion characteristics into a classification layer of the classification model to obtain the sight direction category of the target object;

the interaction module is used for carrying out moving operation on input equipment connected with the human-computer interaction equipment according to a control strategy corresponding to the type of the sight direction of the target object so as to control the human-computer interaction equipment;

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to realize the human-computer interaction method based on the electro-oculogram signal.

The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of human-computer interaction based on an electro-oculogram signal as described in any one of the above.

The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a method of human-computer interaction based on an electro-ocular signal as described in any of the above.

According to the human-computer interaction method and system based on the electro-oculogram signals, on one hand, the electro-oculogram signals of the target object are collected from a plurality of directions, so that the classification model is rich, effective input information of the sight direction categories can be distinguished, more accurate sight direction categories of the target object can be obtained, and the control on human-computer interaction equipment is more accurate and effective; on the other hand, the electro-oculogram signals of the users are relatively single, the types of the sight directions of different users are basically consistent, and the electro-oculogram signals of part of the users only need to be trained, so that the electro-oculogram signal training device can be quickly and conveniently suitable for man-machine interaction of various users, and the software and hardware cost of the man-machine interaction is effectively reduced.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a human-computer interaction method based on an eye electrical signal according to the present invention;

FIG. 2 is a schematic structural diagram of a one-dimensional convolution network in the human-computer interaction method based on the electro-ocular signal provided by the present invention;

FIG. 3 is a schematic structural diagram of a biosensor in the human-computer interaction method based on an eye electrical signal provided by the present invention;

FIG. 4 is a schematic diagram illustrating the distribution of electro-ocular signals in the human-computer interaction method based on electro-ocular signals according to the present invention;

FIG. 5 is a second schematic diagram illustrating the distribution of electro-ocular signals in the human-computer interaction method based on electro-ocular signals according to the present invention;

FIG. 6 is a second schematic flowchart of a human-computer interaction method based on electro-ocular signals according to the present invention;

FIG. 7 is a schematic diagram illustrating the distribution of visual direction categories in the human-computer interaction method based on electro-oculogram signals;

FIG. 8 is a third schematic flowchart of a human-computer interaction method based on electro-ocular signals according to the present invention;

FIG. 9 is a schematic structural diagram of a human-computer interaction system based on an eye electrical signal provided by the present invention;

fig. 10 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the prior art, the electrical activity of the brain is measured through a brain-computer interface to realize human-computer interaction. However, the difference of the electroencephalogram signals among individuals is large, and large-scale cross-test application without training is difficult to achieve when the electroencephalogram signals are used. Furthermore, brain-computer interface interaction requires special design and user training. Therefore, not all people can use them easily. More importantly, the acquisition and analysis of the electroencephalogram signals depend on high software and hardware cost, so that the popularization difficulty is further increased.

In view of the above problems, the present application develops a method for human-computer interaction using eye movement, and particularly helps a user control an input device when interacting with a graphical user interface by combining an input method based on an eye rotation signal and a blink signal. The user can induce an EOG (Electro-Oculogram) signal through the eye movement to control an input device of the graphical user interface, so that friendly and intuitive interaction between the user and the graphical user interface is realized, and the interaction cost is low.

The human-Computer interaction method based on the electro-oculogram signal according to the present invention is described below with reference to fig. 1, and the execution subject of the method may be a controller, which may be a PC (Personal Computer), a Raspberry Pi (Raspberry Pi), a server, a terminal electronic device, etc., and this example is not particularly limited thereto.

The method comprises the following steps: step 101, collecting the electro-ocular signals of the target object from a plurality of directions, and determining the input information of the classification model according to the electro-ocular signals of the target object.

Wherein the plurality of orientations include, but are not limited to, a horizontal orientation and a vertical orientation relative to the face; or subdivided orientations such as the horizontal orientation and the vertical orientation with respect to the left eye portion, and the horizontal orientation and the vertical orientation of the right eye portion, which is not particularly limited in the present embodiment.

One or more ocular electrical signal acquisition systems may be provided at each location to acquire one or more ocular electrical signals.

The target object is a user to be subjected to human-computer interaction, and may be a healthy user or a neurodegenerative disease patient, which is not specifically limited in this embodiment.

Optionally, when the target object needs to perform human-computer interaction, the target object may stimulate the eyeball to rotate through the stimulation inducing paradigm; at the moment, the eye electrical signal generated when the eyeball of the target object rotates can be collected in real time through the eye electrical signal collecting system.

After the electro-ocular signal is acquired, the electro-ocular signal can be directly used as input information and directly input into a classification model; the ocular signal may be processed and then input as input information into the classification model, which is not specifically limited in this embodiment.

The processing of the electrical eye signal includes, but is not limited to, data preprocessing, such as noise reduction processing, and feature extraction, such as one or more of time domain feature extraction, frequency domain feature extraction, and time-frequency domain feature extraction, which is not specifically limited in this embodiment.

Step 102, inputting the input information into a feature extraction layer of the classification model to obtain fusion features of the electro-ocular signals in the plurality of directions;

103, inputting the fusion characteristics into a classification layer of the classification model to obtain the sight direction category of the target object; the classification model is trained and acquired according to the eye electric signals in the multiple directions of the sample object and the sight line direction category of the sample object.

The feature extraction layer is used for performing fusion feature extraction on the electro-ocular signals in a plurality of directions in the input information so as to improve the classification precision. The feature extraction layer may be generated based on one or more machine learning model constructions, including but not limited to convolutional neural networks, cyclic networks, residual networks, and the like.

The specific extraction method comprises the following steps: in the first mode, after the electro-oculogram signals in a plurality of directions are fused, feature extraction is carried out on fusion information; in the second embodiment, the features of the electro-ocular signals in each direction are extracted, and then the features are fused, and the like.

The classification layer is used for learning the fusion characteristics to obtain the sight direction category of the target object.

The classification model may be constructed and generated based on a machine learning model, including but not limited to a convolutional neural network, a cyclic network, a residual error network, and the like, which is not specifically limited in this embodiment.

The classification model is used for learning the corresponding relation between the eye electrical signal and the sight line direction category of the target object. The gaze direction category is used to represent the eyeball rotation direction of the target object, and includes but is not limited to head-up, left-looking, right-looking, down-looking, and up-looking, which is not specifically limited in this embodiment.

The classification model is obtained by training based on the eye electrical signals of the sample object and the real sight direction category corresponding to the eye electrical signals of the sample object.

Optionally, before step 102 is executed, the classification model needs to be trained, specifically, the eye electrical signals in multiple directions of the sample object and the real sight direction category corresponding to the eye electrical signals of the sample object are obtained.

The method for acquiring the sample comprises the steps of inducing a signal of a corresponding sight line direction category of a sample object by using visual stimulation; and synchronously recording the electro-oculogram signals and the corresponding sight line direction categories according to the prompts. And by utilizing a multi-process program, the electro-oculogram signals and the corresponding sight line direction categories are written into a file for storage while the stimulation is displayed. The file may be in other formats such as CSV (Comma-Separated Values) format.

Then, the model is trained according to the electro-ocular signals in the plurality of directions of the sample object and the real sight line direction category so as to obtain a classification model capable of distinguishing the sight line direction category according to the electro-ocular signals. When the method is used for a scene needing to classify the electro-oculogram signals of the target object, input information determined according to the electro-oculogram signals in multiple directions of the target object can be input into the feature extraction layer to obtain fusion features of the electro-oculogram signals in the multiple directions, then the fusion features are input into the classification layer, and the electro-oculogram signals of the target object collected in the multiple directions are subjected to sight line direction classification so as to quickly and accurately obtain the sight line direction classification of the target object.

And 104, performing moving operation on input equipment connected with the human-computer interaction equipment according to the control strategy corresponding to the type of the sight direction of the target object so as to control the human-computer interaction equipment.

Wherein, different sight direction categories correspond to different control strategies, and if the sight direction category is upward viewing, the corresponding control strategy is upward movement; the gaze direction category is look-down and the corresponding control strategy is move-down.

The input device is used for providing input signals for the human-computer interaction device, and further control over the human-computer interaction device is achieved. The input device may be other input devices that can control the human-computer interaction device through a mobile operation, such as a wired mouse or a wireless mouse, which is not specifically limited in this embodiment.

The human-computer interaction device may be an intelligent terminal device such as a computer and a raspberry pi, which is not specifically limited in this embodiment.

Wherein the correspondence between the category of the gaze direction and the control strategy is stored in advance in a database.

Optionally, after the category of the gaze direction of the target object is obtained, the category of the gaze direction of the target object may be searched in the database to obtain a control strategy corresponding to the category of the gaze direction of the target object;

and then, controlling the input equipment to perform corresponding movement operation in real time according to a control strategy corresponding to the category of the sight direction of the target object, thereby realizing the control of the human-computer interaction equipment.

For example, if the category of the sight line direction of the target object is upward viewing and the corresponding control strategy is upward movement, the input device can be controlled to move upward in real time, so as to control the human-computer interaction device. Wherein the control includes, but is not limited to, selecting and activating an object presented on a display of the human interaction device.

On one hand, the embodiment collects the electro-oculogram signals of the target object from a plurality of directions, so that the classification model contains abundant effective input information which can distinguish the sight direction categories, and further can obtain more accurate sight direction categories of the target object, so that the control on the human-computer interaction equipment is more accurate and effective; on the other hand, the electro-oculogram signals of the users are relatively single, the types of the sight directions of different users are basically consistent, and the electro-oculogram signals of part of the users only need to be trained, so that the human-computer interaction system can be quickly and conveniently suitable for human-computer interaction of various users, and the software and hardware cost of the human-computer interaction is effectively reduced.

On the basis of the above embodiment, in this embodiment, the feature extraction layer includes a plurality of feature extraction modules and a fusion module; each feature extraction module corresponds to each direction one by one; each feature extraction module is constructed and generated based on a one-dimensional convolutional neural network; correspondingly, the inputting the input information into a feature extraction layer of the classification model to obtain the fusion features of the electro-ocular signals in the plurality of orientations includes: inputting input information determined according to the electro-ocular signals of the target object in each direction into the feature extraction module corresponding to each direction to obtain essential features of the electro-ocular signals of the target object in each direction; and inputting the essential characteristics of the electro-ocular signals of the target object in a plurality of directions into the fusion module to obtain the fusion characteristics of the electro-ocular signals of the target object.

The feature extraction layer comprises a plurality of feature extraction modules, wherein the number of the feature extraction modules is consistent with the number of the orientations for acquiring the electro-oculogram signals of the target object, namely, each feature extraction module corresponds to each orientation one by one.

The feature extraction module is configured and generated based on the one-dimensional convolutional neural network, and the specific number of layers and model parameters, such as the size of the convolutional kernel and the number of the convolutional kernels, may be set according to actual requirements, which is not specifically limited in this embodiment.

Each feature extraction module is used for carrying out feature extraction on the electro-ocular signals of the target object collected in each direction so as to further obtain essential features capable of distinguishing the categories of the sight line directions.

The feature fusion module can be constructed and generated based on a fully-connected network and the like, and is used for fusing essential features of the electro-ocular signals in multiple orientations to obtain fused features of the electro-ocular signals in multiple orientations.

The classification layer is used for learning fusion characteristics of the electro-oculogram signals in multiple directions so as to accurately distinguish visual line direction categories corresponding to the electro-oculogram signals.

Optionally, the specific steps of acquiring the category of the gaze direction of the target object in step 102 and step 103 include:

firstly, inputting input information determined according to the electro-oculogram signals of the target object in each direction into a feature extraction module corresponding to each direction to obtain essential features of the electro-oculogram signals of the target object in each direction;

then, based on a fusion module, fusing the essential characteristics of the electro-ocular signals of the target object in a plurality of directions to obtain the fusion characteristics of the electro-ocular signals of the target object;

and finally, learning the fusion characteristics based on the classification layer so as to distinguish the sight direction category of the target object.

Correspondingly, in the training process, inputting the input information determined by the electro-oculogram signals of the sample object in each direction into the corresponding feature extraction module, the fusion module and the classification layer in sequence to finally obtain the sight direction category of the sample object; and then, performing iterative training on the classification model according to the deviation between the sight direction class and the real sight direction class output by the classification model to obtain the optimal classification model.

As shown in fig. 2, sample data of the classification model, setting of model parameters, and the like are specifically described below according to specific examples.

The number of the sample points can be expressed as the length of the input information of the classification network model. The number of sample points sample _ N ═ f is usually _s X epoch _ length, where f _s Indicating the sampling rate and epoch _ length indicating the period time length.

E.g. sampling rate f _s 250HZ, that is, the number of samples collected per second is 250; the time length of the time interval is 2s, and the number of sample points is 500.

The number of leads of the classification model, Channels _ N, indicates that the current ocular signal selects information of several electrodes, i.e., the width of the input information of the classification model, i.e., the number of orientations.

For example, in the present embodiment, the electro-ocular signals in the horizontal direction and the vertical direction are selected to determine the input information, that is, 2-channel electro-ocular signals are selected as the data source, and channel _ N is 2.

Convolution kernel size, which represents the size (height) of the convolution kernel sliding over the data. Unlike two-dimensional convolution, because one-dimensional convolution performs convolution over only a range of lengths, the width of the convolution kernel is by default the same as the data width, i.e., the width of the convolution kernel is the same as the number of leads, and the height of the convolution kernel represents the size of the convolution kernel.

The number of convolution kernels represents the number of convolution kernels that slide over the data. If the number of convolution kernels is 3 in fig. 2, 3 different essential features can be extracted.

The classification model can be constructed and formed based on a framework such as Tensorflow in Python or C language.

In the embodiment, after the characteristics of the electro-oculogram signals in different directions are extracted by the plurality of characteristic extraction modules, the essential characteristics of the electro-oculogram signals in the plurality of different directions are fused based on the fusion module, so that the extracted fusion characteristics comprise fusion characteristics capable of distinguishing the visual line direction categories in the plurality of different directions, and further more accurate visual line direction categories of the target object can be acquired.

On the basis of the above embodiment, the plurality of orientations in the present embodiment include a horizontal orientation and a vertical orientation; the acquiring of the electro-ocular signals of the target object from a plurality of orientations and determining the input information of the classification model according to the electro-ocular signals of the target object comprises: acquiring a plurality of ocular electrical signals of the target object from a horizontal orientation and a vertical orientation, respectively; acquiring a differential signal of the electro-ocular signals in the horizontal direction according to the plurality of electro-ocular signals in the horizontal direction; acquiring a differential signal of the electro-ocular signals in the vertical direction according to the plurality of electro-ocular signals in the vertical direction; and determining the input information of the classification model by combining the differential signal of the electro-ocular signal in the horizontal direction and the differential signal of the electro-ocular signal in the vertical direction.

Wherein the horizontal orientation and the vertical orientation are horizontal orientations and vertical orientations corresponding to eye positions in the face image of the target object;

a plurality of eye electric signals can be collected in each direction, for example, for the horizontal direction, one or more eye electric signals can be respectively collected on the left side of a left eye and the right side of a right eye in a face image; for the vertical orientation, one or more ocular electrical signals may be acquired respectively at the upper and lower portions of the left eye in the face image, or one or more ocular electrical signals may be acquired respectively at the upper and lower portions of the right eye in the face image, and so on.

Optionally, acquiring a plurality of ocular electrical signals of the target object from a horizontal position and acquiring a plurality of ocular electrical signals of the target object from a vertical position, respectively;

after acquiring the plurality of electrical ocular signals in the horizontal direction, a differential signal of the electrical ocular signals in the horizontal direction may be acquired according to the plurality of electrical ocular signals in the horizontal direction to measure a change in the horizontal direction of the eye of the target object.

For example, the electrical eye signals collected on the left side of the left eye and the right side of the right eye are used as a set of data, and are subtracted to obtain a differential signal of the electrical eye signals in the horizontal direction.

After the plurality of electrical eye signals in the vertical direction are acquired, a differential signal of the electrical eye signals in the vertical direction may be acquired according to the plurality of electrical eye signals in the vertical direction to measure a change in the vertical direction of the eye of the target object.

For example, the electro-ocular signals collected at the upper and lower parts of the left eye in the face image are used as a group of data, and subtraction is performed to obtain the differential signals of the electro-ocular signals in the vertical direction;

or the electro-oculogram signals collected at the upper part and the lower part of the right eye in the face image are used as a group of data, and subtraction is carried out to obtain the difference signal of the electro-oculogram signals in the vertical direction.

Finally, the differential signal of the electro-ocular signal in the horizontal direction and the differential signal of the electro-ocular signal in the vertical direction are combined to be directly used as input information of the classification model; after being processed, the processed data may be used as input information of the classification model, which is not particularly limited in this embodiment.

In the embodiment, the plurality of electro-ocular signals are collected in the horizontal direction and the vertical direction respectively, the differential signal of the electro-ocular signals in the horizontal direction and the differential signal of the electro-ocular signals in the vertical direction are calculated, and input information capable of effectively distinguishing the categories of the sight line direction can be further mined, so that the classification result is more reliable and accurate.

On the basis of the above embodiment, the acquiring a plurality of ocular electrical signals of the target object from the horizontal direction and the vertical direction respectively in the embodiment includes: acquiring a plurality of eye electrical signals of the target object from a horizontal orientation and a vertical orientation, respectively, based on a biosensor fixed to a face of the target object; the biosensor comprises a first electrode, a second electrode and a third electrode; the first electrodes are fixed on two sides of the head of the target object, are in the same horizontal direction with the eyes of the target object, and are used for acquiring a plurality of eye electric signals of the target object from the horizontal direction; the second electrodes are fixed above and below the eyes of the target object and are in the same vertical direction with the eyes of the target object, and are used for acquiring a plurality of eye electric signals of the target object from the vertical direction; the third electrode is fixed at the root of the ear of the target object and is used for providing a ground reference voltage for the isolation side of the biosensor.

The biosensor can be OpenBCI Cyton Board, which is a low-cost open software or hardware biosensing device. The biosensor can be formed by using 8-channel EEG (electroencephalography), an amplifier ADS1299 produced by Texas instruments and an 8-bit single-chip Atmega328P, and has the characteristics of portability, autonomy, flexibility and the like. Therefore, the biosensor can be used for quickly and conveniently collecting the eye electric signals in 8 different directions.

The differential signal of the eye electrical signal is an action potential difference value generated by upper and lower eye muscles or left and right eye muscles when the eyeball moves towards the corresponding direction. For this purpose, the biological signals of the eyes of the target object can be recorded by means of two channels, one for acquiring the electrical eye signals in the horizontal direction and the other for the electrical eye signals in the vertical direction.

As shown in fig. 3, a gold cup electrode in the biosensor is used together with an electrode paste to improve conductivity; the biosensor includes three types of electrodes, a first electrode, a second electrode, and a third electrode. Each electrode is fixed in place and fits tightly against the face of the target user.

The first electrodes are used for acquiring horizontal eye electrical signals, the number of the first electrodes is at least two, the first electrodes are respectively fixed on two sides of the head of the face of the target object and are in the same horizontal direction with the eyes, for example, the first electrodes are fixed on the left side of the left eye of the target object and are in the same horizontal direction with the left eye, and the first electrodes are fixed on the right side of the right eye of the target object and are in the same horizontal direction with the right eye.

The second electrodes are used for acquiring vertical eye electric signals, the number of the second electrodes is at least two, and the second electrodes are respectively fixed above and below the center of the right eye or above and below the center of the left eye of the target object.

A third electrode is fixed at the root of the ear of the target object for providing a ground reference voltage for the isolated side of the biosensor, the battery voltage being split around ground.

Wherein, the electro-oculogram signal of collection accessible bluetooth or forms such as serial ports transmit to the execution main part to make the execution main part control human-computer interaction equipment after carrying out the analysis to the electro-oculogram signal.

According to the embodiment, the plurality of electro-oculogram signals of the target object can be rapidly and conveniently acquired from different directions only by fixing the plurality of electrodes on the face of the target user, so that the user can control the interactive graphic user system only by using eyeball motion. This method is a technique to improve and facilitate the user's interaction with the computer in a more intuitive, natural viewing manner, and is a computer control manner that is very friendly to physically handicapped persons.

On the basis of the foregoing embodiments, in this embodiment, the determining input information of a classification model according to an eye electrical signal of the target object includes: preprocessing the electro-ocular signal of the target object; wherein the preprocessing comprises filtering processing and normalization processing; the filtering processing comprises direct current component removal, noise signal processing and Butterworth low-pass filtering processing; and taking the preprocessed electro-ocular signal of the target object as the input information of the classification model.

The method comprises the following steps that an eye electrical signal of a target object is acquired through an eye electrical signal collector; due to errors and faults of the collector, human factors and the like, a great deal of noise exists in the collection.

As shown in fig. 4 and 5, the electrical signals for the eye looking to the left and the electrical signals for the eye looking to the right, respectively, contain a large amount of noise data therein. Therefore, in order to allow the gaze direction category of the target object to be quickly and accurately distinguished. The step of determining the input information of the classification model in step 101 is performed before preprocessing the electro-ocular signal of the target object.

The preprocessing includes a filtering process and a normalization process.

As shown in fig. 6, the filtering process includes removal of a direct current component, noise signal processing, and butterworth low-pass filtering process to make the eye electric signal smoother.

The noise signal processing may be power frequency noise with a preset frequency (e.g. 50HZ) removed or other noise reduction processing methods.

Wherein, the Butterworth low-pass filter flattens the frequency response curve in the pass frequency band to the maximum extent, has no ripple, and gradually drops to zero in the stop frequency band; the transfer function of the butterworth low pass filter is:

wherein n is the order of the Butterworth low-pass filter; w is a _c Is the cut-off frequency; g ₀ Is the direct current gain; w is the current frequency.

After the filtering processing is carried out on the electro-ocular signals of the target object, the electro-ocular signals of the target object can be standardized to eliminate the influence of the dimension on the classification result.

Compared with the electro-oculogram signals before preprocessing, the electro-oculogram signals after preprocessing are smoother, noise data are effectively eliminated, and therefore more accurate and effective human-computer interaction can be carried out.

On the basis of the foregoing embodiments, in this embodiment, the determining input information of a classification model according to an eye electrical signal of the target object includes: based on a feature extraction algorithm, carrying out feature extraction on the electro-oculogram signal of the target object to obtain input information of the classification model; the feature extraction algorithm comprises a wavelet transform and a time domain analysis method.

Optionally, before performing the input information of determining the classification model in step 101, the ocular electrical signals of the target object are characterized to extract effective features that can distinguish the categories of the gaze direction from the ocular electrical signals; and then, according to the effective characteristics extracted from the electro-ocular signals, determining the input information of the classification model, so that the classification model can rapidly and accurately distinguish the sight direction categories corresponding to the electro-ocular signals of the target object.

The feature extraction algorithm comprises a wavelet transform and time domain analysis method;

the characteristics of certain aspects of the electro-oculogram signals can be fully highlighted through wavelet transformation, the local analysis of time (space) frequency can be realized, the signals (functions) are gradually subjected to multi-scale refinement through telescopic translation operation, finally, the time subdivision at high frequency and the frequency subdivision at low frequency are achieved, the requirements of time-frequency signal analysis can be automatically adapted, therefore, any details of the signals can be focused, and the time-frequency domain detail characteristics capable of distinguishing the category of the sight line direction are extracted.

The time domain analysis method includes, but is not limited to, an average value, a maximum value, a minimum value, and the like, which is not limited in this example, the signal-to-noise ratio can be improved through time domain analysis, and then time domain features that can distinguish the category of the sight line direction are extracted.

On the basis of the above embodiments, the line-of-sight direction categories in the present embodiment include looking straight ahead, looking left, looking right, looking down, and looking up; the control strategies include hold, move left, move right, move down, and move up.

The different types of the sight directions correspond to different control strategies and are used for controlling the input equipment of the movable control human-computer interaction equipment to carry out different movement operations, and therefore human-computer interaction is achieved.

As shown in fig. 7, in designing the stimulus-induced paradigm, the design target object can be viewed in five directions, i.e., the gaze direction category includes five categories including head-up forward, left-looking, right-looking, down-looking, and up-looking, respectively.

Accordingly, the control strategy also includes five, respectively, hold, move left, move right, move down, and move up.

The corresponding relationship between the category of the sight line direction and the control strategy can be set according to actual requirements, for example, if the control strategy corresponding to the forward sight line is that the input device is kept still, the control strategy corresponding to the leftward sight line is that the input device moves leftward, the control strategy corresponding to the rightward sight line is that the input device moves rightward, the control strategy corresponding to the downward sight line is that the input device moves downward, and the control strategy corresponding to the upward sight line is that the input device moves upward.

It should be noted that, in this embodiment, the gaze direction category and the control policy may be divided into more fine-grained categories, instead of being limited to the above several gaze direction categories and control policies.

In the embodiment, the user can generate different types of eye electric signals only by rotating the eyeballs, so that the human-computer interaction equipment is simply and effectively controlled; in addition, the difference of the eye electrical signals of each user is not large, training of the eye electrical signals of each user is not needed, the method is suitable for multiple users, and the human-computer interaction cost is low.

As shown in fig. 8, the overall flowchart of the human-computer interaction method based on the electro-oculogram signal in this embodiment is shown, which includes a training step and an application step;

the training step comprises model training, precision testing and acquisition of an optimal classification model;

the application steps comprise electro-oculogram signal acquisition, electro-oculogram signal preprocessing and feature extraction, sight direction category acquisition and movement operation on the input equipment according to the control strategy corresponding to the sight direction category so as to realize man-machine interaction.

The human interaction method provided by the embodiment can enable various users to effectively interact with a computer, a raspberry and other terminal devices, so that the users are allowed to control, select and activate the objects presented on the display by using the eye signals. And the eye gaze tracking is used for replacing or serving as the supplementary input of the human-computer interaction equipment, so that the human-computer interaction is effectively promoted, and the bandwidth asymmetry is eliminated.

The following describes the human-computer interaction system based on the electro-oculogram signal provided by the present invention, and the human-computer interaction system based on the electro-oculogram signal described below and the human-computer interaction method based on the electro-oculogram signal described above can be referred to correspondingly.

As shown in fig. 9, the present embodiment provides a human-computer interaction system based on an eye electrical signal, which includes an acquisition module 901, a fusion feature extraction module 902, a classification module 903, and an interaction module 904, where:

the collecting module 901 is configured to collect the eye electrical signals of the target object from multiple orientations, and determine the input information of the classification model according to the eye electrical signals of the target object.

One or more ocular electrical signal acquisition systems may be positioned at each location to acquire one or more ocular electrical signals.

After the electro-ocular signal is acquired, the electro-ocular signal can be directly used as input information and directly input into a classification model; the classification model may be input as input information after processing the electro-ocular signal, which is not specifically limited in this embodiment.

The fusion feature extraction module 902 is configured to input the input information into a feature extraction layer of the classification model to obtain fusion features of the electrooculogram signals in the multiple directions;

the classification module 903 is configured to input the fusion features into a classification layer of the classification model to obtain a gaze direction category of the target object; and the classification model is trained and acquired according to the electro-ocular signals in the plurality of directions of the sample object and the sight direction category of the sample object.

The characteristic extraction layer is used for carrying out fusion characteristic extraction on the electro-ocular signals in a plurality of directions in the input information; the feature extraction layer may be generated based on one or more machine learning model constructions, including but not limited to convolutional neural networks, cyclic networks, residual networks, and the like.

The specific extraction method comprises the following steps: in the first mode, after the electro-oculogram signals in a plurality of directions are fused, feature extraction is carried out on the fused information; in the second embodiment, the features of the electro-oculogram signals in each direction are extracted, and then the features are fused, which is not specifically limited in this embodiment.

The classification model is used for learning the corresponding relation between the eye electrical signal and the sight line direction category of the target object. The gaze direction category is used to represent the eyeball rotation direction of the target object, and includes, but is not limited to, looking up, looking left, looking right, looking down, looking up, and the like, which is not specifically limited in this embodiment.

The method for acquiring the sample comprises the steps of inducing a signal of a corresponding sight line direction category of a sample object by using visual stimulation; and synchronously recording the electro-oculogram signals and the corresponding sight line direction categories according to the prompts. And by utilizing a multi-process program, the electro-oculogram signals and the corresponding sight line direction categories are written into a file for storage while the stimulation is displayed. The file may be in other formats such as CSV format.

Then, the model is trained according to the electro-ocular signals in the plurality of directions of the sample object and the real sight line direction category so as to obtain a classification model capable of distinguishing the sight line direction category according to the electro-ocular signals. When the scene that the electro-oculogram signals of the target object need to be classified is aimed at, input information determined according to the electro-oculogram signals in multiple directions of the target object can be input into the feature extraction layer to obtain fusion features of the electro-oculogram signals in the multiple directions, then the fusion features are input into the classification layer, and the electro-oculogram signals of the target object collected in the multiple directions are subjected to sight line direction classification so as to quickly and accurately obtain the sight line direction classification of the target object.

The interaction module 904 is configured to perform a moving operation on an input device connected to the human-computer interaction device according to a control policy corresponding to the category of the gaze direction of the target object, so as to control the human-computer interaction device;

On one hand, the embodiment collects the electro-oculogram signals of the target object from a plurality of directions, so that the classification model contains abundant effective input information which can distinguish the sight direction categories, and further more accurate sight direction categories of the target object can be obtained, and the control on the human-computer interaction equipment is more accurate and effective; on the other hand, the electro-oculogram signals of the users are relatively single, the types of the sight directions of different users are basically consistent, and the electro-oculogram signals of part of the users only need to be trained, so that the human-computer interaction system can be quickly and conveniently suitable for human-computer interaction of various users, and the software and hardware cost of the human-computer interaction is effectively reduced.

On the basis of the above embodiment, in this embodiment, the feature extraction layer includes a plurality of feature extraction modules and a fusion module; each feature extraction module corresponds to each direction one by one; each feature extraction module is constructed and generated based on a one-dimensional convolutional neural network; accordingly, the fused feature extraction module is configured to: inputting input information determined according to the electro-ocular signals of the target object in each direction into the feature extraction module corresponding to each direction to obtain essential features of the electro-ocular signals of the target object in each direction; and inputting the essential characteristics of the electro-ocular signals of the target object in a plurality of directions into the fusion module to obtain the fusion characteristics of the electro-ocular signals of the target object.

On the basis of the above embodiment, the plurality of orientations in the present embodiment include a horizontal orientation and a vertical orientation; the acquisition module is further specifically configured to: acquiring a plurality of ocular signals of the target object from a horizontal orientation and a vertical orientation, respectively; acquiring a differential signal of the electro-ocular signals in the horizontal direction according to the plurality of electro-ocular signals in the horizontal direction; acquiring a differential signal of the electro-ocular signals in the vertical direction according to the plurality of electro-ocular signals in the vertical direction; and determining the input information of the classification model by combining the differential signal of the electro-ocular signal in the horizontal direction and the differential signal of the electro-ocular signal in the vertical direction.

On the basis of the foregoing embodiment, the acquisition module in this embodiment is further specifically configured to: acquiring a plurality of eye electrical signals of the target object from a horizontal orientation and a vertical orientation, respectively, based on a biosensor fixed to a face of the target object; the biosensor comprises a first electrode, a second electrode and a third electrode; the first electrodes are fixed on two sides of the head of the target object, are in the same horizontal direction with the eyes of the target object, and are used for acquiring a plurality of eye electric signals of the target object from the horizontal direction; the second electrodes are fixed above and below the eyes of the target object and are in the same vertical direction with the eyes of the target object, and are used for acquiring a plurality of ocular electric signals of the target object from the vertical direction; the third electrode is fixed at the ear root of the target object and is used for providing a ground reference voltage for the isolation side of the biosensor.

On the basis of the above embodiments, the present embodiment further includes a preprocessing module, specifically configured to: preprocessing the electro-ocular signal of the target object; wherein the preprocessing comprises filtering processing and normalization processing; the filtering processing comprises direct current component removal, noise signal processing and Butterworth low-pass filtering processing; and taking the preprocessed electro-ocular signal of the target object as the input information of the classification model.

On the basis of the foregoing embodiments, the present embodiment further includes a feature extraction module, which is specifically configured to: based on a feature extraction algorithm, carrying out feature extraction on the electro-oculogram signal of the target object to obtain input information of the classification model; the feature extraction algorithm comprises a wavelet transform and a time domain analysis method.

Fig. 10 illustrates a physical structure diagram of an electronic device, and as shown in fig. 10, the electronic device may include: a processor (processor)1001, a communication Interface (Communications Interface)1002, a memory (memory)1003 and a communication bus 1004, wherein the processor 1001, the communication Interface 1002 and the memory 1003 complete communication with each other via the communication bus 1004. Processor 1001 may invoke logic instructions in memory 1003 to perform a method of human-computer interaction based on an electro-ocular signal, the method comprising: acquiring ocular electrical signals of a target object from a plurality of orientations, and determining input information of a classification model according to the ocular electrical signals of the target object; inputting the input information into a feature extraction layer of the classification model to obtain fusion features of the electro-oculogram signals in the multiple directions; inputting the fusion characteristics into a classification layer of the classification model to obtain the sight direction category of the target object; moving an input device connected with the human-computer interaction device according to a control strategy corresponding to the category of the sight direction of the target object so as to control the human-computer interaction device; and the classification model is trained and acquired according to the electro-ocular signals in the plurality of directions of the sample object and the sight direction category of the sample object.

In addition, the logic instructions in the memory 1003 may be implemented in the form of software functional units and may be stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, wherein when the computer program is executed by a processor, a computer can execute the human-computer interaction method based on the electro-oculogram signal provided by the above methods, the method comprising: acquiring ocular electrical signals of a target object from a plurality of orientations, and determining input information of a classification model according to the ocular electrical signals of the target object; inputting the input information into a feature extraction layer of the classification model to obtain fusion features of the electro-oculogram signals in the multiple directions; inputting the fusion characteristics into a classification layer of the classification model to obtain the sight direction category of the target object; moving an input device connected with the human-computer interaction device according to a control strategy corresponding to the category of the sight direction of the target object so as to control the human-computer interaction device; the classification model is trained and acquired according to the eye electric signals in the multiple directions of the sample object and the sight line direction category of the sample object.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements a method for performing the above-mentioned methods to provide an electro-oculogram signal-based human-computer interaction, the method comprising: acquiring ocular electrical signals of a target object from a plurality of orientations, and determining input information of a classification model according to the ocular electrical signals of the target object; inputting the input information into a feature extraction layer of the classification model to obtain fusion features of the electro-ocular signals in the plurality of directions; inputting the fusion characteristics into a classification layer of the classification model to obtain the sight direction category of the target object; moving an input device connected with the human-computer interaction device according to a control strategy corresponding to the category of the sight direction of the target object so as to control the human-computer interaction device; and the classification model is trained and acquired according to the electro-ocular signals in the plurality of directions of the sample object and the sight direction category of the sample object.

The above-described system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A human-computer interaction method based on an eye electric signal is characterized by comprising the following steps:

2. The electro-ocular signal based human-computer interaction method of claim 1, wherein the feature extraction layer comprises a plurality of feature extraction modules and a fusion module; each feature extraction module corresponds to each direction one by one; each feature extraction module is constructed and generated based on a one-dimensional convolutional neural network;

inputting input information determined according to the electro-oculogram signals of the target object in each direction into the feature extraction module corresponding to each direction to obtain essential features of the electro-oculogram signals of the target object in each direction;

3. The electro-ocular signal based human-computer interaction method of claim 1, wherein the plurality of orientations include a horizontal orientation and a vertical orientation;

4. The electro-ocular signal based human-computer interaction method of claim 3, wherein the acquiring the plurality of electro-ocular signals of the target object from the horizontal orientation and the vertical orientation, respectively, comprises:

the third electrode is fixed at the root of the ear of the target object and is used for providing a ground reference voltage for the isolation side of the biosensor.

5. The method for human-computer interaction based on the electro-ocular signal of any one of claims 1-4, wherein the determining the input information of the classification model according to the electro-ocular signal of the target object comprises:

preprocessing the electro-ocular signal of the target object;

6. The method for human-computer interaction based on the electro-ocular signal of any one of claims 1-4, wherein the determining the input information of the classification model according to the electro-ocular signal of the target object comprises:

7. The electro-oculogram signal based human-computer interaction method according to any one of claims 1-4, wherein said line-of-sight categories include head-up-forward, left-looking, right-looking, down-looking and up-looking;

8. A human-computer interaction system based on an eye electrical signal, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for human-computer interaction based on an electro-ocular signal as claimed in any one of claims 1 to 7 when executing the program.

10. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method for human-computer interaction based on an electro-ocular signal of any of claims 1 to 7.