CN111631688B - Algorithm for automatic sleep staging - Google Patents

Algorithm for automatic sleep staging Download PDF

Info

Publication number
CN111631688B
CN111631688B CN202010591697.XA CN202010591697A CN111631688B CN 111631688 B CN111631688 B CN 111631688B CN 202010591697 A CN202010591697 A CN 202010591697A CN 111631688 B CN111631688 B CN 111631688B
Authority
CN
China
Prior art keywords
data
sleep
sequence
segment
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010591697.XA
Other languages
Chinese (zh)
Other versions
CN111631688A (en
Inventor
刘铁军
王林
吕彬
范宇熊
宋晓宇
郜东瑞
尧德中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202010591697.XA priority Critical patent/CN111631688B/en
Publication of CN111631688A publication Critical patent/CN111631688A/en
Application granted granted Critical
Publication of CN111631688B publication Critical patent/CN111631688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4806Sleep evaluation
    • A61B5/4809Sleep detection, i.e. determining whether a subject is asleep or not
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4806Sleep evaluation
    • A61B5/4812Detecting sleep stages or cycles
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4806Sleep evaluation
    • A61B5/4815Sleep quality
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7203Signal processing specially adapted for physiological signals or for diagnostic purposes for noise prevention, reduction or removal
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/725Details of waveform analysis using specific filters therefor, e.g. Kalman or adaptive filters
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Abstract

The invention discloses an algorithm for automatically staging sleep, which comprises the following steps: a. the feature layer extracts abstract features by using a multilayer perceptron and combines the traditional manual features based on expert experience to be used as information representation of sleep; b. then, a Bi-directional gating circulation unit Bi-GRU is used as a network model in the machine model layer; c. and finally, using a conditional random field CRF method as time continuity correction in a correction layer. The method solves the problems of poor stability, universality and practicability of the existing sleep automatic staging algorithm.

Description

Algorithm for automatic sleep staging
Technical Field
The invention relates to the field of sleep algorithms, in particular to an algorithm for automatically staging sleep.
Background
Different sleep time periods can be divided into five normal categories of WAKE, REM, N1, N2 and N3 periods according to international sleep medical standards by using physiological information during the sleep process of a person.
In recent years, a method of monitoring a sleep state of a person and automatically classifying the same using a computer device and a program has come into wide use. Although modern electronic information technology, machine learning theory, biomedical engineering and other aspects are rapidly developed, the sleep automatic staging method utilizing the machine learning theory still has no internationally recognized standard in the fields of scientific research, medical application, consumer electronics and the like. The main reasons include the lack of thorough and complete theory of the underlying mechanisms of human sleep, the lack of sufficient understanding and trust of clinicians and researchers in the international standards for artificial staging of sleep, the low consistency of sleep state classifications, the lack of expert experience in the developers of sleep monitoring systems, the intermingling of abnormal sleep patterns in normal sleep patterns, etc.
All the automatic sleep staging algorithms relate to characteristic engineering, developers have difficulty in deeply understanding the physiological process of sleep with expert experience, and the expert knowledge mainly comes from a sleep technical instruction manual of the American sleep medical society and an artificial staging criterion obtained by groping in practice.
The sleep automatic staging algorithm relates to a neural network, actual data is difficult to fit by an algorithm model, an under-fit model condition is caused, and the accuracy of a judgment result of a machine is far lower than an expectation.
In general, existing systems for automatically staging sleep are not accurate enough to distinguish between normal sleep states. The existing machine learning algorithm based on feature engineering and the machine learning algorithm based on a neural network reach the accuracy rate of 60% to 90% in a small data range, the existing algorithm does not consider the time continuity by taking the physiological signal data segment of the whole night sleep as an independence hypothesis, and the stability, the universality, the practicability and the like of the whole automatic staging technology are urgently required to be optimized.
Disclosure of Invention
The invention aims to provide an automatic sleep staging algorithm, which solves the problems of poor stability, universality and practicability of the conventional automatic sleep staging algorithm.
In order to solve the technical problems, the invention adopts the following technical scheme:
an algorithm for sleep automatic staging, comprising the steps of: a. the feature layer extracts abstract features by using a multilayer perceptron and combines the traditional manual features based on expert experience to be used as information representation of sleep; b. then, a Bi-directional gating circulation unit Bi-GRU is used as a network model in the machine model layer; c. and finally, using a conditional random field CRF method as time continuity correction in a correction layer.
As a further preferred aspect of the present invention, the step a of extracting abstract features by using a multi-layer perceptron and combining traditional expert experience-based manual features as the information characterization of sleep includes the following steps: s1, making data set from physiological data collected during sleep, including EEG signal, eye electrical signal and mandible electromyographic signal, where C ═ C1,C2,...,CNC represents the data set of all people, N represents the number of people (or the number of times of completing overnight sleep); s2, dividing each person' S continuous data into S segments in time sequence, each segment representing different sleep stage information, each segment needing to learn fitting with different algorithm models as time goes on, so each segment separately making a data set to generate S segment data sets, each segment data set containing N × M sample data, M being the number of sample points contained in each segment data, one sample point representing a sleep period, the time span of which is usually 30 seconds, each sample point containing L signal sampling points, so the data set sample size of C is S × N × M × L; and S3, respectively performing feature engineering on the S segment data sets, wherein the mode comprises the following steps of extracting abstract features: an auto-encoder; traditional features are extracted: the method comprises the following steps that time domain characteristics, frequency domain characteristics and nonlinear dynamics characteristics are spliced into vectors, M vectors are generated in each segment of data and are sequentially arranged to form a sequence, namely a segment sequence; the step b of utilizing the Bi-directional gating circulation unit Bi-GRU as the network model at the machine model layer comprises the following steps: s4, in the machine model layer, the Bi-directional gating circulation unit Bi-GRU network model is trained by using the segment sequences as samples, S segment data sets respectively comprise NxM segment sequences, each segment data set is divided into a training set and a testing set according to a certain proportion, and after the training of each segment data set is finished, the model is stored; s5 sample point input for trainingAfter entering a trained Bi-directional gating circulating unit Bi-GRU network model, finally obtaining a classification label of a sleep stage; s6, splicing the label sequences of S-segment samples sleeping at night into an ultra-long sequence, namely, the night data finally corresponds to a complete night label sequence, the night label sequence is a one-dimensional vector, the dimension of the one-dimensional vector is T (S multiplied by M), the sequence set containing N night labels is still divided into three sets according to the previous division, the training set contains CtThe strip, the verification set contains CvThe test set comprises C strips; the step c of using a conditional random field CRF method as time continuity correction in the correction layer comprises a step S7, wherein in the step S7, the label training set sequence obtained in the step S6 is input into the correction layer, the advantages of the conditional random field CRF method in the aspect of context information transfer extraction are used, specifically, a CRF linear chain method is used for modeling the sequence, an optimal label sequence path is decoded by a Viterbi algorithm, and an overnight label sequence is corrected to be continuously consistent with a label sequence of an expert artificial sleep stage judgment result.
As a further preferred aspect of the present invention, the step S1 further includes the following sub-steps: s11, monitoring, recording and storing human body physiological signal data according to American society for sleep medical Science (SOH) standard by a polysomnography device with a physiological signal acquisition function during human sleep; s12, after sampling and digitizing the original signal data of different types of physiological signals, respectively carrying out zero-phase digital filtering to prevent the physiological signals with non-stationary properties from phase distortion, removing extremely low frequency base lines, power frequency noise and high frequency noise in the signals, and completing signal preprocessing; s13, extracting the electro-ocular signal, the mandibular electromyographic signal and the electroencephalogram signal in the physiological signals, and carrying out original data set C ═ C by using a three-lead signal1,C2,...,CNMaking N overnight sleep data from N individuals, and dividing each subset by person.
As a further preferred aspect of the present invention, the step S2 further includes the following sub-steps: s21, reasonably setting a segmentation stage mode, wherein single sleep data are continuous in time for one night, the whole continuous process is divided into S stages by the algorithm, physiological signals of segmented sections represent different information of the sleep process, the algorithm simulates the prejudgment experience of experts on the whole night signal data in a plurality of trend stages when the experts artificially sleep the stages, and the number M of sample points in each stage is set by the aid of how long the stage has; s22, preparing a specific data set for the early stage of an algorithm model, dividing S data sets on the basis of the original data set, wherein the data size cannot be wrong, each data set comprises N multiplied by M sample points from N individuals, each sample point is the minimum unit of sleep staging and has the time span of 30 seconds, one sample point corresponds to one category label, and the five category labels comprise WAKE, REM, N1, N2 and N3; and S23, aligning the label with the data, and creating and storing the check segment data set into a file. The above data set production is the key step of the present invention, and the result and performance of the whole algorithm are deeply influenced.
As a further preferred aspect of the present invention, the step S3 further includes the following sub-steps: s31, extracting traditional characteristics, and calculating characteristic vector x according to the signalsiThe dimension is k. Feature vector xiThe m features of (1) include: the dynamic characteristic vector comprises a time domain characteristic vector, a frequency domain characteristic vector and a nonlinear dynamic characteristic vector, wherein the time domain characteristic vector comprises a statistical characteristic vector and a geometric characteristic vector. The frequency domain characteristic quantity comprises a power spectral density characteristic quantity and a time frequency characteristic quantity, the nonlinear dynamics characteristic quantity comprises a fractal dimension characteristic quantity and a complexity characteristic quantity, and each characteristic quantity is determined by respective parameters and a calculation mode; and S32, extracting abstract features. Abstract feature extraction is carried out through a self-encoder in the field of artificial neural networks, the unsupervised learning characteristic is utilized to fit the artificial neural networks capable of efficiently representing input data, no additional manual auxiliary work is added, the input signal data are efficiently represented by fixed low-dimensional vectors, namely self-encoding is carried out, the output dimension of the self-encoder is generally smaller than the input signal dimension, namely the self-encoder data dimension reduction characteristic. The invention adopts the self-encoder which is provided with a plurality of encoding layers, the complexity of encoding depends on the number of layers of the neural network stacking layers, the stacking layers are properly increased, and the input data can be effectively compressed and expressed; s33, onCarrying out feature engineering on each sample point, and splicing into a feature vector xii,(ξi∈Rkξi∈Rk) The sample points in each segment must be arranged into a segment sequence in time sequence, the corresponding eigenvectors are also arranged into a segment sequence seq, and actually, the whole eigenspace constructed by the segment sequence is a three-dimensional tensor X belonging to RN×M×k
As a further preferred aspect of the present invention, the step S4 further includes the following sub-steps: s41, dividing the tensor X of the feature space generated by seq into a final training data set, a verification data set and a test data set; s42, inputting the section sequence seq with time sequence into a Bi-directional gating circulation unit Bi-GRU network model, and generating tensor X of feature space by seq for RN×M×kReasonably setting the structure, training mode and initial parameters of the network model, and then loading a data set to start the training of the model; and S43, storing the Bi-directional gating circulation unit Bi-GRU network model after the Bi-directional gating circulation unit Bi-GRU network model reaches a preset termination condition.
As a further preferred aspect of the present invention, the step S5 further includes the following sub-steps: s51, all data are transmitted forward by using the network model to obtain a label of a sleep classification stage corresponding to a data sample, wherein the data sample is a training set and a verification set generated by the data set; and S52, comparing the classification result labels of the machine models with the manual classification result labels of the experts, recording evaluation indexes such as accuracy, recall rate, F1 scores and the like of each data set, and completing construction of the network model.
As a further preferred aspect of the present invention, the step S7 further includes the following sub-steps: s71, constructing a conditional random field CRF model, inputting the training set of tag sequences in the step S6 and tag sequences corresponding to the artificial stages of experts into the conditional random field CRF model, setting the number K of feature functions, iteratively training optimal parameters, and further obtaining the conditional probability P (y | x) of the conditional random field, namely context information of time-dependent transition from model learning to sleep stage, wherein the stage transition is closely related to the sleep time, and the method is also beneficial to the characteristics of sleepA key step of probability transfer using a CRF model; s72, testing the corrected result of the CRF model, and utilizing the conditional probability P (y | x) and the label sequence x in the verification setsTo calculate the optimal tag sequence y*And finally, calculating evaluation indexes such as accuracy and the like. And (5) counting and comparing the results of the CRF correction model and the network model.
Compared with the prior art, the invention can at least achieve one of the following beneficial effects:
1. selecting a proper data segment for segmentation, so that the algorithm has stronger adaptability to each sleep stage;
2. traditional characteristics and abstract characteristics are fused, and the expression capability of the characteristics is amplified, so that the algorithm accuracy is higher;
3. the interpretation of the algorithm on the time continuity is effectively enhanced through the time-associated network and the probability correction.
Drawings
FIG. 1 is an overall block diagram of the algorithm of the present invention.
FIG. 2 is a schematic view of a data set generation process according to the present invention.
FIG. 3 is a schematic diagram of data set generation according to the present invention.
FIG. 4 is a sample structure of data according to the present invention.
Fig. 5 is a block diagram of a stacked self-encoder according to the present invention.
FIG. 6 is a schematic diagram of a first layer of a stacked self-encoder according to the present invention.
FIG. 7 is a diagram of a second layer of a stacked self-encoder according to the present invention.
FIG. 8 is a diagram of the output layer of a stacked self-encoder according to the present invention.
Fig. 9 is an overall schematic diagram of a stacked self-encoder according to the present invention.
FIG. 10 is a diagram of a Bi-GRU network model according to the present invention.
Fig. 11 shows the internal structure of a GRU node according to the present invention.
FIG. 12 is a schematic view of an overall machine model layer incorporating CRF corrections in accordance with the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Specific example 1:
fig. 1, fig. 2, fig. 3, fig. 4, fig. 5, fig. 6, fig. 7, fig. 8, fig. 9, fig. 10, fig. 11, and fig. 12 show an algorithm for sleep automatic staging, and as shown in fig. 1, the overall idea of the present invention is to refine the construction of a sleep automatic staging algorithm model step by step according to three processes of data set production, feature engineering, and model layering.
As shown in fig. 2, the data set production flow is divided into three subsections, data acquisition, data pre-processing and overnight data segmentation. First, a data set is created from physiological data collected during human sleep, including conventional physiological signals such as electroencephalogram (EEG), Electrooculogram (EOG), and mandibular Electromyogram (EMG). C ═ C1,C2,...,CNC represents the data set of all people and N represents the number of people (or the number of times of completing overnight sleep).
Human body physiological signal data are monitored, recorded and stored according to the American society for sleep medicine standard by a polysomnography device with a physiological signal acquisition function during the sleep of a human body.
The original signal data of different kinds of physiological signals are digitized after sampling, and then zero-phase digital filtering is respectively carried out to prevent the physiological signals with non-stationary property from phase distortion. And removing extremely low-frequency baseline, power frequency noise and high-frequency noise in the signal, and finishing signal preprocessing.
Extracting the electro-oculogram signal, the mandible myoelectricity signal and the brain electricity signal in the physiological signals, and carrying out original data set C ═ C by using the three-lead signals1,C2,...,CNAnd (6) making. N overnight sleep data from N individuals, each subset divided by person.
The structure of the data is visualized, as shown in fig. 3 and 4, the data of each person which is continuous overnight is divided into S segments according to the time sequence, each segment represents different sleep stage information, and each sleep data needs to be fitted by different algorithm models along with the time. Thus, each segment is individually made into a data set, resulting in S segment data sets, each segment data set containing N × M sample data, where M is the number of sample points contained in each segment of data. One sample point represents a sleep period, which is typically 30 seconds in time span. Each sample point contains L signal sampling points. The data set sample size for C is sxnxnxnxmxmx.
And reasonably setting the mode of the segmentation stage. The single overnight sleep data are continuous in time, the whole continuous process of the data is divided into S stages by the algorithm, and the physiological signals of the divided stages represent different information of the sleep process. The algorithm simulation expert can have a plurality of trend stages of prejudgment experience on the signal data of the whole night when the artificial sleep stage is divided. Further, how long a stage has is to set the number M of sample points per stage.
And preparing a specific data set for the early preparation of the algorithm model, and dividing S section data sets on the basis of the original data set. The data size cannot be wrong, each data set comprises NxM sample points from N persons, each sample point is a minimum unit of sleep stage, the time span is 30 seconds, and one sample point corresponds to one category label. There are five category labels, which include WAKE, REM, N1, N2, and N3.
And aligning the label with the data, and checking the segment data set to prepare and store the segment data set to a file. The above data set production is the key step of the present invention, and the result and performance of the whole algorithm are deeply influenced.
And respectively performing feature engineering on the S segment data sets, wherein the mode comprises the following steps of extracting abstract features: an auto-encoder; traditional features are extracted: time domain features, frequency domain features, and nonlinear dynamics features. All the features are spliced into vectors, M vectors are generated from each segment of data, and the vectors are sequentially arranged to form a sequence, namely a segment sequence.
In which conventional features are extracted. Computing a feature vector x from the signaliThe dimension is k. Feature vector xiThe m features of (1) include: time domain feature quantity, frequency domain feature quantity and nonlinear dynamics feature quantity. Time domain feature quantity packetContains statistical characteristic quantity and geometric characteristic quantity. The frequency domain feature quantity includes a power spectral density feature quantity and a time frequency feature quantity. The nonlinear dynamics characteristic quantity comprises a fractal dimension characteristic quantity and a complexity characteristic quantity. Each feature quantity is determined by respective parameters and calculation modes.
Wherein abstract features are extracted. Abstract feature extraction is performed by an auto-encoder (Autoencoders) in the field of artificial neural networks. And fitting an artificial neural network capable of efficiently representing input data by using the unsupervised learning characteristic of the artificial neural network. The method has the advantages that extra manual assistance work is not added, and the input signal data are effectively represented by fixed low-dimensional vectors, namely self-coding. The output dimension is generally smaller than the input signal dimension, i.e., the dimension reduction characteristic of the self-encoder data. The invention adopts a self-encoder (SA) which is provided with a plurality of encoding layers, the complexity of encoding depends on the number of the stacking layers of the neural network, the stacking layers are properly increased, and the input data can be effectively compressed and expressed.
The network structure principle of the stacked self-encoder is shown in fig. 5, 6, 7, 8 and 9, and abstract features are extracted.
Performing feature engineering on each sample point to splice feature vectors xii,(ξi∈Rk) The sample points in each segment have to be arranged in a sequence of segments in time order, as well as the corresponding feature vectors into a sequence of segments seq. In fact, the whole feature space constructed by the segment sequence is the three-dimensional tensor X epsilon RN×M×k
Through a data set manufacturing process and a characteristic project, the data set structure achieves the design idea of the invention, and fully embodies the time dependence of characteristic fusion and data.
As shown in fig. 10, at the machine model layer, the Bi-directional gated cyclic unit Bi-GRU network model is trained using the above segment sequences as samples, and the S segment data sets respectively include N × M segment sequences.
The tensor X of the feature space generated by the seq is divided into a final training data set, a verification data set and a test data set.
After cross validation, training of each section of data set is finished, and finally the model is stored.
And inputting the section sequence seq with the time sequence into a Bi-directional gating circulation unit Bi-GRU network model. Tensor X of seq generated eigenspace belongs to RN×M×kShould meet the Bi-GRU input layer requirements. Reasonably setting the structure, the training mode and the initial parameters of the network model, and then loading the data set to start the training of the model.
And the Bi-directional gating circulation unit Bi-GRU network model is stored after reaching a preset termination condition.
And transmitting all data forward by using the network model to obtain the label of the sleep classification stage corresponding to the data sample. Wherein the data samples are a training set and a validation set generated by the data sets.
And comparing the classification result labels of the machine model with the manual classification result labels of the experts, recording evaluation indexes such as accuracy, recall rate, F1 scores and the like of each data set, and completing the construction of the network model.
And inputting the sample points for training into the Bi-directional gating circulation unit Bi-GRU network model which is trained, and finally obtaining the classification labels of the sleep stage.
And splicing the label sequences of S section samples sleeping at night into an overlong sequence, namely, a whole night data finally corresponds to a complete whole night label sequence. The overnight tag sequence is a one-dimensional vector with dimensions T ═ sxm. The sequence set containing N overnight tags is still divided into three sets according to the previous division, and the training set contains CtThe strip, the verification set contains CvBars, test set contains C bars.
And inputting the label training set sequence into a correction layer, modeling the sequence by using a Conditional Random Field (CRF) linear chain method by utilizing the advantages of the CRF method in the aspect of context information transfer extraction, and decoding an optimal label sequence path by using a Viterbi algorithm. And correcting the overnight label sequence to make the overnight label sequence continuously coincide with the label sequence of the judgment result of the expert artificial sleep stage.
And (4) constructing a conditional random field CRF model. Inputting the training set of tag sequences in the step S6 and the tag sequences corresponding to the expert manual staging into a conditional random field CRF model, setting the number K of feature functions, iteratively training out optimal parameters, and further obtaining a conditional probability P (y | x) of the conditional random field, that is, context information of the model learning to sleep stage time dependency transfer, stage transfer and sleep time are closely related, which is also a key step of the present invention in utilizing the probability transfer of the CRF model for sleep characteristics.
And testing the corrected result of the CRF model. Using the conditional probability P (y | x) and the tag sequence x in the verification setsTo calculate the optimal tag sequence y*. And finally, calculating evaluation indexes such as accuracy and the like. And (5) counting and comparing the results of the CRF correction model and the network model.
Although the invention has been described herein with reference to a number of illustrative embodiments thereof, it should be understood that numerous other modifications and embodiments can be devised by those skilled in the art that will fall within the spirit and scope of the principles of this disclosure. More specifically, various variations and modifications are possible in the component parts and/or arrangements of the subject combination arrangement within the scope of the disclosure, the drawings and the appended claims. In addition to variations and modifications in the component parts and/or arrangements, other uses will also be apparent to those skilled in the art.

Claims (7)

1. An algorithm for automatic sleep staging, characterized by: the method comprises the following steps: a. the feature layer extracts abstract features by using a multilayer perceptron and combines the traditional manual features based on expert experience to be used as information representation of sleep; b. then, a Bi-directional gating circulation unit Bi-GRU is used as a network model in the machine model layer; c. finally, a conditional random field CRF method is used as time continuity correction in a correction layer;
the step a of extracting abstract features by using a multilayer perceptron and combining traditional manual features based on expert experience as the information representation of sleep comprises the following steps: s1, making a data set from physiological data acquired during sleep of a person, wherein the data set comprises an electroencephalogram signal, an electro-oculogram signal and a mandibular electromyogram signal, C is { C1, C2,.., C N }, C represents a data set of all persons, and N represents the number of persons; s2, dividing each person' S continuous data into S segments in time sequence, each segment representing different sleep stage information, each segment needing to learn fitting with different algorithm models as time goes on, so each segment separately making a data set to generate S segment data sets, each segment data set containing N × M sample data, M being the number of sample points contained in each segment data, one sample point representing a sleep period, the time span of which is usually 30 seconds, each sample point containing L signal sampling points, so the data set sample size of C is S × N × M × L; and S3, respectively performing feature engineering on the S segment data sets, wherein the mode comprises the following steps of extracting abstract features: an auto-encoder; traditional features are extracted: the method comprises the following steps that time domain characteristics, frequency domain characteristics and nonlinear dynamics characteristics are spliced into vectors, M vectors are generated in each segment of data and are sequentially arranged to form a sequence, namely a segment sequence; the step b of utilizing the Bi-directional gating circulation unit Bi-GRU as the network model at the machine model layer comprises the following steps: s4, in the machine model layer, the Bi-directional gating circulation unit Bi-GRU network model is trained by using the segment sequences as samples, S segment data sets respectively comprise NxM segment sequences, each segment data set is divided into a training set and a testing set according to a certain proportion, and after the training of each segment data set is finished, the model is stored; s5, inputting the sample points for training into the Bi-directional gating circulation unit Bi-GRU network model which is trained, and finally obtaining a classification label of the sleep stage; s6, splicing the tag sequences of S sections of samples sleeping at night into an ultra-long sequence, that is, a night data corresponds to a complete night tag sequence at last, where the night tag sequence is a one-dimensional vector with a dimension T ═ sxm, the sequence set containing N night tags is still divided into three sets according to the previous division, the training set contains Ct strips, the verification set contains Cv strips, and the test set contains Cc strips; the step c of using a conditional random field CRF method as time continuity correction in the correction layer comprises a step S7, wherein in the step S7, the label training set sequence obtained in the step S6 is input into the correction layer, the advantages of the conditional random field CRF method in the aspect of context information transfer extraction are used, specifically, a CRF linear chain method is used for modeling the sequence, an optimal label sequence path is decoded by a Viterbi algorithm, and an overnight label sequence is corrected to be continuously consistent with a label sequence of an expert artificial sleep stage judgment result.
2. The algorithm for sleep automatic staging according to claim 1, characterized in that: the step S1 further includes the following sub-steps: s11, monitoring, recording and storing human body physiological data according to American society for sleep medical Science (SOD) standard by a polysomnography device with a physiological signal acquisition function during human sleep; s12, after sampling and digitizing the original signal data of different types of physiological signals, respectively carrying out zero-phase digital filtering to prevent the physiological signals with non-stationary properties from phase distortion, removing extremely low frequency base lines, power frequency noise and high frequency noise in the signals, and completing signal preprocessing; s13, extracting the electro-ocular signal, the mandibular electromyographic signal and the electroencephalographic signal from the physiological signals, making an original data set C ═ { C1, C2 … C N } by using the three-lead signals, dividing the N overnight sleep data from the N individuals into subsets according to the individual.
3. The algorithm for sleep automatic staging according to claim 1, characterized in that: the step S2 further includes the following sub-steps: s21, reasonably setting a segmentation stage mode, wherein single overnight sleep data are continuous in time, the whole continuous process of the single overnight sleep data is divided into S stages by an algorithm, physiological signals of segmented stages represent different information of the sleep process, the algorithm simulates the prejudgment experience of experts on overnight signal data when the experts artificially sleep stages, and the number M of sample points of each stage is set according to how long a stage has; s22, preparing a specific data set for the early stage of an algorithm model, dividing S data sets on the basis of an original data set, wherein the data size cannot be wrong, each data set comprises N multiplied by M sample points from N individuals, each sample point is the minimum unit of sleep staging, the time span of each sample point is 30 seconds, one sample point corresponds to one category label, the category labels are five, and the category labels comprise WAKE, REM, N1, N2 and N3; and S23, aligning the label with the data, and creating and storing the check segment data set into a file.
4. The algorithm for sleep automatic staging according to claim 1, characterized in that: the step S3 further includes the following sub-steps: s31, extracting traditional features, calculating the dimension k of a feature vector x i according to the signals, wherein m features in the feature vector x i comprise: the dynamic characteristic quantity calculation method comprises the following steps of time domain characteristic quantity, frequency domain characteristic quantity and nonlinear dynamic characteristic quantity, wherein the time domain characteristic quantity comprises statistical characteristic quantity and geometric characteristic quantity, the frequency domain characteristic quantity comprises power spectral density characteristic quantity and time frequency characteristic quantity, the nonlinear dynamic characteristic quantity comprises fractal dimension characteristic quantity and complexity characteristic quantity, and each characteristic quantity is determined by respective parameters and calculation modes; s32, abstract features are extracted, the self-encoder in the field of artificial neural networks is used for extracting the abstract features, the unsupervised learning characteristics are utilized to fit the artificial neural networks capable of efficiently representing input data, no additional manual auxiliary work is added, the input signal data are efficiently represented by fixed low-dimensional vectors, namely self-encoding, the output dimension of the self-encoder is generally smaller than the input signal dimension, namely the dimension reduction characteristics of the self-encoder data, the self-encoder is adopted, the self-encoder has a plurality of encoding layers, the encoding complexity depends on the number of the stacking layers of the neural networks, the stacking number is properly increased, and the input data can be effectively compressed and represented; s33, performing feature engineering on the sample points to splice feature vectors xi i, (xi i belongs to R k) the sample points in each segment must be arranged into a segment sequence according to the time sequence, the corresponding feature vectors are also arranged into a segment sequence seq, and actually, the whole feature space constructed by the segment sequence is a three-dimensional tensor X belongs to RN× M × k
5. The algorithm for sleep automatic staging according to claim 1, characterized in that: the step S4 further includes the following sub-steps: s41, dividing the three-dimensional tensor X of the feature space generated by seq into a final training data set, a verification data set and a test data set; s42, inputting the sequence seq with time sequence into the bidirectional gating circulation unit Bi-GRU network model, and tensor X of feature space generated by seq belongs to RN× M × kReasonably setting the structure, the training mode and the initial parameters of a network model when the requirement of a Bi-GRU input layer is met, and then loading a data set to start the training of the model; and S43, storing the Bi-directional gating circulation unit Bi-GRU network model after the Bi-directional gating circulation unit Bi-GRU network model reaches a preset termination condition.
6. The algorithm for sleep automatic staging according to claim 1, characterized in that: the step S5 further includes the following sub-steps: s51, all data are transmitted forward by using the network model to obtain a label of a sleep classification stage corresponding to a data sample, wherein the data sample is a training set and a verification set generated by the data set; and S52, comparing the classification result labels of the machine models with the manual classification result labels of the experts, recording evaluation indexes of each section of data set, including accuracy, recall rate and F1 score, and completing construction of the network model.
7. The algorithm for sleep automatic staging according to claim 1, characterized in that: the step S7 further includes the following sub-steps: s71, constructing a conditional random field CRF model, inputting the tag sequence training set in the step S6 and tag sequences corresponding to the manual staging of experts into the conditional random field CRF model, setting the number K of feature functions, and iteratively training optimal parameters to further obtain the conditional probability P (y | x) of the conditional random field; and S72, testing the corrected result of the CRF model, calculating the optimal tag sequence y by using the conditional probability P (y | x) and the tag sequence x S in the verification set, finally calculating evaluation indexes including accuracy, and statistically comparing the results of the CRF correction model and the network model.
CN202010591697.XA 2020-06-24 2020-06-24 Algorithm for automatic sleep staging Active CN111631688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010591697.XA CN111631688B (en) 2020-06-24 2020-06-24 Algorithm for automatic sleep staging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010591697.XA CN111631688B (en) 2020-06-24 2020-06-24 Algorithm for automatic sleep staging

Publications (2)

Publication Number Publication Date
CN111631688A CN111631688A (en) 2020-09-08
CN111631688B true CN111631688B (en) 2021-10-29

Family

ID=72323139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010591697.XA Active CN111631688B (en) 2020-06-24 2020-06-24 Algorithm for automatic sleep staging

Country Status (1)

Country Link
CN (1) CN111631688B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112263218A (en) * 2020-10-12 2021-01-26 上海大学 Sleep staging method and device
CN112294342A (en) * 2020-10-30 2021-02-02 哈尔滨理工大学 Sleep staging method based on deep residual Mask-CCNN
CN113080864B (en) * 2021-04-07 2022-02-01 电子科技大学 Common sleep disease detection method through automatic sleep staging results
CN113576410B (en) * 2021-07-20 2022-09-02 电子科技大学 Dynamic continuous analysis method for sleep process
CN113397562A (en) * 2021-07-20 2021-09-17 电子科技大学 Sleep spindle wave detection method based on deep learning
CN113456030A (en) * 2021-08-05 2021-10-01 成都云卫康医疗科技有限公司 Sleep staging method based on heart rate monitoring data
CN114707561B (en) * 2022-05-25 2022-09-30 清华大学深圳国际研究生院 PSG data automatic analysis method, device, computer equipment and storage medium
CN116898455B (en) * 2023-07-06 2024-04-16 湖北大学 Sleep electroencephalogram signal detection method and system based on deep learning model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107106028A (en) * 2014-12-18 2017-08-29 皇家飞利浦有限公司 The system and method classified for cardiopulmonary sleep stage
CN107495962A (en) * 2017-09-18 2017-12-22 北京大学 A kind of automatic method by stages of sleep of single lead brain electricity
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN109602410A (en) * 2018-11-16 2019-04-12 青岛真时科技有限公司 A kind of wearable device and its monitoring of pulse method
CN109864750A (en) * 2019-01-31 2019-06-11 华南理工大学 Based on the state of mind assessment and regulating system and its working method stimulated through cranium
CN110801221A (en) * 2019-12-09 2020-02-18 中山大学 Sleep apnea fragment detection method and device based on unsupervised feature learning
CN110890155A (en) * 2019-11-25 2020-03-17 中国科学技术大学 Multi-class arrhythmia detection method based on lead attention mechanism
CN111091116A (en) * 2019-12-31 2020-05-01 华南师范大学 Signal processing method and system for judging arrhythmia

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7509163B1 (en) * 2007-09-28 2009-03-24 International Business Machines Corporation Method and system for subject-adaptive real-time sleep stage classification
US11810670B2 (en) * 2018-11-13 2023-11-07 CurieAI, Inc. Intelligent health monitoring

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107106028A (en) * 2014-12-18 2017-08-29 皇家飞利浦有限公司 The system and method classified for cardiopulmonary sleep stage
CN107495962A (en) * 2017-09-18 2017-12-22 北京大学 A kind of automatic method by stages of sleep of single lead brain electricity
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN109602410A (en) * 2018-11-16 2019-04-12 青岛真时科技有限公司 A kind of wearable device and its monitoring of pulse method
CN109864750A (en) * 2019-01-31 2019-06-11 华南理工大学 Based on the state of mind assessment and regulating system and its working method stimulated through cranium
CN110890155A (en) * 2019-11-25 2020-03-17 中国科学技术大学 Multi-class arrhythmia detection method based on lead attention mechanism
CN110801221A (en) * 2019-12-09 2020-02-18 中山大学 Sleep apnea fragment detection method and device based on unsupervised feature learning
CN111091116A (en) * 2019-12-31 2020-05-01 华南师范大学 Signal processing method and system for judging arrhythmia

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于BiGRU深度神经网络的心肌梗死检测;张行进 等;《计算机应用与软件》;20200229;第37卷(第2期);第48-52页 *
睡眠分期算法研究;何垣谛;《中国优秀硕士学位论文全文数据集(电子期刊) 医药卫生科技辑》;20200131;第E060-1018页 *

Also Published As

Publication number Publication date
CN111631688A (en) 2020-09-08

Similar Documents

Publication Publication Date Title
CN111631688B (en) Algorithm for automatic sleep staging
Jindal et al. An adaptive deep learning approach for PPG-based identification
CN112656427B (en) Electroencephalogram emotion recognition method based on dimension model
CN110801221B (en) Sleep apnea fragment detection equipment based on unsupervised feature learning
CN109497996B (en) Method for constructing and analyzing complex network of micro-state EEG time domain features
CN108968915A (en) Sleep state classification method and system based on entropy feature and support vector machines
Li et al. Robust ECG biometrics using GNMF and sparse representation
CN111407243B (en) Pulse signal pressure identification method based on deep learning
CN113729707A (en) FECNN-LSTM-based emotion recognition method based on multi-mode fusion of eye movement and PPG
CN108596069A (en) Neonatal pain expression recognition method and system based on depth 3D residual error networks
CN108847279B (en) Sleep breathing state automatic discrimination method and system based on pulse wave data
CN112509696A (en) Health data detection method based on convolution autoencoder Gaussian mixture model
CN114732409A (en) Emotion recognition method based on electroencephalogram signals
CN115530847A (en) Electroencephalogram signal automatic sleep staging method based on multi-scale attention
CN113303770A (en) Sleep staging method and device
Wang et al. Identification of Depression with a Semi-supervised GCN based on EEG Data
Liang et al. Obstructive sleep apnea detection using combination of CNN and LSTM techniques
CN114145745B (en) Graph-based multitasking self-supervision emotion recognition method
Guan Application of logistic regression algorithm in the diagnosis of expression disorder in Parkinson's disease
Pimentel et al. Human mental state monitoring in the wild: Are we better off with deeperneural networks or improved input features?
Liao et al. Recognizing diseases with multivariate physiological signals by a DeepCNN-LSTM network
CN114129147A (en) System and method for predicting effects of Parkinson patients after DBS operation based on brain function network
CN115736840A (en) Sleep quality identification and classification method based on electrocardiogram data
Jindal MobileSOFT: U: A deep learning framework to monitor heart rate during intensive physical exercise
Li et al. Tfformer: A time frequency information fusion based cnn-transformer model for osa detection with single-lead ecg

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant