CN110796110A - Human behavior identification method and system based on graph convolution network - Google Patents
Human behavior identification method and system based on graph convolution network Download PDFInfo
- Publication number
- CN110796110A CN110796110A CN201911070446.0A CN201911070446A CN110796110A CN 110796110 A CN110796110 A CN 110796110A CN 201911070446 A CN201911070446 A CN 201911070446A CN 110796110 A CN110796110 A CN 110796110A
- Authority
- CN
- China
- Prior art keywords
- graph
- node
- time
- topological
- space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a human behavior identification method and a human behavior identification system based on a graph convolution network, wherein the identification method comprises the following steps: extracting human body skeleton information from an image containing human body behaviors, acquiring a human body joint point position information sequence, and constructing a topological graph sequence with any length of a human body skeleton; performing feature extraction and adaptive evolution of a topological structure on a topological graph sequence through a space-time graph convolution network based on topological learnable graph convolution to obtain new features of nodes fusing local space-time features and a topological graph sequence with a new topological structure; extracting features through a graph convolution long-term and short-term memory neural network; obtaining global space-time characteristics by using global pooling operation; and carrying out human behavior recognition based on the global space-time characteristics through a classifier. The method directly learns the characteristics of the whole graph, expands the weight matrix in graph convolution to the structure of the whole topological graph, learns the relation between any two nodes in the graph without the limitation of the topological structure, and has high identification accuracy.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence, and relates to a human behavior identification method and a human behavior identification system based on a graph convolution network, which can be used for action identification of a topological graph sequence.
Background
Convolutional neural networks have achieved tremendous success in many areas, but rely on data characterization with a grid structure. However, data in many fields is not in a grid structure, and data in irregular domains usually shows a topological graph structure, so that the convolutional neural network is difficult to popularize in the graph domain. In order to maintain the characteristic that graph convolution products keep the characteristic, usually a transition matrix is defined on each node and a weight matrix is defined for the degree of the node so that graph convolution can learn on different topological subgraphs, and a corresponding space division rule and a rule used for determining are designed according to the number of domain subsets of the nodes of the space-time graph. The existing self-adaptive graph convolution can only learn the topological self-adaptive relationship between adjacent nodes, and the learning capability of the relationship between nodes with longer distance is insufficient. Furthermore, due to the limitations of the transition matrix in graph convolution, there is often a lack of effective modeling of long-term temporal relationships between sequences of topological graphs.
The topology of the topological graph data is usually fixed at all layers of the network, but the natural topology is not necessarily optimal, so the graph convolution network with the ability to learn arbitrary topologies has great significance to the convolutional neural network in the field of topological structure data.
Disclosure of Invention
In order to solve the problems, the invention provides a human behavior identification method based on a graph convolution network, which directly learns the characteristics of the whole graph, expands a weight matrix in the graph convolution to the structure of the whole topological graph, learns the relationship between any two nodes in the graph without being limited by the topological structure, and has high identification accuracy; meanwhile, a recurrent neural network is introduced to model the long-term time relation of the topological graph sequence, so that the problems in the prior art are solved.
The invention also aims to provide a human body behavior identification method and system based on the graph convolution network.
The invention adopts the technical scheme that a human behavior identification method based on a graph convolution network comprises the following steps:
s1, extracting human skeleton information from the image containing human behavior, obtaining a human joint point position information sequence, and constructing a topological graph sequence with any length of the human skeleton by taking each joint point as a node and the skeleton between the joint points as an edge;
s2, performing feature extraction and adaptive evolution of a topological structure on the topological graph sequence through a space-time graph convolution network based on topological learnable graph convolution to obtain new node features fusing local space-time features and a topological graph sequence with a new topological structure;
s3, extracting the characteristics of the new topological graph sequence through the graph convolution long-term and short-term memory neural network to obtain a topological graph sequence with long-term space-time characteristics;
s4, further fusing the characteristics of the topological graph sequence by using global pooling operation to obtain global space-time characteristics;
and S5, recognizing human body behaviors by using a classifier based on the global space-time characteristics.
Further, in the step S1, the topological graph sequence of the human skeleton is composed of a plurality of topological graph structures, and the topological graph structures are represented by formula (1-1);
G=(V,E)=(fv,wE) (1-1)
wherein G is a topological graph structure of a human skeleton, and a node set V ═ Vti|t=1,…,T,i=1,…, N represents human joints, T is the frame number of the sequence, N is the number of the joints, and the node set V comprises all the nodes in the skeleton sequence at each moment; the edge set E consists of two edge sets of a space domain and a time domain, and the edge set E in the space domainS={vtivtjL (i, j) belongs to H, and represents the edge of the t frame node i and the node j, wherein H is a set of natural connection of human joints; edge set E in time domainT={vtiv(t+1)iRepresents the connection between the front and back frames of the same node; f. ofvFeature vectors, w, representing nodesERepresenting the connection weight of the edge.
Further, in step S2, specifically, the step includes:
s21, the space-time graph convolution network based on the topology learnable graph convolution is provided with a plurality of graph convolution blocks, and the space-domain feature and the time-domain feature are learnt for each graph convolution block respectively to obtain a node feature vector fusing local space-time features;
spatial domain feature learning: learning the spatial domain features by using a node feature learning function to obtain a node feature vector fusing local spatial domain features, wherein the formula is shown as (1-2):
wherein W is a node characteristic learning parameter matrix,is node viOf the feature vector of (1), node viIs the ith node in the topological graph, WmRepresenting the m-th dimension of the matrix W,representing a node viThe corresponding feature vector, namely the content stored in the data structure corresponding to the node, M represents the corresponding dimension of the vector or the matrix; normalizing the learned airspace features by using a batch normalization function, and finally processing the features by using a linear rectification activation function;
learning time domain features: learning time domain features by using a time domain convolution function, and then normalizing the learned time domain features by using a batch standardization function;
s22, after airspace feature learning, fusing the airspace feature vector through a node fusion function GFuse (-) to obtain the connection weight of a new edge set; GFusion (. cndot.) is implemented using a matrix multiplication between topology learnable fusion weights and node features with a specific initialization, as shown in equations (1-3):
wherein L represents a topology learnable fusion parameter matrix,is a node viFeature vector of, LijIs node viAnd vjWith a learnable fusion weight initialized by normalizing the adjacency matrix or the all-0 matrix, vjIs a dividing node v in the topological graphiAll nodes except "⊙" represent the product of the elements of the two matrices,representing a node vjThe topological learnable fusion parameter matrix L is self-adaptive and is realized by utilizing two-dimensional convolution or matrix multiplication with convolution kernel size of 1 multiplied by 1;
and S23, substituting the node feature vector fusing the local space-time features and the connection weight of the new edge set into formula (1-1) to obtain a topological graph sequence with a new topological structure.
Further, the topological graph sequence with long-term spatio-temporal characteristics in step S3 is determined according to equation (1-4):
Fvt=GCNLSTM(STGCN(I)) (1-4)
wherein, FvtIs the long-time space-time characteristic of a node v in the t-th frame, I is a human skeleton topological graph sequence shown in a formula (1-1), STGCN is a space-time graph convolution network based on topological learnable graph convolution, GCNLSTM is a graph convolution long-term and short-term memory network,the specific implementation mode is shown as the formula (1-5):
wherein, WxiAnd WhiIs the weight of the input and hidden states in the input gate, WxfAnd WhfIs the weight of the input and hidden states in the forgetting gate, WxoAnd WhoIs the weight of the input and hidden states in the output gate, WxcIs the weight of the input in the cell state, WhcIs the weight of the hidden state in the cell state, "+g"represents a graph convolution operation, XtIs input at the current time, HtIs a hidden state at the present time, Ht-1Is a hidden state at the previous moment, bi,bf,boAnd bcRespectively, the deviations of the input gate, the forgetting gate, the output gate and the cell state, sigma is an S-shaped Sigmoid function,it,ftand otGate function values, C, for input, forgetting and output gates, respectivelyt-1The state of the cells at the previous moment,representing a Hadamard product, and tanh is a hyperbolic tangent function; ctThe cell state at the current time t.
Further, the step S4 is specifically performed according to the following steps:
s41, firstly, performing mean pooling operation on all node characteristics at each moment to obtain a characteristic vector at each moment, as shown in the formula (1-6):
wherein, FvtFor long-term spatiotemporal characteristics, FtFor the feature vector after the fusion at the time t, GPooling () is a node feature mean value pooling function and represents that for each nodePerforming mean pooling operation on all nodes of a moment feature graph to obtain a feature vector of each moment;
s42, aggregating the feature vectors at each moment by using a time domain mean global pooling operation to obtain global space-time features, as shown in the formula (1-7):
wherein, FtAnd F is the global space-time feature obtained by fusion, and TPooling () is a time domain mean global pooling function, and the feature vectors at all the moments are pooled to obtain the global space-time feature.
Further, the step S5 is specifically represented by the formula (1-8);
where C is the number of behavior classes, CkIs the k-th behavior class, SkAnd SiThe probability that the global space-time feature F belongs to the k-th behavior class and the i-th behavior class is obtained through the known full-connected layer function calculation, and e is a constant.
Further, the topology learnable fusion parameter matrix L and the node characteristic learnt parameter matrix W are learnt and optimized through back propagation.
Further, the determination of the spatial edge set of the topological graph with the new topological structure comprises: node v when t frametiAnd node vtjWhen the fusion weight between the nodes is not 0, it represents the node vtiAnd node vtjHave a spatial relationship between them, form a new edge.
A behavior recognition system based on a spatio-temporal graph convolution and a graph convolution long-term and short-term memory network adopts the human behavior recognition method based on the graph convolution network, and comprises the following steps:
the topological graph sequence construction module is used for extracting human skeleton information from an input image, acquiring a human joint point position information sequence, and constructing a topological graph sequence of a human skeleton by taking all joint points as nodes and bones among the joint points as edges;
the space-time graph convolution network is used for carrying out feature extraction and adaptive evolution of a topological structure on the topological graph sequence to obtain new node features fusing local space-time features and the topological graph sequence with the new topological structure;
the graph convolution long-short term memory neural network is used for extracting the characteristics of the topological graph sequence of the new topological structure to obtain the topological graph sequence with long-term space-time characteristics;
the global pooling module is used for further fusing the characteristics of the topological graph sequence to obtain global space-time characteristics;
and the classifier is used for carrying out human behavior identification based on the global space-time characteristics.
The invention has the beneficial effects that:
(1) the graph convolution is separated into two operations of feature learning and node fusion, a new topological graph except a manually set topological structure is learned by expanding the range of node fusion, the self-adaptive relation of the whole topological graph is learned by a specifically initialized topology learnable fusion parameter matrix, and a weight matrix in the graph convolution is expanded to the whole topological graph structure, so that the relation between interconnected nodes can be learned, the relation between two unconnected nodes can be learned, and the graph convolution is flexible to use and good in adaptability; the characteristics of the whole topological graph structure are directly learned, so that a human body skeleton sequence is converted into deep space-time characteristics, the integrity of the characteristics of the topological structure is kept in the whole learning process, the characteristics of the whole topological graph are more effectively extracted, the identification accuracy is improved, and the problem that the learning capability of the existing self-adaptive graph convolution on topological graphs except for the manually set topological structure is insufficient is solved.
(2) The method is combined with a cyclic neural network to learn the long-term time-space characteristics of the topological graph sequence, is used for human behavior recognition based on the skeleton sequence data, effectively learns the characteristics of topological structure data, and solves the problem that the long-term time characteristic modeling capability of the conventional adaptive graph convolution on the topological graph sequence is insufficient.
(3) The invention can use the latest classifier to improve the performance, has good flexibility and expandability, effectively solves the problem of inconsistent duration time between different actions by converting the action sequence into the global space-time characteristic, effectively models the relation between body parts lacking physical connection, realizes the study of a dynamic topological structure sequence, and can be applied to the applications of human behavior recognition, gesture recognition, facial expression recognition and the like based on a skeleton sequence.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a behavior recognition method based on spatiotemporal graph convolution and graph convolution long and short term memory networks according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of obtaining a new topological graph sequence by a graph convolution method capable of topology learning according to an embodiment of the present invention.
FIG. 3 is a block diagram of a behavior recognition system based on spatiotemporal graph convolution and graph convolution long and short term memory networks according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a human behavior recognition method based on a spatio-temporal graph convolution and a graph convolution long-term and short-term memory network, as shown in figure 1, the method comprises the following steps:
s1, extracting human skeleton information from the image containing human behavior, obtaining a human joint point position information sequence, constructing a topological graph sequence of the human skeleton by taking each joint point as a node and bones among the joint points as edges, wherein the skeleton information is usually extracted from a color image or a depth image;
the topological graph sequence of the human skeleton consists of a plurality of topological graph structures, and the topological graph structures are represented by formula (1);
G=(V,E)=(fv,wE) (1)
wherein G is a topological graph structure of a human skeleton, and a node set V ═ VtiL T1, …, T, i 1, …, N, where T is the number of frames in the sequence, N is the number of joints, and the node set V includes all nodes in the skeleton sequence at each time; the edge set E consists of two edge sets of a space domain and a time domain, and the edge set E in the space domainS={vtivtjL (i, j) belongs to H, and represents the edge of the t frame node i and the node j, wherein H is a set of natural connection of human joints; edge set E in time domainT={vtiv(t+1)iRepresents the connection between the front and back frames of the same node; f. ofvFeature vectors, w, representing nodesERepresenting the connection weight of the edge.
S2, performing feature extraction and adaptive evolution of a topological structure on the topological graph sequence through a space-time graph convolution network based on topological learnable Graph Convolution (GCN), and obtaining new node features fusing local space-time features and a topological graph sequence with a new topological structure;
the convolution of the topology learnable graph can be completed by two steps of node feature learning and node feature fusion, and the convolution function of the topology learnable graph is obtained by a node feature learning function and a node fusion function, as shown in formula (2):
GraphConv (·) is a convolution function of a topology learnable graph, GFusion (·) is a node fusion function, L represents a topology learnable fusion parameter matrix, and FConv (·) is a node feature learning function; the output result of the topology learnable graph convolution function is a node new feature fusing local space-time features and a topological graph sequence with a new topological structure.
S21, the topological learnable graph convolution-based space-time graph convolution network is provided with a plurality of graph convolution blocks, the space domain characteristics and the time domain characteristics of each graph convolution block are learnt respectively, and the node characteristic vector f fusing local space-time characteristics is obtainedv ′;
Spatial domain characteristics: learning the spatial domain features by using a node feature learning function to obtain a node feature vector fused with local spatial domain features, wherein the formula (3) is as follows:
wherein W is a node characteristic learning parameter matrix,is node viOf the feature vector of (1), node viIs the ith node in the topological graph, WmRepresenting the m-th dimension of the matrix W,representing a node viThe corresponding feature vector, i.e. the content stored in the data structure corresponding to the node, M represents the dimension corresponding to the vector or matrix. Normalizing the learned airspace characteristics by using a batch normalization function (BN), accelerating convergence, relieving overfitting, making the network insensitive to the initialization weight and allowing a larger learning rate to be used; and finally, processing characteristics by using a Rectified linear unit (ReLU), so that the calculated amount is saved, the gradient disappearance is avoided, and the overfitting is relieved.
Time domain characteristics: learning time domain features by using a time domain convolution function, and then normalizing the learned time domain features by using a batch standardization function; the convolution function operating on the spatial domain feature is a graph convolution function, because the spatial domain feature is a topological graph structure; the convolution operation operating on time domain features is a common convolution function because time domain features are grid structure data, non-topological structures.
S22, after the airspace feature learning, fusing the airspace feature through a node fusion function GFuse (-) to obtain the connection weight of the new edge setGFusion (·) is implemented using a matrix multiplication between topology learnable fusion weights and node features with a specific initialization, as shown in equation (4):
wherein L represents a topology learnable fusion parameter matrix,is a node viFeature vector of, LijIs node viAnd vjWith a learnable fusion weight initialized by normalizing the adjacency matrix or the all-0 matrix, vjIs a dividing node v in the topological graphiAll nodes outside (including but not limited to node v)iAll nodes of the adjacent nodes) of the two matrices, the relationship between not only the nodes connected by the edges, but also any two nodes can be learned, "⊙" represents the product of the elements of the two matrices,representing a node vjThe topology learnable fusion parameter matrix L is self-adaptive and is realized by utilizing two-dimensional convolution or matrix multiplication with the convolution kernel size of 1 multiplied by 1, so that not only the fusion weight of the existing edge can be learnt, but also the fusion weight between any two nodes can be learnt; for example, in human skeletons, there is no natural connection between the joints of the left and right hands, but there is often a correlation between the two when performing an action, and this implicit relationship can be learned by a topology learnable graph volume; the fusion parameter matrix L and the node characteristic learning parameter matrix W can be learned through back propagation learning topology, and parameters are optimized.
In the embodiment, the space-time graph convolution network based on topology learnable graph convolution has 7 graph volume blocks, the number of channels of each graph volume block is 64, 128, 256 and 256, respectively, and the number of the graph volume blocks and the corresponding number of the channels have no specific requirements, and belong to the setting of hyper-parameters in a neural network.
S23, fusing the node feature vector f of the local space-time featurev', connection weight of new edge setSubstituting formula (1) to obtain a topological graph sequence with a new topological structure, as shown in formula (5):
wherein G isGCNIs a new topological graph sequence, V represents a node set, E'SIs a set of edges, f 'of a topology graph having a new topology'vTo fuse the node feature vectors of the local spatio-temporal features,the connection weight of the new edge set.
Edge set E 'of topology graph with new topology in air domain'S={vtivtj|Li,jNot equal to 0}, i.e. the node v of the t-th frametiAnd node vtjWhen the fusion weight between the nodes is not 0, it represents the node vtiAnd node vtjThe two sides have a spatial relationship, a new edge is formed, and the topological structure of the topological graph is updated. The time domain convolution operation does not change the topology of the graph, but only updates the characteristics of each graph node. Only the topology learnable graph convolution proposed by the invention can simultaneously update the graph node characteristics and the graph topology structure, and the topology learnable graph convolution is only applied to the spatial domain graph convolution, and the time domain convolution still adopts the known time domain convolution method.
In FIG. 2, first, a topological graph sequence of T-th frame is inputted, wherein the topological graph of T-th frame is represented by formula (1), and the topological graph hasN nodes, each node having a feature vector fv,CinRepresenting the number of input channels of the space-time graph convolutional neural network; and performing characteristic learning on the input topological graph sequence through a node characteristic learning parameter matrix W and a learnable fusion weight parameter matrix L in the network, wherein,the link weight value L of the edge between the ith joint and the jth joint in the t frameijRepresents the ith row and the jth column of the matrix L; obtaining a new topological graph sequence through a space-time graph convolutional neural network, wherein each node has a new feature vector f'v,CoutRepresenting the number of channels output by the network.
The convolution of the existing topological learnable graph only learns the relation between the connected nodes, and firstly extracts the characteristics of the subgraph in a bottom-up mode and then fuses to obtain the characteristics of the whole graph. The graph convolution is separated into two operations of feature learning and node fusion, a new topological graph except a topological structure set manually is learned by expanding the range of node fusion, and the self-adaptive relation of the whole topological graph can be learned by a specific initialized topology learning fusion parameter matrix L; expanding the weight matrix in the graph convolution to the whole topological graph structure, learning the relationship between any two nodes in the graph without the limitation of the topological structure, expanding the learning range from the connected nodes to any two nodes; the method does not adopt a bottom-up mode, but can directly learn the characteristics of the whole graph by learning the relation of any two nodes, the parameter quantity is larger than that of learning the subgraph, but the extracted characteristics are more effective.
The invention replaces the feature of the learning sub-topological graph with the feature of the learning whole topological graph for re-fusion, and in order to have equivalent parameters and calculated amount with other self-adaptive graph convolution neural networks, the invention uses simple topology learnable convolution with specific initialization and shows better performance.
S3, extracting the characteristics of the new topological graph sequence through a graph convolution long-term short-term memory neural network (GCNLSTM) to obtain a topological graph sequence with long-term space-time characteristics, as shown in formula (6):
Fvt=GCNLSTM(STGCN(I)) (6)
wherein, I is a human skeleton topological graph sequence shown in formula (1), STGCN is a time-space graph convolution network based on topological learnable graph convolution, GCNLSTM is a graph convolution long-short term memory network, and the specific implementation mode is shown in formula (7); fvtThe long-term space-time characteristics of the node v in the t frame, namely a new characteristic vector of the node.
The graph volume long short term memory neural network is shown as a formula (7);
wherein, WxiAnd WhiIs the weight of the input and hidden states in the input gate, WxfAnd WhfIs the weight of the input and hidden states in the forgetting gate, WxoAnd WhoIs the weight of the input and hidden states in the output gate, WxcIs the weight of the input in the cell state, WhcIs the weight of the hidden state in the cell state, "+g"represents a graph convolution operation, XtIs input at the current time, HtIs a hidden state at the present time, Ht-1Is a hidden state at the previous moment, bi,bf,boAnd bcRespectively, the deviations of the input gate, the forgetting gate, the output gate and the cell state, sigma is an S-shaped Sigmoid function,);it,ftand otGate function values, C, for input, forgetting and output gates, respectivelyt-1The state of the cells at the previous moment,representing a Hadamard product, and tanh is a hyperbolic tangent function; ctThe cell state at the current time t.
The convolutional network of the space-time diagram can only learn short-term high-level space-time characteristics, and for a data sequence with a time relation, the short-term high-level characteristics are not enough for pattern recognition. The cyclic neural network carries out long-term time modeling on the learned short-term high-level space-time characteristics, fully learns the time relation on the sequence, and obviously improves the effect of mode identification.
S4, further fusing the features of the topological graph sequence by using global pooling operation to obtain global space-time features, effectively solving the problem of inconsistent duration time between different actions, and effectively modeling the relationship between body parts lacking physical connection;
s41, first performing a mean pooling operation on all node features at each time to obtain a feature vector at each time, as shown in equation (8):
wherein, FvtFor long-term spatiotemporal characteristics, FtFor the feature vector after the fusion at the time t, GPooling () is a node feature mean pooling function, which represents that mean pooling operation is performed on all nodes of the feature map at each time to obtain the feature vector at each time.
S42, aggregating the feature vectors at each moment by using a time domain mean global pooling operation to obtain global space-time features, as shown in formula (9):
wherein, FtAnd F is the global space-time feature obtained by fusion, and TPooling () is a time domain mean global pooling function, and the feature vectors at all the moments are pooled to obtain the global space-time feature.
S5, recognizing human body behaviors by using a Softmax classifier based on global space-time characteristics, wherein the Softmax classifier has good flexibility and expandability and improves performance;
specifically, as shown in formula (10);
where C is the number of behavior classes, CkIs the kth behavior category (common behavior categories are drinking, eating, brushing teeth, combing head, reading and the like actions), SkAnd SiThe probabilities that the global space-time feature F belongs to the k-th behavior class and the i-th behavior class are obtained through the known full-connection layer function calculation, and e is a constant and has a value of about 2.718.
The embodiment of the invention discloses a behavior recognition system based on a spatio-temporal graph convolution and a graph convolution long-term and short-term memory network, and as shown in a figure 3, the human behavior recognition method based on the graph convolution network comprises the following steps:
the topological graph sequence construction module is used for extracting human skeleton information from an input image, acquiring a human joint point position information sequence, and constructing a topological graph sequence of a human skeleton by taking all joint points as nodes and bones among the joint points as edges;
the space-time graph convolution network is used for carrying out feature extraction and adaptive evolution of a topological structure on the topological graph sequence to obtain new node features fusing local space-time features and the topological graph sequence with the new topological structure;
the graph convolution long-short term memory neural network is used for extracting the characteristics of the topological graph sequence of the new topological structure to obtain the topological graph sequence with long-term space-time characteristics;
the global pooling module is used for further fusing the characteristics of the topological graph sequence to obtain global space-time characteristics;
and the classifier is used for carrying out human behavior identification based on the global space-time characteristics.
The present invention compares on both data sets with a space-time graph convolutional neural network (ST-GCN), one of the most advanced current graph-convolutional neural networks. On the Kinetics-Skeleton data set, the optimal recognition accuracy rate of the method reaches 36.2 percent, and is 5.5 percent higher than that of ST-GCN; on an NTU-RGBD data set, the optimal recognition accuracy of the method reaches 89.2 percent, which is 7.7 percent higher than that of ST-GCN. Compared with the existing human behavior recognition method based on the time-space graph convolution, the method can directly learn the characteristics of the whole graph, the learned characteristics can better represent human skeleton information, the relationship between two joints without physical connection can be learned, the recognition of human behaviors is more favorable, long-term time characteristics can be learned by using the time relationship of the graph convolution long-term and short-term memory network learning sequence, and the method has more superiority than the time relationship of the ordinary convolution learning sequence.
The method is used for human behavior recognition, including action recognition, gesture recognition and facial expression recognition, and can form a topological graph sequence by taking human joints as nodes of a topological graph based on human skeleton information, and the method is adopted for recognition; for example, the method can be applied to purchasing behavior recognition in unmanned supermarket, better man-machine interaction can be realized by recognizing the behaviors of people by the intelligent robot in home life, the behaviors of specific people in specific places can be recognized in the field of security monitoring, and the like; the method can also be applied to data analysis and other applications with a relational model data structure.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.
Claims (9)
1. A human behavior recognition method based on graph convolution network is characterized by comprising the following steps:
s1, extracting human skeleton information from the image containing human behavior, obtaining a human joint point position information sequence, and constructing a topological graph sequence with any length of the human skeleton by taking each joint point as a node and the skeleton between the joint points as an edge;
s2, performing feature extraction and adaptive evolution of a topological structure on the topological graph sequence through a space-time graph convolution network based on topological learnable graph convolution to obtain new node features fusing local space-time features and a topological graph sequence with a new topological structure;
s3, extracting the characteristics of the new topological graph sequence through the graph convolution long-term and short-term memory neural network to obtain a topological graph sequence with long-term space-time characteristics;
s4, further fusing the characteristics of the topological graph sequence by using global pooling operation to obtain global space-time characteristics;
and S5, recognizing human body behaviors by using a classifier based on the global space-time characteristics.
2. The method for recognizing human body behaviors based on graph convolution network as claimed in claim 1, wherein in step S1, the topological graph sequence of the human body skeleton is composed of a plurality of topological graph structures, and the topological graph structures are represented by formula (1-1);
G=(V,E)=(fv,wE) (1-1)
wherein G is a topological graph structure of a human skeleton, and a node set V ═ VtiL T1, …, T, i 1, …, N, where T is the number of frames in the sequence, N is the number of joints, and the node set V includes all nodes in the skeleton sequence at each time; the edge set E consists of two edge sets of a space domain and a time domain, and the edge set E in the space domainS={vtivtjL (i, j) belongs to H, and represents the edge of the t frame node i and the node j, wherein H is a set of natural connection of human joints; edge set E in time domainT={vtiv(t+1)iRepresents the connection between the front and back frames of the same node; f. ofvFeature vectors, w, representing nodesERepresenting the connection weight of the edge.
3. The human behavior recognition method based on graph convolution network of claim 2, wherein the step S2 specifically includes:
s21, the space-time graph convolution network based on the topology learnable graph convolution is provided with a plurality of graph convolution blocks, and the space-domain feature and the time-domain feature are learnt for each graph convolution block respectively to obtain a node feature vector fusing local space-time features;
spatial domain feature learning: learning the spatial domain features by using a node feature learning function to obtain a node feature vector fusing local spatial domain features, wherein the formula is shown as (1-2):
wherein W is a node characteristic learning parameter matrix,is node viOf the feature vector of (1), node viIs the ith node in the topological graph, WmRepresenting the m-th dimension of the matrix W,representing a node viThe corresponding feature vector, namely the content stored in the data structure corresponding to the node, M represents the corresponding dimension of the vector or the matrix; normalizing the learned airspace features by using a batch normalization function, and finally processing the features by using a linear rectification activation function;
learning time domain features: learning time domain features by using a time domain convolution function, and then normalizing the learned time domain features by using a batch standardization function;
s22, after airspace feature learning, fusing the airspace feature vector through a node fusion function GFuse (-) to obtain the connection weight of a new edge set; GFusion (. cndot.) is implemented using a matrix multiplication between topology learnable fusion weights and node features with a specific initialization, as shown in equations (1-3):
wherein L represents a topology learnable fusion parameter matrix,is a node viFeature vector of, LijIs node viAnd vjWith a learnable fusion weight initialized by normalizing the adjacency matrix or the all-0 matrix, vjIs a dividing node v in the topological graphiAll nodes except "⊙" represent the product of the elements of the two matrices,representing a node vjThe topological learnable fusion parameter matrix L is self-adaptive and is realized by utilizing two-dimensional convolution or matrix multiplication with convolution kernel size of 1 multiplied by 1;
and S23, substituting the node feature vector fusing the local space-time features and the connection weight of the new edge set into formula (1-1) to obtain a topological graph sequence with a new topological structure.
4. The method for recognizing human body behaviors based on graph convolution network as claimed in claim 3, wherein the topological graph sequence with long-term spatiotemporal features in step S3 is determined according to equation (1-4):
Fvt=GCNLSTM(STGCN(I)) (1-4)
wherein, FvtFor the long-time space-time characteristics of a node v in the t-th frame, I is a human skeleton topological graph sequence shown in a formula (1-1), STGCN is a space-time graph convolution network based on topological learnable graph convolution, GCNLSTM is a graph convolution long-term and short-term memory network, and the specific implementation mode is shown in a formula (1-5):
wherein, WxiAnd WhiIs the weight of the input and hidden states in the input gate, WxfAnd WhfIs the weight of the input and hidden states in the forgetting gate, WxoAnd WhoIs the weight of the input and hidden states in the output gate, WxcIs the weight of the input in the cell state, WhcIs the weight of the hidden state in the cell state, "+g"represents a graph convolution operation, XtIs input at the current time, HtIs a hidden state at the present time, Ht-1Is a hidden state at the previous moment, bi,bf,boAnd bcRespectively an input gate, a forgetting gate, an output gate and a cellThe deviation of the state, σ, is a Sigmoid function,it,ftand otGate function values, C, for input, forgetting and output gates, respectivelyt-1The state of the cells at the previous moment,representing a Hadamard product, and tanh is a hyperbolic tangent function; ctThe cell state at the current time t.
5. The method for recognizing human body behaviors based on graph convolution network according to claim 4, wherein the step S4 is specifically performed according to the following steps:
s41, firstly, performing mean pooling operation on all node characteristics at each moment to obtain a characteristic vector at each moment, as shown in the formula (1-6):
wherein, FvtFor long-term spatiotemporal characteristics, FtFor the feature vector after the fusion at the time t, GPooling () is a node feature mean pooling function, which represents that mean pooling operation is performed on all nodes of the feature map at each time to obtain the feature vector at each time;
s42, aggregating the feature vectors at each moment by using a time domain mean global pooling operation to obtain global space-time features, as shown in the formula (1-7):
wherein, FtAnd F is the global space-time feature obtained by fusion, and TPooling () is a time domain mean global pooling function, and the feature vectors at all the moments are pooled to obtain the global space-time feature.
6. The method for recognizing human body behaviors based on graph convolution network according to claim 5, wherein the step S5 is specifically represented by formula (1-8);
where C is the number of behavior classes, CkIs the k-th behavior class, SkAnd SiThe probability that the global space-time feature F belongs to the k-th behavior class and the i-th behavior class is obtained through the known full-connected layer function calculation, and e is a constant.
7. The human behavior recognition method based on the graph convolution network, characterized in that the topology learnable fusion parameter matrix L and the node feature learning parameter matrix W are both learned and optimized through back propagation.
8. The human body behavior recognition method based on the graph convolution network is characterized in that the determination of the topological graph space domain edge set with the new topological structure is as follows: node v when t frametiAnd node vtjWhen the fusion weight between the nodes is not 0, it represents the node vtiAnd node vtjHave a spatial relationship between them, form a new edge.
9. A behavior recognition system based on spatio-temporal graph convolution and graph convolution long and short term memory network, characterized in that, a human behavior recognition method based on graph convolution network as claimed in any one of claims 1-8 is adopted, which includes:
the topological graph sequence construction module is used for extracting human skeleton information from an input image, acquiring a human joint point position information sequence, and constructing a topological graph sequence of a human skeleton by taking all joint points as nodes and bones among the joint points as edges;
the space-time graph convolution network is used for carrying out feature extraction and adaptive evolution of a topological structure on the topological graph sequence to obtain new node features fusing local space-time features and the topological graph sequence with the new topological structure;
the graph convolution long-short term memory neural network is used for extracting the characteristics of the topological graph sequence of the new topological structure to obtain the topological graph sequence with long-term space-time characteristics;
the global pooling module is used for further fusing the characteristics of the topological graph sequence to obtain global space-time characteristics;
and the classifier is used for carrying out human behavior identification based on the global space-time characteristics.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911070446.0A CN110796110B (en) | 2019-11-05 | 2019-11-05 | Human behavior identification method and system based on graph convolution network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911070446.0A CN110796110B (en) | 2019-11-05 | 2019-11-05 | Human behavior identification method and system based on graph convolution network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110796110A true CN110796110A (en) | 2020-02-14 |
CN110796110B CN110796110B (en) | 2022-07-26 |
Family
ID=69442713
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911070446.0A Active CN110796110B (en) | 2019-11-05 | 2019-11-05 | Human behavior identification method and system based on graph convolution network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110796110B (en) |
Cited By (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291693A (en) * | 2020-02-17 | 2020-06-16 | 安徽工程大学 | Deep integration method based on skeleton motion recognition |
CN111310707A (en) * | 2020-02-28 | 2020-06-19 | 山东大学 | Skeleton-based method and system for recognizing attention network actions |
CN111709321A (en) * | 2020-05-28 | 2020-09-25 | 西安交通大学 | Human behavior recognition method based on graph convolution neural network |
CN111723779A (en) * | 2020-07-20 | 2020-09-29 | 浙江大学 | Chinese sign language recognition system based on deep learning |
CN111737909A (en) * | 2020-06-10 | 2020-10-02 | 哈尔滨工业大学 | Structural health monitoring data anomaly identification method based on space-time graph convolutional network |
CN111783692A (en) * | 2020-07-06 | 2020-10-16 | 广东工业大学 | Action recognition method and device, electronic equipment and storage medium |
CN111814921A (en) * | 2020-09-04 | 2020-10-23 | 支付宝(杭州)信息技术有限公司 | Object characteristic information acquisition method, object classification method, information push method and device |
CN111814842A (en) * | 2020-06-17 | 2020-10-23 | 北京邮电大学 | Object classification method and device based on multi-pass graph convolution neural network |
CN111812450A (en) * | 2020-06-01 | 2020-10-23 | 复旦大学 | Method for identifying dangerous faults of power grid |
CN111950485A (en) * | 2020-08-18 | 2020-11-17 | 中科人工智能创新技术研究院(青岛)有限公司 | Human body behavior identification method and system based on human body skeleton |
CN112052816A (en) * | 2020-09-15 | 2020-12-08 | 山东大学 | Human behavior prediction method and system based on adaptive graph convolution countermeasure network |
CN112069877A (en) * | 2020-07-21 | 2020-12-11 | 北京大学 | Face information identification method based on edge information and attention mechanism |
CN112084891A (en) * | 2020-08-21 | 2020-12-15 | 西安理工大学 | Cross-domain human body action recognition method based on multi-mode features and counterstudy |
CN112149618A (en) * | 2020-10-14 | 2020-12-29 | 紫清智行科技(北京)有限公司 | Pedestrian abnormal behavior detection method and device suitable for inspection vehicle |
CN112329562A (en) * | 2020-10-23 | 2021-02-05 | 江苏大学 | Human body interaction action recognition method based on skeleton features and slice recurrent neural network |
CN112380955A (en) * | 2020-11-10 | 2021-02-19 | 浙江大华技术股份有限公司 | Action recognition method and device |
CN112464808A (en) * | 2020-11-26 | 2021-03-09 | 成都睿码科技有限责任公司 | Rope skipping posture and number identification method based on computer vision |
CN112543936A (en) * | 2020-10-29 | 2021-03-23 | 香港应用科技研究院有限公司 | Motion structure self-attention-seeking convolutional network for motion recognition |
CN112598021A (en) * | 2020-11-27 | 2021-04-02 | 西北工业大学 | Graph structure searching method based on automatic machine learning |
CN112633153A (en) * | 2020-12-22 | 2021-04-09 | 天津大学 | Facial expression motion unit identification method based on space-time graph convolutional network |
CN112686304A (en) * | 2020-12-29 | 2021-04-20 | 山东大学 | Target detection method and device based on attention mechanism and multi-scale feature fusion and storage medium |
CN112800903A (en) * | 2021-01-19 | 2021-05-14 | 南京邮电大学 | Dynamic expression recognition method and system based on space-time diagram convolutional neural network |
CN112801155A (en) * | 2021-01-20 | 2021-05-14 | 廖彩红 | Business big data analysis method based on artificial intelligence and server |
CN112801060A (en) * | 2021-04-07 | 2021-05-14 | 浙大城市学院 | Motion action recognition method and device, model, electronic equipment and storage medium |
CN112818942A (en) * | 2021-03-05 | 2021-05-18 | 清华大学 | Pedestrian action recognition method and system in vehicle driving process |
CN112836670A (en) * | 2021-02-24 | 2021-05-25 | 复旦大学 | Pedestrian action detection method and device based on adaptive graph network |
CN112906604A (en) * | 2021-03-03 | 2021-06-04 | 安徽省科亿信息科技有限公司 | Behavior identification method, device and system based on skeleton and RGB frame fusion |
CN113268916A (en) * | 2021-04-07 | 2021-08-17 | 浙江工业大学 | Traffic accident prediction method based on space-time graph convolutional network |
CN113408455A (en) * | 2021-06-29 | 2021-09-17 | 山东大学 | Action identification method, system and storage medium based on multi-stream information enhanced graph convolution network |
CN113705542A (en) * | 2021-10-27 | 2021-11-26 | 北京理工大学 | Pedestrian behavior state identification method and system |
CN113761286A (en) * | 2020-06-01 | 2021-12-07 | 杭州海康威视数字技术股份有限公司 | Map embedding method and device of knowledge map and electronic equipment |
CN114078243A (en) * | 2020-08-11 | 2022-02-22 | 湖南大学 | Driver driving behavior identification method and system based on cyclic graph convolution network |
WO2022088176A1 (en) * | 2020-10-29 | 2022-05-05 | Hong Kong Applied Science and Technology Research Institute Company Limited | Actional-structural self-attention graph convolutional network for action recognition |
CN114550308A (en) * | 2022-04-22 | 2022-05-27 | 成都信息工程大学 | Human skeleton action recognition method based on space-time diagram |
CN114821799A (en) * | 2022-05-10 | 2022-07-29 | 清华大学 | Motion recognition method, device and equipment based on space-time graph convolutional network |
CN115797841A (en) * | 2022-12-12 | 2023-03-14 | 南京林业大学 | Quadruped animal behavior identification method based on adaptive space-time diagram attention Transformer network |
CN116434339A (en) * | 2023-04-13 | 2023-07-14 | 江南大学 | Behavior recognition method based on space-time characteristic difference and correlation of skeleton data |
WO2024036825A1 (en) * | 2022-08-16 | 2024-02-22 | 深圳先进技术研究院 | Attitude processing method, apparatus and system, and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109754605A (en) * | 2019-02-27 | 2019-05-14 | 中南大学 | A kind of traffic forecast method based on attention temporal diagram convolutional network |
CN110059620A (en) * | 2019-04-17 | 2019-07-26 | 安徽艾睿思智能科技有限公司 | Bone Activity recognition method based on space-time attention |
CN110222653A (en) * | 2019-06-11 | 2019-09-10 | 中国矿业大学(北京) | A kind of skeleton data Activity recognition method based on figure convolutional neural networks |
-
2019
- 2019-11-05 CN CN201911070446.0A patent/CN110796110B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109754605A (en) * | 2019-02-27 | 2019-05-14 | 中南大学 | A kind of traffic forecast method based on attention temporal diagram convolutional network |
CN110059620A (en) * | 2019-04-17 | 2019-07-26 | 安徽艾睿思智能科技有限公司 | Bone Activity recognition method based on space-time attention |
CN110222653A (en) * | 2019-06-11 | 2019-09-10 | 中国矿业大学(北京) | A kind of skeleton data Activity recognition method based on figure convolutional neural networks |
Non-Patent Citations (2)
Title |
---|
CHENYANG SI ETAL.: ""An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition"", 《ARXIV》 * |
马静: ""基于姿态和骨架信息的行为识别方法研究与实现"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (59)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291693A (en) * | 2020-02-17 | 2020-06-16 | 安徽工程大学 | Deep integration method based on skeleton motion recognition |
CN111310707A (en) * | 2020-02-28 | 2020-06-19 | 山东大学 | Skeleton-based method and system for recognizing attention network actions |
CN111310707B (en) * | 2020-02-28 | 2023-06-20 | 山东大学 | Bone-based graph annotation meaning network action recognition method and system |
CN111709321A (en) * | 2020-05-28 | 2020-09-25 | 西安交通大学 | Human behavior recognition method based on graph convolution neural network |
CN111709321B (en) * | 2020-05-28 | 2022-08-16 | 西安交通大学 | Human behavior recognition method based on graph convolution neural network |
CN113761286B (en) * | 2020-06-01 | 2024-01-02 | 杭州海康威视数字技术股份有限公司 | Knowledge graph embedding method and device and electronic equipment |
CN111812450B (en) * | 2020-06-01 | 2022-03-18 | 复旦大学 | Method for identifying dangerous faults of power grid |
CN113761286A (en) * | 2020-06-01 | 2021-12-07 | 杭州海康威视数字技术股份有限公司 | Map embedding method and device of knowledge map and electronic equipment |
CN111812450A (en) * | 2020-06-01 | 2020-10-23 | 复旦大学 | Method for identifying dangerous faults of power grid |
CN111737909A (en) * | 2020-06-10 | 2020-10-02 | 哈尔滨工业大学 | Structural health monitoring data anomaly identification method based on space-time graph convolutional network |
CN111737909B (en) * | 2020-06-10 | 2021-02-09 | 哈尔滨工业大学 | Structural health monitoring data anomaly identification method based on space-time graph convolutional network |
CN111814842A (en) * | 2020-06-17 | 2020-10-23 | 北京邮电大学 | Object classification method and device based on multi-pass graph convolution neural network |
CN111814842B (en) * | 2020-06-17 | 2023-11-03 | 北京邮电大学 | Object classification method and device based on multichannel graph convolution neural network |
CN111783692A (en) * | 2020-07-06 | 2020-10-16 | 广东工业大学 | Action recognition method and device, electronic equipment and storage medium |
CN111723779A (en) * | 2020-07-20 | 2020-09-29 | 浙江大学 | Chinese sign language recognition system based on deep learning |
CN111723779B (en) * | 2020-07-20 | 2023-05-02 | 浙江大学 | Chinese sign language recognition system based on deep learning |
CN112069877A (en) * | 2020-07-21 | 2020-12-11 | 北京大学 | Face information identification method based on edge information and attention mechanism |
CN112069877B (en) * | 2020-07-21 | 2022-05-03 | 北京大学 | Face information identification method based on edge information and attention mechanism |
CN114078243A (en) * | 2020-08-11 | 2022-02-22 | 湖南大学 | Driver driving behavior identification method and system based on cyclic graph convolution network |
CN111950485A (en) * | 2020-08-18 | 2020-11-17 | 中科人工智能创新技术研究院(青岛)有限公司 | Human body behavior identification method and system based on human body skeleton |
CN111950485B (en) * | 2020-08-18 | 2022-06-17 | 中科人工智能创新技术研究院(青岛)有限公司 | Human body behavior identification method and system based on human body skeleton |
CN112084891A (en) * | 2020-08-21 | 2020-12-15 | 西安理工大学 | Cross-domain human body action recognition method based on multi-mode features and counterstudy |
CN112084891B (en) * | 2020-08-21 | 2023-04-28 | 西安理工大学 | Cross-domain human body action recognition method based on multi-modal characteristics and countermeasure learning |
CN111814921B (en) * | 2020-09-04 | 2020-12-18 | 支付宝(杭州)信息技术有限公司 | Object characteristic information acquisition method, object classification method, information push method and device |
CN111814921A (en) * | 2020-09-04 | 2020-10-23 | 支付宝(杭州)信息技术有限公司 | Object characteristic information acquisition method, object classification method, information push method and device |
CN112052816B (en) * | 2020-09-15 | 2022-07-12 | 山东大学 | Human behavior prediction method and system based on adaptive graph convolution countermeasure network |
CN112052816A (en) * | 2020-09-15 | 2020-12-08 | 山东大学 | Human behavior prediction method and system based on adaptive graph convolution countermeasure network |
CN112149618A (en) * | 2020-10-14 | 2020-12-29 | 紫清智行科技(北京)有限公司 | Pedestrian abnormal behavior detection method and device suitable for inspection vehicle |
CN112329562B (en) * | 2020-10-23 | 2024-05-14 | 江苏大学 | Human interactive action recognition method based on skeleton characteristics and slicing recurrent neural network |
CN112329562A (en) * | 2020-10-23 | 2021-02-05 | 江苏大学 | Human body interaction action recognition method based on skeleton features and slice recurrent neural network |
CN112543936A (en) * | 2020-10-29 | 2021-03-23 | 香港应用科技研究院有限公司 | Motion structure self-attention-seeking convolutional network for motion recognition |
CN112543936B (en) * | 2020-10-29 | 2021-09-28 | 香港应用科技研究院有限公司 | Motion structure self-attention-drawing convolution network model for motion recognition |
WO2022088176A1 (en) * | 2020-10-29 | 2022-05-05 | Hong Kong Applied Science and Technology Research Institute Company Limited | Actional-structural self-attention graph convolutional network for action recognition |
CN112380955B (en) * | 2020-11-10 | 2023-06-16 | 浙江大华技术股份有限公司 | Action recognition method and device |
CN112380955A (en) * | 2020-11-10 | 2021-02-19 | 浙江大华技术股份有限公司 | Action recognition method and device |
CN112464808B (en) * | 2020-11-26 | 2022-12-16 | 成都睿码科技有限责任公司 | Rope skipping gesture and number identification method based on computer vision |
CN112464808A (en) * | 2020-11-26 | 2021-03-09 | 成都睿码科技有限责任公司 | Rope skipping posture and number identification method based on computer vision |
CN112598021A (en) * | 2020-11-27 | 2021-04-02 | 西北工业大学 | Graph structure searching method based on automatic machine learning |
CN112633153A (en) * | 2020-12-22 | 2021-04-09 | 天津大学 | Facial expression motion unit identification method based on space-time graph convolutional network |
CN112686304A (en) * | 2020-12-29 | 2021-04-20 | 山东大学 | Target detection method and device based on attention mechanism and multi-scale feature fusion and storage medium |
CN112800903A (en) * | 2021-01-19 | 2021-05-14 | 南京邮电大学 | Dynamic expression recognition method and system based on space-time diagram convolutional neural network |
CN112800903B (en) * | 2021-01-19 | 2022-08-26 | 南京邮电大学 | Dynamic expression recognition method and system based on space-time diagram convolutional neural network |
CN112801155A (en) * | 2021-01-20 | 2021-05-14 | 廖彩红 | Business big data analysis method based on artificial intelligence and server |
CN112801155B (en) * | 2021-01-20 | 2021-10-26 | 贵州江南航天信息网络通信有限公司 | Business big data analysis method based on artificial intelligence and server |
CN112836670A (en) * | 2021-02-24 | 2021-05-25 | 复旦大学 | Pedestrian action detection method and device based on adaptive graph network |
CN112906604A (en) * | 2021-03-03 | 2021-06-04 | 安徽省科亿信息科技有限公司 | Behavior identification method, device and system based on skeleton and RGB frame fusion |
CN112906604B (en) * | 2021-03-03 | 2024-02-20 | 安徽省科亿信息科技有限公司 | Behavior recognition method, device and system based on skeleton and RGB frame fusion |
CN112818942A (en) * | 2021-03-05 | 2021-05-18 | 清华大学 | Pedestrian action recognition method and system in vehicle driving process |
CN112801060A (en) * | 2021-04-07 | 2021-05-14 | 浙大城市学院 | Motion action recognition method and device, model, electronic equipment and storage medium |
CN113268916A (en) * | 2021-04-07 | 2021-08-17 | 浙江工业大学 | Traffic accident prediction method based on space-time graph convolutional network |
CN113408455A (en) * | 2021-06-29 | 2021-09-17 | 山东大学 | Action identification method, system and storage medium based on multi-stream information enhanced graph convolution network |
CN113705542A (en) * | 2021-10-27 | 2021-11-26 | 北京理工大学 | Pedestrian behavior state identification method and system |
CN114550308A (en) * | 2022-04-22 | 2022-05-27 | 成都信息工程大学 | Human skeleton action recognition method based on space-time diagram |
CN114821799A (en) * | 2022-05-10 | 2022-07-29 | 清华大学 | Motion recognition method, device and equipment based on space-time graph convolutional network |
WO2024036825A1 (en) * | 2022-08-16 | 2024-02-22 | 深圳先进技术研究院 | Attitude processing method, apparatus and system, and storage medium |
CN115797841A (en) * | 2022-12-12 | 2023-03-14 | 南京林业大学 | Quadruped animal behavior identification method based on adaptive space-time diagram attention Transformer network |
CN115797841B (en) * | 2022-12-12 | 2023-08-18 | 南京林业大学 | Quadruped behavior recognition method based on self-adaptive space-time diagram attention transducer network |
CN116434339A (en) * | 2023-04-13 | 2023-07-14 | 江南大学 | Behavior recognition method based on space-time characteristic difference and correlation of skeleton data |
CN116434339B (en) * | 2023-04-13 | 2023-10-27 | 江南大学 | Behavior recognition method based on space-time characteristic difference and correlation of skeleton data |
Also Published As
Publication number | Publication date |
---|---|
CN110796110B (en) | 2022-07-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110796110B (en) | Human behavior identification method and system based on graph convolution network | |
CN112395945A (en) | Graph volume behavior identification method and device based on skeletal joint points | |
CN107492121B (en) | Two-dimensional human body bone point positioning method of monocular depth video | |
CN108846384A (en) | Merge the multitask coordinated recognition methods and system of video-aware | |
Pronobis et al. | Learning deep generative spatial models for mobile robots | |
CN111079931A (en) | State space probabilistic multi-time-series prediction method based on graph neural network | |
Pajares et al. | A Hopfield Neural Network for combining classifiers applied to textured images | |
CN110826698A (en) | Method for embedding and representing crowd moving mode through context-dependent graph | |
CN104537684A (en) | Real-time moving object extraction method in static scene | |
CN111199216A (en) | Motion prediction method and system for human skeleton | |
CN107341471B (en) | A kind of Human bodys' response method based on Bilayer condition random field | |
CN113822419A (en) | Self-supervision graph representation learning operation method based on structural information | |
CN114898467A (en) | Human motion action recognition method, system and storage medium | |
CN117373116A (en) | Human body action detection method based on lightweight characteristic reservation of graph neural network | |
CN113627326A (en) | Behavior identification method based on wearable device and human skeleton | |
CN110348395B (en) | Skeleton behavior identification method based on space-time relationship | |
CN111325336B (en) | Rule extraction method based on reinforcement learning and application | |
US20230116148A1 (en) | Method and system for multimodal classification based on brain-inspired unsupervised learning | |
Xu et al. | Causal structure learning with one-dimensional convolutional neural networks | |
CN114254214A (en) | Traffic prediction method and system based on space-time hierarchical network | |
Hwang et al. | Adaptive reinforcement learning in box-pushing robots | |
CN116844225B (en) | Personalized human body action recognition method based on knowledge distillation | |
Tsang et al. | Convergence analysis of a discrete Hopfield neural network with delay and its application to knowledge refinement | |
CN112598021A (en) | Graph structure searching method based on automatic machine learning | |
Machireddy et al. | Extracting Temporal Correlations Using Hierarchical Spatio-Temporal Feature Maps |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |