CN114582030A - Behavior recognition method based on service robot - Google Patents

Behavior recognition method based on service robot Download PDF

Info

Publication number
CN114582030A
CN114582030A CN202210484610.8A CN202210484610A CN114582030A CN 114582030 A CN114582030 A CN 114582030A CN 202210484610 A CN202210484610 A CN 202210484610A CN 114582030 A CN114582030 A CN 114582030A
Authority
CN
China
Prior art keywords
joint
human
service robot
convolution
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210484610.8A
Other languages
Chinese (zh)
Other versions
CN114582030B (en
Inventor
李婕
王恩果
李毅
李青清
刘钊
高澄
肖克爽
张峻嘉
张振平
巩朋成
李刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University of Technology
Original Assignee
Hubei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University of Technology filed Critical Hubei University of Technology
Priority to CN202210484610.8A priority Critical patent/CN114582030B/en
Publication of CN114582030A publication Critical patent/CN114582030A/en
Application granted granted Critical
Publication of CN114582030B publication Critical patent/CN114582030B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a behavior identification method based on a service robot, which comprises the following specific steps: extracting human body joint point sequences of 13 common behavior categories in a service robot application scene to form a training data set; preprocessing a training data set; carrying out weighting optimization on the joint point data by combining with an actual application scene to output 17 main joint points; constructing a lightweight multi-scale aggregation space-time map convolution deep learning neural network model by using a multi-scale space-time map convolution and time convolution module; training and testing the data set by using the constructed network model; identifying human body behaviors in the video image under the real scene to be identified by using the trained model; and the service robot receives the human behavior recognition result and makes a corresponding response. The invention can accurately identify the human body behaviors in the scene, and ensures the service quality of the service robot.

Description

Behavior recognition method based on service robot
Technical Field
The application relates to the technical field of human behavior recognition, in particular to a behavior recognition method based on a service robot.
Background
With the development of science and technology and the intensive research of artificial intelligence technology, the application field of robot technology is not limited to industrial robots any more, but is popularized and applied in the direction of life activation and civilization, and the service robot gradually enters the daily life of people. In recent years, service robots develop towards intellectualization, have more and more abundant functions, and are widely applied to the aspects of cleaning, medical treatment, rescue, logistics, maintenance, security and the like. The development of the service robot industry can effectively relieve the social service pressure of old disabled people, improve the quality of life of people, promote the rapid development of civil science and technology, and is a strategic measure for realizing the benefit of advanced scientific and technological achievements to the people, so that all countries in the world pay great attention to the development of the service robot industry and invest in a large amount of resources for research and development. Although the relevant research technology of the service robot is mature, the complex external environment is still a great challenge for the service robot in the research of positioning navigation, human-computer interaction, computer vision, reasoning tasks and the like. By carrying out algorithm analysis on the video images captured by the service robot, the behavior of people in the scene can be judged, and then response reactions can be made. In order to identify the human behavior in the video, firstly, information with high relevance of the target human behavior in the video needs to be extracted, then key information is obtained through algorithm processing, and finally the obtained key information is used for identifying the human behavior.
Along with the miniaturization, integration and intellectualization of the camera and the flexibility of an interface of the camera, the service robot can capture indoor environment pictures in real time by carrying the camera. The traditional feature extraction method is to extract visual high-dimensional features by methods of space-time key point sampling, dense track sampling, body part sampling and the like, perform behavior prediction by using classifiers such as SVM (Support Vector Machine), RF (random form) and the like, perform end-to-end feature extraction and recognition by using a deep learning method in an automatic feature learning manner, particularly apply a graph convolution network on a human skeleton, avoid the influence of complex background, shape, RGB (red, green, blue) color and other information on recognition precision as much as possible, apply a key point recognition algorithm (such as openpose, mediaprofile and the like) to captured video pictures to obtain sequence information such as human key points and the like, send the key point sequence information to a constructed multi-scale aggregation space-time graph convolution network model to calculate so as to obtain behavior information of corresponding characters, and further enable a service robot to make corresponding responses (such as waving motions, the robot recognizes the motion and approaches the character).
In the existing scheme, a human body skeleton behavior identification method based on graph convolution mostly treats a human body skeleton sequence as a series of non-intersected graphs, and extracts features through a Graph Convolution (GCN) module in a space dimension and a convolution (TCN) module in a time dimension. Under the complex working environment of the service robot, the recognition efficiency of the behavior recognition model constructed based on the common graph convolution is not high, and the wrong recognition can cause the wrong interaction of the robot, so that the service quality of the robot and the experience of a service object are influenced. Therefore, a lightweight human behavior recognition model is urgently needed to be applied to the service robot.
Disclosure of Invention
The embodiment of the application aims to provide a behavior recognition method based on a service robot, and a lightweight volume human behavior recognition model capable of crossing space-time relations is designed, so that the overall recognition effect is ensured, the false recognition of similar actions is reduced, and the quality of remote visual interaction of the service robot is improved.
In order to achieve the above purpose, the present application provides the following technical solutions:
the embodiment of the application provides a behavior identification method based on a service robot, which comprises the following specific steps:
s1, extracting human body joint point sequences of 13 behavior categories commonly used in the service robot application scene to form a training data set;
s2, preprocessing the training data set, firstly extracting key frames of the joint point sequence, and then optimizing the joint point data by combining with the actual application scene;
s3, for a video shot in a real scene, firstly, carrying out key point estimation by adopting a body-25 human posture estimation model in openposition to obtain 25 key point coordinates and confidence coefficients, then, carrying out key point vacancy value filling on the obtained key point data by adopting a K nearest neighbor method, and finally, carrying out weighting optimization on joint point data by combining with an actual application scene to output 17 main joint points;
s4, constructing a lightweight multi-scale aggregation space-time map convolution deep learning neural network model by using a multi-scale space-time map convolution and time convolution module;
s5, training and testing the data set by using the constructed network model;
s6, identifying human body behaviors in the video image under the real scene to be identified by using the trained model;
and S7, the service robot receives the human behavior recognition result and responds correspondingly.
In the step S1, the training data set is derived from an NTU-RGB + D human behavior data set, and 13 behavior categories are selected: drinking, picking up, throwing away, sitting down, standing up, jumping, shaking head, tumbling, chest pain, waving hands, kicking, hugging and walking, 12324 skeleton files in total.
The step S2 of extracting key frames from the skeleton sequence includes:
on the premise that each section of video corresponding to different behavior types in the service robot application scene is extracted at intervals of 30 frames, 300 frames of data are reserved as a training set, less than 300 frames are repeatedly extracted from the beginning of the video, the number of people in joint data is judged, and joint data only containing one person is reserved for training and verifying the model.
The step S3 specifically includes:
s31, detecting character key points in a video image under a real scene by using an openposition human body key point detection algorithm model, obtaining horizontal and vertical coordinate values (x, y) of 25 skeleton joint points by using a body-25 human body joint point labeling model, splicing discrete joint points according to the physical connection mode of the human body joint points to form a human body skeleton space topological model, and then splicing the space topological graph of each frame in a time sequence to finally obtain a human body skeleton structure change space-time graph;
s32, for the missing detection condition of the whole frame data, defining the 0 th, 1 st and 8 th joint points as main key points, if the output joint point data corresponding to the video image has the condition that any one of the three groups of data is missing in a certain frame, judging that the whole frame data is missing detection, and deleting the joint point data corresponding to the video frame; for the condition that a part of key points of a certain frame are missing, a 2-order K nearest neighbor method is adopted for filling, training and parameter estimation are not needed, and the average value of horizontal and vertical coordinate values (x, y) of frames before and after the point is directly taken for supplement.
The step S4 specifically includes:
s41, graph convolution calculation process: in the best ofAfter the coordinates of the joint points are reached, the joint points are taken as vertexes, the natural connection of the joint points is taken as a bone edge, and the human bone is represented as a picture
Figure 553204DEST_PATH_IMAGE001
Will be
Figure 410301DEST_PATH_IMAGE002
The frame skeleton diagram is in time sequence
Figure 199528DEST_PATH_IMAGE003
Arranging and connecting the same-position joint points to form a space-time skeleton graph and a node set
Figure 319931DEST_PATH_IMAGE004
Is the set of all the joint points in each skeleton diagram, wherein
Figure 365247DEST_PATH_IMAGE005
The number of joints per frame; edge set
Figure 353932DEST_PATH_IMAGE006
Represented by two sets, the first subset representing the intra-skeleton connections of each frame, represented as
Figure 371566DEST_PATH_IMAGE007
Wherein
Figure 53083DEST_PATH_IMAGE008
Is a set of naturally connected human joints, the second subset representing connecting edges of identically located joint points between adjacent frames, to
Figure 788958DEST_PATH_IMAGE009
To indicate that the user is not in a normal position,
Figure 722279DEST_PATH_IMAGE010
as serial number of joint point, by node set
Figure 984633DEST_PATH_IMAGE011
Hem edge set
Figure 712418DEST_PATH_IMAGE012
A adjacency matrix can be obtained
Figure 293178DEST_PATH_IMAGE013
The graph convolution is calculated as follows:
Figure 287996DEST_PATH_IMAGE015
wherein,
Figure 45737DEST_PATH_IMAGE016
in order to be an input, the user can select,
Figure 756204DEST_PATH_IMAGE017
in order to be output, the output is,
Figure 296906DEST_PATH_IMAGE018
in the form of a contiguous matrix, the matrix,
Figure 268274DEST_PATH_IMAGE019
is a weight that can be learned by the user,
Figure 337861DEST_PATH_IMAGE020
is the spatial dimension kernel size;
s42 calculating the self-adaptive graph convolution process as shown in the following formula
Figure 394678DEST_PATH_IMAGE018
On the basis of (1), add
Figure 739072DEST_PATH_IMAGE021
And
Figure 705891DEST_PATH_IMAGE022
two matrices are provided, which are arranged in a matrix,
Figure 306899DEST_PATH_IMAGE021
is a weight that can be trained in a particular way,
Figure 523117DEST_PATH_IMAGE023
a unique map is learned for each sample,
Figure 874464DEST_PATH_IMAGE024
s43, multi-scale space-time graph convolution calculation process: to better connect the spatial and temporal skeleton information, the first node of each node
Figure 820423DEST_PATH_IMAGE025
The jump-to-adjacency matrix is tiled to form one
Figure 497392DEST_PATH_IMAGE026
Of (2) matrix
Figure 404168DEST_PATH_IMAGE027
Figure 683840DEST_PATH_IMAGE027
Each node in the network is directly connected with the corresponding neighbor nodes on all frames, so that the jump connection between the nodes is realized, and the calculation process is as follows:
Figure 890830DEST_PATH_IMAGE028
s44, MS-GCN multi-scale space-time graph convolution module: to input node information respectively
Figure 473121DEST_PATH_IMAGE025
Extracting the jump adjacency matrix and finally extracting the jump adjacency matrix
Figure 991827DEST_PATH_IMAGE029
The matrix is spliced together and then the matrix is spliced,
Figure 216135DEST_PATH_IMAGE030
the serial number of the joint point;
Figure 12053DEST_PATH_IMAGE031
is the coordinates of the joint point and,
Figure 388414DEST_PATH_IMAGE032
representing nodes
Figure 535362DEST_PATH_IMAGE031
The shortest distance between hops;
s45, MS-TCN time expansion convolution module: by using
Figure 32202DEST_PATH_IMAGE033
Convolution for adjusting the number of channels of input information
Figure 807260DEST_PATH_IMAGE034
The convolution kernel processes the integrated information, processes the features after convolution processing in a mode similar to void convolution, connects the extracted features together, and finally adds the step length of 2
Figure 262512DEST_PATH_IMAGE033
Convolution is used for outputting the processed characteristics of the information;
s46, lightweight multi-scale space-time graph convolutional network MS-SGTCN _ S: in order to increase the robustness of the extracted features, two network branches are designed to carry out reasoning operation on input joint point data, wherein the first network branch consists of
Figure 631177DEST_PATH_IMAGE033
The system comprises a convolution module, an MS-GCN module and a full connection layer, wherein 4 MS-GCN modules are adopted in the middle to extract multi-scale space-time characteristics, and the multi-scale space-time characteristics are realized by adopting different time and space sliding windows; the second branch consists of an MS-GCN module and two MS-TCN modules, a long-range time module is adopted to enhance the attention degree of the network to the context change of the joint point in the time dimension, then the characteristic information obtained by the two branches is uniformly sent to the MS-TCN module, then the characteristics are spliced together through a full connecting layer, and then the characteristics are subjected to softmax classifierThe category with the maximum probability obtained after processing is the predicted human behavior, in order to further improve the accuracy of the algorithm, a double-flow network is designed to train the joint points and the framework sequences respectively, then confidence statistics is carried out on the prediction results of the joint points and the framework double-flow network, and the human behavior with high confidence is used as the predicted value of final output.
In step S46, a dual-flow network is designed to train the joint point and the skeleton sequence, and then a confidence statistic is performed on the prediction results of the joint point and the skeleton dual-flow network, and the human behavior with higher confidence is the final output prediction value.
In the step 7, in order to reduce the influence of the complex external environment on the working quality of the service robot, the robot is designed to respond to a certain behavior after receiving a behavior signal for more than 2 seconds continuously, and for dangerous behaviors, the service robot sends alarm information to remind a worker to process the dangerous behaviors.
Compared with the prior art, the invention has the beneficial effects that: the invention can accurately identify the human body behaviors in the scene, and ensures the service quality of the service robot.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a schematic flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a visualization diagram of a training data skeleton according to an embodiment of the present invention;
FIG. 3 is a body skeleton spatial topology model according to an embodiment of the present invention;
FIG. 4 is a time-space diagram illustrating the change of the skeleton structure of a human body according to an embodiment of the present invention;
FIG. 5 is a block diagram illustrating a MS-GCN multi-scale space-time graph convolution module according to an embodiment of the present invention;
FIG. 6 is a block diagram of an MS-TCN time-dilation convolution module in accordance with an embodiment of the present invention;
FIG. 7 is a multi-scale space-time graph convolution network according to an embodiment of the present invention;
FIG. 8 shows a test set RGB video image test result 1 according to an embodiment of the present invention;
FIG. 9 shows a test set RGB video image test result 2 according to an embodiment of the present invention;
fig. 10 is a result 1 of human behavior recognition in a real scene according to an embodiment of the present invention;
fig. 11 is a result 2 of human behavior recognition in a real scene according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
As shown in fig. 1, an embodiment of the present application provides a behavior recognition method based on a service robot, including the following specific steps:
s1, extracting human body joint point sequences of 13 behavior categories commonly used in the service robot application scene to form a training data set;
s2, preprocessing the training data set, firstly extracting key frames of the joint point sequence, and then optimizing the joint point data by combining with an actual application scene;
s3, for a video shot in a real scene, firstly, carrying out key point estimation by adopting a body-25 human posture estimation model in openposition to obtain 25 key point coordinates and confidence coefficients, then, carrying out key point vacancy value filling on the obtained key point data by adopting a K nearest neighbor method, and finally, carrying out weighting optimization on joint point data by combining with an actual application scene to output 17 main joint points;
s4, constructing a lightweight multi-scale aggregation space-time map convolution deep learning neural network model by using a multi-scale space-time map convolution and time convolution module;
s5, training and testing the data set by using the constructed network model;
s6, identifying human body behaviors in the video image under the real scene to be identified by using the trained model;
and S7, the service robot receives the human behavior recognition result and responds correspondingly.
In step S1, the training data is derived from the NTU-RGB + D human behavior data set manufactured by the university of southern beauty, singapore, and 13 daily behaviors and medical behaviors are selected: drinking, picking up, throwing away, sitting down, standing up, jumping, shaking head, tumbling, chest pain, waving hands, kicking, hugging and walking, 12324 skeleton files in total.
In step S2, for the situation that the video durations corresponding to different action types are different, the original data is processed by adopting a method of sampling at intervals and cyclically repeating from the starting frame, that is, on the premise that each video segment extracts one frame at intervals of 30 frames, 200 frames of data are retained as a training set, and less than 200 frames of data are repeatedly extracted from the beginning of the video. The design algorithm judges the number of people in the joint point data, the joint point data only containing a single person is reserved for training and verifying the model, specifically, counting operation is carried out on the joint points, if the total number of the joint points is more than 25, the fact that interference crowds appear in the joint point data can be judged, and then the joint point data is deleted.
In order to further improve the efficiency of algorithm operation and be compatible with key point data of a body _25 human body posture estimation model in openposition, 25 joint point data in an original data set are subjected to weighted optimization processing to remove part of joint points which have little influence on the service robot service object behavior, and the joint point data are recoded.
The set of nodes of the training data set is represented by the following equation:
Figure 790762DEST_PATH_IMAGE035
wherein,
Figure 826852DEST_PATH_IMAGE036
in order to train a set of data set nodes,
Figure 921847DEST_PATH_IMAGE037
to be at time
Figure 636862DEST_PATH_IMAGE038
The coordinate values of the joint points are obtained, and the data set is subjected to a frame-taking process, wherein
Figure 6663DEST_PATH_IMAGE038
Is set to be 200 a and is,
Figure 631679DEST_PATH_IMAGE039
there are 25 joint points in total, which are the serial numbers of the joint points.
The set of 17 joint points after the weighted optimization process is represented by the following formula:
Figure 523674DEST_PATH_IMAGE040
wherein,
Figure 132510DEST_PATH_IMAGE041
in order to weight-optimize the set of back-joint points,
Figure 509265DEST_PATH_IMAGE042
for after weighted optimization in time
Figure 113422DEST_PATH_IMAGE038
Joint of the lower limbPoint coordinate values, as in the above formula, here
Figure 81378DEST_PATH_IMAGE038
Is set to a maximum value of 200 a,
Figure 646351DEST_PATH_IMAGE039
there are 17 joint points in total, which are the serial numbers of the joint points.
Figure 951431DEST_PATH_IMAGE041
The joint points with the middle serial numbers of 1, 3, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16 and 17 correspond to the joint points respectively
Figure 82198DEST_PATH_IMAGE036
Joint point numbers 1, 5, 6, 9, 10, 13, 14, 15, 16, 17, 18, 19, 20, 21 in (1).
Figure 689896DEST_PATH_IMAGE036
In the head joint points 3, 4
Figure 601221DEST_PATH_IMAGE041
Integrating the joint point set into a joint point 2 through weighted optimization calculation;
Figure 116516DEST_PATH_IMAGE036
the left-hand joint points 7, 8, 22 and 23 in
Figure 570631DEST_PATH_IMAGE041
Integrating the joint point set into a joint point 5 through weighted optimization calculation;
Figure 972400DEST_PATH_IMAGE036
right hand joint points 11, 12, 24, 25 in
Figure 777545DEST_PATH_IMAGE041
Integrating the joint point set into a joint point 8 through weighted optimization calculation; the weighted calculation formula is as follows:
Figure 299793DEST_PATH_IMAGE043
wherein
Figure 733048DEST_PATH_IMAGE044
From a set of joint points
Figure 213708DEST_PATH_IMAGE041
Figure 240570DEST_PATH_IMAGE045
Figure 425564DEST_PATH_IMAGE046
From a set of joint points
Figure 385430DEST_PATH_IMAGE036
Figure 505832DEST_PATH_IMAGE047
Figure 879045DEST_PATH_IMAGE048
The optimization coefficients are weighted for the joint.
After recoding, the 17 joint points are finally output as training set data, and the formed human skeleton topological structure diagram is shown in fig. 2 below.
The specific flow of step S3 is:
and S31, detecting the key points of the person in the video image under the real scene by using an openposition human key point detection algorithm model, and obtaining the horizontal and vertical coordinate values (x, y) and the confidence coefficient S of the 25 skeletal joint points by using a body-25 human joint point labeling model. The discrete joint points are spliced together according to the physical connection mode of the human body joint points to form a human body skeleton space topological model, such as a graph 2, and then the space topological graph of each frame in time sequence is spliced together to finally obtain a human body skeleton structure change space-time graph, such as a graph 3.
S32, due to the influence of external factors such as light, shading, character behavior change and the like, the problem of missing detection is difficult to avoid when the openposition human posture estimation algorithm is used for key point estimation, and the missing detection has two conditions of missing detection of the whole frame and missing detection of partial key points in a certain frame. For the first case, defining the 0 th, 1 st and 8 th as main key points, if any group of the three groups of data in a certain frame is missing, judging that the data in the whole frame is missed to be detected, and deleting the data of the frame; for the condition that the second part of key points are missing, a 2-order K neighbor method is adopted for filling, the mean value of frame data before and after the point is taken for filling, the rationality of data filling is ensured under the condition of small calculated amount, and the accuracy relation of complete joint point data to human behavior identification is tight.
And S33, outputting 17 joint points after the 25 joint points in the figure 3 are recoded by a weighted optimization algorithm by the joint points in the synchronization step S2.
The 25 node sets output by the body _25 model in openposition human body posture estimation are represented by the following formula:
Figure 274254DEST_PATH_IMAGE049
wherein,
Figure 557468DEST_PATH_IMAGE050
in order to train a set of data set nodes,
Figure 474871DEST_PATH_IMAGE051
for the joint coordinate values taken at time, since the data is used for testing, here
Figure 741904DEST_PATH_IMAGE038
Is the duration T of the entire video segment,
Figure 409646DEST_PATH_IMAGE039
there are 25 joint points in total, which are the serial numbers of the joint points.
Figure 672000DEST_PATH_IMAGE052
The joint points with the middle serial numbers of 1, 3, 4, 5, 6, 7, 8, 9, 10, 13, 14 and 17 correspond to the joint points respectively
Figure 665363DEST_PATH_IMAGE052
Joint point numbers 8, 5, 6, 7, 2, 3, 4, 12, 13, 9, 10, 1 in (1).
Figure 888534DEST_PATH_IMAGE052
The head joint points 0, 15, 16, 17 and 18 in
Figure 484601DEST_PATH_IMAGE041
Integrating the joint point set into a joint point 2 through weighted optimization calculation;
Figure 476828DEST_PATH_IMAGE052
in the left foot joint 14, 21
Figure 641093DEST_PATH_IMAGE041
The joint point set is integrated into a joint point 11 through weighted optimization calculation, and joint points 19 and 20 are in
Figure 476193DEST_PATH_IMAGE041
Integrating the joint point set into joint points 12 through weighted optimization calculation;
Figure 485738DEST_PATH_IMAGE052
right foot articulation point 11, 24 in
Figure 863630DEST_PATH_IMAGE041
The joint point set is integrated into a joint point 15 through weighted optimization calculation, and joint points 22 and 23 are in
Figure 36946DEST_PATH_IMAGE041
Integrating the joint point set into a joint point 16 through weighted optimization calculation; the weighted calculation formula is as follows:
Figure 93764DEST_PATH_IMAGE053
wherein
Figure 438157DEST_PATH_IMAGE054
From a set of joint points
Figure 139397DEST_PATH_IMAGE041
Figure 504519DEST_PATH_IMAGE055
Figure 455158DEST_PATH_IMAGE056
Figure 806505DEST_PATH_IMAGE057
From a set of joint points
Figure 752464DEST_PATH_IMAGE052
Figure 695012DEST_PATH_IMAGE058
Figure 601788DEST_PATH_IMAGE059
The optimization coefficients are weighted for the joint.
The specific flow of step S4 is:
s41, graph convolution calculation process: after the coordinates of the joint points are obtained, the joint points are taken as vertexes
Figure 382925DEST_PATH_IMAGE060
The natural connection of the joint points is taken as the skeleton edge
Figure 589915DEST_PATH_IMAGE061
Human skeleton can be represented as a map
Figure 172206DEST_PATH_IMAGE062
Will be
Figure 690912DEST_PATH_IMAGE063
The frame skeleton diagrams are arranged according to a time sequence and are connected with the joint points at the same positions to form a space-time skeleton diagram. Node set
Figure 915220DEST_PATH_IMAGE064
Is the set of all the joint points in each skeleton diagram, and the calculation process of the diagram convolution is as follows:
Figure 711138DEST_PATH_IMAGE065
wherein,
Figure 323385DEST_PATH_IMAGE066
in order to be an input, the user can select,
Figure 204753DEST_PATH_IMAGE067
in order to be output, the output is,
Figure 232752DEST_PATH_IMAGE068
in the form of a contiguous matrix, the matrix,
Figure 7810DEST_PATH_IMAGE069
are learnable weights.
And S42, in the adaptive graph convolution calculation process, scientific researchers successively put forward adaptive graph convolution due to the fact that a topological structure fixed by graph convolution is not friendly to the joint points which are not physically connected but have strong relevance. The calculation process is shown in the following formula, in the original adjacency matrix
Figure 931904DEST_PATH_IMAGE018
On the basis of (2), newly adding
Figure 566147DEST_PATH_IMAGE021
And
Figure 224268DEST_PATH_IMAGE022
two matrices.
Figure 994778DEST_PATH_IMAGE021
Is a trainable weight and has no pairIt is subject to any constraints such as normalization, i.e.
Figure 620932DEST_PATH_IMAGE021
The parameters are parameters completely learned from data, and can not only indicate whether two nodes are in contact or not, but also indicate the strength of the contact, and the difference between the parameters and the ST-GCN is a fusion mode. ST-GCN is a multiplication, here an addition, which can result in a nonexistent association.
Figure 335947DEST_PATH_IMAGE022
A unique graph is learned for each sample, and a very classical Gaussian embedding function is adopted, so that the similarity between joints can be captured.
Figure 174590DEST_PATH_IMAGE070
S43, a multi-scale space-time graph convolution calculation process: to better connect the spatial and temporal skeleton information, the first node of each node
Figure 330765DEST_PATH_IMAGE071
The jump-to-adjacency matrix is tiled to form one
Figure 721295DEST_PATH_IMAGE072
Of (2) matrix
Figure 798972DEST_PATH_IMAGE073
Figure 706885DEST_PATH_IMAGE073
Each node in the network is directly connected with the corresponding neighbor nodes on all frames, so that the jump connection between the nodes is realized, and the calculation process is as follows:
Figure 311042DEST_PATH_IMAGE074
s44, MS-GCN multi-scale space-time graph convolution module: respectively to the input nodeFirst of information
Figure 747840DEST_PATH_IMAGE071
Extracting the jump adjacency matrix and finally extracting the jump adjacency matrix
Figure 843972DEST_PATH_IMAGE075
The matrices are spliced together.
S45, MS-TCN time expansion convolution module: by using
Figure 650516DEST_PATH_IMAGE076
Convolution for adjusting the number of channels of input information
Figure 984545DEST_PATH_IMAGE077
The convolution kernel processes the integrated information, processes the features after convolution processing in a mode similar to void convolution, connects the extracted features together, and finally adds the step length of 2
Figure 123402DEST_PATH_IMAGE076
The convolution has a certain correction effect on the features after the output information is processed and the proposed features.
S46, lightweight multi-scale space-time graph convolutional network MS-SGTCN _ S: in order to increase the robustness of the extracted features, two network branches are designed to carry out reasoning operation on the input joint point data. The first network branch is composed of
Figure 300306DEST_PATH_IMAGE078
The system comprises a convolution module, an MS-GCN module and a full connection layer, wherein 4 MS-GCN modules are adopted in the middle to extract multi-scale space-time characteristics, and the multi-scale space-time characteristics are realized by adopting different time and space sliding windows; the second branch is composed of an MS-GCN module and two MS-TCN modules, and a long-range time module is adopted to strengthen the attention of the network to the context change of the joint point in the time dimension. Then, the feature information obtained by the two branches is uniformly sent to an MS-TCN module, then the features are spliced together through a full connection layer, and the class with the maximum probability is obtained after the features are processed by a softmax classifierPredicted human behavior. In order to further improve the accuracy of the algorithm, a double-flow network is designed to train the joint points and the framework sequences respectively, then confidence statistics is carried out on the prediction results of the joint points and the framework double-flow network, and the human body behavior with higher confidence is used as the final output prediction value.
Step S5 trains and tests the data set using the constructed network model
The method is characterized in that a double-flow behavior recognition model is realized based on PyTorch, the method is carried out under CUDA11.1 and 3080Ti GPU, a small-batch stochastic gradient descent algorithm is used for learning network working parameters, the batch size is set to be 32, the momentum is set to be 0.9, the initial learning rate is 0.05, training iterations are reduced in the 25 th and 35 th training iterations, the weight attenuation is set to be 0.0005, and the accuracy under the division rules of X-Sub and X-View is shown in the following table:
Figure 284442DEST_PATH_IMAGE079
in order to verify that the designed network can reduce the false recognition of similar human behaviors, the accuracy rates of 13 human behaviors are respectively counted, and the results are shown in the following table:
Figure 4137DEST_PATH_IMAGE080
the accuracy of recognizing three human behaviors with high similarity, namely drinking, shaking head and chest pain, still reaches more than 95%, and the designed lightweight multi-scale space-time diagram convolutional network can reduce the false recognition of similar human behaviors.
The results of testing the test set of RGB video images are shown in fig. 8 and 9 below;
the result of identifying the human behavior in the real scene is shown in fig. 10 and fig. 11, and thus, the multi-scale space-time graph convolution model constructed by the invention can quickly and effectively identify the human behavior in the real scene, and the service quality of the service robot is ensured.
In order to reduce the influence of the complex external environment on the working quality of the service robot, the robot is designed to react to a certain behavior after receiving a certain behavior signal for more than 2 seconds continuously. For some dangerous behaviors, the service robot sends alarm information to remind a worker to process; the positioning technology and the obstacle avoidance technology of the service robot are combined, for example, when the service robot receives a hand waving action, the robot can move to a customer to carry out service; by combining the face recognition technology, the guest information can be registered and guided to a specified seat for service.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (7)

1. A behavior identification method based on a service robot is characterized by comprising the following specific steps:
s1, extracting human body joint point sequences of 13 behavior categories commonly used in the service robot application scene to form a training data set;
s2, preprocessing the training data set, firstly extracting key frames of the joint point sequence, and then optimizing the joint point data by combining with an actual application scene;
s3, for a video shot in a real scene, firstly, carrying out key point estimation by adopting a body-25 human posture estimation model in openposition to obtain 25 key point coordinates and confidence coefficients, then, carrying out key point vacancy value filling on the obtained key point data by adopting a K nearest neighbor method, and finally, carrying out weighting optimization on joint point data by combining with an actual application scene to output 17 main joint points;
s4, constructing a lightweight multi-scale aggregation space-time map convolution deep learning neural network model by using a multi-scale space-time map convolution and time convolution module;
s5, training and testing the data set by using the constructed network model;
s6, identifying human body behaviors in the video image under the real scene to be identified by using the trained model;
and S7, the service robot receives the human behavior recognition result and responds correspondingly.
2. The behavior recognition method based on the service robot as claimed in claim 1, wherein the training data set in step S1 is derived from NTU-RGB + D human behavior data set, and 13 behavior categories are selected: drinking, picking up, throwing away, sitting down, standing up, jumping, shaking head, tumbling, chest pain, waving hands, kicking, hugging and walking, 12324 skeleton files in total.
3. The service robot-based behavior recognition method as claimed in claim 1, wherein the step S2 of performing key frame extraction on the skeleton sequence comprises:
on the premise that each section of video corresponding to different behavior types in the service robot application scene is extracted at intervals of 30 frames, 300 frames of data are reserved as a training set, less than 300 frames are repeatedly extracted from the beginning of the video, the number of people in joint data is judged, and joint data only containing one person is reserved for training and verifying the model.
4. The service robot-based behavior recognition method according to claim 1, wherein the step S3 specifically comprises:
s31, detecting the human key points in the video image under the real scene by using an openposition human key point detection algorithm model, obtaining the horizontal and vertical coordinate values (x, y) of 25 skeletal joint points by using a body-25 human joint point labeling model, splicing the discrete joint points according to the physical connection mode of the human joint points to form a human skeleton space topological model, and then splicing the space topological graph of each frame in time sequence to finally obtain a human skeleton structure change space-time graph;
s32, for the missing detection condition of the whole frame data, defining the 0 th, 1 st and 8 th joint points as the main key points, if the output joint point data corresponding to the video image has any one of the three groups of data that a certain frame is missing, judging that the whole frame data is missing, and deleting the joint point data corresponding to the video frame; for the condition that a part of key points of a certain frame are missing, a 2-order K nearest neighbor method is adopted for filling, training and parameter estimation are not needed, and the average value of horizontal and vertical coordinate values (x, y) of frames before and after the point is directly taken for supplement.
5. The service robot-based behavior recognition method according to claim 1, wherein the step S4 specifically comprises:
s41, graph convolution calculation process: after obtaining the coordinates of the joint points, the human skeleton is represented as a map by using the joint points as vertexes and the natural connections of the joint points as skeleton edges
Figure DEST_PATH_IMAGE002
Will be
Figure DEST_PATH_IMAGE004
The frame skeleton diagram is in time sequence
Figure DEST_PATH_IMAGE006
Arranging and connecting the same-position joint points to form a space-time skeleton graph and a node set
Figure DEST_PATH_IMAGE008
Is the set of all the joint points in each skeleton diagram, wherein
Figure DEST_PATH_IMAGE010
The number of joints per frame; edge set
Figure DEST_PATH_IMAGE012
Represented by two sets, the first subset representing the intra-skeleton connections of each frame, represented as
Figure DEST_PATH_IMAGE014
Wherein
Figure DEST_PATH_IMAGE016
Is a set of naturally connected human joints, the second subset representing connecting edges of identically located joint points between adjacent frames, to
Figure DEST_PATH_IMAGE018
To indicate that the position of the movable member,
Figure DEST_PATH_IMAGE020
as serial number of joint point, by node set
Figure DEST_PATH_IMAGE022
Hem edge set
Figure DEST_PATH_IMAGE023
An adjacency matrix can be obtained
Figure DEST_PATH_IMAGE025
The graph convolution is calculated as follows:
Figure DEST_PATH_IMAGE029
wherein,
Figure DEST_PATH_IMAGE031
in order to be an input, the user can select,
Figure DEST_PATH_IMAGE033
in order to be output, the output is,
Figure DEST_PATH_IMAGE034
in the form of a contiguous matrix, the matrix,
Figure DEST_PATH_IMAGE036
is a weight that can be learned by the user,
Figure DEST_PATH_IMAGE038
is the spatial dimension kernel size;
s42 calculating the self-adaptive graph convolution process as shown in the following formula
Figure 150967DEST_PATH_IMAGE034
On the basis of (2), newly adding
Figure DEST_PATH_IMAGE040
And
Figure DEST_PATH_IMAGE042
two matrices are provided, which are arranged in a matrix,
Figure 504326DEST_PATH_IMAGE040
is a weight that can be trained in a particular way,
Figure DEST_PATH_IMAGE043
a unique map is learned for each sample,
Figure DEST_PATH_IMAGE045
s43, a multi-scale space-time graph convolution calculation process: to better connect the spatial and temporal skeleton information, the first node of each node
Figure DEST_PATH_IMAGE047
The jump-to-adjacency matrix is tiled to form one
Figure DEST_PATH_IMAGE049
Of (2) matrix
Figure DEST_PATH_IMAGE051
Figure 159429DEST_PATH_IMAGE051
Corresponding on each node and all frames inThe neighbor nodes are directly connected, so that the jump connection between the nodes is realized, and the calculation process is as follows:
Figure DEST_PATH_IMAGE053
s44, MS-GCN multi-scale space-time graph convolution module: to input node information respectively
Figure 492321DEST_PATH_IMAGE047
Extracting the jump adjacency matrix and finally extracting the jump adjacency matrix
Figure DEST_PATH_IMAGE055
The matrix is spliced together and then the matrix is spliced,
Figure DEST_PATH_IMAGE057
the serial number of the joint point;
Figure DEST_PATH_IMAGE059
is the coordinate of the joint point and is,
Figure DEST_PATH_IMAGE061
representing nodes
Figure 211753DEST_PATH_IMAGE059
The shortest distance between hops;
s45, MS-TCN time expansion convolution module: by using
Figure DEST_PATH_IMAGE063
Convolution for adjusting the number of channels of input information
Figure DEST_PATH_IMAGE065
The convolution kernel processes the integrated information, processes the features after convolution processing in a mode similar to void convolution, connects the extracted features together, and finally adds the step length of 2
Figure 971899DEST_PATH_IMAGE063
Convolution is used for outputting the processed characteristics of the information;
s46, lightweight multi-scale space-time graph convolutional network MS-SGTCN _ S: in order to increase the robustness of the extracted features, two network branches are designed to carry out reasoning operation on input joint point data, wherein the first network branch consists of
Figure 707774DEST_PATH_IMAGE063
The system comprises a convolution module, an MS-GCN module and a full connection layer, wherein 4 MS-GCN modules are adopted in the middle to extract multi-scale space-time characteristics, and the multi-scale space-time characteristics are realized by adopting different time and space sliding windows; the second branch is composed of an MS-GCN module and two MS-TCN modules, a long-range time module is adopted to enhance the attention degree of the network to the context change of the joint point in the time dimension, then the feature information obtained by the two branches is uniformly sent to the MS-TCN module, then the features are spliced together through a full connecting layer, the type with the maximum probability is the predicted human behavior after being processed by a softmax classifier, in order to further improve the accuracy of the algorithm, a double-flow network is designed to train the joint point and the framework sequence respectively, then the confidence statistics is carried out on the prediction results of the joint point and the framework double-flow network, and the human behavior with high confidence is taken as the predicted value of final output.
6. The behavior recognition method based on the service robot as claimed in claim 5, wherein in step S46, a dual-flow network is designed to train the joint point and the skeleton sequence, and then a confidence statistic is performed on the prediction results of the joint point and the skeleton dual-flow network, and the human behavior with higher confidence is the final predicted value.
7. The behavior recognition method based on the service robot as claimed in claim 1, wherein in step 7, in order to reduce the influence of the complex external environment on the working quality of the service robot, the robot is designed to respond to a certain behavior signal after receiving the behavior signal for more than 2 seconds continuously, and for dangerous behaviors, the service robot sends alarm information to remind a worker to process the dangerous behaviors.
CN202210484610.8A 2022-05-06 2022-05-06 Behavior recognition method based on service robot Active CN114582030B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210484610.8A CN114582030B (en) 2022-05-06 2022-05-06 Behavior recognition method based on service robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210484610.8A CN114582030B (en) 2022-05-06 2022-05-06 Behavior recognition method based on service robot

Publications (2)

Publication Number Publication Date
CN114582030A true CN114582030A (en) 2022-06-03
CN114582030B CN114582030B (en) 2022-07-22

Family

ID=81769365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210484610.8A Active CN114582030B (en) 2022-05-06 2022-05-06 Behavior recognition method based on service robot

Country Status (1)

Country Link
CN (1) CN114582030B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114881179A (en) * 2022-07-08 2022-08-09 济南大学 Intelligent experiment method based on intention understanding
CN115035596A (en) * 2022-06-05 2022-09-09 东北石油大学 Behavior detection method and apparatus, electronic device, and storage medium
CN115586834A (en) * 2022-11-03 2023-01-10 天津大学温州安全(应急)研究院 Intelligent cardio-pulmonary resuscitation training system
CN115810203A (en) * 2022-12-19 2023-03-17 天翼爱音乐文化科技有限公司 Obstacle avoidance identification method, system, electronic equipment and storage medium
CN116386087A (en) * 2023-03-31 2023-07-04 阿里巴巴(中国)有限公司 Target object processing method and device
CN116665312A (en) * 2023-08-02 2023-08-29 烟台大学 Man-machine cooperation method based on multi-scale graph convolution neural network

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064471A (en) * 2018-07-18 2018-12-21 中北大学 A kind of three-dimensional point cloud model dividing method based on skeleton
CN111652124A (en) * 2020-06-02 2020-09-11 电子科技大学 Construction method of human behavior recognition model based on graph convolution network
CN112949569A (en) * 2021-03-25 2021-06-11 南京邮电大学 Effective extraction method of human body posture points for tumble analysis
CN113657349A (en) * 2021-09-01 2021-11-16 重庆邮电大学 Human body behavior identification method based on multi-scale space-time graph convolutional neural network
WO2022000420A1 (en) * 2020-07-02 2022-01-06 浙江大学 Human body action recognition method, human body action recognition system, and device
CN114187653A (en) * 2021-11-16 2022-03-15 复旦大学 Behavior identification method based on multi-stream fusion graph convolution network
CN114220176A (en) * 2021-12-22 2022-03-22 南京华苏科技有限公司 Human behavior recognition method based on deep learning
CN114399648A (en) * 2022-01-17 2022-04-26 Oppo广东移动通信有限公司 Behavior recognition method and apparatus, storage medium, and electronic device
US20220138536A1 (en) * 2020-10-29 2022-05-05 Hong Kong Applied Science And Technology Research Institute Co., Ltd Actional-structural self-attention graph convolutional network for action recognition

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064471A (en) * 2018-07-18 2018-12-21 中北大学 A kind of three-dimensional point cloud model dividing method based on skeleton
CN111652124A (en) * 2020-06-02 2020-09-11 电子科技大学 Construction method of human behavior recognition model based on graph convolution network
WO2022000420A1 (en) * 2020-07-02 2022-01-06 浙江大学 Human body action recognition method, human body action recognition system, and device
US20220138536A1 (en) * 2020-10-29 2022-05-05 Hong Kong Applied Science And Technology Research Institute Co., Ltd Actional-structural self-attention graph convolutional network for action recognition
CN112949569A (en) * 2021-03-25 2021-06-11 南京邮电大学 Effective extraction method of human body posture points for tumble analysis
CN113657349A (en) * 2021-09-01 2021-11-16 重庆邮电大学 Human body behavior identification method based on multi-scale space-time graph convolutional neural network
CN114187653A (en) * 2021-11-16 2022-03-15 复旦大学 Behavior identification method based on multi-stream fusion graph convolution network
CN114220176A (en) * 2021-12-22 2022-03-22 南京华苏科技有限公司 Human behavior recognition method based on deep learning
CN114399648A (en) * 2022-01-17 2022-04-26 Oppo广东移动通信有限公司 Behavior recognition method and apparatus, storage medium, and electronic device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LEI SHI 等;: "《Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition》", 《APARXIV:1805.07694V3》 *
ZIYU LIU1 等;: "《Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition》", 《IEEE》 *
郑诗雨: "《基于自适应时空融合图卷积网络的人体动作识别方法研究》", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035596A (en) * 2022-06-05 2022-09-09 东北石油大学 Behavior detection method and apparatus, electronic device, and storage medium
CN115035596B (en) * 2022-06-05 2023-09-08 东北石油大学 Behavior detection method and device, electronic equipment and storage medium
CN114881179A (en) * 2022-07-08 2022-08-09 济南大学 Intelligent experiment method based on intention understanding
CN114881179B (en) * 2022-07-08 2022-09-06 济南大学 Intelligent experiment method based on intention understanding
CN115586834A (en) * 2022-11-03 2023-01-10 天津大学温州安全(应急)研究院 Intelligent cardio-pulmonary resuscitation training system
CN115810203A (en) * 2022-12-19 2023-03-17 天翼爱音乐文化科技有限公司 Obstacle avoidance identification method, system, electronic equipment and storage medium
CN115810203B (en) * 2022-12-19 2024-05-10 天翼爱音乐文化科技有限公司 Obstacle avoidance recognition method, system, electronic equipment and storage medium
CN116386087A (en) * 2023-03-31 2023-07-04 阿里巴巴(中国)有限公司 Target object processing method and device
CN116386087B (en) * 2023-03-31 2024-01-09 阿里巴巴(中国)有限公司 Target object processing method and device
CN116665312A (en) * 2023-08-02 2023-08-29 烟台大学 Man-machine cooperation method based on multi-scale graph convolution neural network
CN116665312B (en) * 2023-08-02 2023-10-31 烟台大学 Man-machine cooperation method based on multi-scale graph convolution neural network

Also Published As

Publication number Publication date
CN114582030B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN114582030B (en) Behavior recognition method based on service robot
CN109829436B (en) Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
CN108830252B (en) Convolutional neural network human body action recognition method fusing global space-time characteristics
CN107463949B (en) Video action classification processing method and device
CN108537743B (en) Face image enhancement method based on generation countermeasure network
CN110472554A (en) Table tennis action identification method and system based on posture segmentation and crucial point feature
CN110569795A (en) Image identification method and device and related equipment
CN110472612B (en) Human behavior recognition method and electronic equipment
CN110414432A (en) Training method, object identifying method and the corresponding device of Object identifying model
CN110472604B (en) Pedestrian and crowd behavior identification method based on video
CN109685037B (en) Real-time action recognition method and device and electronic equipment
CN107256386A (en) Human behavior analysis method based on deep learning
CN111274916A (en) Face recognition method and face recognition device
CN112131908A (en) Action identification method and device based on double-flow network, storage medium and equipment
CN110070029A (en) A kind of gait recognition method and device
CN113128424B (en) Method for identifying action of graph convolution neural network based on attention mechanism
CN116343330A (en) Abnormal behavior identification method for infrared-visible light image fusion
CN110765839B (en) Multi-channel information fusion and artificial intelligence emotion monitoring method for visible light facial image
CN112200176B (en) Method and system for detecting quality of face image and computer equipment
CN114529984A (en) Bone action recognition method based on learnable PL-GCN and ECLSTM
CN113516005A (en) Dance action evaluation system based on deep learning and attitude estimation
CN113312973A (en) Method and system for extracting features of gesture recognition key points
CN112906520A (en) Gesture coding-based action recognition method and device
CN113239885A (en) Face detection and recognition method and system
CN111797705A (en) Action recognition method based on character relation modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant