CN115294652B

CN115294652B - Behavior similarity calculation method and system based on deep learning

Info

Publication number: CN115294652B
Application number: CN202210939966.6A
Authority: CN
Inventors: 孙昌霞; 司海平; 李飞涛; 郭玉峰; 王云鹏; 李艳玲; 王晓平; 费尔南多.巴桑
Original assignee: Henan Agricultural University
Current assignee: Henan Agricultural University
Priority date: 2022-08-05
Filing date: 2022-08-05
Publication date: 2023-04-18
Anticipated expiration: 2042-08-05
Also published as: CN115294652A

Abstract

The application provides a behavior similarity calculation method and system based on deep learning. The method comprises the following steps: respectively determining coordinate information of limb key points of a standard action and an action to be corrected according to the first image data and the second image data based on the human body posture estimation model; respectively calculating the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point according to the coordinate information of the limb key point of the standard action and the coordinate information of the limb key point of the action to be corrected; determining a limb offset angle between the action to be corrected and the standard action at each limb key point according to the limb offset of the standard action at each limb key point and the limb offset of the action to be corrected; and determining the comprehensive similarity between the action to be corrected and the standard action according to the limb offset angle between the action to be corrected and the standard action at each limb key point.

Description

Behavior similarity calculation method and system based on deep learning

Technical Field

The application relates to the technical field of image and graphic processing, in particular to a behavior similarity calculation method and system based on deep learning.

Background

With the development of internet technology, the degree of fusion of a computer vision-based teaching method and sports teaching is gradually tight, and in the past dance teaching, the actions of learners can be corrected only by a method of continuous demonstration of teachers, so that students can correctly master the contents of dance courses taught by teachers. However, in the current dance teaching, the action demonstration standard of students mainly depends on subjective evaluation of teachers, and quantitative objective evaluation indexes are lacked for evaluation. Therefore, it is desirable to provide a solution to the above-mentioned deficiencies of the prior art.

Disclosure of Invention

The present application is directed to a method and system for behavior similarity calculation based on deep learning, so as to solve or alleviate the above problems in the prior art.

In order to achieve the above purpose, the present application provides the following technical solutions:

the application provides a behavior similarity calculation method based on deep learning, which comprises the following steps: step S101, respectively determining coordinate information of limb key points of a standard action and an action to be corrected according to first image data and second image data based on a pre-constructed human body posture estimation model; wherein the limb key points are determined according to the human skeleton visual image; the first image data represents an image of the standard action, and the second image data represents an image of the action to be corrected; step S102, respectively calculating the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point according to the coordinate information of the limb key point of the standard action and the coordinate information of the limb key point of the action to be corrected; step S103, determining a limb offset angle between the action to be corrected and the standard action at each limb key point according to the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point; step S104, determining comprehensive similarity between the action to be corrected and the standard action according to the limb offset angle between the action to be corrected and the standard action at each limb key point.

Preferably, in step S101, based on a highherhrnet model, human posture estimation is performed on the standard motion in the first image data and the motion to be corrected in the second image data, and coordinate information of the limb key points of the standard motion and the motion to be corrected is determined correspondingly.

Preferably, the performing, based on the highherhrnet model, human posture estimation on the standard motion in the first image data and the motion to be corrected in the second image data, and correspondingly determining coordinate information of the limb key points of the standard motion and the motion to be corrected includes: respectively starting from 1/32 resolution by adopting a pre-trained HigherHRNet model, and gradually increasing the resolution of the feature maps of the first image data and the second image data to 1/4 by using bilinear upsampling with transverse connection; directly starting from 1/4 resolution, generating a feature map with higher resolution by deconvolution; and generating scale-aware high-resolution heat maps of the first image data and the second image data respectively through a multi-resolution heat map aggregation strategy according to the higher-resolution feature maps generated by deconvolution so as to determine the coordinate information of the limb key points.

Preferably, in step S102, according to the formula:

determining each of the limb criticalities limb offset at a point alpha _im ；

Wherein i ∈ {1,2}, i =1 represents the standard action, and i =2 represents the action to be corrected; m represents the number of the key points of the limbs, and is a positive integer; (x, y) coordinate information representing the limb keypoints

Preferably, in step S103, when | α | _1m |+|α _2m When | ≦ 180 °, according to the formula:

determining a limb offset angle between the standard action and the action to be corrected at each limb key point;

when | α _1m |+|α _2m When the angle is greater than 180 degrees,

γ _k ＝360°-(|α _1m |+|α _2m |)

wherein m and k are positive integers, k is less than m, and gamma is _k Representing the limb offset angle between the standard action and the action to be corrected at the k limb part corresponding to the mth limb key point; alpha (alpha) ("alpha") _1m Representing the limb offset at the mth critical point of the limb in the standard motion; alpha (alpha) ("alpha") _2m Representing the limb offset at the mth key point of the limb in the action to be corrected.

Preferably, in step S104, according to the formula:

determining the comprehensive similarity delta between the standard action and the action to be corrected _k ；

Wherein, delta _k And the limb deviation angle between the action to be corrected and the standard action at the kth limb part corresponding to the mth limb key point is represented, m represents the number of the limb key point, and m is a positive integer.

Preferably, the method further comprises the following steps: and in response to the fact that the comprehensive similarity between the action to be corrected and the standard action is smaller than a preset similarity threshold, performing action correction on the limb key points corresponding to the maximum limb deviation angle between the action to be corrected and the standard action at each limb key point.

The embodiment of the present application further provides a behavior similarity calculation system based on deep learning, including: the limb coordinate determination unit is configured to respectively determine coordinate information of limb key points of a standard action and an action to be corrected according to the first image data and the second image data based on a human body posture estimation model established in advance; the limb key points are determined according to the human skeleton visual image; the first image data represents an image of the standard action, and the second image data represents an image of the action to be corrected; the offset calculation unit is configured to calculate the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point according to the coordinate information of the limb key point of the standard action and the coordinate information of the limb key point of the action to be corrected; the offset angle calculation unit is configured to determine a limb offset angle between the action to be corrected and the standard action at each limb key point according to the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point; and the similarity calculation unit is configured to determine comprehensive similarity between the action to be corrected and the standard action according to limb offset angles between the action to be corrected and the standard action at each limb key point.

Has the beneficial effects that:

in the behavior similarity calculation scheme based on deep learning, firstly, based on a human body posture estimation model which is constructed in advance, according to an image of a standard action and an image of an action to be corrected, coordinate information of a limb key point of the standard action and coordinate information of a limb key point of the action to be corrected are respectively determined; secondly, calculating the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point according to the coordinate information of the limb key point of the standard action and the coordinate information of the limb key point of the action to be corrected; further, the limb cheap angle between the action to be corrected and the standard action is calculated according to the standard action of each limb key point and the limb offset of the action to be corrected; and finally, determining the comprehensive similarity between the action to be corrected and the standard action according to the limb offset angle between the action to be corrected and the standard action at each limb key point, and further giving a corresponding action posture correction suggestion according to the comprehensive similarity. Therefore, quantitative calculation can be effectively carried out on the teaching of national standard dancing and the like and the standard of the action to be corrected, timely feedback and action correction suggestions are given to the learner according to the obtained calculation result, the fast action posture adjustment of the learner is facilitated, and the learning efficiency is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. Wherein:

fig. 1 is a schematic flow chart of a behavior similarity calculation method based on deep learning according to some embodiments of the present application;

FIG. 2 is a schematic diagram of the division of limb keypoints provided according to some embodiments of the present application;

fig. 3 is a schematic structural diagram of a behavior similarity calculation system based on deep learning according to some embodiments of the present application.

Detailed Description

The present application will be described in detail below with reference to the embodiments with reference to the attached drawings. The various examples are provided by way of explanation of the application and are not limiting of the application. In fact, it will be apparent to those skilled in the art that modifications and variations can be made in the present application without departing from the scope or spirit of the application. For instance, features illustrated or described as part of one embodiment, can be used with another embodiment to yield a still further embodiment. It is therefore intended that the present application cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.

For convenience of explanation, the dance movement of the dancer is taken as an example, that is, the standard dance movement described below is a standard movement, and the dance movement of the learner is a movement to be corrected.

At present, for the calculation of dance motion similarity, more research is performed on recognition of dance motions, for example, an image nursing method is used for processing dance adaptive motions, then a supervised learning method of a Support Vector Machine (SVM) is used for training processed images, and then classification recognition is performed on dance motions. The method is mainly used for identifying the dance movements, does not measure and judge the standard degree of the dance movements, and cannot provide a correction opinion for the nonstandard dance movements.

Based on the above, the applicant provides a behavior similarity calculation technology based on deep learning, firstly, a human body posture estimation model from bottom to top is adopted to calculate human body key points for a dancer (learner), then, the offset angle of the body of the dancer is calculated according to the obtained specific coordinates of the human body key points, then, the design of a dance action similarity calculation method is carried out on the obtained offset angle of the body of the dancer, then, the evaluation calculation is carried out on the standard degree of the dance action of the dancer, finally, the correction of the dance action of the dancer is provided according to the calculation result, and the automatic analysis and feedback of the dancer action are realized. By the technology, quantitative calculation can be effectively carried out on dance teaching of national standard dances and the like and the standard of dance actions of learners, timely feedback and improvement suggestions are given to the learners according to the obtained calculation results, the dancing action posture can be quickly adjusted by the learners, and the dance learning efficiency of the learners is improved.

Exemplary method

Fig. 1 is a schematic flow chart of a behavior similarity calculation method based on deep learning according to some embodiments of the present application; FIG. 2 is a schematic diagram of the division of limb keypoints provided according to some embodiments of the present application; as shown in fig. 1 and 2, the behavior similarity calculation method based on deep learning includes:

and S101, respectively determining coordinate information of the key points of the limbs of the standard action and the action to be corrected according to the first image data and the second image data based on a pre-constructed human posture estimation model.

Wherein, the limb key points are determined according to the human skeleton visual image; the first image data represents an image of a standard dance movement (i.e., standard movement) and the second image data represents an image of a learner dance movement (i.e., movement to be corrected).

In the application, the human body posture estimation model is constructed based on a deep learning framework, and a top-down posture estimation method can be adopted, such as a Critical path algorithm (CPM), a multi-person posture recognition framework (alphapos), a high-resolution feature network (HRNet), and the like; bottom-up pose estimation methods may also be employed, such as: openpos, highherhrnet, etc.

In the application, the images of the standard dance movements correspond to the images of the dance movements of the learners one by one, namely, each movement in the dance movements comprises one image of the standard dance movement and one image of the dance movement of the learners, and the images of each standard dance movement and the images of each dance movement of the learners are a group of images so as to modify the dance movements of the learners one by one. It is understood that the dance movements are collected in international standard dances (including latin dances and modern dances), mainly including luneba, chai, cowboy, samba and douniu dances in latin dances and waltz in modern dances.

In the process of determining the coordinate information of the key points of the limbs, estimating the coordinates of the key points of the limbs of the standard dance movements and the dance movements of the learners by a bottom-up method. Specifically, based on the HigherHRNet model, human body posture estimation is respectively carried out on the standard dance movement in the first image data and the learner dance movement in the second image data, and coordinate information of the limb key points of the standard dance movement and the learner dance movement is correspondingly determined.

It can be understood that the standard dance movements correspond to the body key points of the dance movements of the learner one by one, and the body key points are determined according to the human body skeleton visual image, as shown in fig. 2, the body skeleton visual image of the standard dance movements may be the human body skeleton visual image of the dance movements of the learner. For convenience of representation, S = { S) =isdefined ₁ 、s ₂ 、s ₃ …s ₁₅ Denotes limb key points, for a total of 15 limb key points s _m (m =1, 2, 3 ... 15) calculation was performed, and these 15 limb key points were, from top to bottom, nose, left and right shoulder joints, middle part of shoulder (middle of shoulders), left and right elbow joints, left and right hands, left and right hip joints, middle part of hip (middle of double hips), left and right knee joints, and left and right feet, respectively. And the human body is divided into 12 limb parts according to 15 limb key points (namely k =1, 2 \823012), which are respectively: nose-middle of left and right shoulders, left shoulder-right shoulder, middle of left and right shoulders-The middle of the left hip and the right hip, the left hip-the right hip, the left shoulder-the left elbow, the left elbow-the left hand, the right shoulder-the right elbow, the right elbow-the right hand, the left hip-the left knee, the left knee-the left foot, the right hip-the right foot and the right knee-the right foot, thereby effectively simplifying the calculation of the limb deviation angle.

In the application, the human body posture estimation is carried out on the first image data and the second image data respectively by adopting a pre-trained HigherHRNet model, so that the specific position information of the limb key points is obtained. Specifically, firstly, the pre-trained HigherHRNet model is adopted to respectively start from 1/32 resolution, and the feature map resolution of the first image data and the feature map resolution of the second image data are respectively and gradually improved to 1/4 by using bilinear upsampling with transverse connection. That is, with the HRnet as a backbone network, the input image (i.e. the first image data and the second image data) is first compressed into a feature map with only 1/4 resolution of the original image by two step convolutions through the Stem module; after Stem, the host network is connected, wherein the host network comprises 4 stages, and each stage extends a branch with lower resolution through the stage network. In the first stage of the main network, there is only one 1/4 resolution network branch, and the network branch contains 4 BottleNeck modules (1 × 1 convolution module, 3 × 3 convolution module, 1 × 1 convolution module), and the number of feature map channels is controlled in 64 dimensions.

At the end of the last BottleNeck module in the first stage, the last BottleNeck module is divided into two branches, the first branch adopts 3 × 3 convolution to keep the original resolution, but the number of channels is C (32 or 48); the second branch adopts step 3 × 3 convolution, and samples the resolution to 1/8 of the original image, and the number of control channels is 2C. In the second stage, the basic modules in the two branches do not adopt the BottleNeck module any more, but adopt two cascaded 3 x 3 convolutions.

Then, the high-resolution feature pyramid directly starts from the 1/4 resolution of the backbone, and generates a feature map with higher resolution (more than 1/4 resolution) by deconvolution. In the process, the multi-resolution supervision strategy is used for respectively supervising the scale change of the first image data and the scale change of the second image data, so that the HighHRNet model can process the scale change of the first image data and the scale change of the second image data, and training targets (feature maps with different resolutions of the first image data and the second image data) with different resolutions are distributed to corresponding feature pyramid layers. That is, cross-resolution information fusion is involved after the second stage enters the third stage. That is, the initial value of the 1/4 resolution branch of the third stage is obtained by summing the 1/4 resolution branch output of the second stage and the up-sampling result of the 1/8 resolution branch. Specifically, the upsampling operation is performed by convolution with 1 × 1 to align the number of channels, and then the bilinear interpolation is performed to double the resolution.

Similarly, the initial value of the 1/8 resolution branch is obtained by adding the 1/8 resolution output of the second stage and the down-sampling result of the 1/4 resolution branch through step convolution. In addition, the 1/16 resolution branch added in the third stage is obtained by adding the results of one-step convolution of the 1/8 resolution branch output in the second stage and two-step convolution of the 1/4 resolution branch output, and the number of channels is maintained to be 4C.

Here, the basic block of the third stage is a cascade of two 3 × 3 convolutions in agreement with the second stage. The difference is that the third module is divided into four sub-stages, each sub-stage comprises 4 basic modules, and cross-resolution feature aggregation is performed once at the end of each sub-stage. The input of each branch of the sub-stage is obtained by adding the output of the branch of the previous sub-stage and the results of the resolution and the channel number which are the same as the other two branches are up-sampled/down-sampled to the branch.

And finally, according to the feature map with higher resolution (more than 1/4 resolution) generated by deconvolution, respectively generating high-resolution heat maps of scale-aware (scale-aware) of the first image data and the second image data by a multi-resolution heat map aggregation strategy to obtain the specific position information of the limb key point.

That is to say, the input of the fourth stage is obtained by the output of the last sub-stage of the third stage through cross-resolution feature aggregation, a 1/32 branch is expanded, the fourth stage comprises 3 sub-stages, each sub-stage comprises 4 cascaded two basic modules of 3 × 3 convolution, and similarly, the end of each sub-stage performs once cross-resolution feature aggregation.

At the end of the fourth stage, all resolution branches are unified into a feature map of the original image 1/4 resolution by cross-resolution feature aggregation, the feature map is up-sampled to the original image 1/2 resolution by a layer of 4 × 4 Deconvolution (Deconvolution layer), and 1/2 resolution heat maps corresponding to the N key points are generated by 4 layers of convolution layers.

The HigherHRNet adopts a multi-resolution supervision strategy, specifically, 1/4 resolution feature maps of N key points are generated by a group of convolutions on a 1/4 resolution feature map before a deconvolution layer, the L2 loss is calculated by using a group of convolutions, the N heat maps are spliced (Cat) with the 1/4 feature map at the front part, then the 1/2 resolution feature maps of the key points are generated by a deconvolution and 4-layer convolution module, and the L2 loss is calculated by using the group convolutions again.

And S102, respectively calculating the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point according to the coordinate information of the limb key point of the standard action and the coordinate information of the limb key point of the action to be corrected.

According to the method and the device, after the coordinate information of the key point of the limb of the dance action of the learner is determined, the limb offset of the limb part, corresponding to the key point of the limb of the dance action of the learner, of the limb part corresponding to the key point of the limb of the dance action of the learner is determined according to the coordinate information of the key point of the limb of the dance action of the learner.

Specifically, the limb offset alpha of the limb part corresponding to each limb key point is determined according to the formula (1) _im . Equation (1) is as follows:

wherein, i belongs to {1,2}, and represents a standard dance action when i =1, and represents a dance action of a learner when i = 2; m represents the number of the key points of the limbs, and m is a positive integer; (x, y) represents coordinate information of a limb key point.

It is understood that the limb offset is the offset of the corresponding limb part of the learner's dance motion and the standard dance motion in 12 limb parts of the human body divided according to the 15 limb key points, and not the offset of the limb key points, and each limb part at least comprises two limb key points. Through the calculation of the limb offset of each limb part, the difference between the dance action of the learner and the standard dance action can be effectively determined, and then the corresponding limb key points are adjusted.

Step S103, determining a limb offset angle between the action to be corrected and the standard action at each limb key point according to the limb offset of the standard action at each limb key point and the limb offset of the action to be corrected.

Specifically, the limb offset angle between the standard dance motion and the dance motion of the learner at each limb key point is determined according to the formula (2) and the formula (3). The formula (2) and the formula (3) are as follows:

when | α _1m |+|α _2m When the angle is less than or equal to 180 degrees,

when | α _1m |+|α _2m When the angle is greater than 180 degrees,

γ _k ＝360°-(|α _1m |+|α _2m |)…………………………(3)

wherein m and k are positive integers, k is less than m, and gamma is _k Representing the limb offset angle between the standard dance motion and the dance motion of the learner at the kth limb part corresponding to the mth limb key point; alpha (alpha) ("alpha") _1m Representing the limb offset at the mth limb key point in the standard dance action; alpha is alpha _2m Representing the limb offset at the mth limb key point in the dance motion of the learner.

It can be understood that when the limb deviation angle is determined, the formula (2) and the formula (3) are parallel conditions, and the formula (2) and the formula (3) are required to be satisfied at the same time. The limb offset angle specifically refers to the offset angle of the limb part corresponding to the limb key point.

And step S104, determining the comprehensive similarity between the action to be corrected and the standard action according to the cheap limb angle between the action to be corrected and the standard action at each limb key point.

In the application, the obtained offset angles between the standard dance movements and the body parts of the dance movements of the learner are analyzed, and the comprehensive similarity between the dance movements of the learner and the standard dance movements is determined according to a formula (4). Equation (4) is as follows:

wherein, delta _k And the limb offset angle between the dance motion of the learner at the kth limb part corresponding to the mth limb key point and the standard dance motion is represented, m represents the number of the limb key point, and m is a positive integer.

In the present application, the total similarity of each limb portion is determined by adding the offset angles of each limb portion for standard dance movements and learner dance movements, subtracting the resulting total offset angle by 12 × 180 ° (2160), and dividing by 12 × 180 °. The overall similarity of the 12 limb parts can be obtained by the formula (4).

And further, when the comprehensive similarity between the dance action of the learner and the standard dance action is smaller than a preset similarity threshold, performing dance action correction on the limb key point corresponding to the maximum limb offset angle between the dance action of the learner and the standard dance action at each limb key point. Further, the comprehensive similarity of each limb part is compared, and the limb part corresponding to the maximum comprehensive similarity is selected for correction.

Specifically, when the comprehensive similarity of the body part of the learner is smaller than the preset degree, the dance action of the learner is corrected by correcting the maximum body offset angle at the key point of the body corresponding to the body part. Furthermore, when the integrated similarity interval is [94, 100], the dance action standard of the learner is considered, and when the integrated similarity interval is [0, 94], the best correction suggestion of the dance action is given according to the limb part with the maximum integrated similarity.

Therefore, the deviation angle of the limb part of the dance action of the learner is quantified, the limb part with the largest error is judged according to the obtained different limb deviation angles between the dance action of the learner and the standard dance action, corresponding action correction suggestions of the limb part are given, the dance action of the learner is fed back and adjusted in time, the dance action posture can be adjusted rapidly by the learner, and the dance learning efficiency is improved.

Exemplary System

FIG. 3 is a block diagram of a deep learning based behavior similarity calculation system according to some embodiments of the present application; as shown in fig. 3, the behavior similarity calculation system based on deep learning includes: a limb coordinate determination unit 301 configured to determine coordinate information of a limb key point of a standard motion and a to-be-corrected motion, respectively, according to the first image data and the second image data based on a human body posture estimation model constructed in advance; wherein, the key points of the limbs are determined by a visual image of a human skeleton; the first image data represents an image of a standard action; the second image data represents an image of the action to be corrected; the offset calculating unit 302 is configured to calculate the limb offset of the standard motion and the limb offset of the motion to be corrected at each limb key point according to the coordinate information of the limb key point of the standard motion and the coordinate information of the limb key point of the motion to be corrected; the offset angle calculation unit 303 is configured to determine a limb offset angle between the action to be corrected and the standard action at each limb key point according to the limb offset of the standard action at each limb key point and the limb offset of the action to be corrected; and the similarity calculation unit 304 is configured to determine the comprehensive similarity between the action to be corrected and the standard action according to the limb offset angle between the action to be corrected and the standard action at each limb key point.

The behavior similarity calculation system based on deep learning provided by the embodiment of the application can realize the steps and the flow of the behavior similarity calculation method based on deep learning of any embodiment, and achieve the same technical effects, and is not repeated herein.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A behavior similarity calculation method based on deep learning is characterized by comprising the following steps:

step S101, respectively determining coordinate information of limb key points of a standard motion and a motion to be corrected according to first image data and second image data based on a pre-constructed human body posture estimation model; wherein the limb key points are determined according to the human skeleton visual image; the first image data represents an image of the standard action, and the second image data represents an image of the action to be corrected;

step S102, respectively calculating the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point according to the coordinate information of the limb key point of the standard action and the coordinate information of the limb key point of the action to be corrected; wherein, according to the formula:

determining a limb offset a at each of the limb key points _im ；

Wherein, i belongs to {1,2}, and i =1 represents the standard action, and i =2 represents the action to be corrected; m represents the number of the limb key points, and m is a positive integer; (x, y) coordinate information representing the limb keypoints;

step S103, determining the position between the action to be corrected and the standard action at each limb key point according to the limb offset of the standard action and the limb offset of the action to be corrected at each limb key pointThe limb offset angle of (a); wherein when | α _1m |+|α _2m When | < 180 °, according to the formula:

when | α _1m |+|α _2m |>When the temperature is higher than 180 degrees,

γ _k ＝360°-(|α _1m |+|α _2m |)

wherein k is a positive integer, k<m，γ _k Representing the limb offset angle between the standard action and the action to be corrected at the k limb part corresponding to the mth limb key point; alpha is alpha _1m Representing the limb offset at the mth critical point of the limb in the standard motion; alpha (alpha) ("alpha") _2m Representing the limb offset at the mth key point of the limb in the action to be corrected;

step S104, determining comprehensive similarity between the action to be corrected and the standard action according to limb offset angles between the action to be corrected and the standard action at each limb key point; wherein, according to the formula:

Wherein, delta _k And m represents the number of the key points of the limb, wherein the m represents the limb offset angle between the action to be corrected and the standard action at the k limb part corresponding to the key points of the limb.

2. The deep learning-based behavior similarity calculation method according to claim 1, wherein in step S101,

and respectively carrying out human posture estimation on the standard action in the first image data and the action to be corrected in the second image data based on a HigherHRNet model, and correspondingly determining the coordinate information of the limb key points of the standard action and the action to be corrected.

3. The behavior similarity calculation method based on deep learning of claim 2, wherein the estimating the body pose of the standard motion in the first image data and the to-be-corrected motion in the second image data respectively based on the highherhrnet model, and correspondingly determining the coordinate information of the limb key points of the standard motion and the to-be-corrected motion comprises:

respectively starting from 1/32 resolution by adopting a pre-trained HigherHRNet model, and gradually increasing the resolution of the feature maps of the first image data and the second image data to 1/4 by using bilinear upsampling with transverse connection;

directly starting from 1/4 resolution, generating a feature map with higher resolution by deconvolution;

and generating scale-aware high-resolution heat maps of the first image data and the second image data respectively by a multi-resolution heat map aggregation strategy according to the higher-resolution feature maps generated by deconvolution so as to determine the coordinate information of the limb key points.

4. The behavior similarity calculation method based on deep learning according to any one of claims 1 to 3, further comprising:

and in response to the fact that the comprehensive similarity between the action to be corrected and the standard action is smaller than a preset similarity threshold, performing action correction on the limb key point corresponding to the maximum limb offset angle between the action to be corrected and the standard action at each limb key point.

5. A deep learning based behavioral similarity computing system, comprising:

the limb coordinate determination unit is configured to respectively determine coordinate information of limb key points of a standard action and an action to be corrected according to the first image data and the second image data based on a human body posture estimation model established in advance; the limb key points are determined according to the human skeleton visual image; the first image data represents an image of the standard action, and the second image data represents an image of the action to be corrected;

the offset calculating unit is configured to calculate the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point respectively according to the coordinate information of the limb key point of the standard action and the coordinate information of the limb key point of the action to be corrected; wherein, according to the formula:

determining a limb offset a at each of the limb key points _im ；

the offset angle calculation unit is configured to determine a limb offset angle between the action to be corrected and the standard action at each limb key point according to the limb offset of the standard action and the limb offset of the action to be corrected at each limb key point; wherein when | α _1m |+|α _2m When | ≦ 180 °, according to the formula:

when | α _1m |+|α _2m |>When the temperature is higher than 180 degrees,

γ _k ＝360°-(|α _1m |+|α _2m |)

determining a limb offset angle between the standard action and the action to be corrected at each key point of the limb;

wherein k is a positive integer, k<m，γ _k Representing the limb offset angle between the standard action and the action to be corrected at the k limb part corresponding to the mth limb key point; alpha (alpha) ("alpha") _1m Representing the limb offset at the mth critical point of the limb in the standard motion; alpha is alpha _2m Representing the limb offset at the mth key point of the limb in the action to be corrected;

the similarity calculation unit is configured to determine comprehensive similarity between the action to be corrected and the standard action according to limb offset angles between the action to be corrected and the standard action at each limb key point; wherein, according to the formula: