CN117576781A

CN117576781A - Training intensity monitoring system and method based on behavior recognition

Info

Publication number: CN117576781A
Application number: CN202311663182.6A
Authority: CN
Inventors: 王亚利; 黄冀周
Original assignee: Changchun Leiheng Era Technology Co ltd
Current assignee: Changchun Leiheng Era Technology Co ltd
Priority date: 2023-12-06
Filing date: 2023-12-06
Publication date: 2024-02-20

Abstract

A training intensity monitoring system and method based on behavior recognition is disclosed. Firstly, acquiring a training state monitoring video of a monitored preset object acquired by a camera in a preset time period, then extracting training state time sequence characteristics of the training state monitoring video to obtain a sequence of training state time sequence correlation characteristic graphs, then constructing training state change characteristics from the sequence of the training state time sequence correlation characteristic graphs to obtain training state semantic change time sequence characteristic vectors, and finally, determining a fatigue degree grade based on the training state semantic change time sequence characteristic vectors. Therefore, a trainer and a mentor can better know the fatigue condition in the training process, so that the training plan can be adjusted in time, and the training effect and the safety are improved.

Description

Training intensity monitoring system and method based on behavior recognition

Technical Field

The present application relates to the field of training intensity monitoring, and more particularly, to a training intensity monitoring system and method based on behavior recognition.

Background

With the popularity of fitness exercises, more and more people choose to perform physical training in gymnasiums or at home to improve physical fitness and fitness levels. However, the effectiveness and safety of physical training is largely dependent on the control of the intensity of the training, which may result in poor training results, and too high a training intensity may increase the risk of injury. That is, during the training process, it is necessary to know the fatigue level of the trainee to adjust the training strength in time and avoid excessive fatigue.

Traditional fatigue monitoring methods often rely on subjective feedback. However, this evaluation method based on self-perception is excessively dependent on subjective judgment of an individual, and lacks objective basis. Thus, an optimized solution is desired.

Disclosure of Invention

In view of this, the present application proposes a training intensity monitoring system and method based on behavior recognition, which can collect the motion state of a trainer by using a camera, and analyze the behavior characteristics of the trainer in combination with a deep learning algorithm to monitor the fatigue degree of the trainer and generate a corresponding fatigue grade label.

According to an aspect of the present application, there is provided a training intensity monitoring method based on behavior recognition, including:

acquiring training state monitoring videos of a monitored preset object acquired by a camera in a preset time period;

extracting training state time sequence characteristics of the training state monitoring video to obtain a sequence of a training state time sequence associated feature map;

constructing training state change characteristics from the sequence of the training state time sequence association characteristic diagram to obtain training state semantic change time sequence characteristic vectors; and

and determining the fatigue degree level based on the training state semantic change time sequence feature vector.

According to another aspect of the present application, there is provided a training intensity monitoring system based on behavior recognition, comprising:

the video acquisition module is used for acquiring training state monitoring videos of the monitored preset object acquired by the camera in a preset time period;

the training state time sequence feature extraction module is used for extracting training state time sequence features of the training state monitoring video to obtain a sequence of a training state time sequence associated feature map;

the training state change feature construction module is used for constructing training state change features from the sequence of the training state time sequence associated feature map so as to obtain training state semantic change time sequence feature vectors; and

and the fatigue degree grade analysis module is used for determining the fatigue degree grade based on the training state semantic change time sequence feature vector.

According to the embodiment of the application, firstly, training state monitoring videos of a monitored preset object collected by a camera in a preset time period are obtained, then training state time sequence characteristics of the training state monitoring videos are extracted to obtain a sequence of training state time sequence correlation characteristic diagrams, then training state change characteristics are constructed from the sequence of the training state time sequence correlation characteristic diagrams to obtain training state semantic change time sequence characteristic vectors, and finally, fatigue degree grades are determined based on the training state semantic change time sequence characteristic vectors. Therefore, a trainer and a mentor can better know the fatigue condition in the training process, so that the training plan can be adjusted in time, and the training effect and the safety are improved.

Other features and aspects of the present application will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the present application and together with the description, serve to explain the principles of the present application.

FIG. 1 illustrates a flow chart of a training intensity monitoring method based on behavior recognition according to an embodiment of the present application.

Fig. 2 shows an architectural diagram of a training intensity monitoring method based on behavior recognition according to an embodiment of the present application.

Fig. 3 shows a flowchart of sub-step S120 of a behavior recognition based training intensity monitoring method according to an embodiment of the present application.

Fig. 4 shows a flowchart of sub-step S130 of a behavior recognition based training intensity monitoring method according to an embodiment of the present application.

FIG. 5 illustrates a block diagram of a behavior recognition based training intensity monitoring system, according to an embodiment of the present application.

Fig. 6 shows an application scenario diagram of a training intensity monitoring method based on behavior recognition according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some, but not all embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present application without making any inventive effort, are also within the scope of the present application.

As used in this application and in the claims, the terms "a," "an," "the," and/or "the" are not specific to the singular, but may include the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

Various exemplary embodiments, features and aspects of the present application will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

In addition, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits have not been described in detail as not to unnecessarily obscure the present application.

Aiming at the technical problems, the technical concept of the application is to collect the motion state of a trainer by using a camera, analyze the behavior characteristics of the trainer by combining a deep learning algorithm so as to monitor the fatigue degree of the trainer and generate a corresponding fatigue grade label. Through the mode, a trainer and a mentor can better know the fatigue condition in the training process, so that the training plan can be adjusted in time, and the training effect and the safety are improved.

Based on this, fig. 1 shows a flowchart of a training intensity monitoring method based on behavior recognition according to an embodiment of the present application. Fig. 2 shows an architectural diagram of a training intensity monitoring method based on behavior recognition according to an embodiment of the present application. As shown in fig. 1 and 2, a training intensity monitoring method based on behavior recognition according to an embodiment of the present application includes the steps of: s110, acquiring training state monitoring videos of a monitored preset object acquired by a camera in a preset time period; s120, extracting training state time sequence characteristics of the training state monitoring video to obtain a sequence of a training state time sequence associated characteristic diagram; s130, constructing training state change features from the sequence of the training state time sequence association feature map to obtain training state semantic change time sequence feature vectors; and S140, determining the fatigue degree level based on the training state semantic change time sequence feature vector.

It should be understood that the purpose of step S110 is to acquire the state information of the monitored object in the training process by capturing the video through the camera, and the camera may record the action, posture, expression and other features of the trainer for subsequent analysis and processing. In step S120, time series features of the training state are extracted from the training state monitoring video, and these features may include an action sequence, an action frequency, an action amplitude, and the like, and by extracting these features, a sequence of time series correlation feature maps describing the training state may be obtained. In step S130, according to the sequence of the training state timing correlation feature map, the change features of the training states are constructed, and these features may include duration of the training states, transition frequency between the training states, stability of the training states, and so on, by constructing these features, a timing feature vector describing the semantic change of the training states may be obtained. In step S140, the fatigue level of the monitored object is determined according to the time sequence feature vector of the semantic change of the training state, and the fatigue level of the monitored object can be judged and classified into the corresponding level by analyzing the fatigue index in the feature vector, such as frequent change of the training state, change of the duration time, and the like. The purpose of these steps is to identify the training state and fatigue level of the monitored subject by behavior recognition and training state monitoring video analysis. This method may be applied to a variety of training scenarios, such as physical training, fitness training, etc., to provide real-time monitoring and assessment of a trainer.

Specifically, in the technical scheme of the application, firstly, training state monitoring videos of a monitored preset object in a preset time period, which are acquired by a camera, are acquired; and video segment segmentation is carried out on the training state monitoring video to obtain a sequence of training state monitoring video segments. It should be appreciated that the state of the trainer may change over time during the training process. By slicing the training state monitoring video into small video segments, detailed information of these state changes can be captured. Monitoring the video clip through each training state after the video slicing may represent a relatively continuous training action. In this way, fine posture changes and state changes can be captured more finely.

And then, the sequence of the training state monitoring video segment passes through a training state time sequence correlation feature extractor based on a three-dimensional convolutional neural network model to obtain a sequence of a training state time sequence correlation feature map. That is, a three-dimensional convolutional neural network model is utilized to extract time-series correlation features about training states from the video.

Here, the conventional image feature extraction method generally considers only static information of each image, and timing-related information is also very important for video sequences. A three-dimensional convolutional neural network (3D CNN) is a deep learning model that can process video sequences. By applying convolution operation in the time dimension, the time sequence correlation characteristic in the video sequence can be effectively captured. Specifically, by using a training state timing correlation feature extractor based on a three-dimensional convolutional neural network model, the timing correlation feature of each video clip can be extracted from the sequence of training state monitoring video clips. The feature information can reflect the states of the trainee at different time points, and the guiding model can better understand the behavior mode of the trainee.

Accordingly, in step S120, as shown in fig. 3, the training state timing feature of the training state monitoring video is extracted to obtain a sequence of training state timing association feature graphs, which includes: s121, carrying out data preprocessing on the training state monitoring video to obtain a sequence of training state monitoring video fragments; and S122, performing feature extraction on the sequence of the training state monitoring video clips by using a deep learning network model to obtain a sequence of the training state time sequence association feature map.

It should be understood that in step S121, the data preprocessing may include operations such as video segmentation, downsampling, denoising, etc. to extract a segment sequence of the training state monitoring video, and by segmenting the video into segments, the time sequence information of the training state may be divided into a series of continuous segments, so as to facilitate subsequent processing and analysis. In step S122, the deep learning network model may be a Convolutional Neural Network (CNN) or a Recurrent Neural Network (RNN), or the like. By inputting the video clips into a deep learning network, the network can learn advanced feature representations of the video clips, such as motion patterns, gesture changes, etc., and the extracted features can be used to construct a sequence of training state timing-related feature maps to describe the timing features of the training states. The aim of the two steps is to preprocess the training state monitoring video and extract the characteristics so as to obtain the time sequence characteristics of the training state. The data preprocessing may help remove noise, reduce data dimensionality, etc., so that subsequent feature extraction is more accurate and efficient. The advanced representation of the video clip can be learned by feature extraction using a deep learning network model to better capture the time-series-related features of the training state. These features will be used for subsequent training state analysis and fatigue level determination.

In step S121, the data preprocessing is performed on the training state monitoring video to obtain a sequence of training state monitoring video segments, including: and carrying out video segment segmentation on the training state monitoring video to obtain a sequence of the training state monitoring video segment. It should be appreciated that video segment splitting is an important step in the data preprocessing of the training state monitoring video and serves to split the entire training state monitoring video into a plurality of successive video segments for subsequent processing and analysis. The video clip slicing uses include: 1. and (3) time sequence analysis: by segmenting the video into segments, the timing information of the training state can be divided into a series of consecutive segments. Thus, the change process and time sequence characteristics of the training state can be better captured, and the duration, change frequency and the like of the training state can be observed through analyzing each video clip. 2. Feature extraction: the video clip cut provides input for subsequent feature extraction. Each video segment may be considered as an independent data sample, and the training state information contained by each segment may be represented by computing its characteristics. These characteristics may be a sequence of actions, frequency of actions, amplitude of actions, etc. for subsequent training state analysis and determination of fatigue level. 3. Data dimension reduction: in some cases, the complete training state monitoring video may contain a large amount of redundant information, and processing the entire video may result in a burden of computation and storage. By segmenting the video into segments, the dimensionality of the data can be reduced, and only key training state information is reserved, so that the processing and storage costs are reduced. In summary, the segmentation of video segments is an important step in preprocessing the training state monitoring video, which facilitates timing analysis, feature extraction and data dimension reduction, and provides a more efficient and feasible data representation for subsequent training state analysis.

In step S122, the deep learning network model is a training state time sequence correlation feature extractor based on a three-dimensional convolutional neural network model; the feature extraction of the sequence of the training state monitoring video segments by using a deep learning network model to obtain the sequence of the training state time sequence associated feature map comprises the following steps: and passing the sequence of the training state monitoring video segment through the training state time sequence correlation characteristic extractor based on the three-dimensional convolutional neural network model to obtain the sequence of the training state time sequence correlation characteristic diagram.

It should be noted that, the three-dimensional convolutional neural network (3D CNN) is a deep learning network model, and is particularly suitable for processing video data or data with a time dimension, unlike the conventional two-dimensional convolutional neural network (2D CNN), the three-dimensional convolutional neural network considers the time dimension in the convolutional operation, and can capture the time sequence correlation characteristic in video or time sequence data. The basic structure of a three-dimensional convolutional neural network is similar to a 2D CNN, but three-dimensional convolutional kernels are used in the convolutional layer. The method can simultaneously carry out convolution operation in space and time dimensions, thereby effectively extracting space-time characteristics in video or time sequence data. Three-dimensional convolutional neural networks are typically composed of multiple convolutional layers, pooled layers, and fully-connected layers, and deep networks can be constructed by stacking these layers. In training state time series correlation feature extraction, a model based on a three-dimensional convolutional neural network is used as a feature extractor. It accepts as input a sequence of training state monitoring video segments and extracts features in the space-time domain by convolution operations. These features may be high-level representations of actions, gestures, movement patterns, and the like. By inputting the sequence of video clips into a feature extractor based on a three-dimensional convolutional neural network, a sequence of time-series-associated feature maps of training states can be obtained. These feature maps can be used for subsequent tasks such as training state analysis, fatigue level assessment, etc. In general, a three-dimensional convolutional neural network is a deep learning model suitable for processing video or time series data, capable of capturing spatio-temporal features and for extracting time-series-related features of training states in training-state time-series-related feature extraction.

And then, the sequence of the training state time sequence correlation characteristic diagram passes through a self-attention correlation strengthening module to obtain the sequence of the self-strengthening training state time sequence correlation characteristic diagram. The self-attention association strengthening module learns element-by-element association between each position and other positions in each training state time sequence association feature map, and based on global distribution of corresponding weights, the self-attention strengthening module guides the model to pay self attention to feature distribution of different positions in the feature map. In the process, more important feature distribution areas can obtain higher weight and attention, so that state detail information of key actions is better captured.

And then, calculating the training state change semantic coefficients between every two adjacent corrected self-strengthening training state time sequence associated feature graphs in the sequence of the corrected self-strengthening training state time sequence associated feature graphs to obtain corrected training state semantic change time sequence feature vectors composed of the training state change semantic coefficients. Here, the training state change semantic coefficient between every two adjacent corrected self-strengthening training state time sequence correlation feature graphs in the sequence of the corrected self-strengthening training state time sequence correlation feature graphs is calculated, so that the semantic difference and the change degree between the two corrected self-strengthening training state time sequence correlation feature graphs can be quantified. The training state semantic change time sequence feature vector formed by the training state change semantic coefficients can reflect the evolution and change trend of the training state of the trainer in the training process.

Accordingly, in step S130, as shown in fig. 4, constructing training state change features from the sequence of the training state time sequence associated feature map to obtain training state semantic change time sequence feature vectors, including: s131, the sequence of the training state time sequence correlation characteristic diagram is passed through a self-attention correlation strengthening module to obtain the sequence of the self-strengthening training state time sequence correlation characteristic diagram; s132, carrying out feature distribution correction on the sequence of the self-strengthening training state time sequence associated feature map to obtain a corrected sequence of the self-strengthening training state time sequence associated feature map; and S133, calculating training state change semantic coefficients between every two adjacent corrected self-strengthening training state time sequence associated feature graphs in the sequence of corrected self-strengthening training state time sequence associated feature graphs to obtain the training state change semantic coefficient composed of the training state change semantic coefficients.

It should be understood that the process of constructing training state change features from the sequence of training state timing related feature maps to obtain training state semantic change timing feature vectors includes three steps S131, S132 and S133. In step S131, the self-attention mechanism allows the model to globally focus on the features of different positions in the sequence, and performs weighted aggregation on the features of each position, so that the expression capability of the training state time sequence correlation feature map can be enhanced, and richer semantic information can be extracted. In step S132, the sequence of the self-strengthening training state timing sequence-related feature map is subjected to feature distribution correction to obtain the corrected sequence of the self-strengthening training state timing sequence-related feature map, where the purpose of feature distribution correction is to enable features of different time steps to have consistent distribution characteristics by normalizing and adjusting the distribution of features. This helps to reduce the variance between features and improve the consistency of training state changes. In step S133, a training state change semantic coefficient between every two adjacent feature maps in the sequence of the corrected self-strengthening training state timing-related feature maps is calculated. These semantic coefficients may represent the degree of variation and semantic relevance between training states. By calculating the coefficients, training state semantic change time sequence feature vectors composed of training state change semantic coefficients can be obtained, and the feature vectors can be used for further tasks such as training state analysis and fatigue degree assessment. In summary, step S131 enhances the expressive power of the feature map by the self-attention association enhancing module, step S132 improves the consistency of the features by the feature distribution correction, and step S133 obtains the training state semantic change timing feature vector by calculating the semantic coefficients. The purpose of these steps is to extract more semantically training state change features from the sequence of training state timing related feature maps for subsequent analysis and application.

In step S131, the step of obtaining the sequence of the self-enhanced training state time-series associated feature map by the self-attention association enhancing module includes: respectively passing the sequence of the training state time sequence associated feature map through a first convolution layer of the self-attention associated strengthening module to obtain a sequence of a first feature map; respectively passing the sequence of the first feature map through a second convolution layer of the self-attention association strengthening module to obtain a sequence of a second feature map; respectively expanding the sequence of the second feature map into feature vectors along each feature matrix of the channel dimension to obtain a sequence of a plurality of first feature vectors; respectively calculating cosine similarity between any two first feature vectors in the sequences of the first feature vectors to obtain a sequence of cosine similarity feature graphs; respectively carrying out normalization processing on the sequences of the cosine similarity feature graphs through a Softmax function to obtain the sequences of the normalized cosine similarity feature graphs; multiplying the sequence of the normalized cosine similarity feature map and the sequence of the cosine similarity feature map according to position points to obtain a sequence of a similarity mapping optimization feature map; respectively passing the sequences of the similarity mapping optimization feature graphs through a first deconvolution layer of the self-attention association strengthening module to obtain sequences of first deconvolution feature graphs; respectively calculating the element-by-element sum of the sequence of the first deconvolution feature map and the sequence of the first feature map to obtain a sequence of a first fusion feature map; respectively passing the sequence of the first fusion feature map through a second deconvolution layer of the self-attention association strengthening module to obtain a sequence of a second deconvolution feature map; and respectively calculating the element-by-element sum of the sequence of the second deconvolution feature map and the sequence of the strain amount time sequence correlation feature map to obtain the sequence of the self-strengthening training state time sequence correlation feature map.

In step S133, each phase in the sequence of the corrected self-strengthening training state time sequence correlation characteristic diagram is calculatedTraining state change semantic coefficients between two adjacent corrected self-strengthening training state time sequence associated feature graphs to obtain the training state change time sequence feature vector composed of the training state change semantic coefficients, comprising: calculating training state change semantic coefficients between every two adjacent corrected self-strengthening training state time sequence associated feature graphs in the sequence of the corrected self-strengthening training state time sequence associated feature graphs according to the following change semantic coefficient calculation formula so as to obtain the training state change semantic feature vector composed of the training state change semantic coefficients; the change semantic coefficient calculation formula is as follows:wherein (1)>For the previous corrected self-strengthening training state time sequence correlation characteristic diagram +.>Characteristic value of the location->For the later self-strengthening training state timing related feature map after correction +.>Characteristic value of the location->And->For each of the height and width of the corrected self-strengthening training state timing related feature map, +.>For each channel dimension of the corrected self-strengthening training state time sequence associated feature map, +.>Is->Individual training state change semantic coefficients,/->A logarithmic function operation with a base of 2 is represented.

Further, the training state semantic change time sequence feature vector is passed through a classifier to obtain a classification result, and the classification result is used for representing a fatigue degree grade label.

Accordingly, in step S140, determining the fatigue level based on the training state semantic change timing feature vector includes: and the training state semantic change time sequence feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for representing a fatigue degree grade label. It should be appreciated that a classifier is a trained machine learning model that can be classified into different classes based on the input feature vectors. Here, the classifier is used for judging the current fatigue level according to the information of the training state semantic change time sequence feature vector.

The classification result is used to represent the fatigue level grade label. The current fatigue level can be determined from the output of the classifier. In general, the fatigue level may be classified into a plurality of levels, such as a mild fatigue, a moderate fatigue, and a severe fatigue level. The classification result may be used to identify the current level of fatigue in order to take corresponding measures, such as rest, adjust training intensity, or perform other interventions. The fatigue degree grade determining method based on the training state semantic change time sequence feature vector is used for monitoring and evaluating the fatigue degree of an individual in real time. By analyzing the semantic change of the training state, the fatigue state of the individual can be known more accurately, and corresponding measures are taken according to the fatigue degree grade label so as to ensure the training effect and the health and safety of the individual.

Specifically, the training state semantic change time sequence feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for representing a fatigue degree grade label and comprises the following steps: performing full-connection coding on the training state semantic change time sequence feature vector by using a full-connection layer of the classifier to obtain a coding classification feature vector; and inputting the coding classification feature vector into a Softmax classification function of the classifier to obtain the classification result.

It should be appreciated that the role of the classifier is to learn the classification rules and classifier using a given class, known training data, and then classify (or predict) the unknown data. Logistic regression (logistics), SVM, etc. are commonly used to solve the classification problem, and for multi-classification problems (multi-class classification), logistic regression or SVM can be used as well, but multiple bi-classifications are required to compose multiple classifications, but this is error-prone and inefficient, and the commonly used multi-classification method is the Softmax classification function.

In the above technical solution, each training state time sequence correlation feature graph in the sequence of training state time sequence correlation feature graphs represents the image semantic feature of the time sequence correlation of the corresponding training state monitoring video segment, so that after passing through the self-attention correlation strengthening module, the time sequence correlation distribution of the channel dimension can be constrained through the image semantic feature spatial distribution correlation, so that each obtained self-strengthening training state time sequence correlation feature graph in the sequence of self-strengthening training state time sequence correlation feature graphs has a spatial information attribute corresponding to the image semantic feature spatial distribution on the whole distribution dimension. In this way, if the spatial information expression effect of each self-strengthening training state time sequence associated feature map serving as the high-dimensional feature can be improved, the expression effect of each self-strengthening training state time sequence associated feature map can be improved, and the expression effect of the training state semantic change time sequence feature vector can be improved. Based on this, the applicant of the present application time-series association characteristic diagram of each self-strengthening training stateAnd (5) optimizing.

Specifically, in one example, in step S132, the sequence of the self-reinforced training state timing-related feature map is subjected to feature distribution modificationThe sequence of the corrected self-strengthening training state time sequence associated characteristic diagram is obtained, which comprises the following steps: carrying out feature distribution correction on the sequence of the self-strengthening training state time sequence associated feature map by using the following correction formula to obtain the corrected sequence of the self-strengthening training state time sequence associated feature map; wherein, the correction formula is:wherein (1)>，/>Is the +.th in each self-strengthening training state time sequence associated feature diagram>Characteristic value of the location->Is the size of the local neighborhood, and +.>For local spatial partition coefficients, +.>Is the +.f. in the time sequence associated feature map of each optimized self-strengthening training state>Characteristic values of the location.

Specifically, the feature map is associated with each self-strengthening training state time sequenceA time sequence correlation characteristic diagram of each self-strengthening training state by taking a local segmentation space in the expanded Hilbert space as a reference>Feature manifolds in high-dimensional feature space perform local integration of surfaces, thereby being based on productPartial integral processing of the partial function to correct the timing-related feature map of each self-strengthening training state>Phase transition discontinuous points of the feature manifold expressed by the non-stationary data sequence after local space expansion, thereby obtaining finer structure and geometric features of the feature manifold, and improving the time sequence associated feature diagram of each self-strengthening training state>The spatial information expression effect in the high-dimensional feature space is improved, so that the expression effect of the training state semantic change time sequence feature vector is improved, and the accuracy of a classification result obtained by the classifier is improved.

In summary, according to the training intensity monitoring method based on behavior recognition, the training person and the instructor can better know the fatigue condition in the training process, so that the training plan can be adjusted in time, and the training effect and the safety are improved.

Fig. 5 shows a block diagram of a training intensity monitoring system 100 based on behavior recognition according to an embodiment of the present application. As shown in fig. 5, a training intensity monitoring system 100 based on behavior recognition according to an embodiment of the present application includes: a video acquisition module 110, configured to acquire a training state monitoring video of a monitored predetermined object acquired by a camera in a predetermined period of time; the training state time sequence feature extraction module 120 is configured to extract training state time sequence features of the training state monitoring video to obtain a sequence of training state time sequence associated feature graphs; a training state change feature construction module 130, configured to construct training state change features from the sequence of the training state time sequence correlation feature map to obtain training state semantic change time sequence feature vectors; and a fatigue level analysis module 140, configured to determine a fatigue level based on the training state semantic change time sequence feature vector.

In one possible implementation, the training state timing feature extraction module 120 includes: the data preprocessing unit is used for carrying out data preprocessing on the training state monitoring video to obtain a sequence of training state monitoring video fragments; and the feature extraction unit is used for carrying out feature extraction on the sequence of the training state monitoring video clips by using a deep learning network model so as to obtain the sequence of the training state time sequence associated feature map.

Here, it will be understood by those skilled in the art that the specific functions and operations of the respective units and modules in the above-described behavior recognition-based training intensity monitoring system 100 have been described in detail in the above description of the behavior recognition-based training intensity monitoring method with reference to fig. 1 to 4, and thus, repetitive descriptions thereof will be omitted.

As described above, the training intensity monitoring system 100 based on behavior recognition according to the embodiment of the present application may be implemented in various wireless terminals, such as a server or the like having a training intensity monitoring algorithm based on behavior recognition. In one possible implementation, the behavior recognition based training intensity monitoring system 100 according to embodiments of the present application may be integrated into the wireless terminal as a software module and/or hardware module. For example, the behavior recognition based training intensity monitoring system 100 may be a software module in the operating system of the wireless terminal or may be an application developed for the wireless terminal; of course, the behavior recognition based training intensity monitoring system 100 could equally be one of many hardware modules of the wireless terminal.

Alternatively, in another example, the behavior-recognition-based training intensity monitoring system 100 and the wireless terminal may be separate devices, and the behavior-recognition-based training intensity monitoring system 100 may be connected to the wireless terminal through a wired and/or wireless network and transmit the interaction information in a agreed data format.

Fig. 6 shows an application scenario diagram of a training intensity monitoring method based on behavior recognition according to an embodiment of the present application. As shown in fig. 6, in this application scenario, first, a training state monitoring video (e.g., D illustrated in fig. 6) of a monitored predetermined object acquired by a camera for a predetermined period of time is acquired, and then the training state monitoring video is input to a server (e.g., S illustrated in fig. 6) in which a training intensity monitoring algorithm based on behavior recognition is deployed, wherein the server is capable of processing the training state monitoring video using the training intensity monitoring algorithm based on behavior recognition to obtain a classification result for representing a fatigue level label.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The embodiments of the present application have been described above, the foregoing description is exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A training intensity monitoring method based on behavior recognition, comprising:

2. The behavior recognition-based training intensity monitoring method according to claim 1, wherein extracting training state timing characteristics of the training state monitoring video to obtain a sequence of training state timing correlation characteristic diagrams comprises:

performing data preprocessing on the training state monitoring video to obtain a sequence of training state monitoring video fragments; and

and extracting features of the sequence of the training state monitoring video segments by using a deep learning network model to obtain the sequence of the training state time sequence associated feature map.

3. The behavior recognition-based training intensity monitoring method of claim 2, wherein the training state monitoring video is subjected to data preprocessing to obtain a sequence of training state monitoring video segments, comprising:

and carrying out video segment segmentation on the training state monitoring video to obtain a sequence of the training state monitoring video segment.

4. The training intensity monitoring method based on behavior recognition according to claim 3, wherein the deep learning network model is a training state time sequence correlation feature extractor based on a three-dimensional convolutional neural network model;

the feature extraction of the sequence of the training state monitoring video segments by using a deep learning network model to obtain the sequence of the training state time sequence associated feature map comprises the following steps:

and passing the sequence of the training state monitoring video segment through the training state time sequence correlation characteristic extractor based on the three-dimensional convolutional neural network model to obtain the sequence of the training state time sequence correlation characteristic diagram.

5. The behavior recognition-based training intensity monitoring method of claim 4, wherein constructing training state change features from the sequence of training state timing-related feature maps to obtain training state semantic change timing feature vectors comprises:

the sequence of the training state time sequence correlation characteristic diagram is passed through a self-attention correlation strengthening module to obtain the sequence of the self-strengthening training state time sequence correlation characteristic diagram;

performing feature distribution correction on the sequence of the self-strengthening training state time sequence associated feature map to obtain a corrected sequence of the self-strengthening training state time sequence associated feature map; and

calculating training state change semantic coefficients between every two adjacent corrected self-strengthening training state time sequence associated feature graphs in the sequence of corrected self-strengthening training state time sequence associated feature graphs to obtain the training state change time sequence feature vector composed of the training state change semantic coefficients.

6. The behavior recognition based training intensity monitoring method of claim 5, wherein passing the sequence of training state timing related feature maps through a self-attention related reinforcement module to obtain the sequence of self-reinforced training state timing related feature maps comprises:

respectively passing the sequence of the training state time sequence associated feature map through a first convolution layer of the self-attention associated strengthening module to obtain a sequence of a first feature map;

respectively passing the sequence of the first feature map through a second convolution layer of the self-attention association strengthening module to obtain a sequence of a second feature map;

respectively expanding the sequence of the second feature map into feature vectors along each feature matrix of the channel dimension to obtain a sequence of a plurality of first feature vectors;

respectively calculating cosine similarity between any two first feature vectors in the sequences of the first feature vectors to obtain a sequence of cosine similarity feature graphs;

respectively carrying out normalization processing on the sequences of the cosine similarity feature graphs through a Softmax function to obtain the sequences of the normalized cosine similarity feature graphs;

multiplying the sequence of the normalized cosine similarity feature map and the sequence of the cosine similarity feature map according to position points to obtain a sequence of a similarity mapping optimization feature map;

respectively passing the sequences of the similarity mapping optimization feature graphs through a first deconvolution layer of the self-attention association strengthening module to obtain sequences of first deconvolution feature graphs;

respectively calculating the element-by-element sum of the sequence of the first deconvolution feature map and the sequence of the first feature map to obtain a sequence of a first fusion feature map;

respectively passing the sequence of the first fusion feature map through a second deconvolution layer of the self-attention association strengthening module to obtain a sequence of a second deconvolution feature map; and

and respectively calculating element-by-element sums of the sequence of the second deconvolution feature map and the sequence of the strain amount time sequence correlation feature map to obtain the sequence of the self-strengthening training state time sequence correlation feature map.

7. The behavior recognition-based training intensity monitoring method of claim 6, wherein calculating training state change semantic coefficients between every two adjacent post-correction self-strengthening training state timing correlation feature graphs in the sequence of post-correction self-strengthening training state timing correlation feature graphs to obtain the training state semantic change timing feature vector composed of training state change semantic coefficients, comprises:

calculating training state change semantic coefficients between every two adjacent corrected self-strengthening training state time sequence associated feature graphs in the sequence of the corrected self-strengthening training state time sequence associated feature graphs according to the following change semantic coefficient calculation formula so as to obtain the training state change semantic feature vector composed of the training state change semantic coefficients;

the change semantic coefficient calculation formula is as follows:wherein (1)>For the previous corrected self-strengthening training state time sequence correlation characteristic diagram +.>Characteristic value of the location->For the later self-strengthening training state timing related feature map after correction +.>Characteristic value of the location->And->For each of the height and width of the corrected self-strengthening training state timing related feature map, +.>For each channel dimension of the corrected self-strengthening training state time sequence associated feature map, +.>Is->Individual training state change semantic coefficients,/->A logarithmic function operation with a base of 2 is represented.

8. The behavior recognition based training intensity monitoring method of claim 7, wherein determining a fatigue level based on the training state semantic change timing feature vector comprises:

and the training state semantic change time sequence feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for representing a fatigue degree grade label.

9. A training intensity monitoring system based on behavior recognition, comprising:

10. The behavior recognition based training intensity monitoring system of claim 9, wherein the training state timing feature extraction module comprises:

the data preprocessing unit is used for carrying out data preprocessing on the training state monitoring video to obtain a sequence of training state monitoring video fragments; and

and the feature extraction unit is used for carrying out feature extraction on the sequence of the training state monitoring video clips by using a deep learning network model so as to obtain the sequence of the training state time sequence associated feature map.