CN112307898A

CN112307898A - Data glove action segmentation method and device in virtual reality environment

Info

Publication number: CN112307898A
Application number: CN202011032007.3A
Authority: CN
Inventors: 金淼; 张军; 黄天富; 郭志伟; 吴志武; 张颖; 雷民; 陈习文; 陈卓; 卢冰; 汪泉; 王斯琪; 王旭; 聂高宁; 周玮; 付济良; 齐聪; 郭子娟; 余雪芹; 刘俊
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI; State Grid Fujian Electric Power Co Ltd; Marketing Service Center of State Grid Fujian Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI; State Grid Fujian Electric Power Co Ltd; Marketing Service Center of State Grid Fujian Electric Power Co Ltd
Priority date: 2020-09-27
Filing date: 2020-09-27
Publication date: 2021-02-02

Abstract

The invention discloses a data glove motion segmentation method and device in a virtual reality environment. The method comprises the steps that motion data of a data glove in a virtual reality environment acquired in the operation process are processed into M groups of high-dimensional spatial information matrixes corresponding to M frames of time slices in sequence; calculating the probability that the kth group of high-dimensional space information matrix belongs to the multi-dimensional cloud model determined by the previous Q groups of high-dimensional space information matrices one by one from the Nth group of high-dimensional space information matrices to the Mth group of high-dimensional space information matrices; when the probability is less than the preset value, determining the set of high-dimensional spatial information matrix MM_kThe corresponding sampling time is a time division point; and sequentially dividing the motion data of the data glove in the virtual reality environment acquired in the operation process into (F +1) relatively independent motions according to all the determined F time division points. The method can accurately divide the actions, and has the advantages of high precision, strong objectivity, high automation degree and high processing efficiency.

Description

Data glove action segmentation method and device in virtual reality environment

Technical Field

The invention belongs to the technical field of virtual reality, and particularly relates to a data glove motion segmentation method and device in a virtual reality environment.

Background

In the operation training virtual reality system, in order to evaluate the normative of the operation actions of the trainees, standard actions are required as measuring guide ropes.

Currently, in a Virtual environment of Virtual Reality (VR), data gloves are used to acquire data information such as coordinate offset and rotation angle of an operator's wrist, finger, and operation object. The data information can be used for objectively displaying the operation action of the operator, such as the visualization display in a three-dimensional entity after the data reconstruction.

At present, standard operation is divided by manual work, so that the consumed time is long, the subjectivity of a division result is strong, the operation efficiency is low, and the objectivity of the division result is insufficient.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a data glove motion segmentation method in a virtual reality environment and aims to solve the problems of low efficiency, insufficient objectivity and the like of the data glove motion segmentation method in the virtual reality environment in the prior art.

In a first aspect, the present invention provides a method for segmenting data glove actions in a virtual reality environment, including:

processing the motion data of the data glove in the virtual reality environment acquired in the operation process into M groups of high-dimensional spatial information matrixes MM sequentially corresponding to the M frame time slices according to the sampling interval T_jWherein j is more than or equal to 1 and less than or equal to M, and the duration of the operation process is recorded as T-M T;

starting from the nth set of high-dimensional spatial information matrices to the mth set of high-dimensional spatial information matrices,

calculating the k-th group of high-dimensional spatial information matrixes MM one by one_kProbability P of being subordinate to multi-dimensional cloud model determined by Q groups of high-dimensional spatial information matrixes before the probability P_kWherein k is not less than N and not more than M, Q is not less than 2 and not more than (k-1), and

at probability P_kWhen the value is larger than the preset value, the set of high-dimensional spatial information matrix MM is determined_kBelonging to the same action with the previous Q groups of high-dimensional spatial information matrixes; and

at probability P_kWhen the value is less than the preset value, the set of high-dimensional space information matrix MM is determined_kThe high-dimensional spatial information matrix MM does not belong to the same action with the previous Q groups of high-dimensional spatial information matrices_kThe corresponding sampling time is a time division point;

and sequentially dividing the motion data of the data glove in the virtual reality environment acquired in the operation process into (F +1) relatively independent motions according to all the determined F time division points.

Further, the air conditioner is provided with a fan,

the k-th group of high-dimensional spatial information matrix MM is calculated_kProbability P of being subordinate to multi-dimensional cloud model determined by Q groups of high-dimensional spatial information matrixes before the probability P_kThe method comprises the following steps:

determining a kth set of high dimensional spatial information matrices MM_kPrevious Q sets of high dimensional spatial information matrices: MM (Measure and Regulation)_k-Q， MM_k-Q-1，…，MM_k-1；

Respectively using the k-th group of high-dimensional spatial information matrixes MM_kTraining convolutional neural network by previous (Q-s) high-dimensional spatial information matrix and predicting high-dimensional spatial information matrix EE corresponding to k-th frame time slice by using convolutional neural network_Q-sWherein S is more than or equal to 0 and less than or equal to S, S is the shortest training length, 1<S<Q；

Establishing a multi-dimensional cloud model CL according to all (S +1) high-dimensional spatial information matrixes corresponding to the kth frame time slice_k；

Computing k-th group high-dimensional spatial information matrix MM_kMembership to a multidimensional cloud model CL_kProbability P of_k。

Further, the air conditioner is provided with a fan,

the high-dimensional spatial information matrix MM of the kth group is utilized_kTraining convolutional neural network by previous (Q-s) high-dimensional spatial information matrix and predicting high-dimensional spatial information matrix EE corresponding to k-th frame time slice by using convolutional neural network_Q-sThe method comprises the following steps:

using the k-th set of high-dimensional spatial information matrices MM_kThe former Q groups of high-dimensional spatial information matrixes train a convolutional neural network and predict a high-dimensional spatial information matrix EE corresponding to the kth frame time slice by using the convolutional neural network_Q；

Using the k-th set of high-dimensional spatial information matrices MM_kThe previous (Q-1) set of high-dimensional spatial information matrices train the convolutional neural network and use the convolutional neural network to predict the high-dimensional corresponding to the kth frame time sliceSpatial information matrix EE_Q-1；

Successively decreasing the number of high-dimensional space information matrixes, and utilizing the k-th group of high-dimensional space information matrixes MM_kTraining the convolutional neural network with the previous (Q-s) set of high-dimensional spatial information matrices and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice using the convolutional neural network_Q-sStopping until S is increased to S-1;

using the k-th set of high-dimensional spatial information matrices MM_kTraining the convolutional neural network with the previous (Q-S) set of high-dimensional spatial information matrices and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice using the convolutional neural network_Q-S。

Further, the air conditioner is provided with a fan,

the determination of the kth group of high-dimensional spatial information matrix MM_kThe previous Q sets of high-dimensional spatial information matrices include:

from the kth set of high dimensional spatial information matrices MM_kRemoving the high-dimensional spatial information matrix corresponding to each frame time slice before all determined time division points from the k groups of high-dimensional spatial information matrices to obtain a k group of high-dimensional spatial information matrices MM_kThe former Q sets of high-dimensional spatial information matrices.

Further, the air conditioner is provided with a fan,

establishing a multi-dimensional cloud model CL according to all (S +1) high-dimensional spatial information matrixes corresponding to the kth frame time slice_kThe method comprises the following steps:

respectively determining expectation, entropy and super-entropy of a multi-dimensional normal forward cloud generator described by (S +1) cloud droplets according to all (S +1) high-dimensional spatial information matrixes corresponding to the kth frame time slice, wherein the super-entropy He is less than or equal to 0.5;

establishing a multi-dimensional normal cloud model CL corresponding to all (S +1) high-dimensional space information matrixes corresponding to the kth frame time slice according to expectation, entropy and super-entropy of the multi-dimensional normal forward cloud generator_k。

Further, the air conditioner is provided with a fan,

processing the motion data of the data glove in the virtual reality environment acquired in the operation process into M frames in sequence according to the sampling interval TM groups of high-dimensional spatial information matrixes MM corresponding to time slices_jThe method comprises the following steps:

according to the position relation of each sensor in the data glove in the space, sequentially determining 5 columns of high-dimensional space information matrixes with 6 rows from the thumb to the little finger;

wherein, the end action is represented by E, and E is used as a redundant item for judging the action edge information.

Further, the air conditioner is provided with a fan,

when each high-dimensional spatial information matrix is a 5-column and 6-row high-dimensional spatial information matrix, and a convolutional neural network is trained, the structure of the convolutional neural network is as follows: from the input layer to the output layer, include in proper order: the device comprises an input layer, a convolution layer, a pooling layer, a full-connection layer and an output layer;

the convolution kernel size of the convolution layer is 6 x 2, and the sliding step size is 3.

Further, the air conditioner is provided with a fan,

when the convolutional neural network is trained, the following evaluation indexes are adopted for training precision selection:

the mean absolute error MAE, the mean relative error MRE,

wherein f is_iFor the purpose of the actual displacement value,

to predict the displacement value, n is the number of test samples.

Further, the air conditioner is provided with a fan,

the data glove comprises a plurality of sensors arranged at preset positions or joints of the hand, and each sensor is used for acquiring position or coordinate data of multiple dimensions.

In a second aspect, the present invention provides a device for segmenting the motion of a data glove in a virtual reality environment, comprising:

an action data processing unit for: processing the motion data of the data glove in the virtual reality environment acquired in the operation process into M groups of high-dimensional spatial information matrixes MM sequentially corresponding to the M frame time slices according to the sampling interval T_jWherein j is more than or equal to 1 and less than or equal to M, and the duration of the operation process is recorded as T-M T;

a time division point determination unit configured to: starting from the nth set of high-dimensional spatial information matrices to the mth set of high-dimensional spatial information matrices,

an action division unit configured to: and sequentially dividing the motion data of the data glove acquired in the operation process into (F +1) relatively independent motions according to all the determined F time division points.

The method and the device for segmenting the data glove actions in the virtual reality environment solve the problem of feature extraction of high-density data for recording action tracks generated in operation in the virtual reality environment, can accurately segment the actions, have high precision, strong objectivity, high automation degree and high processing efficiency, and can be used as standard actions of an operation training virtual reality system.

Drawings

A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:

FIG. 1 is a flow chart of a data glove motion segmentation method in a virtual reality environment according to a preferred embodiment of the present invention;

FIG. 2 is a schematic diagram of a data glove motion segmentation apparatus in a virtual reality environment according to a preferred embodiment of the present invention;

FIG. 3 is a schematic view of joint nodes of a data glove in a virtual reality environment according to a preferred embodiment of the present invention;

FIG. 4 is a schematic diagram of a data matrix of a data glove at a certain time in a virtual reality environment according to a preferred embodiment of the present invention;

FIG. 5 is a schematic diagram of a convolutional neural network in accordance with a preferred embodiment of the present invention;

FIG. 6 is a schematic diagram of the digital features (He, Ex, En) of a multi-dimensional cloud model according to a preferred embodiment of the present invention;

FIG. 7 is a schematic diagram of virtual simulation of a feature frame in 4 segments of action features after division of an electroscopy operation in a virtual reality environment according to a preferred embodiment of the present invention;

FIG. 8 is a cloud model of x and y predicted results of a certain motion of a data glove in a virtual reality environment at a certain time in accordance with a preferred embodiment of the present invention;

fig. 9 is a membership degree result of each motion frame determined when an electroscopy operation is divided in a virtual reality environment in the preferred embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their context in the relevant art and will not be interpreted in an idealized or overly formal sense.

As shown in fig. 1, the method for segmenting the motion of a data glove in a virtual reality environment according to the present embodiment includes:

step S11: processing the motion data of the data glove in the virtual reality environment acquired in the operation process into M groups of high-dimensional spatial information matrixes MM corresponding to the M frames of time slices in sequence according to the sampling interval T_jWherein j is more than or equal to 1 and less than or equal to M, and the duration of the operation process is recorded as T-M T;

step S12: starting from the nth set of high-dimensional spatial information matrices to the mth set of high-dimensional spatial information matrices,

step S13: and sequentially dividing the motion data of the data glove in the virtual reality environment acquired in the operation process into (F +1) relatively independent motions according to all the determined F time division points.

In particular, the k-th set of high-dimensional spatial information matrices MM is calculated_kProbability P of being subordinate to multi-dimensional cloud model determined by Q groups of high-dimensional spatial information matrixes before the probability P_kThe method comprises the following steps:

determining a kth set of high dimensional spacesInformation matrix MM_kPrevious Q sets of high dimensional spatial information matrices: MM (Measure and Regulation)_k-Q， MM_k-Q-1，…，MM_k-1；

In particular, the k-th group of high-dimensional spatial information matrix MM is utilized_kTraining the convolutional neural network with the previous (Q-s) set of high-dimensional spatial information matrix and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice by using the convolutional neural network_Q-sThe method comprises the following steps:

Using the k-th set of high-dimensional spatial information matrices MM_kTraining the convolutional neural network with the previous (Q-1) set of high-dimensional spatial information matrices and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice by using the convolutional neural network_Q-1；

using the k-th set of high-dimensional spatial information matrices MM_kTraining convolution spirit by previous (Q-S) set high-dimensional space information matrixPredicting a high-dimensional spatial information matrix EE corresponding to the k-th frame time slice through a network by utilizing the convolutional neural network_Q-S。

In particular, the determination of the kth set of high-dimensional spatial information matrices MM_kThe previous Q sets of high-dimensional spatial information matrices include:

Specifically, the multidimensional cloud model CL is established according to all (S +1) high-dimensional spatial information matrices corresponding to the k-th frame time slice_kThe method comprises the following steps:

Specifically, according to the sampling interval T, the motion data of the data glove in the virtual reality environment acquired in the operation process is processed into M groups of high-dimensional spatial information matrixes MM sequentially corresponding to M frame time slices_jThe method comprises the following steps:

Specifically, when each high-dimensional spatial information matrix is a 5-column and 6-row high-dimensional spatial information matrix, and when the convolutional neural network is trained, the structure of the convolutional neural network is as follows: from the input layer to the output layer, include in proper order: the device comprises an input layer, a convolution layer, a pooling layer, a full-connection layer and an output layer;

Specifically, when the convolutional neural network is trained, the following evaluation indexes are adopted for training precision selection:

the mean absolute error MAE, the mean relative error MRE,

wherein f is_iFor the purpose of the actual displacement value,

to predict the displacement value, n is the number of test samples.

In particular, the data glove comprises a plurality of sensors for positioning at respective predetermined positions or joints of the hand, each sensor for acquiring position or coordinate data in a plurality of dimensions.

As shown in fig. 2, the device for segmenting the motion of a data glove in a virtual reality environment according to the present embodiment includes:

an action data processing unit 100 for: processing the motion data of the data glove in the virtual reality environment acquired in the operation process into M groups of high-dimensional spatial information matrixes MM sequentially corresponding to the M frame time slices according to the sampling interval T_jWherein j is more than or equal to 1 and less than or equal to M, and the duration of the operation process is recorded as T-M T;

a time division point determining unit 200 configured to: starting from the nth set of high-dimensional spatial information matrices to the mth set of high-dimensional spatial information matrices,

calculating the k-th group of high-dimensional spatial information matrixes MM one by one_kProbability P of being subordinate to multi-dimensional cloud model determined by Q groups of high-dimensional spatial information matrixes before the probability P_kWherein N isK is not less than M, 2 is not less than Q is not more than k-1, and

an action dividing unit 300 for: and sequentially dividing the motion data of the data glove in the virtual reality environment acquired in the operation process into (F +1) relatively independent motions according to all the determined F time division points.

The method comprises the steps of utilizing a convolutional neural network to extract characteristics of multi-dimensional data corresponding to a plurality of continuous time frames, and utilizing the extracted characteristics to predict the multi-dimensional data corresponding to the next time frame; a plurality of prediction results are obtained through different but mutually associated multidimensional data in advance, a result set is formed, and a multidimensional cloud model is constructed by utilizing the result set; and calculating the probability that the multidimensional data corresponding to the next time frame actually measured belongs to the multidimensional data corresponding to the previous time frames by using the multidimensional cloud model, thereby determining whether the next time frame belongs to the action corresponding to the previous time frames.

It should be understood that the data glove includes a plurality of sensors for placement at predetermined locations or joints of the hand, each sensor for acquiring position or coordinate data in a plurality of dimensions. The operator wears the data glove to generate different actions in the set operation, and accordingly, the data acquired by all the sensors in the data glove is a high-density data set with compact density in space.

The segmentation method solves the problem of feature extraction of high-density data which are generated in the operation under the virtual reality environment and used for recording the action track, can accurately segment the action, has high precision and strong objectivity, and can be used as the standard action of an operation training virtual reality system.

Step 1, preprocessing data of multiple sensors in a data glove

As shown in fig. 3, in the space, the palm sensor O is adjacent to each finger sensor (a, b, c, d, e in order from left to right); the palm sensor is adjacent to the sensors at the finger roots of the fingers; the remaining sensors are adjacent to sensors at the same spatial location corresponding to the finger and the adjacent finger. The joints of two fingers, Thumb, a and Pinky, are spatially located at the top of the palm in the space of the base joints of the fingers, such as Index (Index finger), Middle finger, Ring (Ring finger), Pinky, and the like. Joints at the finger roots of the fingers are adjacent to the wrists, and joints of the other corresponding fingers are adjacent to joints of the adjacent fingers in spatial positions. Overall, there are 24 sensors in the data glove as shown in FIG. 3.

As shown in fig. 4, model hand data is imaged according to the spatial position relationship of each sensor in the data glove, and if the representative letters corresponding to the data from the thumb to the little finger are a, b, c, d and E, the corresponding finger joint points or sensor position nodes are S, a2, a3, a4 and E in the matrix; b. b1, b2, b3, b4, E; c. c1, c2, c3, c4, E; d. d1, d2, d3, d4, E; e. e1, E2, E3, E4, E, etc., to complete the matrix construction of the sensor data. Specifically, a1 is absent because the thumb has one sensor node less in the data glove. Overall, except for S and E, corresponding to the 24 sensors in the data glove shown in FIG. 3, respectively.

Wherein, the first joint of the two joint points of the joint at the wrist is arranged on the thumb and is used as the initial item of data and expressed by S (start) and is used as filling data to express the start of the matrix; and the last node ending item is represented by E (end), is used as a redundant item and is placed in the data matrix to be used for judging the action edge information. Processing the spatial position recorded by the data corresponding to each time frame into a two-dimensional matrix, wherein the size of the matrix is 5 x 6, and thirty data nodes are total; each node contains coordinate offsets in the x, y, z coordinate directions (referring to relative offsets from the origin of coordinates) for recording spatial information/motion data. The input data of the subsequent multi-dimensional cloud model and each CNN model are 15 × 6 two-dimensional matrixes.

Step 2, convolutional neural network and training

In combination with the characteristics of the motion data, the spatial information prediction convolutional neural network shown in fig. 5 is set, and the structure of the convolutional neural network is as follows: from the input layer to the output layer, include in proper order: the device comprises an input layer, a convolution layer, a pooling layer, a full-connection layer and an output layer.

Specifically, each element constituting the convolution kernel corresponds to a weight coefficient and a bias vector (bias vector), similar to a neuron (neuron) of a feedforward neural network. The fully-connected layer in the convolutional neural network is equivalent to the hidden layer in the traditional feedforward neural network. Because of the characteristic invariance, the output of the convolutional layer is converged by using the pooling layer, so that the convolutional neural network model can be prevented from being over-fitted during training. And finally, outputting through the full connection layer.

Spatial information prediction convolutional neural network

Specifically, after raw data is preprocessed, the result is input to an input layer of a convolutional neural network. The size of a convolution kernel and the convolution kernel sliding step length are set on the convolution layer, the convolution kernel is output to the pooling layer after being processed by the activation function, the feature mapping result of the convolution is sampled and processed in the pooling layer, and the dimension is reduced so as to reduce the algorithm complexity. And outputting the processing result of the pooling layer to a full-connection layer, fusing the characteristics obtained by the previous calculation by the full-connection layer, and outputting the characteristics to an output layer after the characteristics are subjected to the action of a weighted sum and an activation function.

The output layer of the spatial information prediction convolutional neural network is designed to output motion data of the motion glove in a time frame, that is, each node contains coordinate offset (referring to relative offset with respect to the origin of coordinates) of spatial motion x, y, z. The coordinate offset and the rotation angle of each sensor are contained to indicate the space information of the glove at the moment.

After the structure of a Convolutional Neural Network (CNN) is determined, the parameter setting of the Convolutional Neural Network determines the quality of a prediction result. When the convolutional neural network is trained, a trained framework for predicting the motion of the previous time frame is adopted, and modification is carried out on the basis of the framework.

Specifically, parameters are set according to engineering practice experience, adjusted, and a reasonable model structure is determined through trial and error.

Specifically, the step of setting the training data set is as follows:

setting spatial information/motion data corresponding to a section of operation, and sequentially recording spatial information of a front k time frame as: t is₁,T₂,…,T_kNow using the spatial information pair T of the k-frames_k+1And predicting the spatial information of the time.

Will T_k+1The spatial information of k frames before the moment is respectively set as training data sets with different lengths and used for training a spatial information prediction convolutional neural network, and the method comprises the following steps:

will T₁,T₂,…,T_kAs training data set 1, T is obtained by convolutional neural network training_k+1Spatial information prediction result M of frame₁；

Will T₂,T₃,…,T_kAs training data set 2, T is obtained through convolutional neural network training_k+1Spatial information prediction result M of frame₂；

Continuously repeating the above steps until

Will T_k-n,T_k-n+1,…,T_kAs a training data set k-n, T is obtained through convolutional neural network training_k+1Spatial information prediction result M of frame_k-n；

Wherein, the shortest training length is (k-n), the training data set k-n: t is_k-n,T_k-n+1,…,T_kThe shortest training data set. And the shortest training length is set to ensure the effectiveness of spatial information prediction convolutional neural network training results.

At this time, M₁，M₂，…，M_k-nForm time T_k+1The set of predicted results of (1).

In the next step, the time T will be utilized_k+1The multi-dimensional cloud model is constructed by using the prediction result set, and at this time, the spatial information data for constructing the multi-dimensional cloud model comprises (k-n) spatial information samples, that is, the multi-dimensional cloud model is constructed by using (k-n) cloud droplets.

Step 3, constructing a multi-dimensional cloud model

As shown in fig. 6, the cloud model characterizes the uncertainty of qualitative concepts in the quantitative domain using numerical features consisting of expectation, entropy and hyper-entropy. It is expected that Ex is the center of all cloud droplets belonging to the same qualitative concept in the quantitative domain, representing the cloud droplet value that best expresses the qualitative conceptual feature.

Entropy En is a composite indicator of randomness and ambiguity of qualitative concepts in a quantitative discourse domain. In one aspect, entropy is similar to the variance in a stochastic theoretic normal distribution, expressing the acceptable range of qualitative concepts in the quantitative domain. On the other hand, the larger the entropy, the larger the distribution range of cloud droplets, and the wider and the more fuzzy the qualitative concept. The superentropy He is a measure of membership uncertainty. According to the definition of the cloud model, the stable trend of the cloud droplets and the certainty degree thereof forms an expected curve of the cloud model, namely a normal curve with the mean value of Ex and the variance of En. The degree of deviation between the cloud droplets and the expected curve of the cloud model is reflected by the super entropy. The greater the entropy, the greater the degree of deviation between the cloud droplet and the desired curve, which appears as the greater the thickness of the cloud.

Each numerical feature of the cloud represents a numerical feature of the qualitative concept, denoted as C (Ex, En, He). This is the feature vector represented by the digital feature in the cloud model, i.e., the cloud vector. The digital nature of the cloud is unique in that Ex, En and He can be used to define a number of cloud droplets on a scale that forms the cloud and integrates the ambiguity and randomness of the concept.

Further, the multidimensional cloud model sets the multidimensional data to U { x }₁,x₂,…,x_NDenotes an N-dimensional quantitative discourse domain represented by an exact numerical value, and C is U { x }₁,x₂,…,x_NA qualitative concept on. If the quantitative value X belongs to U, and X (X)₁,x₂,…,x_N) Is a random realization of the qualitative concept C, if X (X)₁,x₂,…,x_N) The following distribution is satisfied:

wherein:

and X (X)₁,x₂,…,x_N) Degree of certainty for C μ (X (X)₁,x₂,…,x_N))∈[0，1]Satisfies the following conditions:

that is, X (X) can be called₁,x₂,…,x_N) At U { x₁,x₂,…,x_NThe distribution on the lattice is N-dimensional normal cloud.

Further, the implementation process of the N-dimensional normal forward cloud generator algorithm is as follows:

(1) is generated by

In order to be the desired value,

normal random number X (X) of standard deviation₁,x₂,…,x_N)；

(2) Is generated by

To expect, the first and second data are stored in a memory,

normal random number of standard deviation

(3) Calculating the certainty:

then (X (X)₁,x₂,…,x_N)，μ(x(x₁,x₂,…,x_N) ) is a cloud droplet of a normal cloud;

(4) repeating (1) to (3) until a preset number of cloud droplets are generated. It should be understood that the larger the number of data used for prediction, the more accurate the multidimensional cloud model is constructed.

The multi-dimensional cloud model established in the above steps is a multi-dimensional cloud model for a single time/single time frame. Such as predicting the (k-n) th Tth using the aforementioned convolutional neural network model_k+1Spatial information prediction results of the frames; then use the Tth of the (k-n) samples_k+1Spatial information prediction of frames, T, can be established_k+1A multidimensional cloud model of time of day describing the Tth_k+1Distribution of spatial information of the frame in a multi-dimensional cloud space.

Step 4, action division

Determining the probability of each frame of operation action video in the multidimensional cloud model, wherein the spatial information of each frame is determined by the spatial information of the continuous multi-frames before the operation action video;

and when the probability value of the spatial information of a certain frame in the corresponding cloud model is lower than the preset probability, judging that the time corresponding to the frame is a dividing point of the operation action.

And finally, performing action segmentation on the whole operation according to the starting time and the time corresponding to each segmentation point to obtain a plurality of segmented sub-actions.

Specifically, the power operation of electroscopy is subjected to action division by adopting the division method of the embodiment of the invention. When testing electricity, the whole operation flow is as follows in sequence: picking up the test pencil, moving the test pencil to the distribution box to be tested, testing the distribution box to be tested and returning the test pencil. Fig. 7 shows a virtual simulation image corresponding to a key time frame determined after manually dividing the overall power-on-inspection workflow in a VR environment.

The manual action segmentation result is compared with the automatic action segmentation result performed by the segmentation method of this embodiment in conjunction with the work flow.

When the selected training data and the verification data are used for training spatial information to predict each parameter of the convolutional neural network, two evaluation indexes are adopted for training precision selection, namely an average Absolute Error (MAE) and an average relative Error (MRE):

wherein f is_iFor the purpose of the actual displacement value,

to predict the displacement value, n represents the number of test samples.

In the segmentation method of this embodiment, the experimental test results in the training process for the spatial information prediction convolutional neural network are listed in table 1.

TABLE 1 comparison of number of test samples to prediction accuracy

Number of training samples	True displacement value	Predicting displacement values	Absolute error	Relative error
					50	0.8	0.33	0.47	0.59
100	1.0	1.52	0.52	0.52
					150	5.6	6.7	1.10	0.20
200	1.1	1.27	0.17	0.15
					250	4.6	5.20	0.60	0.13
300	7.9	9.01	1.11	0.14
					350	3.6	3.61	0.64	0.18
MAE			0.66
					MRE			0.27

From the two parameters, the spatial information prediction convolutional neural network of the example has good prediction effect; in particular, when the number of training samples is greater than 200, the relative error is stable. Therefore, the number of training samples is set to 300-600 (i.e. starting from the 300-600 th high-dimensional spatial information matrix,

calculating the probability P that each subsequent high-dimensional spatial information matrix belongs to the multi-dimensional cloud model determined by the previous high-dimensional spatial information matrixes_k) And the length interval (i.e., the number of (k-n) spatial information samples associated with the shortest training length (k-n)) is selected according to the number of cloud drops required by the multi-dimensional cloud model.

Specifically, when the 15 × 6 two-dimensional matrix is used as an input, the convolution layer operation is performed using a small number of convolution kernels according to repeated practice and experience because the 15 × 6 two-dimensional matrix has a small dimension.

Preferably, the convolution layer of the convolutional neural network model has a convolution kernel size of 6 x 2 and a sliding step size of 3. The weights of each layer of the convolutional neural network model are adjusted step by step through a plurality of experiments by using a training data set. After the convolutional neural network is trained by training sample data, the connection weight value and the bias value of each node can be determined and obtained. As listed in table 2, some of the parameters are shown.

TABLE 2 weight values of the constraint first layer of CNN

Numbering	(1,1)	(1,2)	(1,3)	(1,4)	…
						1	0.070605	0.002318	-0.067692	0.004617	…
2	0.082346	0.006948	-0.073171	0.009502	…
						3	0.043874	0.003815	-0.096552	0.079520	…
4	-0.018687	0.048976	-0.164631	-0.070936	…
						5	-0.026614	0.084153	-0.211651	-0.023158	…
6	-0.025934	0.036556	-0.094665	-0.045263	…
						…	…	…	…	…	…

The multi-dimensional cloud model is used for comprehensively evaluating the position data/space information and is generated according to the characteristics of each data; and calculating new position data through the multi-dimensional cloud to obtain the probability that the spatial information (namely the motion position of each sensor) at the moment accords with the change of the operation action.

After the actions of the electric power operation training are summarized and analyzed in combination with the operation instruction book, when the human hand moves, the current operation is still and the movement kept in a certain movement range can be regarded as belonging to the same operation. For example, when the human hand moves horizontally, the human hand may shift in all directions, and when the shift angle is less than 30 °, the human hand can still be regarded as belonging to the same operation action.

All the results predicted for each time frame build a corresponding cloud model through the digital features of the cloud. Specifically, each time frame is predicted one by one, and a multi-dimensional cloud model is built for a plurality of time frames before each time frame one by one; that is, how many nodes need to be predicted, how many cloud models are constructed.

Specifically, when the multi-dimensional cloud model is established, the following steps are adopted:

(1) determining expected Ex

For a variable, if it has a certain range of variation, the range is V_Qa[B_min,B_max]Then Ex can be calculated according to:

Ex＝(B_min+B_max)/2

in the formula, B_min,B_maxRespectively represent variable V_QaMinimum and maximum boundaries of (c);

for variables having only one side boundary, e.g. V_Qa[B_min,+∞]Or V_Qa[-∞,B_max]And determining default boundary parameters or expected values of the variables according to the upper and lower boundaries of the variables, and then calculating the digital characteristic Ex of the cloud according to the formula.

(2) Determining the entropy En, because the established multidimensional cloud model comprehensively considers the change of each variable, namely the variation value of each variable, determining the digital characteristic entropy En of the cloud according to the maximum range of the change, wherein the En of the variable is unchanged and is determined by the following formula:

En＝(Ex)/3

in the above equation, Ex is a different expected value for a certain variable. The formula here is set according to the 3En rule.

(3) Determining the super entropy He

He is usually chosen directly with a suitable constant k, typically He ≦ 0.5. If He is greater than 0.5, the distance between cloud droplets is too large, and the cloud droplets are too dispersed, so that the qualitative concept cannot be well embodied.

According to the three characteristics, the cloud model can be generated in the space by utilizing an N-dimensional normal forward cloud generator algorithm. FIG. 8 illustrates a two-dimensional cloud model generated from x and y at a time.

As can be derived from fig. 8, the average offset of the joint movement of the model hand (including the palm) is about 5 mm; wherein, the membership degree of the vertical axis represents the real displacement value of the point at the moment with which probability is high; in fig. 8, since the points are distributed approximately within the width of 10mm, the average deviation is approximately 5mm, which corresponds to the position change during the actual work.

Based on the constructed multi-dimensional cloud model, the actual movement position is calculated and then compared in the cloud space to obtain the probability of the change of the actual movement position according with the power operation. Specifically, the first 400 frames of all the acquired frames are used for training a convolutional neural network model; constructing an independent three-dimensional cloud model for each frame (namely, starting from the 401 st frame and ending from the 401 st frame), wherein the three-dimensional cloud model comprises offset characteristics in the x direction, the y direction and the z direction; and, after the membership degree of each cloud drop is aggregated, the number characteristic for describing each cloud drop is actually a four-dimensional value. And taking the corresponding probability membership degree of the actual displacement value of each frame in the multi-dimensional cloud space as the probability of switching/dividing/changing the overall operation action.

As shown in fig. 8, the points with membership (i.e., probability) greater than 0.6 are dense, which indicates that when position data prediction is performed by using each convolutional neural network, the prediction result at most of the time/time frame conforms to the action change rule during power operation.

The distribution probability variation of the motion segmentation is shown in fig. 9. By contrast, 5 division points (i.e., 5 falling peaks in fig. 9) are obtained, and the entire operation can be divided into 6 segments.

According to the time of the frame number of the segmentation point, the time is compared with the time point in the original motion process, and the action can be segmented into the following actions by combining the whole operation process: steering the tool table (the video screenshot in fig. 7 does not show the tool table) to move, grab the test pencil, move to the distribution box, contact the test pencil, return to the tool table, put down the test pencil, and so on. This is exactly the same as the manually segmented sub-action sequence shown in fig. 7. Therefore, compared with manual segmentation, the segmentation method of the embodiment has the advantages of high segmentation precision, strong objectivity, high automation degree and high processing efficiency.

In summary, the segmentation method of the embodiment has the following characteristics:

1. the multi-sensor data glove data are processed by utilizing a convolutional neural network to obtain a multi-dimensional data set for prediction, so that the problem of difficult application of spatially intensive multi-sensor data is solved;

2. establishing a multi-dimensional cloud model for the multi-dimensional data set obtained by prediction, wherein the multi-dimensional cloud model is used for carving spatial position distribution, and fully integrating qualitative concepts and quantitative data in the movement process of the data gloves during operation;

3. and determining a segmentation time point by using the probability of the operation motion data in the corresponding multi-dimensional cloud model, and segmenting the operation action in the virtual reality environment.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The invention has been described above by reference to a few embodiments. However, other embodiments of the invention than the one disclosed above are equally possible within the scope of the invention, as would be apparent to a person skilled in the art from the appended patent claims.

Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a// the [ device, component, etc ]" are to be interpreted openly as at least one instance of a device, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

Claims

1. A data glove motion segmentation method in a virtual reality environment is characterized by comprising the following steps:

2. The method of claim 1,

determining a kth set of high dimensional spatial information matrices MM_kPrevious Q sets of high dimensional spatial information matrices: MM (Measure and Regulation)_k-Q，MM_k-Q-1，…，MM_k-1；

Respectively using the k-th group of high-dimensional spatial information matrixes MM_kTraining the convolutional neural network with the previous (Q-s) set of high-dimensional spatial information matrices and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice by using the convolutional neural network_Q-sWherein S is more than or equal to 0 and less than or equal to S, S is the shortest training length, 1<S<Q；

3. The method of claim 2,

the high-dimensional spatial information matrix MM of the kth group is utilized_kTraining the convolutional neural network with the previous (Q-s) set of high-dimensional spatial information matrices and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice by using the convolutional neural network_Q-sThe method comprises the following steps:

using the k-th set of high-dimensional spatial information matrices MM_kThe former Q groups of high-dimensional spatial information matrixes train a convolutional neural network, and the convolutional neural network is utilized to predict a high-dimensional spatial information matrix EE corresponding to the kth frame time slice_Q；

Using the k-th set of high-dimensional spatial information matrices MM_kTraining the convolutional neural network with the previous (Q-1) set of high-dimensional spatial information matrices, and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice by using the convolutional neural network_Q-1；

Successively decreasing the number of high-dimensional space information matrixes, and utilizing the k-th group of high-dimensional space information matrixes MM_kTraining the convolutional neural network with the previous (Q-s) set of high-dimensional spatial information matrices, and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice by using the convolutional neural network_Q-sStopping until S is increased to S-1;

using the k-th set of high-dimensional spatial information matrices MM_kTraining the convolutional neural network with the previous (Q-S) set of high-dimensional spatial information matrices, and predicting the high-dimensional spatial information matrix EE corresponding to the k-th frame time slice by using the convolutional neural network_Q-S。

4. The method of claim 2,

5. The method of claim 2,

6. The method of claim 2,

according to the sampling interval T, the operation is performedThe motion data of the data glove in the virtual reality environment acquired in the working process is processed into M groups of high-dimensional spatial information matrixes MM corresponding to the M frames of time slices in sequence_jThe method comprises the following steps:

wherein, the end action is represented by E, and is used as a redundant item for judging the action edge information.

7. The method of claim 6,

8. The method of claim 7,

the mean absolute error MAE, the mean relative error MRE,

wherein f is_iFor the purpose of the actual displacement value,

to predict the displacement value, n is the number of test samples.

9. The method of claim 1,

10. A data glove movement segmenting device in a virtual reality environment is characterized by comprising:

an action division unit configured to: and sequentially dividing the motion data of the data glove in the virtual reality environment acquired in the operation process into (F +1) relatively independent motions according to all the determined F time division points.