CN110442238A - A kind of method and device of determining dynamic effect - Google Patents

A kind of method and device of determining dynamic effect Download PDF

Info

Publication number
CN110442238A
CN110442238A CN201910703830.3A CN201910703830A CN110442238A CN 110442238 A CN110442238 A CN 110442238A CN 201910703830 A CN201910703830 A CN 201910703830A CN 110442238 A CN110442238 A CN 110442238A
Authority
CN
China
Prior art keywords
gesture
dynamic effect
user
gestures
additional dynamic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910703830.3A
Other languages
Chinese (zh)
Inventor
李峰
邱日明
左小祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910703830.3A priority Critical patent/CN110442238A/en
Publication of CN110442238A publication Critical patent/CN110442238A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

This application involves field of computer technology more particularly to a kind of method and device of determining dynamic effect, method includes: to obtain the images of gestures of user;The gesture-type of user and the gesture feature point information of user are determined according to images of gestures;According to gesture-type, the gesture additional dynamic effect of user is determined, gesture additional dynamic effect play mode is determined according to the gesture feature of user point information;Gesture additional dynamic effect is shown according to gesture additional dynamic effect play mode.The embodiment of the present application does not need to increase other equipment, and the purpose of gesture motion control may be implemented, be not limited to usage scenario, simplify the process of human-computer interaction;And it is particularly shown mode by what the gesture feature of user point information determined gesture additional effect, can determine that the different modes that is particularly shown, the control effect of generation have diversity for different gesture feature point information.

Description

A kind of method and device of determining dynamic effect
Technical field
This application involves field of computer technology, especially relate to a kind of method and device of determining dynamic effect.
Background technique
Intelligent terminal has become one of users' electronic product used in everyday at present.People can not only pass through end End obtains information, browsing video, can also control the various virtual objects presented in terminal interface, realization and terminal Between interaction, or even can be realized interacting between the user of distant place using terminal.In various control methods, hand is utilized Gesture action control computer is a kind of practical human-computer interaction technology, controls computer compared to using keyboard or touch screen, utilizes Gesture motion control computer does not need to contact with mounting medium, therefore user can operate on bigger space;And Gesture motion is a kind of more natural interaction technique.
In the prior art, the gesture motion that user is captured usually using specific equipment, is then completed in a computer The building of gesture threedimensional model, to achieve the purpose that gesture motion controls.But this method needs to come using additional equipment The gesture motion for capturing user limits the scene that user carries out gesture motion control, increases the complexity of human-computer interaction process Property.
Summary of the invention
The embodiment of the present application provides a kind of method and device of determining dynamic effect, to solve in the prior art due to needing Capture the gesture motion of user using additional equipment and caused by gesture control scene it is limited, human-computer interaction process is complicated The problem of.
On the one hand, the embodiment of the present application provides a kind of method of determining dynamic effect, comprising:
Obtain the images of gestures of user;
The gesture-type of the user and the gesture feature point information of the user are determined according to the images of gestures;
According to the gesture-type, the gesture additional dynamic effect of the user is determined, it is special according to the gesture of the user Sign point information determines gesture additional dynamic effect play mode;
The gesture additional dynamic effect is shown according to the gesture additional dynamic effect play mode.
On the one hand, the embodiment of the present application provides a kind of device of determining dynamic effect, comprising:
Acquiring unit, for obtaining the images of gestures of user;
Determination unit, for determining the gesture-type of the user and the gesture of the user according to the images of gestures Characteristic point information;According to the gesture-type, the gesture additional dynamic effect of the user is determined, according to the gesture of the user Characteristic point information determines gesture additional dynamic effect play mode;
Display unit, for showing that the gesture additional dynamic is imitated according to the gesture additional dynamic effect play mode Fruit.
Optionally, the gesture feature point information is gesture feature point co-ordinate position information, and the determination unit is specifically used In:
The co-ordinate position information of each characteristic point is sequentially connected according to preset order, forms feature point trajectory;
The gesture additional dynamic effect play mode is determined according to the feature point trajectory.
Optionally, the determination unit is specifically used for:
According to the feature point trajectory, determine that gesture additional dynamic effect plays direction and dynamic effect plays range.
Optionally, the determination unit is specifically used for:
The gesture additional dynamic effect is determined according to the broadcasting direction of the gesture additional dynamic effect play mode Display direction;
The gesture additional dynamic effect is determined according to the broadcasting range of the gesture additional dynamic effect play mode Indication range.
Optionally, the characteristic point is the hand skeleton node for setting quantity, the gesture additional dynamic effect of the user For love dynamic effect;
The determination unit is specifically used for:
Determine that gesture additional dynamic effect play mode is user's according to the hand skeleton nodal information of the user Gesture additional dynamic effect is played between index finger and thumb;
The display unit is specifically used for:
The love dynamic effect is played between the index finger and thumb of the user.On the one hand, the embodiment of the present application A kind of computer equipment is provided, including memory, processor and stores the meter that can be run on a memory and on a processor The step of calculation machine program, the processor executes the method for the determining dynamic effect.
On the one hand, the embodiment of the present application provides a kind of computer readable storage medium, and being stored with can be set by computer The standby computer program executed, when described program is run on a computing device, so that described in computer equipment execution The step of determining the method for dynamic effect.
The gesture-type of the user and described is determined by obtaining the images of gestures of user, and by images of gestures The gesture feature point information of user, by gesture-type, determining needs the additional dynamic effect to be shown in user gesture, leads to Determining gesture feature point information is crossed to determine that the play mode of the additional dynamic effect, the play mode are being shown for determination When, the display mode of gesture additional dynamic effect;When playing video to be played, gesture is played according to determining display mode Additional effect.Compared with prior art, the embodiment of the present application does not need to increase other equipment, so that it may realize gesture motion control Purpose, be no longer limited to usage scenario, simplify the process of human-computer interaction, it is user-friendly;And the hand for passing through user Gesture characteristic point information determines dynamic effect control mode, determines the specific aobvious of gesture additional effect by dynamic effect control mode Show mode, can determine that the different modes that is particularly shown, the control effect of generation have for different gesture feature point information Diversity increases the diversity and interest of human-computer interaction.
Detailed description of the invention
Fig. 1 is a kind of application scenarios architecture diagram provided by the embodiments of the present application;
Fig. 2 is a kind of flow diagram of the method for determining dynamic effect provided by the embodiments of the present application;
Fig. 3 is a kind of schematic diagram of method that images of gestures is determined from video to be played provided by the embodiments of the present application;
Fig. 4 a is a kind of images of gestures schematic diagram provided by the embodiments of the present application;
Fig. 4 b is the schematic diagram of region of interest in a kind of images of gestures provided by the embodiments of the present application;
Fig. 5 is a kind of structural schematic diagram of CNN semantic segmentation model provided by the embodiments of the present application;
Fig. 6 is a kind of structural schematic diagram of region of interest detection model provided by the embodiments of the present application;
Fig. 7 is a kind of structural schematic diagram of CNN disaggregated model provided by the embodiments of the present application;
Fig. 8 is a kind of structural schematic diagram of characteristic point detection model provided by the embodiments of the present application;
Fig. 9 is a kind of schematic diagram of hand skeleton node provided by the embodiments of the present application;
Figure 10 is the corresponding signal of a kind of gesture and the corresponding additional dynamic effect of gesture provided by the embodiments of the present application Figure;
Figure 11 is a kind of schematic diagram that video is played according to display mode provided by the embodiments of the present application;
Figure 12 is a kind of flow diagram of the method for determining dynamic effect provided by the embodiments of the present application;
Figure 13 is a kind of processing flow schematic diagram that gesture-type is determined according to images of gestures provided by the embodiments of the present application;
Figure 14 is a kind of processing for determining gesture feature point information respectively according to images of gestures provided by the embodiments of the present application Flow diagram;
Figure 15 is a kind of specific implement scene schematic diagram provided by the embodiments of the present application;
Figure 16 is that the broadcasting content of Intelligent flat TV in a kind of specific implement scene provided by the embodiments of the present application is shown It is intended to;
Figure 17 is a kind of structural schematic diagram of the device of determining dynamic effect provided by the embodiments of the present application;
Figure 18 is a kind of electronic equipment structural schematic diagram provided by the embodiments of the present application.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, is not whole embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
In order to facilitate understanding of embodiments of the present invention, first several concepts are simply introduced below:
Human-computer interaction refers to using certain conversational language between people and computer, true to complete with certain interactive mode Determine the information exchanging process between the people of task and computer.
Gesture, the posture of hand refer to that the mankind apply palm and finger position, shape structure with one that speech center is set up At language-specific system.
Gesture motion control, a kind of mode of human-computer interaction realize certain dialogic voice by gesture and computer, and The information exchange between user number and computer can be completed by gesture motion.
Images of gestures, the image information comprising user gesture.
Region of interest, a part of image, in machine vision, image procossing, from processed image with box, circle, The modes such as ellipse, irregular polygon sketch the contours of region to be treated.
During concrete practice, present applicant have found that, determine that the method for gesture motion is usual in the prior art It is to increase equipment in user side, is just able to achieve the capture of user gesture movement, such as uses Leapmotion, the devices such as kinect The gesture motion of user is captured, completes the threedimensional model building of gesture, in a computer then so as to determine user's True gesture motion.
But in the prior art, the usual price of increased equipment is higher, and user cannot carry, and user cannot be at any time Carry out gesture motion control everywhere can not popularize the technology of human-computer interaction on a large scale.
Based on the above issues, present applicant contemplates one kind first and does not need using optional equipment, it is only necessary to common Picture pick-up device, such as it is various can be realized the terminal taken pictures of camera shooting, the images of gestures of user is got, then according to gesture figure As determining gesture-type, then according to gesture-type and the corresponding gesture trigger effect of preset gesture-type, gesture is realized The purpose of action control.
But during actual experiment, present applicant have found that, if only carrying out gesture by gesture-type When action control, cannot well control gesture triggering effect display mode, such as gesture trigger effect display direction with Gesture trigger effect mismatches or the indication range of gesture trigger effect is mismatched with gesture punishment method effect.
Based on the above issues, present applicant further contemplates one kind first and does not need using optional equipment, Common picture pick-up device is only needed, the corresponding gesture-type of images of gestures is not only determined by images of gestures, also passes through gesture figure Gesture feature point information as determining user, determines dynamic effect play mode according to the gesture feature of user point information, passes through Dynamic effect play mode has determined the display mode of gesture trigger effect, realizes the purpose of the gesture motion control of user, And it can preferably play gesture trigger effect.By the method for the embodiment of the present application, do not need using additional equipment, The three-dimensional modeling process for not needing progress gesture, simplifies the process of human-computer interaction, improves the efficiency of human-computer interaction, and more The meaning of the good gesture motion for understanding user, and clearly indicate to user, promote user's impression.
The playback method of target object in the embodiment of the present application can be applied to application scenarios as shown in Figure 1, at this It include picture pick-up device 101, computer equipment 102 and display equipment 103 in application scenarios.Picture pick-up device 101 is that have camera shooting The electronic equipment of function, picture pick-up device 101 can be smart phone, tablet computer or portable personal computer etc..User exists When using picture pick-up device 101, by the shooting the gesture motion of user of picture pick-up device 101, video to be played is formed, is taken the photograph It is connect as equipment 101 passes through network with computer equipment 102, computer equipment 102 is after obtaining video to be played, from wait broadcast The images of gestures for determining user in video is put, and determines the gesture-type of user and the hand of user according to the images of gestures of user Gesture characteristic point information determines the gesture additional dynamic effect of the user according to gesture-type, that is, dynamic to be shown is needed to imitate Fruit;Dynamic effect play mode is determined according to the gesture feature of user point information;User is determined according to dynamic effect play mode Gesture additional dynamic effect display mode, i.e., how to show dynamic effect;Computer equipment 102 is determining display mode Afterwards, it by showing that equipment 103 shows video to be played, and when showing video to be played, is shown according to determining display mode Gesture additional dynamic effect, realizes the human-computer interaction between user and computer equipment 102.
It is worth noting that the architecture diagram in the embodiment of the present invention is to clearly illustrate in the embodiment of the present invention Technical solution, the limitation to technical solution provided in an embodiment of the present invention is not constituted, for other application scenarios frameworks And service application, technical solution provided in an embodiment of the present invention are equally applicable for similar problem.
Based on application scenario diagram shown in FIG. 1, the embodiment of the present application provides a kind of method of determining dynamic effect, the party The process of method can be executed by determining the device of dynamic effect, as shown in Figure 2, comprising the following steps:
Step S201 obtains the images of gestures of user.
Specifically, user uses when carrying out human-computer interaction by gesture motion generally for guaranteeing to obtain in real time The gesture motion instruction that family issues, needs the continuous gesture of the captured in real-time user in human-computer interaction process to act, until man-machine Interactive process terminates.Video to be played refers to shooting at the end of user starts and carries out human-computer interaction up to human-computer interaction Content.
In the embodiment of the present application, it is determined from video to be played first by way of taking out frame comprising user gesture figure The picture frame of picture determines the picture frame in video to be played including user gesture image, that is to say, that can in video to be played It can include the picture frame of not user gesture, the picture frame comprising user gesture is handled as images of gestures.
For example, in the embodiment of the present application, as shown in figure 3, video to be played includes 5 frame picture frames, wherein the 2nd frame image User gesture image is contained in frame, the 3rd frame picture frame and the 4th frame picture frame, so passing through the 2nd frame picture frame, the 3rd frame figure As the images of gestures that frame and the 4th frame picture frame are user.
Step S202 determines the gesture-type of the user and the gesture feature of the user according to the images of gestures Point information.
Specifically, in the embodiment of the present application, after the images of gestures in video to be played has been determined, according to images of gestures Determine the gesture-type of user and the gesture feature point information of user.The gesture-type of user refers to that user is man-machine in progress When interaction, the posture of the hand for the user that computer is understood that, the corresponding different gesture-type of images of gestures is to preset , by determining that the posture of the hand of the posture and pre-set gesture-type of hand in images of gestures determines the gesture class of user Type.
The gesture feature point information of user refers to the test point that setting quantity is provided in images of gestures, these detections Point can characterize the framework characteristic of user gesture, the information such as range, direction of posture of hand for better understanding user.
Optionally, in the embodiment of the present application, the gesture-type and gesture feature point information that determine user, need head First determine the region of interest in images of gestures, that is, for carrying out the image-region of gesture-type identification, example in images of gestures As shown in Fig. 4 a and Fig. 4 b, Fig. 4 a is images of gestures, and Fig. 4 b is the region of interest determined from the images of gestures of Fig. 4 a.
Optionally, in the embodiment of the present application, region of interest, image detection mould can be determined by image detection model Whether type is implemented for detecting in target image including the approximate region of target and target in the target image in the application In example, image detection model is used to approximate region of the gesture in images of gestures in detection gesture image.
Optionally, image detection model is according to CNN (Convolutional Neural Networks, convolutional Neural net Network) semantic segmentation model is trained.Interested step is determined by CNN semantic segmentation model specifically:
Step S202a uses CNN semantic segmentation model to obtain each of images of gestures pixel as in target area The probability of one pixel;
CNN semantic segmentation model can carry out feature extraction to each of images of gestures pixel, and by each pixel Corresponding feature extraction result is matched with preset characteristics of image, features described above extract result and preset characteristics of image it Between matching degree can be used to measure the corresponding pixel of feature extraction result be target area in a pixel probability. Matching degree between feature extraction result and preset characteristics of image is higher, then it is mesh that this feature, which extracts the corresponding pixel of result, The probability for marking a pixel in region is bigger;Matching degree between feature extraction result and preset characteristics of image is got over Small, then it is lower to extract the probability that the corresponding pixel of result is a pixel in target area for this feature.Wherein, preset figure As feature can be composition target the corresponding characteristics of image of pixel, can training complete CNN semantic segmentation model after It arrives.
In addition, can be indicated using probability matrix after obtaining each pixel of images of gestures as the probability of target State probability.Wherein, pixel included by probability and images of gestures included by probability matrix corresponds.For example, probability square The numerical value of battle array the 4th row the 3rd column is used to indicate the corresponding probability of pixel of the 4th row the 3rd of images of gestures column.
Step S202b, according to the corresponding determine the probability region of interest of each pixel.
Region of interest includes the pixel that probability is greater than preset threshold.Optionally, the corresponding probability of each pixel is being determined Afterwards, binary conversion treatment is carried out to probability, is set as 1 above or equal to the probability of preset threshold, it will be no more than preset threshold It is set as 0.By the above-mentioned means, region of interest can be determined.
Step S202c extracts region of interest from images of gestures.
Region of interest can be extracted from images of gestures, then be determined according to region of interest through the above steps The gesture-type of user and the gesture feature point information of user.
Optionally, the CNN semantic segmentation model in the embodiment of the present application includes 2n+1 grades of convolutional layers, n grades of pond layers and n grades Warp lamination, wherein in the 1st to the n-th grade of convolutional layer, be provided with low-level culture pool layer after every grade of convolutional layer, i.e., preceding n grades of convolutional layers It is arranged alternately with n grades of pond layers.Optionally, every grade of convolutional layer is for carrying out process of convolution at least once.Correspondingly, images of gestures To get arriving the corresponding characteristic pattern of images of gestures after n grades of convolutional layers and n, that is, pond layer processing, wherein the port number of characteristic pattern Greater than the port number of images of gestures, and the size of characteristic pattern is less than the size of images of gestures.
The U-shaped net constituted below with CNN semantic segmentation model for 7 grades of convolutional layers, 3 grades of pond layers and 3 grades of warp laminations It is illustrated for network structure.Convolutional layer is used to extract the layer of feature, is divided into convolution operation and activation operation two parts.Wherein, When carrying out convolution operation, feature extraction is carried out using the convolution kernel that training study obtains is first passed through in advance, when carrying out activation operation, is made Activation processing is carried out to the characteristic pattern that convolution obtains with activation primitive, common activation primitive includes line rectification (Rectified Linear Unit, ReLU) function, S type (Sigmoid) function and tanh (Tanh) function etc..
Pond (pooling) layer is located at after convolutional layer, for reducing the feature vector of convolutional layer output, that is, reduces special The size of figure is levied, while improving overfitting problem.Common pond mode includes average pond (mean-pooling), maximum pond Change (max-pooling) and random pool (stochastic-pooling) etc..
Warp lamination (deconvolution), the layer for being up-sampled to feature vector, i.e., for increasing characteristic pattern Size.
As shown in figure 5, carrying out convolution and activation processing to the (i-1)-th characteristic pattern by i-stage convolutional layer first, and will place The (i-1)-th characteristic pattern after reason inputs i-stage pond layer, 2≤i≤n.For first order convolutional layer, input is gesture picture;And For i-stage convolutional layer, input is then the characteristic pattern of (i-1)-th grade of pond layer output.Optionally, first order convolutional layer is got After gesture picture, convolution operation is carried out by default convolution nuclear rivals gesture image, then activation behaviour is carried out by default activation primitive Make;After i-stage convolutional layer obtains the (i-1)-th characteristic pattern of the (i-1)-th pond layer output, by default convolution kernel to the (i-1)-th characteristic pattern Convolution operation is carried out, then activation operation is carried out by default activation primitive, to play the role of extracting feature, wherein carry out After process of convolution, the port number of characteristic pattern increases.As shown in figure 5, first order convolutional layer carries out at convolution twice images of gestures Reason;The fisrt feature figure that second level convolutional layer exports the first pond layer carries out process of convolution twice, and third level convolutional layer is to the The second feature figure of two pond layers output carries out process of convolution twice, and the third that fourth stage convolutional layer exports third pond layer is special Sign figure carries out process of convolution twice.Wherein, the height of multi-channel feature figure is for indicating size, and width is then used to indicate channel Number.
Secondly, the (i-1)-th characteristic pattern carries out pond processing to treated by i-stage pond layer, the i-th characteristic pattern is obtained. After i-stage convolutional layer completes process of convolution, by treated, the (i-1)-th characteristic pattern inputs (i-1)-th grade of pond layer, by (i-1)-th grade of pond Change layer and carry out pond processing, thus the i-th characteristic pattern of output.Wherein, pond layers at different levels are used to reduce the size of characteristic pattern, and retain Important information in characteristic pattern.Optionally, pond layers at different levels carry out maximum pondization processing to the characteristic pattern of input.Schematically, As shown in figure 5, first order pond layer handles first order convolutional layer output characteristic pattern, fisrt feature figure, the second level are obtained Pond layer handles second level convolutional layer output characteristic pattern, obtains second feature figure, and third level pond layer rolls up the third level Lamination output characteristic pattern is handled, and third feature figure is obtained.
Finally, the i-th characteristic pattern is inputted i+1 grade convolutional layer.After completing pondization processing, i-stage pond layer is by the i-th feature Figure input next stage convolutional layer, by next stage convolutional layer further progress feature extraction.As shown in figure 5, images of gestures successively passes through Cross first order convolutional layer, first order pond layer, second level convolutional layer and second level pond layer, third convolutional layer and third pond After layer, third feature figure is inputted into fourth stage convolutional layer by third level pond layer.Above-described embodiment is only to carry out cubic convolution, pond Change and be illustrated for operating, in other possible embodiments, CNN semantic segmentation model can carry out multiple convolution, pond Change operation, the present embodiment does not constitute this and limits.
After the processing operation for having carried out alternate convolutional layer and pond layer, it is also necessary to be obtained by warp lamination interested Area carries out convolution to intermediate characteristic pattern and deconvolution is handled by (n+1)th to 2n+1 grades convolutional layers and n grades of warp laminations, Obtain region of interest.Wherein, the size of region of interest is equal to the size of images of gestures.
In a kind of possible embodiment, handled by (n+1)th to 2n+1 grades convolutional layers and n grades of warp laminations When include the following steps:
Firstly, carrying out deconvolution processing, 1≤j by characteristic pattern of the j-th stage warp lamination to+n grades of convolutional layer outputs of jth ≤n.Schematically, as shown in figure 5, carrying out deconvolution to the characteristic pattern that fourth stage convolutional layer exports by first order warp lamination Processing;Deconvolution processing is carried out to the characteristic pattern that level V convolutional layer exports by second level warp lamination;It is anti-by the third level Convolutional layer carries out deconvolution processing to the characteristic pattern that the 6th grade of convolutional layer exports.Wherein, deconvolution processing is as process of convolution Inverse process, for being up-sampled to characteristic pattern, to reduce the size of characteristic pattern.As shown in figure 5, being handled by warp lamination Afterwards, the size of characteristic pattern reduces.
Secondly, to deconvolution, treated that characteristic pattern that characteristic pattern exports with the n-th-j+1 grades of convolutional layer splices, and will Spliced characteristic pattern inputs jth+n+1 grade convolutional layer, what deconvolution treated characteristic pattern and the n-th-j+1 grades of convolutional layer exported The size of characteristic pattern is identical.Schematically, as shown in figure 5, characteristic pattern and first order warp that third level convolutional layer is exported The characteristic pattern splicing of lamination output, the input as level V convolutional layer;The characteristic pattern that second level convolutional layer is exported and The characteristic pattern splicing of second level warp lamination output, as the input of the 6th grade of convolutional layer, the feature that first order convolutional layer is exported The characteristic pattern splicing of figure and the output of third level warp lamination, the input as the 7th grade of convolutional layer.
Finally, carrying out process of convolution, final output and original hand to spliced characteristic pattern by jth+n+1 grades of convolutional layer The consistent region of interest of gesture picture size.
After CNN semantic segmentation model structure and treatment process has been determined, so that it may pass through the sense in history images of gestures Region of interest trains CNN semantic segmentation model, and the CNN semantic segmentation model then completed according to training extracts region of interest.
The above method is the process that region of interest is determined using CNN semantic segmentation model, there are also it is determined that region of interest Method, such as by another region of interest detection model in images of gestures pixel carry out feature extraction, determine gesture The direction the x offset based on pixel coordinate system of region of interest, the offset of the direction y, the width of region of interest and interested in image The height in area can determine region of interest by above-mentioned four-tuple.
Illustratively, the specific structure of region of interest detection model is as shown in fig. 6, in Fig. 6, i.e., using images of gestures as The input of region of interest detection model extracts images of gestures feature by convolution operation, and being operated by pondization reduces characteristic image Resolution ratio, convenient for extract higher phonetic feature, by be used alternatingly convolution sum pondization operation realization to images of gestures from Low layer semantic feature has obtained characteristic pattern to the extraction of high-level semantics features.
In this application, in order to the information learning using different characteristic dimension to more robust feature, such as Fig. 6 institute Show, three different semantic and different resolution characteristic patterns is selected to be merged and predict the position four-tuple of region of interest.In In the application, if needing to zoom in and out operation when the characteristic pattern resolution ratio difference of selection, the resolution ratio of characteristic pattern being adjusted It is whole.In Fig. 6, these three different semantic and different resolution characteristic patterns are respectively the output characteristic pattern of fourth stage convolutional layer, The input feature vector figure of level V convolutional layer is special by the output of convolution operation obtains twice characteristic pattern and level V convolutional layer Sign figure;Since the output characteristic pattern of fourth stage convolutional layer is different from the resolution ratio of other characteristic patterns, it is also necessary to by fourth stage convolution The output characteristic pattern of layer carries out resolution adjustment operation.It calculates and needs respectively to feature for convenience before carrying out Fusion Features Figure carries out dimension transformation operation.
In the embodiment of the present application, the gesture-type of user and the gesture feature point letter of user are determined according to images of gestures Breath can be two processes that sequence carries out, and be also possible to two processes of parallel processing.
Specifically, in the embodiment of the present application, can use CNN disaggregated model and determine the corresponding gesture-type of region of interest And characteristic point detection model determines the gesture feature point information of the corresponding user of region of interest.
Specifically, CNN disaggregated model is obtained according to the true classification results training of region of interest in history images of gestures , illustratively, the network structure of CNN disaggregated model as shown in fig. 7, first by i-stage convolutional layer to the (i-1)-th characteristic pattern into Row convolution and activation processing, and the (i-1)-th characteristic pattern inputs i-stage pond layer, 2≤i≤n by treated.For the first order Convolutional layer, input are region of interest;And for i-stage convolutional layer, input is then the feature of (i-1)-th grade of pond layer output Figure.Optionally, first order convolutional layer get it is interested after, convolution operation is carried out by default convolution nuclear rivals gesture image, then Activation operation is carried out by default activation primitive;After i-stage convolutional layer obtains the (i-1)-th characteristic pattern of the (i-1)-th pond layer output, lead to It crosses default convolution kernel and convolution operation is carried out to the (i-1)-th characteristic pattern, then activation operation is carried out by default activation primitive, to play Extract the effect of feature, wherein after carrying out process of convolution, the port number of characteristic pattern increases.As shown in fig. 7, first order convolutional layer Process of convolution twice is carried out to region of interest;Second level convolutional layer carries out two secondary volumes to the fisrt feature figure that the first pond layer exports Product processing, third level convolutional layer carry out process of convolution twice, fourth stage convolutional layer to the second feature figure that the second pond layer exports Process of convolution twice is carried out to the third feature figure of third pond layer output.Wherein, the height of multi-channel feature figure is for indicating Size, and width is then used to indicate port number.
Secondly, the (i-1)-th characteristic pattern carries out pond processing to treated by i-stage pond layer, the i-th characteristic pattern is obtained. After i-stage convolutional layer completes process of convolution, by treated, the (i-1)-th characteristic pattern inputs (i-1)-th grade of pond layer, by (i-1)-th grade of pond Change layer and carry out pond processing, thus the i-th characteristic pattern of output.Wherein, pond layers at different levels are used to reduce the size of characteristic pattern, and retain Important information in characteristic pattern.Optionally, pond layers at different levels carry out maximum pondization processing to the characteristic pattern of input.Schematically, As shown in fig. 7, first order pond layer handles first order convolutional layer output characteristic pattern, fisrt feature figure, the second level are obtained Pond layer handles second level convolutional layer output characteristic pattern, obtains second feature figure, and third level pond layer rolls up the third level Lamination output characteristic pattern is handled, and third feature figure is obtained.
Finally, the i-th characteristic pattern is inputted i+1 grade convolutional layer.After completing pondization processing, i-stage pond layer is by the i-th feature Figure input next stage convolutional layer, by next stage convolutional layer further progress feature extraction.As shown in fig. 7, region of interest successively passes through Cross first order convolutional layer, first order pond layer, second level convolutional layer and second level pond layer, third convolutional layer and third pond After layer, third feature figure is inputted into fourth stage convolutional layer by third level pond layer.Above-described embodiment is only to carry out cubic convolution, pond Change and be illustrated for operating, in other possible embodiments, CNN disaggregated model can carry out multiple convolution, Chi Huacao Make, the present embodiment does not constitute this and limits.
The last one convolution operation of fourth stage convolutional layer is 1x1 convolution, and characteristic pattern is mapped to classification from feature space Space obtains classification results.
The classification capacity of CNN disaggregated model is learnt by the characteristics of image in training sample, a kind of optional reality It applies in example, first acquisition training sample set, then CNN disaggregated model is trained using using training sample set.Training sample This collection includes multiple training samples.The corresponding recognition result of training sample is practical fixed recognition result, i.e. training sample In the corresponding true gesture-type of region of interest have determined.For example, include multiple region of interest in training sample, The corresponding true gesture-type of middle part region of interest is " GOOD ", and region of interest corresponding true gesture-type in part is " OK ", the corresponding true gesture-type of part region of interest are " OK " etc..
In the embodiment of the present application, characteristic point detection model is the structure according to the characteristic point in history images of gestures and spy Sign determination, illustratively, the structure of characteristic point detection model is as shown in figure 8, characteristic point detection model is also a kind of CNN detection Model, the treatment process of the detection model are as follows: region of interest successively passes through first order convolutional layer, first order pond layer, the second level It is by third level pond layer that third feature figure is defeated after convolutional layer and second level pond layer, third convolutional layer and third pond layer Enter fourth stage convolutional layer, obtains characteristic pattern.
Characteristic point detection model in the embodiment of the present application includes processing step and the fusion spy of predicted characteristics information The processing step for levying study chooses different semantic and different resolutions the predicted characteristics information the step of in fourth stage convolutional layer The characteristic pattern of rate carries out mixing operation, and the fusion between the characteristic pattern of different resolution also needs to carry out resolution adjustment.Such as Fig. 8 It is shown, two characteristic patterns are had chosen in fourth stage convolutional layer and have carried out Single cell fusion, using the characteristic pattern of fusion as level V The input of convolutional layer;In order to improve the robustness of fusion feature, and by two in fourth stage convolutional layer and level V convolutional layer Characteristic pattern has carried out Single cell fusion, using the characteristic pattern of fusion as the input of the 6th grade of convolutional layer.Pass through predicted characteristics point information The step of can predict characteristic point information in region of interest.In the embodiment of the present application, the number of fusion with no restrictions, It merges the scheme of number also within the scope of protection of this application.
After the input feature vector figure for obtaining the 6th grade of convolutional layer, start the processing step for carrying out fusion feature study, In In the processing step of fusion feature study, to predicting, the characteristic point information come carries out up-sampling or deconvolution is handled, increase The resolution ratio of characteristic pattern carries out fusion treatment then in conjunction with the characteristic pattern obtained by convolutional layer, realizes to characteristic point information Error correction and fine tuning.As shown in figure 8, the input feature vector figure of deconvolution treated characteristic pattern and fourth stage convolutional layer is melted It closes, and fused characteristic pattern is input to the 6th grade of convolutional layer, finally obtain characteristic point information.In the embodiment of the present application, It is merged with the characteristic pattern of which rank of convolutional layer and is not limited, it is also an option that other characteristic patterns are merged.
Optionally, in the embodiment of the present application, characteristic point is the hand skeleton node for setting quantity, passes through hand skeleton section Point information can at least determine the information such as gesture scope and gestures direction.Specifically, in the embodiment of the present application, choosing 21 Hand skeleton node is as characteristic point, as shown in figure 9, being capable of detecting when 21 hand skeleton nodes by characteristic point detection model Information.In this application, 21 hand skeleton nodes are set, the scheme of the hand skeleton node of other quantity is also in the application Protection scope in.
Optionally, in the embodiment of the present application, the characteristic point information detected is characterized location information a little, characteristic point Location information can be the coordinate information of characteristic point, and the coordinate basis of the coordinate information can be the coordinate system of picture pick-up device, It can be the coordinate system with images of gestures.
As shown in the above, in step S202, can be distinguished by CNN disaggregated model and characteristic point detection model Determine the gesture-type of user and the gesture feature point information of user, two models are all to determine that the sense in images of gestures is emerging It is handled behind interesting area, and region of interest can at least be determined by CNN semantic segmentation model, that is to say, that in step In S202, determine that the gesture-type of user needs to determine the hand of user using CNN semantic segmentation model and CNN disaggregated model Gesture characteristic point information needs to pass through the treatment process of cascaded design using CNN semantic segmentation model and characteristic point detection model The advantages of there are two aspects:
In a first aspect, the accuracy rate of model is high, performance is excellent.Entire gesture figure can be accounted for avoid region of interest in images of gestures Problem as identifying inaccuracy when ratio is lesser.CNN semantic segmentation model first can find out region of interest and cut out It is then forwarded to CNN disaggregated model or characteristic point detection model, such CNN disaggregated model and characteristic point detection model do not have to It is concerned about a large amount of non-interested redundancy, greatly promotes the accuracy rate of identification, while the complexity of model can also be reduced, The performance of lift scheme.
The flexibility ratio of second aspect, model is high, and scalability is strong, and reusability is high.It can be carried out respectively for each model Targetedly design and optimization, and each model flexibly can be disassembled and be assembled, for example determine that region of interest can be used Region of interest detection model as shown in FIG. 6 also can be used in CNN semantic segmentation model as shown in Figure 5, and does not have to modification figure The model structure of 7 or Fig. 8.
Step S203 determines the gesture additional dynamic effect of the user, according to the user according to the gesture-type Gesture feature point information determine gesture additional dynamic effect play mode.
Specifically, after determining gesture-type, according to the corresponding relationship of gesture-type and gesture additional dynamic effect, energy Enough determine gesture additional dynamic effect.Illustratively, as shown in Figure 10, it is additional to correspond to different gestures for different gesture-types Dynamic effect, when gesture-type is " than the heart ", gesture additional dynamic effect is to drift love, when gesture-type is " transmitting " When, gesture additional dynamic effect is emission bullet effect.
In the embodiment of the present application, after gesture feature point information has been determined, dynamic is determined according to gesture feature point information How effect play mode shows gesture additional effect.Optionally, in the embodiment of the present application, gesture feature point information is The location information of gesture feature point, further, gesture feature point information are the co-ordinate position information of hand skeleton node, the seat The benchmark of cursor position information can be the coordinate system of images of gestures, be also possible to other coordinate systems.
In this application, after gesture feature dot position information has been determined, according to preset order by the coordinate of each characteristic point Location information is sequentially connected, and forms feature point trajectory;Preset order is the order of connection between each characteristic point set, can be with It is identical as the label sequence of hand skeleton node each in Fig. 9, it can also be different.
After determining feature point trajectory, the dynamic effect play mode is determined according to feature point trajectory.That is, logical The determining different characteristic locus of points is crossed, different dynamic effect play mode can be determined, for example, feature point trajectory 1 is corresponding Dynamic effect play mode is gradual change display pattern, and the corresponding dynamic effect play mode of feature point trajectory 2 is that shutter is shown Mode etc..
Further, in this application, determine that dynamic effect play mode includes at least determining move according to feature point trajectory State effect plays direction and dynamic effect plays range.That is, dynamic effect can be played by feature point trajectory The range of direction and dynamic effect.
In the embodiment of the present application, it can only determine that dynamic effect plays direction or determining dynamic by feature point trajectory Effect plays range, or can determine that dynamic effect plays direction and dynamic effect plays range by feature point trajectory.
Illustratively, it is to show from doing gradual change to the right that the corresponding dynamic effect of feature point trajectory 3, which plays direction, dynamic effect It is full frame for playing range;It is diagonally diagonal aobvious to the right from upper left side that the corresponding dynamic effect of feature point trajectory 4, which plays direction, Show, dynamic effect plays in the setting range that range is region of interest, i.e., shows in the setting range of the finger of user.It is special Levying the corresponding dynamic effect of the locus of points 5 and playing range is screen center region etc..
Step S204 really shows the gesture additional dynamic effect according to the gesture additional dynamic effect play mode.
After dynamic effect play mode has been determined, gesture additional dynamic is capable of determining that according to dynamic effect play mode Effect is particularly shown mode.That is, showing gesture additional dynamic effect according to dynamic effect play mode.
Optionally, in the embodiment of the present application, the gesture of user is determined according to the broadcasting direction of dynamic effect play mode The display direction of additional dynamic effect;The gesture additional dynamic effect of user is determined according to the broadcasting range of dynamic effect play mode The indication range of fruit.Illustratively, it is that gradual change from left to right is shown that the corresponding dynamic effect of feature point trajectory 3, which plays direction, is moved State effect play range be it is full frame, then when show the dynamic effect, the display mode of dynamic effect be exactly from left to right gradual change show Show, and is displayed in full screen dynamic effect;The corresponding dynamic effect of feature point trajectory 4 play direction be from upper left side diagonally to the right Diagonal display, dynamic effect play in the setting range that range is region of interest, i.e., carry out in the setting range of the finger of user Display is then showing that the dynamic effect is that the display mode of dynamic effect is exactly in the setting range of region of interest from upper left The diagonal display diagonal to the right in side.
When carrying out video to be played, in addition to playing video content to be played, i.e., the user's obtained by picture pick-up device Outside video frame, determining gesture additional effect is also played, gesture additional effect is played out according to determining display mode.
Illustratively, one section of video to be played of user, as shown in figure 11, video to be played are obtained by picture pick-up device In, user carries out human-computer interaction by gesture, and the gesture-type of user is than the heart, and the display mode of gesture additional dynamic effect is Than the heart between index finger and thumb, so will appear love in " than the heart " gesture of user when playing video content to be played The dynamic effect of the heart, and the dynamic effect of love is that occur between the index finger and thumb of user " than the heart " gesture.
In order to preferably explain the embodiment of the present application, the embodiment of the present application is described below with reference to specific implement scene and is provided A kind of determining dynamic effect method, in this application, user by gesture carry out man-machine interactive operation, by obtain include The video to be played of images of gestures obtains images of gestures, and determines gesture-type and gesture feature respectively according to images of gestures Point information, has determined gesture additional dynamic effect according to gesture-type, according to gesture feature point information, it is determined that dynamic effect is broadcast Mode playback passes through determining dynamic effect play mode, it is determined that the display mode of dynamic effect, and in broadcasting acquisition wait broadcast When putting video, gesture additional dynamic effect is shown according to display mode, specifically, as shown in figure 12.
In Figure 12, images of gestures is obtained in video to be played, is then determined using the method that region of interest detects Then region of interest is cut out from images of gestures and by the region of interest in images of gestures.The side detected by gesture-type Method determines that the gesture-type of region of interest is " gunslinging type ", and the gesture additional effect of " gunslinging type " is " gunslinging effect ", is led to The method for crossing the detection of gesture characteristic point information has determined the gesture feature locus of points of region of interest, is then estimated according to gesture feature point It counts and determines that additional dynamic effect play mode is " injection of index finger direction ".When playing video to be played, in the food of images of gestures Refer on position " projecting bullet ", completes the process that user and computer carry out human-computer interaction by gesture motion.
In order to preferably explain the embodiment of the present application, the application is described below with reference to another specific implement scene and is implemented A kind of method for determining dynamic effect that example provides, describe in the embodiment by obtain include images of gestures view to be played Frequently, images of gestures is obtained, and determines the concrete processing procedure of gesture-type according to images of gestures.As shown in figure 13, it obtains wait broadcast The images of gestures in video is put, images of gestures is input in CNN semantic segmentation model, determination is interested in images of gestures Area is cut out region of interest from images of gestures region of interest is then input to CNN classification by centre linking module In model, obtaining classification results is " GOOD ".
The embodiment of the present application also provides a kind of specific implement scene, and to describe a kind of determination provided by the embodiments of the present application dynamic The method of state effect, describe in the embodiment by obtain include images of gestures video to be played, obtain images of gestures, and Determine the concrete processing procedure of gesture feature point information respectively according to images of gestures.In this application, gesture feature point is hand Images of gestures is input to interested by skeleton node specifically, as shown in figure 14, obtaining the images of gestures in video to be played Area's detection model has found the position of region of interest in images of gestures, is connected module by centre, by region of interest from gesture It cuts out and in image, then region of interest is input to characteristic point detection model, it is determined that the track of characteristic point in region of interest.
The embodiment of the present application also provides a kind of specific implement scene, and to describe a kind of determination provided by the embodiments of the present application dynamic The method of state effect, as shown in figure 15, in the embodiment of the present application, picture pick-up device and computer equipment and display equipment are same Each equipment on one electronic equipment, the electronic equipment can be appointing with camera function, computing function and display function One terminal, such as smart phone, tablet computer or portable personal computer etc..In the embodiment of the present application, with Intelligent flat It is illustrated for TV.
In the scene of the embodiment of the present application, user starts to carry out man-machine friendship with Intelligent flat TV by gesture motion Mutually, the content of human-computer interaction is that determining user gesture acts corresponding gesture additional dynamic effect, and it is flat to be eventually displayed in intelligence On plate TV.
Specific process is that user triggers gesture motion, and Intelligent flat TV shoots the gesture motion of user, formed wait broadcast Video is put, Intelligent flat TV determines the characteristic point information of the corresponding gesture-type of images of gestures and gesture, and then determines hand Gesture additional dynamic effect and dynamic effect display mode, when showing video to be played, carrying out display according to display mode should Gesture additional dynamic effect.It as shown in figure 16, not only include the video to be played obtained in the video that Intelligent flat is televised Content further includes gesture additional dynamic effect, and shows the gesture additional dynamic effect according to display mode.In Figure 16, really Having determined user gesture type is " celebrating gesture ", and the corresponding gesture additional dynamic effect of the gesture is " firework effect ", is determined The display mode of gesture additional dynamic effect is to show in setting range near gesture, attached when playing video content to be played Dynamic effect " firework effect " is added, and has shown " firework effect " in the setting range near gesture.
Based on the above embodiment, refering to fig. 1 shown in 7, the embodiment of the present invention provides a kind of device of determining dynamic effect 1700, comprising:
Acquiring unit 1701, for obtaining the images of gestures of user;
Determination unit 1702, for determined according to the images of gestures user gesture-type and the user Gesture feature point information;According to the gesture-type, the gesture additional dynamic effect of the user is determined, according to the user's Gesture feature point information determines gesture additional dynamic effect play mode;
Display unit 1703, for showing the gesture additional dynamic according to the gesture additional dynamic effect play mode Effect.
Further, the determination unit 1702 is specifically used for:
Determine the region of interest in the images of gestures;
Characteristic point detection model is determined according to the structure and features of the characteristic point in the history images of gestures;
The characteristic point information in the region of interest is determined according to the characteristic point detection model and the region of interest.
Further, the determination unit 1702 is specifically used for:
The region of interest is input to the characteristic point detection model, passes through alternate convolution in characteristic point detection model Operation and pondization operation obtain the characteristics of image of the region of interest;
The structure and features of the characteristic point in the history images of gestures learnt according to the characteristic point detection model Determine the corresponding characteristic point information of described image feature.
Further, the gesture feature point information is gesture feature point co-ordinate position information, the determination unit 1702 It is specifically used for:
The co-ordinate position information of each characteristic point is sequentially connected according to preset order, forms feature point trajectory;
The gesture additional dynamic effect play mode is determined according to the feature point trajectory.
Further, the determination unit 1702 is specifically used for:
According to the feature point trajectory, determine that gesture additional dynamic effect plays direction and dynamic effect plays range.
Further, the determination unit 1702 is specifically used for:
The gesture additional dynamic effect is determined according to the broadcasting direction of the gesture additional dynamic effect play mode Display direction;
The indication range of the gesture additional dynamic effect is determined according to the broadcasting range of the dynamic effect play mode.
Further, the characteristic point is the hand skeleton node for setting quantity, the gesture additional dynamic effect of the user Fruit is love dynamic effect;
The determination unit is specifically used for:
Determine that gesture additional dynamic effect play mode is user's according to the hand skeleton nodal information of the user Gesture additional dynamic effect is played between index finger and thumb;
The display unit is specifically used for:
The love dynamic effect is played between the index finger and thumb of the user.
Based on the same technical idea, the embodiment of the present application provides a kind of computer equipment, as shown in figure 18, including extremely Lack a processor 1801, and the memory 1802 connecting at least one processor, does not limit place in the embodiment of the present application The specific connection medium between device 1801 and memory 1802 is managed, is passed through between processor 1801 and memory 1802 in Figure 18 total For line connection.Bus can be divided into address bus, data/address bus, control bus etc..
In the embodiment of the present application, memory 1802 is stored with the instruction that can be executed by least one processor 1801, until The instruction that a few processor 1801 is stored by executing memory 1802, the method that determining dynamic effect above-mentioned can be executed In included step.
Wherein, processor 1801 is the control centre of computer equipment, can use various interfaces and connection terminal The various pieces of equipment are stored in memory 1802 by running or executing the instruction being stored in memory 1802 and calling Interior data, to obtain client address.Optionally, processor 1801 may include one or more processing units, processor 1801 can integrate application processor and modem processor, wherein the main processing operation system of application processor, user interface With application program etc., modem processor mainly handles wireless communication.It is understood that above-mentioned modem processor It can not be integrated into processor 1801.In some embodiments, processor 1801 and memory 1802 can be in same chips Upper realization, in some embodiments, they can also be realized respectively on independent chip.
Processor 1801 can be general processor, such as central processing unit (CPU), digital signal processor, dedicated collection At circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array or other Perhaps transistor logic, discrete hardware components may be implemented or execute the application reality for programmable logic device, discrete gate Apply each method, step disclosed in example and logic diagram.General processor can be microprocessor or any conventional processing Device etc..The step of method in conjunction with disclosed in the embodiment of the present application, can be embodied directly in hardware processor and execute completion, or With in processor hardware and software module combination execute completion.
Memory 1802 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey Sequence, non-volatile computer executable program and module.Memory 1802 may include the storage medium of at least one type, It such as may include flash memory, hard disk, multimedia card, card-type memory, random access storage device (Random Access Memory, RAM), static random-access memory (Static Random Access Memory, SRAM), may be programmed read-only deposit Reservoir (Programmable Read Only Memory, PROM), read-only memory (Read Only Memory, ROM), band Electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), magnetic storage, disk, CD etc..Memory 1802 can be used for carrying or storing have instruction or data The desired program code of structure type and can by any other medium of computer access, but not limited to this.The application is real Applying the memory 1802 in example can also be circuit or other devices that arbitrarily can be realized store function, for storing program Instruction and/or data.
Based on the same technical idea, the embodiment of the present application provides a kind of computer readable storage medium, is stored with The computer program that can be executed by computer equipment, when described program is run on a computing device, so that the computer Equipment executes the step of method for determining dynamic effect.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, those skilled in the art can carry out various modification and variations without departing from this hair to the embodiment of the present invention The spirit and scope of bright embodiment.In this way, if these modifications and variations of the embodiment of the present invention belong to the claims in the present invention And its within the scope of equivalent technologies, then the present invention is also intended to include these modifications and variations.

Claims (12)

1. a kind of method of determining dynamic effect, which is characterized in that the described method includes:
Obtain the images of gestures of user;
The gesture-type of the user and the gesture feature point information of the user are determined according to the images of gestures;
According to the gesture-type, the gesture additional dynamic effect of the user is determined, according to the gesture feature of user point Information determines gesture additional dynamic effect play mode;
The gesture additional dynamic effect is shown according to the gesture additional dynamic effect play mode.
2. the method according to claim 1, wherein the hand for determining the user according to the images of gestures Gesture characteristic point information, comprising:
Determine the region of interest in the images of gestures;
Characteristic point detection model is determined according to the structure and features of the characteristic point in the history images of gestures;
The characteristic point information in the region of interest is determined according to the characteristic point detection model and the region of interest.
3. according to the method described in claim 2, it is characterized in that, described according to the characteristic point detection model and the sense Region of interest determines the characteristic point information in the region of interest, comprising:
The region of interest is input to the characteristic point detection model, passes through alternate convolution operation in characteristic point detection model The characteristics of image of the region of interest is obtained with pondization operation;
The structure and features of the characteristic point in the history images of gestures learnt according to the characteristic point detection model determines The corresponding characteristic point information of described image feature.
4. any method according to claim 1~3, which is characterized in that the gesture feature point information is gesture feature Point co-ordinate position information, the gesture feature point information according to the user determine gesture additional dynamic effect play mode, Include:
The co-ordinate position information of each characteristic point is sequentially connected according to preset order, forms feature point trajectory;
The gesture additional dynamic effect play mode is determined according to the feature point trajectory.
5. according to the method described in claim 4, it is characterized in that, described determine that the gesture is attached according to the feature point trajectory Add dynamic effect play mode, comprising:
According to the feature point trajectory, determine that gesture additional dynamic effect plays direction and gesture additional dynamic effect plays model It encloses.
6. according to the method described in claim 5, it is characterized in that, described according to the gesture additional dynamic effect play mode Show the gesture additional dynamic effect, comprising:
The display of the gesture additional dynamic effect is determined according to the broadcasting direction of the gesture additional dynamic effect play mode Direction;
The display of the gesture additional dynamic effect is determined according to the broadcasting range of the gesture additional dynamic effect play mode Range.
7. according to the method in claim 2 or 3, which is characterized in that the characteristic point is the hand skeleton section for setting quantity Point, the gesture additional dynamic effect of the user are love dynamic effect;
It is described that gesture additional dynamic effect play mode is determined according to the gesture feature point information of the user, comprising:
Determine that gesture additional dynamic effect play mode is the index finger in user according to the hand skeleton nodal information of the user Gesture additional dynamic effect is played between thumb;
It is described to show the gesture additional dynamic effect according to the gesture additional dynamic effect play mode, comprising:
The love dynamic effect is played between the index finger and thumb of the user.
8. a kind of device of determining dynamic effect characterized by comprising
Acquiring unit, for obtaining the images of gestures of user;
Determination unit, for determining the gesture-type of the user and the gesture feature of the user according to the images of gestures Point information;According to the gesture-type, the gesture additional dynamic effect of the user is determined, according to the gesture feature of the user Point information determines gesture additional dynamic effect play mode;
Display unit, for showing the gesture additional dynamic effect according to the gesture additional dynamic effect play mode.
9. device according to claim 8, which is characterized in that the determination unit is specifically used for:
Determine the region of interest in the images of gestures;
Characteristic point detection model is determined according to the structure and features of the characteristic point in the history images of gestures;
The characteristic point information in the region of interest is determined according to the characteristic point detection model and the region of interest.
10. device according to claim 9, which is characterized in that the determination unit is specifically used for:
The region of interest is input to the characteristic point detection model, passes through alternate convolution operation in characteristic point detection model The characteristics of image of the region of interest is obtained with pondization operation;
The structure and features of the characteristic point in the history images of gestures learnt according to the characteristic point detection model determines The corresponding characteristic point information of described image feature.
11. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor is realized described in claim 1~7 any claim when executing described program The step of method.
12. a kind of computer readable storage medium, which is characterized in that it is stored with the computer journey that can be executed by computer equipment Sequence, when described program is run on a computing device, so that computer equipment perform claim requirement 1~7 is any described The step of method.
CN201910703830.3A 2019-07-31 2019-07-31 A kind of method and device of determining dynamic effect Pending CN110442238A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910703830.3A CN110442238A (en) 2019-07-31 2019-07-31 A kind of method and device of determining dynamic effect

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910703830.3A CN110442238A (en) 2019-07-31 2019-07-31 A kind of method and device of determining dynamic effect

Publications (1)

Publication Number Publication Date
CN110442238A true CN110442238A (en) 2019-11-12

Family

ID=68432675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910703830.3A Pending CN110442238A (en) 2019-07-31 2019-07-31 A kind of method and device of determining dynamic effect

Country Status (1)

Country Link
CN (1) CN110442238A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111766947A (en) * 2020-06-30 2020-10-13 歌尔科技有限公司 Display method, display device, wearable device and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933334A (en) * 2015-12-30 2017-07-07 上海风语筑展览有限公司 A kind of spectators and three-dimensional development Interactive Experience method
CN107168527A (en) * 2017-04-25 2017-09-15 华南理工大学 The first visual angle gesture identification and exchange method based on region convolutional neural networks
CN107340852A (en) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 Gestural control method, device and terminal device
US20180101237A1 (en) * 2016-01-04 2018-04-12 Boe Technology Group Co., Ltd. System, method, and apparatus for man-machine interaction
CN108594997A (en) * 2018-04-16 2018-09-28 腾讯科技(深圳)有限公司 Gesture framework construction method, apparatus, equipment and storage medium
CN108762505A (en) * 2018-05-29 2018-11-06 腾讯科技(深圳)有限公司 Virtual object control method, device, storage medium based on gesture and equipment
CN108932053A (en) * 2018-05-21 2018-12-04 腾讯科技(深圳)有限公司 Drawing practice, device, storage medium and computer equipment based on gesture
CN108958475A (en) * 2018-06-06 2018-12-07 阿里巴巴集团控股有限公司 virtual object control method, device and equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933334A (en) * 2015-12-30 2017-07-07 上海风语筑展览有限公司 A kind of spectators and three-dimensional development Interactive Experience method
US20180101237A1 (en) * 2016-01-04 2018-04-12 Boe Technology Group Co., Ltd. System, method, and apparatus for man-machine interaction
CN107340852A (en) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 Gestural control method, device and terminal device
CN107168527A (en) * 2017-04-25 2017-09-15 华南理工大学 The first visual angle gesture identification and exchange method based on region convolutional neural networks
CN108594997A (en) * 2018-04-16 2018-09-28 腾讯科技(深圳)有限公司 Gesture framework construction method, apparatus, equipment and storage medium
CN108932053A (en) * 2018-05-21 2018-12-04 腾讯科技(深圳)有限公司 Drawing practice, device, storage medium and computer equipment based on gesture
CN108762505A (en) * 2018-05-29 2018-11-06 腾讯科技(深圳)有限公司 Virtual object control method, device, storage medium based on gesture and equipment
CN108958475A (en) * 2018-06-06 2018-12-07 阿里巴巴集团控股有限公司 virtual object control method, device and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111766947A (en) * 2020-06-30 2020-10-13 歌尔科技有限公司 Display method, display device, wearable device and medium

Similar Documents

Publication Publication Date Title
US11055516B2 (en) Behavior prediction method, behavior prediction system, and non-transitory recording medium
CN110472531A (en) Method for processing video frequency, device, electronic equipment and storage medium
CN109934293A (en) Image-recognizing method, device, medium and obscure perception convolutional neural networks
CN107742107A (en) Facial image sorting technique, device and server
CN109952610A (en) The Selective recognition of image modifier and sequence
CN109359538A (en) Training method, gesture identification method, device and the equipment of convolutional neural networks
CN110532984A (en) Critical point detection method, gesture identification method, apparatus and system
CN110147711A (en) Video scene recognition methods, device, storage medium and electronic device
CN106407891A (en) Target matching method based on convolutional neural network and device
CN106909887A (en) A kind of action identification method based on CNN and SVM
CN109871843A (en) Character identifying method and device, the device for character recognition
CN106326853A (en) Human face tracking method and device
CN109816659A (en) Image partition method, apparatus and system
CN110008961A (en) Text real-time identification method, device, computer equipment and storage medium
CN113050860B (en) Control identification method and related device
CN108628455B (en) Virtual sand painting drawing method based on touch screen gesture recognition
CN110533119A (en) The training method of index identification method and its model, device and electronic system
CN110688897A (en) Pedestrian re-identification method and device based on joint judgment and generation learning
CN111985597A (en) Model compression method and device
CN111291713B (en) Gesture recognition method and system based on skeleton
CN116152416A (en) Picture rendering method and device based on augmented reality and storage medium
CN114360018B (en) Rendering method and device of three-dimensional facial expression, storage medium and electronic device
CN110852224A (en) Expression recognition method and related device
CN110442238A (en) A kind of method and device of determining dynamic effect
CN110633641A (en) Intelligent security pedestrian detection method, system and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination