CN110516572A - Method for identifying sports event video clip, electronic equipment and storage medium - Google Patents
Method for identifying sports event video clip, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN110516572A CN110516572A CN201910759733.6A CN201910759733A CN110516572A CN 110516572 A CN110516572 A CN 110516572A CN 201910759733 A CN201910759733 A CN 201910759733A CN 110516572 A CN110516572 A CN 110516572A
- Authority
- CN
- China
- Prior art keywords
- action classification
- recognition result
- sample data
- video clip
- preset model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000009471 action Effects 0.000 claims abstract description 108
- 230000002860 competitive effect Effects 0.000 claims description 36
- 238000013527 convolutional neural network Methods 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 9
- 238000004422 calculation algorithm Methods 0.000 claims description 8
- 230000015654 memory Effects 0.000 claims description 8
- 239000012530 fluid Substances 0.000 claims description 5
- 230000001052 transient effect Effects 0.000 claims description 5
- 241001269238 Data Species 0.000 claims 1
- 230000008901 benefit Effects 0.000 abstract description 6
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 208000001491 myopia Diseases 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- NVNSXBXKNMWKEJ-UHFFFAOYSA-N 5-[[5-(2-nitrophenyl)furan-2-yl]methylidene]-1,3-diphenyl-2-sulfanylidene-1,3-diazinane-4,6-dione Chemical compound [O-][N+](=O)C1=CC=CC=C1C(O1)=CC=C1C=C1C(=O)N(C=2C=CC=CC=2)C(=S)N(C=2C=CC=CC=2)C1=O NVNSXBXKNMWKEJ-UHFFFAOYSA-N 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000013016 learning Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the invention provides a method for identifying a video clip of a sports event, electronic equipment and a storage medium, wherein the method comprises the following steps: identifying the action category of the sports event video clip by adopting a first preset model; the training of the first preset model adopts first sample data; the first sample data is data related to action categories; if the accuracy of the recognition result is lower than a preset threshold, re-recognizing the action type by adopting a second preset model, and taking the re-recognition result as a final recognition result of the action type; training of the second preset model adopts second sample data; the second sample data is data related to the relative position of a target reference object in the video clip of the sports event; the relative position is a position between the target reference object and the trigger portion of the action type. The method for identifying the video clips of the sports events, the electronic device and the storage medium provided by the embodiment of the invention can accurately identify the video clips of the sports events and also have the advantages of high efficiency, simplicity and strong universality.
Description
Technical field
The present invention relates to technical field of video processing more particularly to a kind of methods for identifying competitive sports video clip, electricity
Sub- equipment and storage medium.
Background technique
When video class app needs to issue short Video Roundup of some sports tournaments under different scenes (as scored, point
Ball, penalty shot etc.), it, can also be by way of deep learning algorithm other than the video method of traditional short-sighted frequency of artificial editing
AI is carried out to match video and automates editing.It is exactly to know to the scene of match video that AI, which automates editing firstly the need of what is done,
Not, some video scenes can be identified by having the method for many deep learnings at present, such as use 3D convolutional neural networks pair
Kinetics data set (personage's behavior class) identifies that Average Accuracy is up to 83.6%, for another example with LSTM network to UCF-101
Data set (movement of 101 classes) identifies that Average Accuracy is up to 88.6%.It can be found that: existing technical solution is for single
The human action scene Recognition of change has good accuracy rate, still, for the scene Recognition of sports tournament class, especially basketball
And football, effect be not just it is so ideal, for this kind of scene discrimination 60% or so, this has been primarily due to many people
The scene that group is clustered round, and the movement span of personage's interaction is big, there are also various environmental differences, such as multi-angle of view, illumination, low resolution
Factor.It is high so as to cause the complexity of training sample, so that the accuracy rate of identification disaggregated model is low.
Therefore, drawbacks described above how is avoided, accurately identifies competitive sports video clip, becoming need solve the problems, such as.
Summary of the invention
In view of the problems of the existing technology, the embodiment of the present invention provides a kind of side for identifying competitive sports video clip
Method, electronic equipment and storage medium.
The embodiment of the present invention provides a kind of method for identifying competitive sports video clip, comprising:
It is identified using action classification of first preset model to sport event video segment;First preset model
Training use first sample data;The first sample data are data relevant to action classification;
If the accuracy rate of recognition result is lower than preset threshold, the action classification is carried out again using the second preset model
Identification, and using recognition result again as the final recognition result of the action classification;The training of second preset model uses
Second sample data;Second sample data is related to the relative position of target object of reference in sport event video segment
Data;The relative position is the position between target object of reference and the triggering position of action classification.
The embodiment of the present invention provides a kind of electronic equipment, comprising: memory, processor and storage are on a memory and can be
The computer program run on processor, wherein
The processor realizes following method and step when executing described program:
It is identified using action classification of first preset model to sport event video segment;First preset model
Training use first sample data;The first sample data are data relevant to action classification;
If the accuracy rate of recognition result is lower than preset threshold, the action classification is carried out again using the second preset model
Identification, and using recognition result again as the final recognition result of the action classification;The training of second preset model uses
Second sample data;Second sample data is related to the relative position of target object of reference in sport event video segment
Data;The relative position is the position between target object of reference and the triggering position of action classification.
The embodiment of the present invention provides a kind of non-transient computer readable storage medium, is stored thereon with computer program, should
Following method and step is realized when computer program is executed by processor:
It is identified using action classification of first preset model to sport event video segment;First preset model
Training use first sample data;The first sample data are data relevant to action classification;
If the accuracy rate of recognition result is lower than preset threshold, the action classification is carried out again using the second preset model
Identification, and using recognition result again as the final recognition result of the action classification;The training of second preset model uses
Second sample data;Second sample data is related to the relative position of target object of reference in sport event video segment
Data;The relative position is the position between target object of reference and the triggering position of action classification.
Method, electronic equipment and the storage medium of identification competitive sports video clip provided in an embodiment of the present invention, pass through
The action classification of sport event video segment is recognized, and the model of second of identification will be with sport event video piece
The relevant data in relative position of target object of reference in section, being capable of accurately identifier as the second trained sample data
Race video clip is educated, is also had the advantages that efficient, simple and versatile.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this hair
Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root
Other attached drawings are obtained according to these attached drawings.
Fig. 1 is the embodiment of the method flow chart of present invention identification competitive sports video clip;
Fig. 2 is another embodiment flow chart of method of present invention identification competitive sports video clip;
Fig. 3 is electronic equipment entity structure schematic diagram provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
Every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
Fig. 1 is the embodiment of the method flow chart of present invention identification competitive sports video clip, as shown in Figure 1, the present invention is real
The method that a kind of identification competitive sports video clip of example offer is provided, comprising the following steps:
S101: it is identified using action classification of first preset model to sport event video segment;Described first is pre-
If the training of model uses first sample data;The first sample data are data relevant to action classification.
Specifically, device identifies the action classification of sport event video segment using the first preset model;It is described
The training of first preset model uses first sample data;The first sample data are data relevant to action classification.Dress
Setting can be for electronic equipment, by taking basketball as an example, and action classification may include laying up, dunk shot, robbing backboard and penalty shot etc., due to laying up
Higher with the similitude of the technical movements of dunk shot, therefore, the first preset model action classification that is not easily distinguishable out is laid up or is detained
Basket, it is therefore, lower with the accuracy rate of the recognition result of dunk shot for laying up, it will usually lower than the recognition result for robbing backboard and penalty shot
Accuracy rate.
First preset model can be for using non local module nonlocal and the double-current volume for expanding 3D convolution I3D and combining
Product neural network, the embodiment of the present invention use double fluid expansion 3D convolution (I3D) convolutional neural networks as basic network, and
Non local module (nonlocal) is added wherein to obtain better global effect.I3D convolutional neural networks are a kind of volume
Product core and Chi Huahe expand into the network structure of 3D form, i.e., base all convolution kernels and Chi Huahe are long and wide in script
On plinth, and increase time dimension.And nonlocal module extracts the space time information other than video, for obtaining depth nerve net
The long-term memory and global information of network, have the characteristics that efficient, simple and versatile, and this nonlocal module can be with
Very easily it is embedded into existing network frame.Therefore, the embodiment of the present invention uses the volume that nonlocal and I3D are combined
Product neural network, and the convolutional neural networks are trained, then by trained convolutional neural networks, identify competitive sports
The accuracy rate of the action classification of video clip.The training of first preset model can refer to subsequent explanation.
Fig. 2 is another embodiment flow chart of method of present invention identification competitive sports video clip, as shown in Fig. 2, can be with
Video stream data is first obtained, video segmentation is then cut into, that is, sport event video segment is got, using I3D-nonlocal
Model prediction is to be identified using action classification of first preset model to sport event video segment.
S102: if the accuracy rate of recognition result is lower than preset threshold, using the second preset model to the action classification
It is identified again, and using recognition result again as the final recognition result of the action classification;The instruction of second preset model
Practice and uses the second sample data;Second sample data is the opposite position with the target object of reference in sport event video segment
Set relevant data;The relative position is the position between target object of reference and the triggering position of action classification.
Specifically, if device judgement knows that the accuracy rate of recognition result lower than preset threshold, uses the second preset model
The action classification is identified again, and using recognition result again as the final recognition result of the action classification;Described
The training of two preset models uses the second sample data;Second sample data be and the target in sport event video segment
The relevant data in the relative position of object of reference;The relative position is between target object of reference and the triggering position of action classification
Position.Preset threshold can be independently arranged according to the actual situation, and for the scene of above-mentioned basketball race, preset threshold is chosen as
70%, the accuracy rate of recognition result can be indicated with the confidence level of action classification.
Illustrate referring to the example above, for laying up with the accuracy rate of the recognition result of dunk shot lower than preset threshold, then uses
Second preset model is identified that the second preset model can be convolutional neural networks CNN classifier to the action classification again.
By taking basketball as an example, target object of reference is backboard frame, and the relative position of target object of reference is the position between backboard frame and hand,
It is to be understood that for position of the backboard frame between hand farther out the case where, i.e. the distance between backboard frame and hand
Greater than pre-determined distance, then the action classification is to lay up;For the closer situation in position of the backboard frame between hand, i.e. backboard
The distance between frame and hand are less than pre-determined distance, then the action classification is dunk shot, and pre-determined distance can according to the actual situation certainly
Main setting.The training of second preset model can refer to subsequent explanation.
The relative position, which can be, obtains every frame picture detection in the sport event video segment using yolo algorithm
.Its full name of yolo is You Only Look Once:Unified, Real-Time Object Detection, You
What Only Look Once was said is only to need a CNN operation, and Unified refers to that this is a unified frame, provides
The prediction of end-to-end, and Real-Time embodiment is that yolo algorithm speed is fast.
The method of identification competitive sports video clip provided in an embodiment of the present invention, by sport event video segment
Action classification is recognized, and the model of second identification is by the phase with the target object of reference in sport event video segment
The second sample data to the relevant data in position as training, can accurately identify competitive sports video clip, also have
Have the advantages that efficient, simple and versatile.
On the basis of the above embodiments, described that the action classification is identified again using the second preset model, and
Using recognition result again as the final recognition result of the action classification, comprising:
If recognition result includes multiple action classifications again, the quantity of everything classification is obtained respectively, and most by quantity
More action classifications is as the final recognition result.
If obtaining everything class respectively specifically, device judgement knows that again recognition result includes multiple action classifications
Other quantity, and using the most action classification of quantity as the final recognition result.Referring to the example above, multiple action classifications
It may include laying up and two classifications of dunk shot, obtain the lay up quantity of action classification and the quantity of dunk shot action classification respectively, such as
Fruit lay up action classification quantity be greater than dunk shot action classification quantity, then will lay up as final recognition result;If dunk shot
The quantity of action classification is greater than the quantity for action classification of laying up, then using dunk shot as final recognition result.
The method of identification competitive sports video clip provided in an embodiment of the present invention, is determined most by the quantity of action classification
Whole recognition result is further able to accurately identify competitive sports video clip.
On the basis of the above embodiments, described that the action classification is identified again using the second preset model, and
Using recognition result again as the final recognition result of the action classification, comprising:
If judgement knows that the most action classification of quantity is not unique, the most action classification of all quantity is obtained respectively
Confidence level, and using the maximum action classification of confidence value as the final recognition result.
If it is most to obtain all quantity respectively specifically, device judgement knows that the most action classification of quantity is not unique
Action classification confidence level, and using the maximum action classification of confidence value as the final recognition result.Referring to above-mentioned
Citing obtains the confidence for movement of laying up that is, if the quantity for action classification of laying up is equal to the quantity of dunk shot action classification respectively
The confidence level of degree and dunk shot movement, if the numerical value of the confidence level for movement of laying up is greater than the numerical value of the confidence level of dunk shot movement,
It will lay up as final recognition result;If the numerical value of the confidence level for movement of laying up is less than the numerical value of the confidence level of dunk shot movement,
Then using dunk shot as final recognition result.
The method of identification competitive sports video clip provided in an embodiment of the present invention, passes through the confidence value of action classification
It determines final recognition result, is further able to accurately identify competitive sports video clip.
On the basis of the above embodiments, the relative position is using yolo algorithm to the sport event video segment
In every frame picture it is detected.
Specifically, the relative position in device is using yolo algorithm to every in the sport event video segment
Frame picture is detected.It can refer to above description, repeat no more.
The method of identification competitive sports video clip provided in an embodiment of the present invention regards competitive sports using yolo algorithm
The mode of every frame picture detection in frequency segment obtains the relative position, guarantees the accuracy that relative position obtains, is further able to
It is enough accurately to identify competitive sports video clip.
On the basis of the above embodiments, first preset model is swollen using non local module nonlocal and double fluid
The convolutional neural networks that swollen 3D convolution I3D is combined.
Specifically, first preset model in device is using 3D volumes of non local module nonlocal and double fluid expansion
The convolutional neural networks that product I3D is combined.It can refer to above description, repeat no more.
The method of identification competitive sports video clip provided in an embodiment of the present invention, the first preset model is selected as non local
The convolutional neural networks that module nonlocal and double fluid expansion 3D convolution I3D are combined, are further able to accurately identify sport
Race video clip also has the advantages that efficient, simple and versatile.
On the basis of the above embodiments, second preset model is convolutional neural networks CNN classifier.
Specifically, second preset model in device is convolutional neural networks CNN classifier.It can refer to and state
It is bright, it repeats no more.
Second preset model is selected as convolution mind by the method for identification competitive sports video clip provided in an embodiment of the present invention
Through network C NN classifier, it is further able to accurately identify competitive sports video clip.
On the basis of the above embodiments, the training of first preset model, comprising:
Each action classification data of each sport event video segment are collected, and as the first sample data.
Specifically, device collects each action classification data of each sport event video segment, and as the first sample
Data.Referring to the example above, first sample data can be respectively to lay up, dunk shot, rob this 5 classifications of backboard, penalty shot and background
Video clip, each video clip can be 64 frames.
Pre-process the first sample data, and using the pretreated first sample data training nonlocal and
The convolutional neural networks that the I3D is combined.
Specifically, device pre-processes the first sample data, and using pretreated first sample data training institute
State the convolutional neural networks that nonlocal and the I3D are combined.Following steps can be divided by pre-processing: first, due to
Different classes of sample size differs greatly, therefore, it is necessary to be overturn to sample, the behaviour of the image enhancements such as rotation and plus noise
Make, to increase training sample, so that the sample size of each classification is balanced.Second, by the unified contracting of every frame picture of training sample
It puts as same size, length and width as picture to be adjusted to 256*320 in the embodiment of the present invention are loaded with increasing in training process
The speed of model.The every picture sampled out is cut to multiple new pictures by the way of random cropping by third, the present invention
The size picture that 3 224*224 have just been cut in embodiment, as the input sample of algorithm, to increase the extensive of training sample
Property.Format is read in suitable for the model of the embodiment of the present invention finally, converting the sample after data processing to, that is, is converted into LMDB
Form, wherein sample catalogue address and label data are stored in LMDB.
Adjust suitable learning rate, the number of iterations and training parameter.In embodiments of the present invention, sample rate is set as 8, because
This, the short-sighted frequency of 64 frames shares the input of 8 picture samples, and every 400 iteration save a model, when the number of iterations is 12800
When, error rate is minimum.Therefore, select the model under the number of iterations for trained first preset model.
The convolutional neural networks that the nonlocal and the I3D when being up to the first preset condition are combined are as instruction
The first preset model perfected.
Specifically, the convolution mind that the nonlocal and the I3D when device is up to the first preset condition are combined
Through network as trained first preset model.First preset condition can reach 12800 for above-mentioned the number of iterations, not make to have
Body limits.
The method of identification competitive sports video clip provided in an embodiment of the present invention, by being instructed to the first preset model
Practice, ensure that the accuracy of the first preset model, be further able to accurately identify competitive sports video clip, also there is height
Effect, simple and versatile advantage.
On the basis of the above embodiments, the training of second preset model, comprising:
The corresponding each station-keeping data of target object of reference of each sport event video segment is collected, and as described second
Sample data.
Specifically, device collects the corresponding each station-keeping data of target object of reference of each sport event video segment, and
As second sample data.Second sample data can be respectively lay up, the basket in the video clip of two classifications of dunk shot
Position data of the sheet frame relative to hand, each video clip can be 64 frames or so.
Second sample data is pre-processed, and using pretreated the second sample data training CNN classifier.
Specifically, device pre-processes second sample data, and using pretreated the second sample data training institute
State CNN classifier.Pre-process the second sample data, using pretreated the second sample data training CNN classifier
Step, can with above-mentioned pretreatment first sample data, training nonlocal and I3D combine convolutional neural networks the step of
It is identical, it repeats no more.
CNN classifier when being up to the second preset condition is as trained second preset model.
Specifically, CNN classifier when device is up to the second preset condition is as the trained second default mould
Type.Second preset condition may include that the number of iterations reaches preset times or model error less than default error, not limit specifically
Fixed, preset times and default error can be independently arranged according to the actual situation.
The method of identification competitive sports video clip provided in an embodiment of the present invention, by being instructed to the second preset model
Practice, ensure that the accuracy of the second preset model, be further able to accurately identify competitive sports video clip.
It should be understood that may include different types of action classification in same sport event video segment, it can
Including laying up, dunk shot, rob backboard, penalty shot.Using the first preset model to robbing backboard and recognition result that penalty shot is identified
Accuracy rate is higher than preset threshold, and uses the first preset model low with the accuracy rate for the recognition result that dunk shot is identified to laying up
In preset threshold.Therefore, this method after step slol, can also include the following steps:
S102 ': the accuracy rate of recognition result is lower than the target action classification of preset threshold if it exists, then default using second
Model identifies the target action classification again, and according to recognition result again and higher than the recognition result of the preset threshold
It is determined as the final recognition result of the action classification;The training of second preset model uses the second sample data;Institute
Stating the second sample data is data relevant to the relative position of target object of reference in sport event video segment;It is described opposite
Position is the position between target object of reference and the triggering position of action classification.
Be described as follows referring to Fig. 2: the recognition result higher than the preset threshold is to rob backboard and penalty shot, synthesizes short-sighted frequency,
It will lay up with dunk shot as target action classification, and be identified again using two preset models, at this point, recognition result and being higher than again
The recognition result of the preset threshold may include laying up, dunk shot, robbing backboard, penalty shot, can be further respectively according to movement
Quantity, the mode of the confidence level of action classification of classification determine final recognition result.Quantity, action classification about action classification
Confidence level determine that the explanation of final recognition result can refer to the above-mentioned explanation laid up with two kinds of action classifications of dunk shot, it is no longer superfluous
It states.
Fig. 3 is electronic equipment entity structure schematic diagram provided in an embodiment of the present invention, as shown in figure 3, the electronic equipment
It include: processor (processor) 301, memory (memory) 302 and bus 303;
Wherein, the processor 301, memory 302 complete mutual communication by bus 303;
The processor 301 is used to call the program instruction in the memory 302, to execute above-mentioned each method embodiment
Provided method, for example, identified using action classification of first preset model to sport event video segment;Institute
The training of the first preset model is stated using first sample data;The first sample data are data relevant to action classification;
If the accuracy rate of recognition result is lower than preset threshold, the action classification is identified again using the second preset model, and
Using recognition result again as the final recognition result of the action classification;The training of second preset model uses the second sample
Data;Second sample data is data relevant to the relative position of target object of reference in sport event video segment;
The relative position is the position between target object of reference and the triggering position of action classification.
The present embodiment discloses a kind of computer program product, and the computer program product includes being stored in non-transient calculating
Computer program on machine readable storage medium storing program for executing, the computer program include program instruction, when described program instruction is calculated
When machine executes, computer is able to carry out method provided by above-mentioned each method embodiment, for example, uses the first preset model
The action classification of sport event video segment is identified;The training of first preset model uses first sample data;
The first sample data are data relevant to action classification;If the accuracy rate of recognition result is lower than preset threshold, use
Second preset model identifies the action classification again, and using recognition result again as the final identification of the action classification
As a result;The training of second preset model uses the second sample data;Second sample data is and sport event video
The relevant data in relative position of target object of reference in segment;The relative position is the touching of target object of reference and action classification
Send out the position between position.
The present embodiment provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage medium
Computer instruction is stored, the computer instruction makes the computer execute method provided by above-mentioned each method embodiment, example
It such as include: to be identified using action classification of first preset model to sport event video segment;First preset model
Training use first sample data;The first sample data are data relevant to action classification;If the standard of recognition result
True rate is lower than preset threshold, then is identified again using the second preset model to the action classification, and recognition result will make again
For the final recognition result of the action classification;The training of second preset model uses the second sample data;Described second
Sample data is data relevant to the relative position of target object of reference in sport event video segment;The relative position is
Position between target object of reference and the triggering position of action classification.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through
The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program
When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes: ROM, RAM, magnetic disk or light
The various media that can store program code such as disk.
The apparatus embodiments described above are merely exemplary, wherein described, unit can as illustrated by the separation member
It is physically separated with being or may not be, component shown as a unit may or may not be physics list
Member, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needs
In some or all of the modules achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creativeness
Labour in the case where, it can understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on
Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should
Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers
It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation
Method described in certain parts of example or embodiment.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (10)
1. a kind of method for identifying competitive sports video clip characterized by comprising
It is identified using action classification of first preset model to sport event video segment;The instruction of first preset model
Practice and uses first sample data;The first sample data are data relevant to action classification;
If the accuracy rate of recognition result is lower than preset threshold, the action classification is known again using the second preset model
Not, and using recognition result again as the final recognition result of the action classification;The training of second preset model is using the
Two sample datas;Second sample data is relevant to the relative position of target object of reference in sport event video segment
Data;The relative position is the position between target object of reference and the triggering position of action classification.
2. the method for identification competitive sports video clip according to claim 1, which is characterized in that described pre- using second
If model identifies the action classification again, and using recognition result again as the final recognition result of the action classification,
Include:
If recognition result includes multiple action classifications again, the quantity of everything classification is obtained respectively, and quantity is most
Action classification is as the final recognition result.
3. the method for identification competitive sports video clip according to claim 2, which is characterized in that described pre- using second
If model identifies the action classification again, and using recognition result again as the final recognition result of the action classification,
Include:
If judgement knows that the most action classification of quantity is not unique, the confidence of the most action classification of all quantity is obtained respectively
Degree, and using the maximum action classification of confidence value as the final recognition result.
4. the method for identification competitive sports video clip according to any one of claims 1 to 3, which is characterized in that the phase
It is detected to every frame picture in the sport event video segment using yolo algorithm to position.
5. the method for identification competitive sports video clip according to any one of claims 1 to 3, which is characterized in that described the
One preset model is the convolutional neural networks combined using non local module nonlocal and double fluid expansion 3D convolution I3D.
6. the method for identification competitive sports video clip according to any one of claims 1 to 3, which is characterized in that described the
Two preset models are convolutional neural networks CNN classifier.
7. the method for identification competitive sports video clip according to claim 5, which is characterized in that the first default mould
The training of type, comprising:
Each action classification data of each sport event video segment are collected, and as the first sample data;
Pre-process the first sample data, and using the pretreated first sample data training nonlocal and described
The convolutional neural networks that I3D is combined;
The convolutional neural networks that the nonlocal and the I3D when being up to the first preset condition are combined are used as and train
The first preset model.
8. the method for identification competitive sports video clip according to claim 6, which is characterized in that the second default mould
The training of type, comprising:
The corresponding each station-keeping data of target object of reference of each sport event video segment is collected, and as second sample
Data;
Second sample data is pre-processed, and using pretreated the second sample data training CNN classifier;
CNN classifier when being up to the second preset condition is as trained second preset model.
9. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor
Machine program, which is characterized in that the processor is realized as described in any one of claim 1 to 8 when executing the computer program
The step of method.
10. a kind of non-transient computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer
It is realized when program is executed by processor such as the step of any one of claim 1 to 8 the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910759733.6A CN110516572B (en) | 2019-08-16 | 2019-08-16 | Method for identifying sports event video clip, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910759733.6A CN110516572B (en) | 2019-08-16 | 2019-08-16 | Method for identifying sports event video clip, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110516572A true CN110516572A (en) | 2019-11-29 |
CN110516572B CN110516572B (en) | 2022-06-28 |
Family
ID=68625506
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910759733.6A Active CN110516572B (en) | 2019-08-16 | 2019-08-16 | Method for identifying sports event video clip, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110516572B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209897A (en) * | 2020-03-09 | 2020-05-29 | 腾讯科技(深圳)有限公司 | Video processing method, device and storage medium |
CN111598035A (en) * | 2020-05-22 | 2020-08-28 | 北京爱宾果科技有限公司 | Video processing method and system |
CN113542774A (en) * | 2021-06-04 | 2021-10-22 | 北京格灵深瞳信息技术股份有限公司 | Video synchronization method and device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9330726B1 (en) * | 2015-10-09 | 2016-05-03 | Sports Logic Group, LLC | System capable of integrating user-entered game event data with corresponding video data |
CN105608690A (en) * | 2015-12-05 | 2016-05-25 | 陕西师范大学 | Graph theory and semi supervised learning combination-based image segmentation method |
CN107766839A (en) * | 2017-11-09 | 2018-03-06 | 清华大学 | Action identification method and device based on neutral net |
CN107967491A (en) * | 2017-12-14 | 2018-04-27 | 北京木业邦科技有限公司 | Machine learning method, device, electronic equipment and the storage medium again of plank identification |
-
2019
- 2019-08-16 CN CN201910759733.6A patent/CN110516572B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9330726B1 (en) * | 2015-10-09 | 2016-05-03 | Sports Logic Group, LLC | System capable of integrating user-entered game event data with corresponding video data |
CN105608690A (en) * | 2015-12-05 | 2016-05-25 | 陕西师范大学 | Graph theory and semi supervised learning combination-based image segmentation method |
CN107766839A (en) * | 2017-11-09 | 2018-03-06 | 清华大学 | Action identification method and device based on neutral net |
CN107967491A (en) * | 2017-12-14 | 2018-04-27 | 北京木业邦科技有限公司 | Machine learning method, device, electronic equipment and the storage medium again of plank identification |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209897A (en) * | 2020-03-09 | 2020-05-29 | 腾讯科技(深圳)有限公司 | Video processing method, device and storage medium |
CN111209897B (en) * | 2020-03-09 | 2023-06-20 | 深圳市雅阅科技有限公司 | Video processing method, device and storage medium |
CN111598035A (en) * | 2020-05-22 | 2020-08-28 | 北京爱宾果科技有限公司 | Video processing method and system |
CN111598035B (en) * | 2020-05-22 | 2023-05-23 | 北京爱宾果科技有限公司 | Video processing method and system |
CN113542774A (en) * | 2021-06-04 | 2021-10-22 | 北京格灵深瞳信息技术股份有限公司 | Video synchronization method and device, electronic equipment and storage medium |
CN113542774B (en) * | 2021-06-04 | 2023-10-20 | 北京格灵深瞳信息技术股份有限公司 | Video synchronization method, device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110516572B (en) | 2022-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10748376B2 (en) | Real-time game tracking with a mobile device using artificial intelligence | |
CN110533097B (en) | Image definition recognition method and device, electronic equipment and storage medium | |
CN110166827B (en) | Video clip determination method and device, storage medium and electronic device | |
CN109145784B (en) | Method and apparatus for processing video | |
US20190294871A1 (en) | Human action data set generation in a machine learning system | |
CN110837856B (en) | Neural network training and target detection method, device, equipment and storage medium | |
CN109543713A (en) | The modification method and device of training set | |
CN104679818B (en) | A kind of video key frame extracting method and system | |
CN103262119B (en) | For the method and system that image is split | |
CN110516572A (en) | Method for identifying sports event video clip, electronic equipment and storage medium | |
CN106599907A (en) | Multi-feature fusion-based dynamic scene classification method and apparatus | |
CN110298844A (en) | X-ray contrastographic picture blood vessel segmentation and recognition methods and device | |
CN109558892A (en) | A kind of target identification method neural network based and system | |
CN110503076A (en) | Video classification methods, device, equipment and medium based on artificial intelligence | |
CN109753884A (en) | A kind of video behavior recognition methods based on key-frame extraction | |
CN109544592A (en) | For the mobile moving object detection algorithm of camera | |
CN113822254B (en) | Model training method and related device | |
CN108647571A (en) | Video actions disaggregated model training method, device and video actions sorting technique | |
CN105430394A (en) | Video data compression processing method, apparatus and equipment | |
CN113435355A (en) | Multi-target cow identity identification method and system | |
CN109919296A (en) | A kind of deep neural network training method, device and computer equipment | |
Dai et al. | Tan: Temporal aggregation network for dense multi-label action recognition | |
CN109063790A (en) | Object identifying model optimization method, apparatus and electronic equipment | |
CN115187772A (en) | Training method, device and equipment of target detection network and target detection method, device and equipment | |
CN104978583B (en) | The recognition methods of figure action and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |