CN110149531A - The method and apparatus of video scene in a kind of identification video data - Google Patents
The method and apparatus of video scene in a kind of identification video data Download PDFInfo
- Publication number
- CN110149531A CN110149531A CN201910522913.2A CN201910522913A CN110149531A CN 110149531 A CN110149531 A CN 110149531A CN 201910522913 A CN201910522913 A CN 201910522913A CN 110149531 A CN110149531 A CN 110149531A
- Authority
- CN
- China
- Prior art keywords
- video
- scene
- video scene
- video data
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012545 processing Methods 0.000 claims description 38
- 238000000605 extraction Methods 0.000 claims description 12
- 238000003062 neural network model Methods 0.000 claims description 11
- 230000008859 change Effects 0.000 claims description 8
- 238000005520 cutting process Methods 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 description 10
- 239000000284 extract Substances 0.000 description 6
- 241001269238 Data Species 0.000 description 5
- 241000894007 species Species 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 239000000039 congener Substances 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012706 support-vector machine Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 238000003825 pressing Methods 0.000 description 2
- 235000015170 shellfish Nutrition 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the invention discloses a kind of method and apparatus of video scene in identification video data.The described method includes: determining that target video data includes the type of the video scene by preset image recognition model, and obtain the confidence level for the video frame that different types of video scene separately includes;The confidence level of different classes of the included video frame of the video scene is done into normalized respectively, obtains weighted value of the different classes of video scene in the target video data;In addition, obtaining the parameter information of user's input, target video scene is determined from the video scene according to the size of the parameter information and the weighted value, and the target video scene is back to client.Using method of the present invention, the corresponding target video scene of target video data can be quickly identified by comparing weighted value of the video scene in target video data, to improve the recognition efficiency and accuracy of target video scene.
Description
Technical field
The present embodiments relate to video data processing technology fields, and in particular to video field in a kind of identification video data
The method and apparatus of scape, in addition, further relating to a kind of electronic equipment and storage equipment.
Background technique
With the fast development of science and technology, various video datas are more and more abundant, and one section of complete video data is often
Comprising multiple video scenes, the video scene identification of video data is a relatively common problem, still, to realize to institute
The accurate identification for stating video scene in video data is relatively difficult.Therefore, the accuracy of video scene identification, drop how to be improved
The video scene misclassification rate of low target video becomes those skilled in the art's technical problem urgently to be solved.
In order to solve the above-mentioned technical problem, the video scene identification method generallyd use in the prior art is to extract video counts
The characteristics of image of the video frame included in characterizes video data, is then based on preset classifier to the image of video data
Feature is identified and is classified that different classifications corresponds to different video scenes, to realize to video field in video data
The identification of scape.However, above-mentioned video scene recognition methods is easy to be influenced by classifier performance, cause to divide video scene
Class accuracy is not still high, is unable to satisfy the demand of active user.
Summary of the invention
For this purpose, the embodiment of the present invention provides a kind of method and apparatus for identifying video scene in video data, it is existing to solve
Have since current video scene segmentation technique is not mature enough present in technology, caused target scene extracts inaccurate ask
Topic.
To achieve the goals above, the embodiment of the present invention provides the following technical solutions:
The method of video scene in a kind of identification video data provided according to embodiments of the present invention, comprising: according to be checked
The different of the included video scene of complete video data of survey are split processing to the complete video data, obtain target view
Frequency evidence;Wherein, the target video data includes the video frame of at least one video scene;Known by preset image
Other model determines that the target video data includes the type of the video scene, and obtains different types of video field
The confidence level for the video frame that scape separately includes;Wherein, described image identification model for according to the target video data include institute
State the depth nerve net that the characteristic information of video scene classifies to the video scene that the target video data includes
Network model;The confidence level of different classes of the included video frame of the video scene is done into normalized respectively, is obtained different
Weighted value of the video scene of classification in the target video data;The parameter information for obtaining user's input, according to institute
The size for stating parameter information and the weighted value obtains target video scene from the video scene, and by the target video
Scene is back to client.
Further, described that mesh is obtained from the video scene according to the size of the parameter information and the weighted value
Video scene is marked, specifically includes: extracting targeted parameter value from the parameter information;According to the targeted parameter value and the power
The size of weight values obtains the weighted value up to or over the first video field of default weight threshold from the video scene
Scape, using first video scene as the target video scene;Wherein, first video scene includes at least one institute
State video scene.
Further, the video scene recognition methods, further includes: extract target component from the parameter information
Value;Judge whether the targeted parameter value is greater than the number of species that the target video data includes the video scene, if
It is then to return to alarm prompt to the client.
Further, the confidence level of the video frame refers to that the video frame is the corresponding video frame of the video scene
Probability value.
Further, the difference according to the included video scene of complete video data to be detected is to the complete view
Frequency obtains target video data, specifically includes according to processing is split: obtaining the complete video data to be detected;It is logical
Cross the color characteristic that feature extraction algorithm obtains video scene described in the complete video data;It is calculated by the feature extraction
Method obtains the color characteristic of video frame described in the complete video data;According to institute two neighboring in the complete video data
State the second color in the first color characteristic difference and the complete video data of video scene between two adjacent video frames
Feature difference judges the switching position between the adjacent video scene;According to the switching position to the complete video number
It is handled according to being split.
Correspondingly, the application also provides a kind of device for identifying video scene in video data, comprising: dividing processing list
Member, for being split according to the difference of the included video scene of complete video data to be detected to the complete video data
Processing obtains target video data;Wherein, the target video data includes the video frame of at least one video scene;
Video scene recognition unit, for determining that the target video data includes the video by preset image recognition model
The type of scene, and obtain the confidence level for the video frame that different types of video scene separately includes;Wherein, described image
It includes the characteristic information of the video scene to the target video data packet that identification model, which is according to the target video data,
The deep neural network model that the video scene contained is classified;Video scene weight analysis unit is used for inhomogeneity
The confidence level of other included video frame of the video scene does normalized respectively, obtains the different classes of video field
Weighted value of the scape in the target video data;Target video data obtaining unit, for obtaining the parameter letter of user's input
Breath obtains target video scene according to the size of the parameter information and the weighted value from the video scene, and by institute
It states target video scene and is back to client.
Further, the dividing processing unit is specifically used for: extracting targeted parameter value from the parameter information;According to
The size of the targeted parameter value and the weighted value obtains the weighted value up to or over pre- from the video scene
If the first video scene of weight threshold, using first video scene as the target video scene;Wherein, described first
Video scene includes at least one described video scene.
Further, the target video data obtaining unit is specifically used for: obtaining the complete video number to be detected
According to;The color characteristic of video scene described in the complete video data is obtained by feature extraction algorithm;Pass through the feature
Extraction algorithm obtains the color characteristic of video frame described in the complete video data;According to adjacent in the complete video data
In the first color characteristic difference and the complete video data of two video scenes between two adjacent video frames
Second colors feature difference judges the switching position between the adjacent video scene;According to the switching position to described complete
Video data is split processing.
Correspondingly, the application also provides a kind of electronic equipment, comprising: processor and memory;Wherein, the memory is used
The program of the method for video scene, the equipment are powered and run the identification by the processor in storage identification video data
In video data after the program of the method for video scene, following step is executed:
The complete video data are divided according to the difference of the included video scene of complete video data to be detected
Processing is cut, target video data is obtained;Wherein, the target video data includes the video of at least one video scene
Frame;By preset image recognition model, determine that the target video data includes the type of the video scene, and obtain not
The confidence level for the video frame that the congener video scene separately includes;Wherein, described image identification model is according to
Target video data include the characteristic information of the video scene video scene that includes to the target video data into
The deep neural network model of row classification;The confidence level of different classes of the included video frame of the video scene is done respectively and is returned
One change processing, obtains weighted value of the different classes of video scene in the target video data;Obtain user's input
Parameter information, target video field is obtained from the video scene according to the size of the parameter information and the weighted value
Scape, and the target video scene is back to client.
Correspondingly, the application also provides a kind of storage equipment, it is stored with the method for video scene in identification video data
Program, the program are run by processor, execute following step:
The complete video data are divided according to the difference of the included video scene of complete video data to be detected
Processing is cut, target video data is obtained;Wherein, the target video data includes the video of at least one video scene
Frame;By preset image recognition model, determine that the target video data includes the type of the video scene, and obtain not
The confidence level for the video frame that the congener video scene separately includes;Wherein, described image identification model is according to
Target video data include the characteristic information of the video scene video scene that includes to the target video data into
The deep neural network model of row classification;The confidence level of different classes of the included video frame of the video scene is done respectively and is returned
One change processing, obtains weighted value of the different classes of video scene in the target video data;Obtain user's input
Parameter information, target video field is obtained from the video scene according to the size of the parameter information and the weighted value
Scape, and the target video scene is back to client.
Using the method for video scene in identification video data of the present invention, preset image recognition mould can be passed through
Type obtains the confidence level for the video frame that different types of video scene separately includes, and then does normalized and obtain video
Weighted value of the scene in target video data quickly determines the corresponding target of target video data by comparing weighted value size
Video scene, improves the collecting efficiency and accuracy of target video scene, to improve the usage experience of user.
Detailed description of the invention
It, below will be to embodiment party in order to illustrate more clearly of embodiments of the present invention or technical solution in the prior art
Formula or attached drawing needed to be used in the description of the prior art are briefly described.It should be evident that the accompanying drawings in the following description is only
It is merely exemplary, it for those of ordinary skill in the art, without creative efforts, can also basis
The attached drawing of offer, which is extended, obtains other implementation attached drawings.
Structure depicted in this specification, ratio, size etc., only to cooperate the revealed content of specification, for
Those skilled in the art understands and reads, and is not intended to limit the invention enforceable qualifications, therefore does not have technical
Essential meaning, the modification of any structure, the change of proportionate relationship or the adjustment of size are not influencing the function of the invention that can be generated
Under effect and the purpose that can reach, should all still it fall in the range of disclosed technology contents obtain and can cover.
Fig. 1 is the flow chart of the method for video scene in a kind of identification video data provided in an embodiment of the present invention;
Fig. 2 is the schematic diagram of the device of video scene in a kind of identification video data provided in an embodiment of the present invention;
Fig. 3 is the schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Embodiments of the present invention are illustrated by particular specific embodiment below, those skilled in the art can be by this explanation
Content disclosed by book is understood other advantages and efficacy of the present invention easily, it is clear that described embodiment is the present invention one
Section Example, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not doing
Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.
Below based on the method for video scene in identification video data of the present invention, embodiment is retouched in detail
It states.As shown in Figure 1, it is a kind of flow chart for identifying the method for video scene in video data provided in an embodiment of the present invention,
Specific implementation process the following steps are included:
Step S101: according to the difference of the included video scene of complete video data to be detected to the complete video number
According to processing is split, target video data is obtained;Wherein, the target video data includes at least one described video scene
Video frame.
In embodiments of the present invention, the complete video data may include at least two video scenes.Correspondingly, pressing
It processing is split to the complete video data can obtain according to the difference of the included video scene of complete video data
At least two sections of video datas (that is: target video data).Every section of video data may include several video frames.Due to working as forward sight
Frequency according to the immature of cutting leads to that some non-same videos may be doped in the two sections of video datas obtained after dividing processing
Video frame in scene, therefore, the target video data include the video frame of at least one video scene, are utilized at this time
Misclassification rate is higher when existing neural network identifies target video data, needs to be further processed.
It should be noted that the difference according to the included video scene of complete video data to be detected is to described
Complete video data are split processing, and then obtain target video data, can specifically be accomplished in that
The complete video data to be detected are obtained, institute in the complete video data is obtained by feature extraction algorithm
The color characteristic for stating video scene obtains the face of video frame described in the complete video data by the feature extraction algorithm
Color characteristic, according to the first color characteristic difference of the video scene two neighboring in the complete video data and described complete
The second color characteristic difference in video data between two adjacent video frames, judges the switching between the adjacent video scene
Position is split processing to the complete video data according to the switching position.
Step S102: by preset image recognition model, determine that the target video data includes the video scene
Type, and obtain the confidence level for the video frame that different types of video scene separately includes;Wherein, described image identifies
Model is used for
The deep neural network model classified of the video scene.
In the step 101 to complete video data to be identified be split processing obtain target video data it
Afterwards, Data Preparation has been done to carry out analysis to the video scene in this step.In a step 102, pass through preset figure
As identification model analyzes the video scene, it can determine that the target video data includes the kind of the video scene
Class, and obtain the confidence level for the video frame that different types of video scene separately includes.
In an embodiment of the present invention, the confidence level of the video frame refers to that the video frame is corresponding for the video scene
Video frame probability value.
In the actual implementation process, immature due to current video data cutting, according to complete video number to be detected
Different according to included video scene are split processing to the complete video data, and the target video data of acquisition includes extremely
The video frame of a few video scene.Therefore, by the preset image recognition model to the target video data
The video scene for including carries out identification classification, may obtain at least one video scene classification and the different classes of view
Frequency scene separately includes the confidence level of video frame.
It should be noted that preset image recognition model of the present invention can refer to that one kind is neural network based
Video scene classifier, the training of the video scene classifier can be by acquiring different types of sample image, to difference
The sample image of type extracts characteristics of image, then using the type of the characteristics of image and sample image that extract to video scene point
Class device is trained, and common video scene classifier has support vector machines (Support Vector Machine, SVM), shellfish
This classifier of leaf, logistic regression classifier etc..
Step S103: the confidence level of different classes of the included video frame of the video scene is done at normalization respectively
Reason, obtains weighted value of the different classes of video scene in the target video data.
Obtained in the step 102 video frame that different types of video scene separately includes confidence level it
Afterwards, Data Preparation has been done for the normalized process in this step.It in step 103, can be by different classes of institute
The confidence level for stating the included video frame of video scene does normalized respectively, and then obtains the different classes of video scene
Weighted value in the target video data.
It is exemplified below, in the specific implementation process, according to the difference of included video scene to the complete video
Data are split processing, it is assumed that and the target video data of acquisition may include the video frame of tri- different video scenes of A, B, C,
At this point, identification classification is carried out to the video scene that the target video data includes by the preset image recognition model,
Tri- different video scene types of A, B, C may be obtained.Assuming that A category video scene includes 5 video frames, each video frame
Confidence level is followed successively by 0.6,0.5,0.4,0.4,0.6;B category video scene includes 3 video frames, the confidence level of each video frame
It is followed successively by 0.6,0.1,0.5;C category video scene includes 2 video frames, and the confidence level of each video frame is followed successively by 0.3 He
0.3。
The confidence level of different classes of the included video frame of the video scene is done respectively using Weighted Average Algorithm and is returned
One change processing, obtains the weighted average of each video scene.Specifically, by the corresponding confidence level 0.6 of 5 video frames,
0.5, it 0.4,0.4,0.6 is added, then divided by video frame number 5, obtaining weighted average is 0.5.Similarly, available B classification view
The weighted average of frequency scene is that the weighted average of 0.4, C category video scene is 0.3.The weighted average is this hair
Weighted value of the bright different classes of video scene in the target video data.That is: the video of A classification
Weighted value of the scene in the target video data is 0.5;The video scene of B classification is in the target video data
Weighted value be 0.4;Weighted value of the video scene of C classification in the target video data is 0.3.
In the actual implementation process, since the too low video frame of confidence level may be because being interfered by other factors
Arriving to set a confidence threshold value first in specific operation process as a result, therefore, acquisition meets or exceeds the confidence
The video frame for spending threshold value carries out next calculating.Such as confidence threshold value is set as 0.5, then basis meets or exceeds this and sets
The video frame of confidence threshold calculates the weighted average of each video scene.Such as: the video scene of A classification only considers at this time
The video frame that confidence level is 0.6,0.5 and 0.6, the video scene of B classification only consider that confidence level is 0.6 and 0.5 video
Frame is not above the video frame of confidence threshold value in the video scene of C classification.The weighted average of the video scene of A classification
Value is that the weighted average of the video scene of (0.6+0.5+0.5)/3, B classification is the video scene of (0.6+0.5)/2, C classification
Weighted average be 0.In order to reduce calculation amount, when met or exceeded in each classification the video frame of the confidence threshold value compared with
When more, the 3-6 video frame that top score can be chosen from each classification is calculated.
Step S104: obtain user input parameter information, according to the size of the parameter information and the weighted value from
Target video scene is obtained in the video scene, and the target video scene is back to client.
Weighted value of the different classes of video scene in the target video data is obtained in the step 103
Later, Data Preparation has been done to obtain target video scene from the video scene in this step.In step 103,
The size of the parameter information and the weighted value that can be inputted according to user obtains target video scene from the video scene
And the target video scene is back to client.
In embodiments of the present invention, targeted parameter value can be extracted from the parameter information that user inputs, at this point, according to institute
State the size of targeted parameter value and the weighted value, can be obtained from the video scene weighted value up to or over
First video scene of default weight threshold, using first video scene as the target video scene.Wherein, described
One video scene includes at least one described video scene.
Specifically, assuming that the targeted parameter value N of user's input is 2, first determine whether that the targeted parameter value N is 2 whether big
In the number of species 3 that the target video data includes the video scene, if it is not, then video scene is in the target video
Weighted value in data is from big to small successively are as follows: A category video scene > B category video scene > C category video scene.This
When, default weight threshold can be set to 3, and the weighted value is obtained from the video scene up to or over 3 the first view
Frequency scene (that is: A category video scene and B category video scene).Using A category video scene and B category video scene as knowledge
The not target video scene of the video data.When the targeted parameter value N of user's input is 1, at this point, default weight threshold can be with
5 are set as, the weighted value is obtained from the video scene up to or over 5 the first video scene (that is: A category video
Scene).Using A category video scene as the target video scene for identifying the video data.
It may determine that whether the targeted parameter value is greater than the target video data institute in embodiment of the present invention
Number of species comprising the video scene, if so, returning to alarm prompt to the client.That is: when user inputs
Targeted parameter value N be 5 when, targeted parameter value N for 5 be greater than the target video data comprising the video scene type
Quantity 3 reminds the user that at this point, returning to alarm prompt to client.
Using the method for video scene in identification video data of the present invention, preset image recognition mould can be passed through
Type obtains the confidence level for the video frame that different types of video scene separately includes, and then does normalized and obtain video
Weighted value of the scene in target video data quickly determines the corresponding target of target video data by comparing weighted value size
Video scene, improves the collecting efficiency and accuracy of target video scene, to improve the usage experience of user.
Corresponding with the method for video scene in a kind of identification video data of above-mentioned offer, the present invention also provides a kind of knowledges
The device of video scene in other video data.Since the embodiment of the device is similar to above method embodiment, so description
Fairly simple, related place refers to the explanation of above method embodiment part, in identification video data described below
The embodiment of the device of video scene is only illustrative.It please refers to shown in Fig. 2, is a kind of knowledge provided in an embodiment of the present invention
The schematic diagram of the device of video scene in other video data.
The device of video scene includes following part in a kind of identification video data of the present invention:
Dividing processing unit 201, for the difference according to the included video scene of complete video data to be detected to institute
It states complete video data and is split processing, obtain target video data;Wherein, the target video data includes at least one
The video frame of the video scene.
In embodiments of the present invention, the complete video data may include at least two video scenes.Correspondingly, pressing
It processing is split to the complete video data can obtain according to the difference of the included video scene of complete video data
At least two sections of video datas (that is: target video data).Every section of video data may include several video frames.Due to working as forward sight
Frequency according to the immature of cutting leads to that some non-same videos may be doped in the two sections of video datas obtained after dividing processing
Video frame in scene, therefore, the target video data include the video frame of at least one video scene, are utilized at this time
Misclassification rate is higher when existing neural network identifies target video data, needs to be further processed.
It should be noted that the difference according to the included video scene of complete video data to be detected is to described
Complete video data are split processing, and then obtain target video data, can specifically be accomplished in that
The complete video data to be detected are obtained, institute in the complete video data is obtained by feature extraction algorithm
The color characteristic for stating video scene obtains the face of video frame described in the complete video data by the feature extraction algorithm
Color characteristic, according to the first color characteristic difference of the video scene two neighboring in the complete video data and described complete
The second color characteristic difference in video data between two adjacent video frames, judges the switching between the adjacent video scene
Position is split processing to the complete video data according to the switching position.
Video scene recognition unit 202, for determining the target video data packet by preset image recognition model
Type containing the video scene, and obtain the confidence level for the video frame that different types of video scene separately includes;Its
In, it includes the characteristic information of the video scene to the target that described image identification model, which is according to the target video data,
The deep neural network model that the video scene that video data includes is classified.
In an embodiment of the present invention, the confidence level of the video frame refers to that the video frame is corresponding for the video scene
Video frame probability value.
In the actual implementation process, immature due to current video data cutting, according to complete video number to be detected
Different according to included video scene are split processing to the complete video data, and the target video data of acquisition includes extremely
The video frame of a few video scene.Therefore, by the preset image recognition model to the target video data
The video scene for including carries out identification classification, may obtain at least one video scene classification and the different classes of view
Frequency scene separately includes the confidence level of video frame.
It should be noted that preset image recognition model of the present invention can refer to that one kind is neural network based
Video scene classifier, the training of the video scene classifier can be by acquiring different types of sample image, to difference
The sample image of type extracts characteristics of image, then using the type of the characteristics of image and sample image that extract to video scene point
Class device is trained, and common video scene classifier has support vector machines (Support Vector Machine, SVM), shellfish
This classifier of leaf, logistic regression classifier etc..
Video scene weight analysis unit 203, for setting different classes of the included video frame of the video scene
Reliability does normalized respectively, obtains weighted value of the different classes of video scene in the target video data.
For example, in the specific implementation process, according to the difference of included video scene to the complete video data
It is split processing, it is assumed that the target video data of acquisition may include the video frame of tri- different video scenes of A, B, C, this
When, identification classification is carried out to the video scene that the target video data includes by the preset image recognition model, it can
Tri- different video scene types of A, B, C can be obtained.Assuming that A category video scene includes 5 video frames, each video frame is set
Reliability is followed successively by 0.6,0.5,0.2,0.2,0.5;B category video scene includes 3 video frames, the confidence level of each video frame according to
Secondary is 0.7,0.3,0.5;C category video scene includes 2 video frames, and the confidence level of each video frame is followed successively by 0.3 and 0.3.
The confidence level of different classes of the included video frame of the video scene is done into normalized respectively using Weighted Average Algorithm,
Obtain the weighted average of each video scene.Specifically, by the corresponding confidence level of 5 video frames 0.6,0.5,0.4,
0.4, it 0.6 is added, then divided by video frame number 5, obtaining weighted average is 0.5.Similarly, available B category video scene
Weighted average is that the weighted average of 0.4, C category video scene is 0.3.The weighted average is as of the present invention
Weighted value of the different classes of video scene in the target video data.That is: the video scene of A classification is in institute
Stating the weighted value in target video data is 0.5;Weighted value of the video scene of B classification in the target video data
It is 0.4;Weighted value of the video scene of C classification in the target video data is 0.3.
In the actual implementation process, since the too low video frame of confidence level may be because being interfered by other factors
Arriving to set a confidence threshold value first in specific operation process as a result, therefore, acquisition meets or exceeds the confidence
The video frame for spending threshold value carries out next calculating.Such as confidence threshold value is set as 0.5, then basis meets or exceeds this and sets
The video frame of confidence threshold calculates the weighted average of each video scene.Such as: the video scene of A classification only considers at this time
The video frame that confidence level is 0.6,0.5 and 0.6, the video scene of B classification only consider that confidence level is 0.6 and 0.5 video
Frame is not above the video frame of confidence threshold value in the video scene of C classification.The weighted average of the video scene of A classification
Value is that the weighted average of the video scene of (0.6+0.5+0.5)/3, B classification is the video scene of (0.6+0.5)/2, C classification
Weighted average be 0.In order to reduce calculation amount, when met or exceeded in each classification the video frame of the confidence threshold value compared with
When more, the 3-6 video frame that top score can be chosen from each classification is calculated.
Target video data obtaining unit 204, for obtaining the parameter information of user's input, according to the parameter information and
The size of the weighted value obtains target video scene from the video scene, and the target video scene is back to visitor
Family end.
In embodiments of the present invention, targeted parameter value can be extracted from the parameter information that user inputs, at this point, according to institute
State the size of targeted parameter value and the weighted value, can be obtained from the video scene weighted value up to or over
First video scene of default weight threshold, using first video scene as the target video scene.Wherein, described
One video scene includes at least one described video scene.
Specifically, assuming that the targeted parameter value N of user's input is 2, first determine whether that the targeted parameter value N is 2 whether big
In the number of species 3 that the target video data includes the video scene, if it is not, then video scene is in the target video
Weighted value in data is from big to small successively are as follows: A category video scene > B category video scene > C category video scene.This
When, default weight threshold can be set to 3, and the weighted value is obtained from the video scene up to or over 3 the first view
Frequency scene (that is: A category video scene and B category video scene).Using A category video scene and B category video scene as knowledge
The not target video scene of the video data.When the targeted parameter value N of user's input is 1, at this point, default weight threshold can be with
5 are set as, the weighted value is obtained from the video scene up to or over 5 the first video scene (that is: A category video
Scene).Using A category video scene as the target video scene for identifying the video data.
It may determine that whether the targeted parameter value is greater than the target video data institute in embodiment of the present invention
Number of species comprising the video scene, if so, returning to alarm prompt to the client.That is: when user inputs
Targeted parameter value N be 5 when, targeted parameter value N for 5 be greater than the target video data comprising the video scene type
Quantity 3 reminds the user that at this point, returning to alarm prompt to client.
Using the device of video scene in identification video data of the present invention, preset image recognition mould can be passed through
Type obtains the confidence level for the video frame that different types of video scene separately includes, and then does normalized and obtain video
Weighted value of the scene in target video data quickly determines the corresponding target of target video data by comparing weighted value size
Video scene, improves the collecting efficiency and accuracy of target video scene, to improve the usage experience of user.
Corresponding with the method for video scene in a kind of identification video data of above-mentioned offer, the present invention also provides a kind of electricity
Sub- equipment.Since the embodiment of the electronic equipment is similar to above method embodiment, so being described relatively simple, related place
The explanation of above method embodiment part is referred to, electronic device described below is only illustrative.Please refer to Fig. 3 institute
Show, is the schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
The present invention provides a kind of electronic equipment and specifically includes: processor 301 and memory 302;Wherein, the memory
302, for storing the program of the method for video scene in identification video data, which is powered and passes through the processor 301
It runs in the identification video data after the program of the method for video scene, executes following step:
The complete video data are divided according to the difference of the included video scene of complete video data to be detected
Processing is cut, target video data is obtained;Wherein, the target video data includes the video of at least one video scene
Frame;By preset image recognition model, determine that the target video data includes the type of the video scene, and obtain not
The confidence level for the video frame that the congener video scene separately includes;Wherein, described image identification model is according to
Target video data include the characteristic information of the video scene video scene that includes to the target video data into
The deep neural network model of row classification;The confidence level of different classes of the included video frame of the video scene is done respectively and is returned
One change processing, obtains weighted value of the different classes of video scene in the target video data;Obtain user's input
Parameter information, target video field is obtained from the video scene according to the size of the parameter information and the weighted value
Scape, and the target video scene is back to client.
Correspondingly, the present invention also provides a kind of storage equipment, comprising: be stored with the side of video scene in identification video data
The program of method, the program are run by processor, execute following step:
The complete video data are divided according to the difference of the included video scene of complete video data to be detected
Processing is cut, target video data is obtained;Wherein, the target video data includes the video of at least one video scene
Frame;By preset image recognition model, determine that the target video data includes the type of the video scene, and obtain not
The confidence level for the video frame that the congener video scene separately includes;Wherein, described image identification model is according to
Target video data include the characteristic information of the video scene video scene that includes to the target video data into
The deep neural network model of row classification;The confidence level of different classes of the included video frame of the video scene is done respectively and is returned
One change processing, obtains weighted value of the different classes of video scene in the target video data;Obtain user's input
Parameter information, target video field is obtained from the video scene according to the size of the parameter information and the weighted value
Scape, and the target video scene is back to client.
Although above having used general explanation and specific embodiment, the present invention is described in detail, at this
On the basis of invention, it can be made some modifications or improvements, this will be apparent to those skilled in the art.Therefore,
These modifications or improvements without departing from theon the basis of the spirit of the present invention are fallen within the scope of the claimed invention.
Claims (10)
1. a kind of method of video scene in identification video data characterized by comprising
Place is split to the complete video data according to the difference of the included video scene of complete video data to be detected
Reason obtains target video data;Wherein, the target video data includes the video frame of at least one video scene;
By preset image recognition model, determine that the target video data includes the type of the video scene, and obtain
The confidence level for the video frame that different types of video scene separately includes;Wherein, described image identification model is the mesh
The video scene that the characteristic information of the mark video data video scene that includes includes to the target video data into
The deep neural network model of row classification;
The confidence level of different classes of the included video frame of the video scene is done into normalized respectively, is obtained different classes of
Weighted value of the video scene in the target video data;
The parameter information for obtaining user's input, according to the size of the parameter information and the weighted value from the video scene
Target video scene is obtained, and the target video scene is back to client.
2. the method for video scene in identification video data according to claim 1, which is characterized in that described according to be checked
The different of the included video scene of complete video data of survey are split processing to the complete video data, obtain target view
Frequency evidence, specifically includes:
Obtain the complete video data to be detected;
The color characteristic of video scene described in the complete video data is obtained by feature extraction algorithm;
The color characteristic of video frame described in the complete video data is obtained by the feature extraction algorithm;
According to the first color characteristic difference between the video scene two neighboring in the complete video data and described complete
The second color characteristic difference in whole video data between two adjacent video frames, judges cutting between the adjacent video scene
Change place;
Processing is split to the complete video data according to the switching position.
3. the method for video scene in identification video data according to claim 1, which is characterized in that described according to
The size of parameter information and the weighted value obtains target video scene from the video scene, specifically includes:
Targeted parameter value is extracted from the parameter information;
According to the size of the targeted parameter value and the weighted value, obtained from the video scene weighted value reach or
Person is more than the first video scene of default weight threshold, using first video scene as the target video scene;Wherein,
First video scene includes at least one described video scene.
4. the method for video scene in identification video data according to claim 1, which is characterized in that further include:
Targeted parameter value is extracted from the parameter information;
Judge whether the targeted parameter value is greater than the number of species for the video scene that the target video data is included,
If so, returning to alarm prompt to the client.
5. the method for video scene in identification video data according to claim 1, which is characterized in that the video frame
Confidence level refers to that the video frame is the probability value of the corresponding video frame of the video scene.
6. the device of video scene in a kind of identification video data characterized by comprising
Dividing processing unit, for the difference according to the included video scene of complete video data to be detected to the complete view
Frequency obtains target video data according to processing is split;Wherein, the target video data includes at least one described video
The video frame of scene;
Video scene recognition unit, for determining that the target video data includes described by preset image recognition model
The type of video scene, and obtain the confidence level for the video frame that different types of video scene separately includes;Wherein, described
It includes the characteristic information of the video scene to the target video number that image recognition model, which is according to the target video data,
According to comprising the deep neural network model classified of the video scene;
Video scene weight analysis unit, for distinguishing the confidence level of different classes of the included video frame of the video scene
Normalized is done, weighted value of the different classes of video scene in the target video data is obtained;
Target video data obtaining unit, for obtaining the parameter information of user's input, according to the parameter information and the power
The size of weight values obtains target video scene from the video scene, and the target video scene is back to client.
7. the device of video scene in identification video data according to claim 6, which is characterized in that the target video
Data acquiring unit is specifically used for:
Obtain the complete video data to be detected;
The color characteristic of video scene described in the complete video data is obtained by feature extraction algorithm;
The color characteristic of video frame described in the complete video data is obtained by the feature extraction algorithm;
According to the first color characteristic difference between the video scene two neighboring in the complete video data and described complete
The second color characteristic difference in whole video data between two adjacent video frames, judges cutting between the adjacent video scene
Change place;
Processing is split to the complete video data according to the switching position.
8. the device of video scene in identification video data according to claim 6, which is characterized in that the dividing processing
Unit is specifically used for:
Targeted parameter value is extracted from the parameter information;
According to the size of the targeted parameter value and the weighted value, obtained from the video scene weighted value reach or
Person is more than the first video scene of default weight threshold, using first video scene as the target video scene;Wherein,
First video scene includes at least one described video scene.
9. a kind of electronic equipment characterized by comprising
Processor;And
Memory, for storing the program of the method for video scene in identification video data, which is powered and passes through the place
Reason device is run in the identification video data after the program of the method for video scene, executes following step:
Place is split to the complete video data according to the difference of the included video scene of complete video data to be detected
Reason obtains target video data;Wherein, the target video data includes the video frame of at least one video scene;
By preset image recognition model, determine that the target video data includes the type of the video scene, and obtain
The confidence level for the video frame that different types of video scene separately includes;Wherein, described image identification model is according to institute
State the video scene that the characteristic information that target video data includes the video scene includes to the target video data
The deep neural network model classified;
The confidence level of different classes of the included video frame of the video scene is done into normalized respectively, is obtained different classes of
Weighted value of the video scene in the target video data;
The parameter information for obtaining user's input, according to the size of the parameter information and the weighted value from the video scene
Target video scene is obtained, and the target video scene is back to client.
10. a kind of storage equipment, which is characterized in that be stored with the program of the method for video scene in identification video data, the journey
Sequence is run by processor, executes following step:
Place is split to the complete video data according to the difference of the included video scene of complete video data to be detected
Reason obtains target video data;Wherein, the target video data includes the video frame of at least one video scene;
By preset image recognition model, determine that the target video data includes the type of the video scene, and obtain
The confidence level for the video frame that different types of video scene separately includes;Wherein, described image identification model is according to institute
State the video scene that the characteristic information that target video data includes the video scene includes to the target video data
The deep neural network model classified;
The confidence level of different classes of the included video frame of the video scene is done into normalized respectively, is obtained different classes of
Weighted value of the video scene in the target video data;
The parameter information for obtaining user's input, according to the size of the parameter information and the weighted value from the video scene
Target video scene is obtained, and the target video scene is back to client.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910522913.2A CN110149531A (en) | 2019-06-17 | 2019-06-17 | The method and apparatus of video scene in a kind of identification video data |
PCT/CN2019/108434 WO2020252975A1 (en) | 2019-06-17 | 2019-09-27 | Method and apparatus for recognizing video scene in video data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910522913.2A CN110149531A (en) | 2019-06-17 | 2019-06-17 | The method and apparatus of video scene in a kind of identification video data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110149531A true CN110149531A (en) | 2019-08-20 |
Family
ID=67591546
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910522913.2A Pending CN110149531A (en) | 2019-06-17 | 2019-06-17 | The method and apparatus of video scene in a kind of identification video data |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110149531A (en) |
WO (1) | WO2020252975A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110933462A (en) * | 2019-10-14 | 2020-03-27 | 咪咕文化科技有限公司 | Video processing method, system, electronic device and storage medium |
WO2020252975A1 (en) * | 2019-06-17 | 2020-12-24 | 北京影谱科技股份有限公司 | Method and apparatus for recognizing video scene in video data |
CN113177603A (en) * | 2021-05-12 | 2021-07-27 | 中移智行网络科技有限公司 | Training method of classification model, video classification method and related equipment |
CN115334351A (en) * | 2022-08-02 | 2022-11-11 | Vidaa国际控股(荷兰)公司 | Display device and adaptive image quality adjusting method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9754351B2 (en) * | 2015-11-05 | 2017-09-05 | Facebook, Inc. | Systems and methods for processing content using convolutional neural networks |
CN108053420A (en) * | 2018-01-05 | 2018-05-18 | 昆明理工大学 | A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class |
CN108537134A (en) * | 2018-03-16 | 2018-09-14 | 北京交通大学 | A kind of video semanteme scene cut and mask method |
CN109145840A (en) * | 2018-08-29 | 2019-01-04 | 北京字节跳动网络技术有限公司 | video scene classification method, device, equipment and storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8316301B2 (en) * | 2005-08-04 | 2012-11-20 | Samsung Electronics Co., Ltd. | Apparatus, medium, and method segmenting video sequences based on topic |
CN102207966B (en) * | 2011-06-01 | 2013-07-10 | 华南理工大学 | Video content quick retrieving method based on object tag |
CN109213895A (en) * | 2017-07-05 | 2019-01-15 | 合网络技术(北京)有限公司 | A kind of generation method and device of video frequency abstract |
CN108848422B (en) * | 2018-04-19 | 2020-06-02 | 清华大学 | Video abstract generation method based on target detection |
CN110149531A (en) * | 2019-06-17 | 2019-08-20 | 北京影谱科技股份有限公司 | The method and apparatus of video scene in a kind of identification video data |
-
2019
- 2019-06-17 CN CN201910522913.2A patent/CN110149531A/en active Pending
- 2019-09-27 WO PCT/CN2019/108434 patent/WO2020252975A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9754351B2 (en) * | 2015-11-05 | 2017-09-05 | Facebook, Inc. | Systems and methods for processing content using convolutional neural networks |
CN108053420A (en) * | 2018-01-05 | 2018-05-18 | 昆明理工大学 | A kind of dividing method based on the unrelated attribute dynamic scene of limited spatial and temporal resolution class |
CN108537134A (en) * | 2018-03-16 | 2018-09-14 | 北京交通大学 | A kind of video semanteme scene cut and mask method |
CN109145840A (en) * | 2018-08-29 | 2019-01-04 | 北京字节跳动网络技术有限公司 | video scene classification method, device, equipment and storage medium |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020252975A1 (en) * | 2019-06-17 | 2020-12-24 | 北京影谱科技股份有限公司 | Method and apparatus for recognizing video scene in video data |
CN110933462A (en) * | 2019-10-14 | 2020-03-27 | 咪咕文化科技有限公司 | Video processing method, system, electronic device and storage medium |
CN110933462B (en) * | 2019-10-14 | 2022-03-25 | 咪咕文化科技有限公司 | Video processing method, system, electronic device and storage medium |
CN113177603A (en) * | 2021-05-12 | 2021-07-27 | 中移智行网络科技有限公司 | Training method of classification model, video classification method and related equipment |
CN113177603B (en) * | 2021-05-12 | 2022-05-06 | 中移智行网络科技有限公司 | Training method of classification model, video classification method and related equipment |
CN115334351A (en) * | 2022-08-02 | 2022-11-11 | Vidaa国际控股(荷兰)公司 | Display device and adaptive image quality adjusting method |
CN115334351B (en) * | 2022-08-02 | 2023-10-31 | Vidaa国际控股(荷兰)公司 | Display equipment and self-adaptive image quality adjusting method |
Also Published As
Publication number | Publication date |
---|---|
WO2020252975A1 (en) | 2020-12-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110149531A (en) | The method and apparatus of video scene in a kind of identification video data | |
Tudor Ionescu et al. | How hard can it be? Estimating the difficulty of visual search in an image | |
CN110175549B (en) | Face image processing method, device, equipment and storage medium | |
CN110378235B (en) | Fuzzy face image recognition method and device and terminal equipment | |
CN109697416B (en) | Video data processing method and related device | |
CN110633745B (en) | Image classification training method and device based on artificial intelligence and storage medium | |
US11176418B2 (en) | Model test methods and apparatuses | |
CN104143079B (en) | The method and system of face character identification | |
KR101725651B1 (en) | Identification apparatus and method for controlling identification apparatus | |
US9613296B1 (en) | Selecting a set of exemplar images for use in an automated image object recognition system | |
CN104778481A (en) | Method and device for creating sample library for large-scale face mode analysis | |
CN104992148A (en) | ATM terminal human face key points partially shielding detection method based on random forest | |
CN112883902B (en) | Video detection method and device, electronic equipment and storage medium | |
CN113111690B (en) | Facial expression analysis method and system and satisfaction analysis method and system | |
Gunasekar et al. | Face detection on distorted images augmented by perceptual quality-aware features | |
CN107180056A (en) | The matching process and device of fragment in video | |
CN109857864A (en) | Text sentiment classification method, device, computer equipment and storage medium | |
CN112182269B (en) | Training of image classification model, image classification method, device, equipment and medium | |
CN113723157B (en) | Crop disease identification method and device, electronic equipment and storage medium | |
CN103177266A (en) | Intelligent stock pest identification system | |
CN111401343B (en) | Method for identifying attributes of people in image and training method and device for identification model | |
WO2015146113A1 (en) | Identification dictionary learning system, identification dictionary learning method, and recording medium | |
CN110222582A (en) | A kind of image processing method and camera | |
CN112926429A (en) | Machine audit model training method, video machine audit method, device, equipment and storage medium | |
CN110874835B (en) | Crop leaf disease resistance identification method and system, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190820 |