CN109495766A - A kind of method, apparatus, equipment and the storage medium of video audit - Google Patents
A kind of method, apparatus, equipment and the storage medium of video audit Download PDFInfo
- Publication number
- CN109495766A CN109495766A CN201811438719.8A CN201811438719A CN109495766A CN 109495766 A CN109495766 A CN 109495766A CN 201811438719 A CN201811438719 A CN 201811438719A CN 109495766 A CN109495766 A CN 109495766A
- Authority
- CN
- China
- Prior art keywords
- video
- audit
- classification
- content information
- pending video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012550 audit Methods 0.000 title claims abstract description 240
- 238000000034 method Methods 0.000 title claims abstract description 59
- 230000004927 fusion Effects 0.000 claims abstract description 97
- 230000005484 gravity Effects 0.000 claims abstract description 85
- 238000012549 training Methods 0.000 claims description 96
- 230000001256 tonic effect Effects 0.000 claims description 38
- 239000000284 extract Substances 0.000 claims description 35
- 230000011218 segmentation Effects 0.000 claims description 34
- 238000004422 calculation algorithm Methods 0.000 claims description 25
- 230000006870 function Effects 0.000 claims description 24
- 238000012795 verification Methods 0.000 claims description 24
- 238000001228 spectrum Methods 0.000 claims description 23
- 238000012952 Resampling Methods 0.000 claims description 17
- 238000000605 extraction Methods 0.000 claims description 12
- 238000010276 construction Methods 0.000 claims description 5
- 230000003287 optical effect Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 3
- 239000000155 melt Substances 0.000 claims 1
- 238000004458 analytical method Methods 0.000 description 25
- 230000008569 process Effects 0.000 description 11
- 230000015654 memory Effects 0.000 description 9
- 238000012015 optical character recognition Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 238000013507 mapping Methods 0.000 description 5
- 208000001491 myopia Diseases 0.000 description 5
- 238000012552 review Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/635—Overlay text, e.g. embedded captions in a TV program
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/233—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/235—Processing of additional data, e.g. scrambling of additional data or processing content descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/435—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses method, apparatus, equipment and the storage mediums of a kind of audit of video.Wherein, this method comprises: extracting the different type content information in pending video;According to the correlation between different type content information, the feature vector of different type content information is merged, the feature vector of pending video is obtained;According to the feature vector of the pending video, specific gravity of the pending video under different default audit classifications is respectively obtained;According to the specific gravity under different default audit classifications, the audit classification of the pending video is determined.Technical solution provided in an embodiment of the present invention realizes the video audit under polymorphic type fusion, solves the problems, such as that there are limitations for video audit in the prior art, improve the comprehensive and accuracy of video audit.
Description
Technical field
A kind of audited the present embodiments relate to Internet technical field more particularly to video method, apparatus, equipment and
Storage medium.
Background technique
With the rapid development of Internet technology, internet video flow is increased significantly in recent years, short-sighted frequency, live streaming etc.
Various novel user's original contents promote the video of transmission on Internet more and more abundant.At the same time, largely it is related to terrified, sudden and violent
The violation video of the topics such as power, pornographic or political sensitivity, can be also uploaded on internet by illegal user and fast propagation comes out.
At this point, how to be in high efficiency and low cost filtered these violation videos is that the application of the internet videos such as short-sighted frequency, live streaming produces
The common difficulty faced in product.
Since the video resource amount propagated in internet is increasing, by manually to the video resource for being uploaded to internet
Whether audited comprising violation content, is necessarily required to consume a large amount of human costs, and efficiency is lower;Machine is usually utilized at present
Video content is audited in device study automatically, passes through picture, text or the sound under different modalities in single analysis video content
The information such as sound judge current video with the presence or absence of violation content, to accordingly be audited to internet video.
And the information under the single types such as picture, text or sound in video is mostly only analyzed in the prior art, come true
Whether in violation of rules and regulations to determine content in video, the audit of internet video is had some limitations, reduces video audit
Accuracy.
Summary of the invention
The embodiment of the invention provides method, apparatus, equipment and the storage mediums of a kind of audit of video, to solve existing skill
There are problems that limitation for video audit in art, realizes the video audit under polymorphic type fusion, improve the complete of video audit
Face property and accuracy.
In a first aspect, the embodiment of the invention provides a kind of methods of video audit, this method comprises:
Extract the different type content information in pending video;
According to the correlation between different type content information, the feature vector of different type content information is merged, is obtained
The feature vector of pending video;
According to the feature vector of the pending video, pending video is respectively obtained under the default audit classification of difference
Specific gravity;
According to the specific gravity under different default audit classifications, the audit classification of the pending video is determined.
Further, according to the correlation between different type content information, the feature of different type content information is merged
Vector obtains the feature vector of pending video, comprising:
Each different type content information is inputted into the fusion learning model constructed in advance, is learnt by the fusion
Study submodel in model under different type extracts the feature vector of the different type content information respectively;
According to the correlation between different type content information, pass through the fusion submodel pair in the fusion learning model
The feature vector of different type content information is merged, and the feature vector of pending video is obtained.
Further, according to the feature vector of the pending video, it is default careful in difference to respectively obtain pending video
Specific gravity under core classification, comprising:
According to the feature vector of the pending video, respectively by the default regression function in the fusion submodel
To specific gravity of the pending video under different default audit classifications.
Further, the fusion learning model is by executing operations described below building:
Different types of samples of content information in training sample is extracted, the training sample is that target audits going through under classification
History video;
Extract the feature vector of the samples of content information respectively by the study submodel under different type, and according to each
Correlation between the samples of content information is carried out by feature vector of the fusion submodel to each samples of content information
Fusion, obtains the feature vector of training sample;
According to the feature vector of the training sample, respectively obtained by the default regression function in the fusion submodel
Specific gravity of the training sample under different default audit classifications;
Classification and the specific gravity in the case where Bu Tong presetting audit classification are audited according to the target of training sample, determines corresponding classification
Loss, and the Classification Loss is subjected to backpropagation, each study submodel and fusion submodel are modified, and continue to obtain
Training sample new under the target audit classification is taken, until the Classification Loss under target audit classification is lower than default loss
Threshold value;
The training sample reacquired under other audit classifications is trained again, until under preset all audits classification
Classification Loss be below corresponding default loss threshold value, then by obtained each study submodel and fusant model construction be melt
Close learning model.
Further, the different type content information includes sequence of pictures, tonic train and the text in pending video
Word sequence.
Further, the different type content information in pending video is extracted, comprising:
After the pending video segmentation, video frame is extracted in the pending video of segmentation;
The video frame extracted is combined, corresponding sequence of pictures is obtained.
Further, the different type content information in pending video is extracted, comprising:
After the pending video segmentation, the resampling audio-frequency information in the pending video of segmentation;
The spectrum signature in the audio-frequency information of resampling is extracted by Mel-cepstral MFC algorithm;
The spectrum signature extracted is combined, corresponding tonic train is obtained.
Further, the different type content information in pending video is extracted, comprising:
The text information in the pending video is obtained by optical character identification OCR algorithm, obtains corresponding text
Sequence.
Further, according to the specific gravity under different default audit classifications, the audit classification of the pending video is determined,
Include:
Specific gravity of the pending video under violation classification exceeds preset violation threshold value, then sends out to manual examination and verification platform
Send the pending video;
The audit classification of the pending video is determined according to the feedback information of the manual examination and verification platform.
Second aspect, the embodiment of the invention provides a kind of device of video audit, which includes:
Information extraction modules, for extracting the different type content information in pending video;
Fusion Features module, for merging different type content letter according to the correlation between different type content information
The feature vector of breath obtains the feature vector of pending video;
Specific gravity determining module respectively obtains pending video not for the feature vector according to the pending video
With the specific gravity under default audit classification;
Category determination module is audited, for determining the pending view according to the specific gravity under different default audit classifications
The audit classification of frequency.
Further, the Fusion Features module, is specifically used for:
Each different type content information is inputted into the fusion learning model constructed in advance, is learnt by the fusion
Study submodel in model under different type extracts the feature vector of the different type content information respectively;
According to the correlation between different type content information, pass through the fusion submodel pair in the fusion learning model
The feature vector of different type content information is merged, and the feature vector of pending video is obtained.
Further, the specific gravity determining module, is specifically used for:
According to the feature vector of the pending video, respectively by the default regression function in the fusion submodel
To specific gravity of the pending video under different default audit classifications.
Further, the fusion learning model is by executing operations described below building:
Different types of samples of content information in training sample is extracted, the training sample is that target audits going through under classification
History video;
Extract the feature vector of the samples of content information respectively by the study submodel under different type, and according to each
Correlation between the samples of content information is carried out by feature vector of the fusion submodel to each samples of content information
Fusion, obtains the feature vector of training sample;
According to the feature vector of the training sample, respectively obtained by the default regression function in the fusion submodel
Specific gravity of the training sample under different default audit classifications;
Classification and the specific gravity in the case where Bu Tong presetting audit classification are audited according to the target of training sample, determines corresponding classification
Loss, and the Classification Loss is subjected to backpropagation, each study submodel and fusion submodel are modified, and continue to obtain
Training sample new under the target audit classification is taken, until the Classification Loss under target audit classification is lower than default loss
Threshold value;
The training sample reacquired under other audit classifications is trained again, until under preset all audits classification
Classification Loss be below corresponding default loss threshold value, then by obtained each study submodel and fusant model construction be melt
Close learning model.
Further, the different type content information includes sequence of pictures, tonic train and the text in pending video
Word sequence.
Further, the information extraction modules, are specifically used for:
After the pending video segmentation, video frame is extracted in the pending video of segmentation;
The video frame extracted is combined, corresponding sequence of pictures is obtained.
Further, the information extraction modules, also particularly useful for:
After the pending video segmentation, the resampling audio-frequency information in the pending video of segmentation;
The spectrum signature in the audio-frequency information of resampling is extracted by Mel-cepstral MFC algorithm;
The spectrum signature extracted is combined, corresponding tonic train is obtained.
Further, the information extraction modules, also particularly useful for:
The text information in the pending video is obtained by optical character identification OCR algorithm, obtains corresponding text
Sequence.
Further, the audit category determination module, is specifically used for:
Specific gravity of the pending video under violation classification exceeds preset violation threshold value, then sends out to manual examination and verification platform
Send the pending video;
The audit classification of the pending video is determined according to the feedback information of the manual examination and verification platform.
The third aspect, the embodiment of the invention provides a kind of equipment, which includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
The method that device realizes the audit of video described in any embodiment of that present invention.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey
Sequence realizes the method for the audit of video described in any embodiment of that present invention when the program is executed by processor.
The embodiment of the invention provides method, apparatus, equipment and the storage mediums of a kind of audit of video, pending by extracting
The feature vector of different type content information in core video merges each feature vector according to the correlation between feature vector, and
Specific gravity of the pending video under different default audit classifications is respectively obtained by fused feature vector, so that it is determined that pending
The audit classification of core video realizes the video audit under polymorphic type fusion, solves and audit presence for video in the prior art
The problem of limitation, improves the comprehensive and accuracy of video audit.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is a kind of flow chart of the method for video audit that the embodiment of the present invention one provides;
Fig. 2 is a kind of schematic illustration of video review process provided by Embodiment 2 of the present invention;
Fig. 3 is the building schematic diagram in the method for the video audit that the embodiment of the present invention three provides to fusion learning model;
A kind of scene frame of the applicable application scenarios of the method for video audit that Fig. 4 A is provided by the embodiment of the present invention four
Composition;
Fig. 4 B is the schematic illustration for the video review process that the embodiment of the present invention four provides;
Fig. 5 is a kind of structural schematic diagram of the device for video audit that the embodiment of the present invention five provides;
Fig. 6 is a kind of structural schematic diagram for equipment that the embodiment of the present invention six provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
The present embodiments relate to optical character identification (Optical Character Recognition, OCR), from
All kinds of Internet technologies such as right Language Processing, speech recognition, computer vision and machine learning, are primarily adapted for use in and upload to user
Short-sighted frequency, a kind of novel user original content in the internet videos stream such as live video (User Generated Content,
UGC) in the content auditing of video.The embodiment of the present invention mainly uses multiple deep neural network models in a large amount of existing videos
Different type content information carry out corresponding analysis, and the correlation between variant type content information is carried out effective
It excavates, so that variant type content information be merged, obtains corresponding trained submodel, and by multiple trained submodels
Building fusion learning model, merges the different type content information in pending video according to the fusion learning model,
The specific gravity under different default audit classifications is obtained, so that it is determined that the audit classification of the pending video, it can be to pending view
Inherent correlation in frequency between all types of content informations is effectively excavated, and the comprehensive and quasi- of video audit is improved
True property.
Embodiment one
Fig. 1 is a kind of flow chart of the method for video audit that the embodiment of the present invention one provides, and the present embodiment can be applied to
In the video audit terminal that any original video that can be uploaded to user is audited.The scheme of the embodiment of the present invention can be with
Suitable for how to solve the problems, such as video audit, there are limitations.A kind of method of video audit provided in this embodiment can be by
The device of video provided in an embodiment of the present invention audit executes, which can carry out reality by way of software and/or hardware
It is existing, and be integrated in the equipment for executing this method, which can be any intelligence for carrying corresponding video auditing capabilities
It can terminal device.
Specifically, this method may include steps of with reference to Fig. 1:
S110 extracts the different type content information in pending video.
Wherein, pending video can be uploaded in internet by types of applications program for any user and be propagated
, the user's original video audited;Short-sighted frequency, the network direct broadcasting video that such as user records.Exist in order to prevent
Fast propagation of the user's original video of violation video content in internet, it is therefore desirable to before video transmission, to upload
The content of all types of user original video audited, the video filtering that violation content will be present comes out.Meanwhile in different type
Holding information is the content information under the variant type isolated in pending video, such as picture, sound and text;This implementation
Different type content information may include sequence of pictures, tonic train and the word sequence in pending video in example.Specifically,
In order to solve the problems, such as video audit, there are limitations, it is necessary first to determine the content in pending video under each type, with
Continue after an action of the bowels and the inherent correlation between different type content information is effectively excavated, obtains fused Global Information.
Optionally, in the present embodiment firstly the need of obtaining corresponding pending video, can by with each user terminal
It establishes and is wirelessly connected, this pending video can be obtained when user uploads corresponding video;Administrator can also be passed through
Directly by corresponding pending video input into the terminal device for executing the video reviewing method provided in the present embodiment, thus
Directly obtain this pending video.After getting pending video, existing information isolation technics can be passed through first
The different types content informations such as the picture, sound, text for including in pending video are extracted respectively, so as to subsequent to pending
Inherent correlation in video between different type content information is effectively excavated, and the Global Information of the pending video is obtained
Feature.
Optionally, by different type content information in this present embodiment may include sequence of pictures in pending video,
Tonic train and word sequence, therefore when extracting the different type content information in pending video, it may include to pending
The extraction respectively of the content information of this three aspect of sequence of pictures, tonic train and word sequence in core video.
1) it in the present embodiment when being extracted to the sequence of pictures in pending video, extracts in pending video not
Same type content information, can specifically include: after pending video segmentation, extract video in the pending video of segmentation
Frame;The video frame extracted is combined, corresponding sequence of pictures is obtained.
Specifically, due in the pending video that gets may continuous multiple frames video image be it is similar, at this time for
Reduction data processing amount can carry out segment processing to pending video according to certain time interval in the present embodiment, make
Obtaining has certain similitude between each video frame for including in each section of pending video, at this time in each section of pending video
A frame video image is randomly selected in all videos frame for including, gives up other video frames in the pending video of the segmentation,
So that it is guaranteed that there are the differences of certain picture for the video frame extracted in the pending video of each segmentation, to greatly reduce
The redundancy of video frame during processing improves the subsequent analysis rate to video frame.
Further, the video frame screen size of the pending video of difference uploaded due to different user may be different, and
When analyzing the sequence of pictures in different pending videos, need to guarantee the picture ruler of the sequence of pictures of subsequent analysis
It is very little identical, therefore after extracting video frame in the pending video of each segmentation, it needs to zoom to each video frame predefined
Screen size, to guarantee subsequent to analyze the sequence of pictures of identical size;Meanwhile it is zooming to for extracting is predetermined
The video frame of the screen size of justice is grouped together according to the time sequencing of segmentation, obtains different type content in pending video
Sequence of pictures in information, it is subsequent that corresponding analysis and fusion are carried out to the sequence of pictures, improve the rate of video audit.
2) it in the present embodiment when being extracted to the tonic train in pending video, extracts in pending video not
Same type content information, can specifically include: after pending video segmentation, the resampling audio in the pending video of segmentation
Information;The frequency in the audio-frequency information of resampling is extracted by Mel-cepstral (Mel-Frequency Cepstrum, MFC) algorithm
Spectrum signature;The spectrum signature extracted is combined, corresponding tonic train is obtained.
Wherein, MFC is the frequency spectrum that can be used to represent short-term voice signal, is based on indicating with nonlinear melscale
Log spectrum and its linear cosine conversion, the frequency band on MPC is uniformly distributed on melscale at this time, that is to say, that
Such frequency band can it is relatively general used by linear cepstrum representation method, more connect with the nonlinear auditory system of the mankind
Closely, therefore generally MFC algorithm is selected to observe the phonetic feature in voice identification system, such as: it can recognize automatically
User penetrates the number that phone is said.
Specifically, for the inherent correlation between subsequent analysis audio-frequency information and sequence of pictures, can get to
Audit video after, according to sequence of pictures handle in identical time interval the pending video is segmented, with guarantee with
Matching degree between sequence of pictures;At this time since different pending videos are in initial record, corresponding audio-frequency information acquisition
When selected sound frequency and different, it is therefore desirable to according to frequency predetermined to each in the pending video of each segmentation
The audio-frequency information being segmented in pending video carries out resampling, to ensure at the subsequent audio-frequency information under same frequency
Reason.
Further, it is obtaining in each segmentation after the audio-frequency information of resampling, in order to be carried out to the audio-frequency information of each segmentation
Analysis, the spectrum signature of the audio-frequency information of resampling in each segmentation can be obtained by MFC algorithm, to recognize in the segmentation
The voice messaging of audio-frequency information.Wherein, it since MFC algorithm is when obtaining corresponding frequency spectrum, disposably can not all obtain whole
The corresponding spectrum signature of the audio-frequency information of resampling in a segmentation, at this time can be by presetting the sliding window of fixed size
Mouthful, which is slided on the audio-frequency information of each segmentation, and extract audio in the sliding window every time using MFC algorithm
The corresponding spectrum signature of signal, at this time since sliding window is in sliding process, MFC algorithm is extracting the frequency spectrum in window every time
When feature, different sliding windows may include same audio-frequency information, can repeat at this time to the spectrum signature of the audio-frequency information
It extracts, it is ensured that the complete and accuracy of the corresponding spectrum signature of the audio signal improves video and audits effect.It is adopted again extracting
After spectrum signature in the audio-frequency information of sample, it can combine the spectrum signature extracted one according to corresponding time sequencing
It rises, obtains the tonic train in pending video in different type content information, it is subsequent that corresponding point is carried out to the tonic train
Analysis and fusion, improve the rate of video audit.In addition, other audio feature extraction algorithms can also be used in the present embodiment, such as
Linear prediction cepstrum coefficient algorithm, to replace MFC algorithm to extract the spectrum signature in the audio-frequency information of resampling.
3) it in the present embodiment when being extracted to the word sequence in pending video, extracts in pending video not
Same type content information, can specifically include: obtaining the text information in pending video by OCR algorithm, obtains corresponding
Word sequence.
Wherein, OCR algorithm refers to that electronic equipment (such as scanner or digital camera) checks print on paper or screen
The character of display determines character shape by dark, the bright mode of detection character, is then translated into shape with character identifying method
The process of computword, to carry out automatic identification to text.
Specifically, due to the source of the text information in pending video can be the barrage occurred in pending video,
It comments on the comment information in area, text information etc. in the text information and video scene that are superimposed on video pictures, but is not limited to
These types of source, it is also possible to which there are other text sources.It can be by existing OCR algorithm to the pending view in the present embodiment
The text information occurred in frequency is identified, at this time since the comment information of text information and comment area in barrage is by it
He watches the text information of user's input of the pending video, can directly acquire in background system, therefore the present embodiment
In only need to obtain the self-contained text information of pending video using existing OCR algorithm, and utilize Word2Vec algorithm
Alignment processing is carried out to the text information of acquisition, thus effectively by these text informations be mapped as the dense feature of low-dimensional to
Amount, obtains corresponding word sequence.In addition, in the present embodiment can also use other word embedded mobile GISs, as Glove algorithm,
WordRank model, FastText classifier etc. carry out alignment processing instead of text information of the Word2Vec algorithm to acquisition, obtain
To corresponding word sequence.
S120 merges the feature vector of different type content information according to the correlation between different type content information,
Obtain the feature vector of pending video.
Specifically, when extracting the different type content information in pending video, in order to analyze different type content
Correlation between information, it is necessary first to obtain the feature vector of variant type content information respectively;It can be at this time difference
Corresponding single type analysis model is set separately in type content information, by existing neural network, support vector machine, at random
The methods of forest or big data analysis, using the historic training data under a large amount of the type to each single type analysis model into
Different type content information can be obtained variant type content by the corresponding single type analysis model by row training
The feature vector of information, this feature vector, which can correspond to, to be represented the different type content information and is different from other content information
Characteristic.
Optionally, when obtaining the feature vector of different type content information, by right in different type content information
It answers the characteristic value under each dimension to be analyzed, the correlation between each different type content information is judged, thus to variant
The feature vector of type content information is merged.Optionally, by same in the feature vector to different type content information
Characteristic value in dimension is analyzed, judge variant type content information in the dimension in correlation, thus to each
Characteristic value of the feature vector on same dimension is merged, obtain capable of indicating the feature of the overall permanence of pending video to
Amount.
Illustratively, the sequence of pictures, tonic train and the word sequence that include in extracting pending video etc. are different
When type content information, the analysis of single type is carried out to sequence of pictures, tonic train and word sequence respectively, to obtain figure
Piece sequence, tonic train and the corresponding feature vector of word sequence, at this time extract sequence of pictures and tonic train
Feature vector can be to be analyzed according to corresponding time dimension, therefore can be corresponding to sequence of pictures and tonic train
Specific features value in feature vector under same time dimension is analyzed, and judges the correlation of the two;Simultaneously because literary
It may include barrage and comment information in word sequence, therefore requirement of real-time is lower, at this time on analyzing each time dimension
Picture feature and when audio frequency characteristics, be referred to various features value in the corresponding feature vector of whole word sequence carry out it is whole
Body analysis, thus according to the inherent correlation between sequence of pictures, tonic train and word sequence three, by sequence of pictures, sound
Frequency sequence and the corresponding feature vector of word sequence are merged, to obtain to indicate the overall permanence of the pending video
The feature vector of information is analyzed in different audit classifications so as to subsequent.
S130 respectively obtains pending video under the default audit classification of difference according to the feature vector of pending video
Specific gravity.
Wherein, presetting audit classification is any one video type that pending video may belong to, and may include normal
Classification and violation classification, wherein violation classification can also be subdivided into violence classification, terrified classification, pornographic classification and political sensitivity
Classification etc. contains the violation type of various violation contents.
Specifically, obtaining the feature vector of the pending video in the feature vector for merging variant category content information
Afterwards, the feature vector of the pending video can be analyzed, judges the pending video and each different default audit classes
Difference degree between the video of not lower the included audit classification, and it is pre- according to the feature vector and difference of pending video
If corresponding to the difference degree of video in audit classification, pending video ratio shared under different default audit classifications is determined
Weight, so as to the subsequent audit classification for judging the pending video according to the specific gravity.Optionally, analysis can be passed through in the present embodiment
The feature of a large amount of history videos under the default audit classification of difference, to analyze the view for being included under different default audit classifications
In frequency should existing common feature, audit institute under classifications with different preset subsequently through the feature vector for analyzing pending video
Difference degree between the common feature that should include determines specific gravity of the pending video under different default audit classifications.
S140 determines the audit classification of pending video according to the specific gravity under different default audit classifications.
Specifically, in the present embodiment when obtaining specific gravity of the pending video under different default audit classifications, by right
Each specific gravity is analyzed, and judges whether the pending video is normal video.Illustratively, if the pending video is in normal class
Specific gravity in not is much larger than the specific gravity in the violation classification of each subdivision, it is determined that the audit classification of the pending video is positive
Normal classification;If the specific gravity of the pending video in normal category compared with lower than in the violation classification of each subdivision specific gravity it
With, and the specific gravity in the violence classification in violation classification under subdivision is much higher than the specific gravity in other violation classifications, it is determined that it should
The audit classification of pending video is the violence classification in violation classification.Meanwhile if default audit classification only includes normal category
With two kinds of violation classification, then it is directly that the higher audit classifications of specific gravity of pending video in both default audit classifications are true
It is set to the audit classification of the pending video.
Optionally, in order to avoid machine audits the audit error to the audit classification of pending video in the present embodiment, also
The audit accuracy to pending video can be further promoted in such a way that machine is audited and manual examination and verification combine;This
When, according to the specific gravity under different default audit classifications, determines the audit classification of pending video, can specifically include: pending
Specific gravity of the core video under violation classification exceeds preset violation threshold value, then sends pending video to manual examination and verification platform;Root
The audit classification of pending video is determined according to the feedback information of manual examination and verification platform.
Specifically, violation threshold value is to determine corresponding specific gravity when pending video includes the violation content under the violation classification
Lower limit value;In the present embodiment ratio (Suspected Tllegal Push can be pushed according to certain doubtful violation
Ratio, SIPR) corresponding each violation threshold value under different violation classifications is defined respectively, wherein the SIPR is to push to manually put down daily
Platform data volume is divided by daily business total amount of data;If pending video is under violation classification in different subdivision violation classifications at this time
Specific gravity is below corresponding violation threshold value, it is determined that the audit classification of the pending video is normal category;If pending video
Under violation classification some subdivision violation classification in specific gravity exceed the corresponding violation threshold value of subdivision violation classification, then this
When the pending video may include corresponding violation content.
Optionally, in order to further enhance the accuracy of video audit, the present embodiment is in pending video in a certain violation
When specific gravity under classification exceeds corresponding preset violation threshold value, which can be sent to corresponding manual examination and verification and put down
Platform further carries out manual examination and verification to the pending video by staff again, determines the audit classification of the pending video,
And after the completion of manual examination and verification, manual examination and verification result is returned to as corresponding feedback information by manual examination and verification platform and executes sheet
In the video audit terminal of the method for video audit in embodiment, to keep the video audit terminal true according to the feedback information
Fixed corresponding manual examination and verification are as a result, so that it is determined that the audit classification of the pending video.
Technical solution provided in this embodiment, by extract the feature of different type content information in pending video to
Amount merges each feature vector according to the correlation between feature vector, and is respectively obtained by fused feature vector pending
Specific gravity of the core video under different default audit classifications, so that it is determined that the audit classification of pending video, realizes polymorphic type fusion
Under video audit, solve the problems, such as to improve the complete of video audit in the prior art for video audit there are limitation
Face property and accuracy.
Embodiment two
Fig. 2 is a kind of schematic illustration of video review process provided by Embodiment 2 of the present invention.Be in the present embodiment
It is optimized on the basis of technical solution provided by the above embodiment.Specifically, mainly in different type in the present embodiment
Hold the specific gravity determination process of the fusion process and pending video of the feature vector of information under different default audit classifications into
The detailed explanation of row.
Optionally, it may include steps of in the present embodiment:
S210 extracts the different type content information in pending video.
S220 inputs variant type content information into the fusion learning model constructed in advance, learns mould by fusion
Study submodel in type under different type extracts the feature vector of different type content information respectively.
Wherein, fusion learning model is that one kind can carry out unitary class to the different type content information in pending video
After type analysis, variant type content information is directly merged, it is different default careful under polymorphic type fusion to obtain the pending video
The training pattern of specific gravity in core classification can be composed of multiple submodels.Wherein, it can wrap in the fusion learning model
The study submodel that correspondence analysis is carried out to type content information variant under single type is included, and to variant type content
The fusion submodel that the analysis result of information is merged;Specifically, extracted in the study submodel and pending video
Different type content information corresponds, and in different type content information may include sequence of pictures, audio sequence in the present embodiment
Column and three kinds of word sequence, therefore as shown in Fig. 2, in the present embodiment merge learning model in include three kinds of study submodels, divide
It is not corresponded with sequence of pictures, tonic train and the word sequence extracted in pending video, thus respectively to pending
Sequence of pictures, tonic train and word sequence in video carry out single type analysis, to extract sequence of pictures, audio sequence
Arrange feature vector corresponding with word sequence;Fusion submodel can to three kinds learn submodel in extract sequence of pictures,
Tonic train and the corresponding feature vector of word sequence are merged, and the overall permanence that can indicate the pending video is obtained
Feature vector.Optionally, the study submodel in the present embodiment and fusion submodel are a kind of deep neural network model respectively,
By carrying out self study and adaptive training to a large amount of history videos, enables to learn submodel and fusant model capability divides
Do not have a certain particular procedure ability, to obtain corresponding target analysis result.Learn submodel and fusion in the present embodiment
Submodel can also replace depth by other machines learning model, such as Xgboost, support vector machines, random forest model
Neural network model is trained.
It optionally, first will respectively not in the present embodiment after extracting the different type content information in pending video
Same type content information is separately input in the fusion learning model constructed in advance, that is, will be mentioned from pending video respectively
Sequence of pictures, tonic train and the word sequence of taking-up are input in fusion learning model, by fusion learning model and not
Same type content information learn correspondingly submodel respectively to variant type content information carry out feature extraction, by with
The corresponding study submodel of sequence of pictures extracts the feature vector of sequence of pictures;Pass through study submodel corresponding with tonic train
Extract the feature vector of tonic train;The feature vector of word sequence is extracted by study submodel corresponding with word sequence;
So as to the subsequent inherent correlation according between sequence of pictures, tonic train and word sequence, each feature vector is carried out special
Levy fusion treatment.
S230 passes through the fusion submodel in fusion learning model according to the correlation between different type content information
The feature vector of different type content information is merged, the feature vector of pending video is obtained.
Specifically, after respectively obtaining the feature vector of different type content information by each study submodel, it will respectively not
The feature vector of same type content information is respectively by the fusion submodel in fusion learning model, so that fusion submodel is effective
Excavate the correlation between variant type content information, and according to the correlation to the feature of different type content information to
Amount is merged, and the feature vector that can indicate the overall permanence of pending video is obtained.Optionally, respectively will in the present embodiment
The feature vector of sequence of pictures, tonic train and word sequence is by fusion submodel, analysis picture sequence, tonic train and text
Inherent correlation between word sequence, such as can by analysis picture sequence and the feature vector of tonic train in the same time
Characteristic value in dimension judges the correlation of sequence of pictures and tonic train, due to including barrage and comment in word sequence
Information etc., real-time is lower, therefore after determining the correlation of sequence of pictures and tonic train, analyzes the spy of word sequence again
Levy correlation of the vector in other dimensions between sequence of pictures and tonic train, thus to sequence of pictures, tonic train and
The feature vector of word sequence is merged, and the feature vector of pending video is obtained.Specifically, can pass through in the present embodiment
The Nonlinear Mapping in submodel between each neuron is merged to merge the feature vector of variant type content information one
It rises, obtains the feature vector of pending video.Wherein, it can learn and store a large amount of input and output modes to reflect in Nonlinear Mapping
Relationship is penetrated, without understanding the math equation for describing this mapping relations in advance, as long as enough sample datas pair can be provided
Network model carries out learning training in advance, just can complete to tie up the Nonlinear Mapping that the input space ties up output space to m from n.
S240 is respectively obtained according to the feature vector of pending video by the default regression function in fusion submodel
Specific gravity of the pending video under different default audit classifications.
Wherein, default regression function is a kind of regression model for multi-classification algorithm in neural network, and being used for will be pending
Classification results normalization of the core video in different default audit classifications.Optionally, letter is returned using Softmax in the present embodiment
Output of the feature vector of pending video in each default audit type can be mapped to corresponding (0,1) section by number
It is interior, to obtain specific gravity of the pending video under variant default audit classification, pending video can be guaranteed not at this time
It is 1 with the sum of the specific gravity under default audit classification.It should be noted that the default regression function in fusion submodel can be
Softmax regression function, Logistic regression function and polynomial regression function etc. can be according to the features of pending video
Vector determines any one in the regression model of shared specific gravity under different audit classifications, does not limit in the present embodiment this
It is fixed.
Optionally, the feature vector of the pending video is analyzed in the present embodiment, judge the pending video with
Difference degree between the video for the audit classification for being included under each different default audit classifications, so that it is determined that this is pending
The feature vector of video is corresponding in different default audit classifications to be analyzed as a result, and by the Softmax in fusion submodel
Regression function is by the analysis exported in variant default audit classification as a result, being mapped in corresponding (0,1) section, thus really
Specific gravity of the fixed pending video under variant default audit classification, determines examining for pending video according to the specific gravity so as to subsequent
Core classification.
S250 determines the audit classification of pending video according to the specific gravity under different default audit classifications.
Technical solution provided in this embodiment, by the fusion learning model that constructs in advance to different type content information
Feature vector is merged, and respectively obtains pending video under the default audit classification of difference according to fused feature vector
Specific gravity realize that the lower video of polymorphic type fusion is audited, solve the prior art so that it is determined that the audit classification of pending video
In for video audit there are problems that limitation, improve video audit comprehensive and accuracy.
Embodiment three
Fig. 3 is the building schematic diagram in the method for the video audit that the embodiment of the present invention three provides to fusion learning model.
The present embodiment is to optimize on the basis of the above embodiments, the mainly specific training to fusion learning model in the present embodiment
Process carries out detailed explanation.
Optionally, the present embodiment may include steps of:
S310 extracts different types of samples of content information in training sample.
Wherein, training sample is the history video under target audit classification.In order to guarantee the fusion learning model energy of building
It is enough that corresponding audit is carried out to all types of pending videos, it needs in advance to having determined largely going through for audit classification belonging to video
History video is trained.At this point, first by obtain it is a large amount of it is of all categories under history video as training sample, the wherein history
The source of video can be the short-sighted frequency in internet or internet live video for being uploaded to that internet is propagated, and be also possible to it
The video that his approach obtains is not construed as limiting the source of history video in the present embodiment.Meanwhile a large amount of training samples that will acquire
It is configured to corresponding training set, and is labeled for each training sample, is previously according to the content in the history video
Each training sample sets corresponding sample label, which is the audit classification of target belonging to training sample, including
Normal category and violation classification, or violation classification is specifically segmented, therefore the violation label in the present embodiment can be set
For single tag along sort or more tag along sorts, to judge whether violation classification is subdivided into specific violation type.
Optionally, the present embodiment carries out deep neural network training for the training sample under each audit classification respectively,
Obtaining label information in training set first is the training sample that a certain target audits classification, and to different type in the training sample
Sample information separated, to extract in the different types of samples such as the picture, sound, text for including in training sample
Hold information.Specifically, the samples pictures sequence for including in training sample, sample audio sequence can be extracted in the present embodiment respectively
With sample word sequence, each submodel is trained so as to subsequent.
S320, extracts the feature vector of samples of content information by the study submodel under different type respectively, and according to
Correlation between each samples of content information merges the feature vector of each samples of content information by merging submodel,
Obtain the feature vector of training sample.
Specifically, in extracting training sample after different types of each samples of content information, as shown in Figure 3, respectively
Each samples of content information is input in corresponding study submodel, that is, samples pictures sequence inputting is instructed to for picture
In experienced study submodel, by sample audio sequence inputting into the study submodel for audio training, by sample text sequence
Column are input in the study submodel for text training, thus according to preset training parameter in each study submodel,
The feature in samples pictures sequence, sample audio sequence and sample word sequence is extracted respectively, to obtain each sample
The feature vector of content information;The feature vector of each samples of content information is input to simultaneously in fusion submodel at this time, is passed through
Preset training parameter judges the correlation between each samples of content information in fusion submodel, that is, is joined according to training
Inherent correlation between number analysis samples pictures sequence, sample audio sequence and the feature vector of sample word sequence, thus
Make to merge feature vector of the submodel according to the correlation analysis result using corresponding training parameter to each samples of content information
It is merged, obtains the feature vector that can indicate the overall permanence of the training sample.Specifically, passing through fusion in the present embodiment
Nonlinear Mapping in submodel between each neuron is by the feature of each samples of content information different types of in training sample
Vector Fusion together, obtains the feature vector of the training sample.
S340 respectively obtains instruction by the default regression function in fusion submodel according to the feature vector of training sample
Practice specific gravity of the sample under different default audit classifications.
Optionally, in the feature vector for obtaining training sample, according to default in each difference in advance in fusion submodel
The model parameter feature set under audit classification, the feature vector of training of judgement sample are pre- under each different default audit classifications
If model parameter feature between difference degree, it is right under the default audit classifications of difference to obtain the feature vector of the training sample
The compartment analysis answered is as a result, simultaneously by the default regression function in fusion submodel, that is, Softmax regression function will be
The compartment analysis exported in variant default audit classification is as a result, be mapped in corresponding (0,1) section, so that it is determined that the training
Specific gravity of the sample under variant default audit classification, determines that the training sample was trained at this according to the specific gravity so as to subsequent
Affiliated audit classification obtained in journey.
S350 audits classification and the specific gravity in the case where Bu Tong presetting audit classification according to the target of training sample, determines and correspond to
Classification Loss, and Classification Loss is subjected to backpropagation, each study submodel and fusion submodel is modified, and continues
Training sample new under target audit classification is obtained, until the Classification Loss under target audit classification is lower than default loss threshold value.
Specifically, can substantially determine this instruction when obtaining specific gravity of the training sample under different default audit classifications
The audit classification that the fusion learning model analyzes training sample during white silk, the audit classification are estimated value, this
When can the target according to belonging to the training sample audit classification with according to Bu Tong preset audit classification under specific gravity determination
Classification is audited, that is, the concrete class of video audit and this estimated value are judged, determines and exists in this audit
Classification Loss, which can clearly indicate that each submodel of currently training is quasi- for the classification of target audit type
True degree.Optionally, the Classification Loss that this training can be judged using any existing loss function, does not limit this
It is fixed.Existed in the present embodiment by the sample label to training sample, that is, corresponding target audit classification and training sample
Specific gravity under the default audit classification of difference seeks corresponding cross entropy to determine that this trains corresponding Classification Loss.Meanwhile this
In embodiment after obtaining Classification Loss, it is also necessary to judge the Classification Loss, if the Classification Loss of this training exceeds
Preset loss threshold value, the accuracy for illustrating that each submodel of this training audits video is not also high, needs to be instructed again
Practice;The Classification Loss that this training obtains is subjected to backpropagation according to model training process at this time, that is, the classification is damaged
It loses successively back through fusion submodel and each study submodel, and according to the Classification Loss to the training parameter in each submodel
It is modified, to constantly adjust the training parameter in each submodel, the classification accuracy of each submodel is continuously improved.
Further, after being modified to each submodel, acquisition can be continued in the training set constructed in advance and belonged to
The target audits training sample new under classification, obtains the new classification damage determined when the training sample new to this is trained
It loses, circuits sequentially, until the Classification Loss under target audit classification is lower than preset loss threshold value, illustrate each of this training
Submodel has reached certain accuracy to video audit, instructs again without auditing the training sample under classification to the target
Practice, other audit classifications are trained at this time.
S360, the training sample reacquired under other audit classifications are trained again, until preset all audits
Classification Loss under classification is below corresponding default loss threshold value, then by obtained each study submodel and fusion submodel structure
It builds to merge learning model.
Specifically, when auditing the training sample training completion under classification to a certain target, it is also necessary to default audit class
Training samples under other audit classifications in not are trained, referring to above-mentioned training process, respectively to normal category and
The corresponding training sample of each specific violation type segmented in violation classification is trained, until under preset all audits classifications
Classification Loss be below corresponding default loss threshold value, can determine at this time currently training each submodel can be to any
The audit classification of pending video carries out accurate judgement, is at this time this by obtained each study submodel and fusant model construction
Fusion learning model in embodiment carries out accurate reviews to pending video so as to subsequent.
Technical solution provided in this embodiment, by inputting samples of content information different types of in a large amount of training samples
It is trained into each study submodel and fusion submodel, building can believe the different type content in pending video
The fusion learning model analyzed and merged is ceased, to realize the video audit under polymorphic type fusion, solves the prior art
In for video audit there are problems that limitation, improve video audit comprehensive and accuracy.
Example IV
A kind of scene frame of the applicable application scenarios of the method for video audit that Fig. 4 A is provided by the embodiment of the present invention four
Composition, Fig. 4 B are the schematic illustration for the video review process that the embodiment of the present invention four provides.Mainly with specific in the present embodiment
Application scenarios detailed process that video is audited be described in detail.It include that video audit is whole referring to Fig. 4 A, in the present embodiment
End 40, user terminal 41 and manual examination and verification platform 42;Video audit terminal 40 respectively with user terminal 41 and manual examination and verification platform
42 establish wireless connection.
Optionally, user can upload corresponding pending video by the user terminal 41 at place, in pending video
Before propagating in internet, video audit terminal 40 can obtain the pending view that user newly uploads on user terminal 41 first
Frequently, and using the method for the video audit provided in the embodiment of the present invention the pending video is audited, it is pending to obtain this
Specific gravity of the core video under different default audit classifications, finds the specific gravity of the pending video under violation classification at this time, if
The pending video can be sent to corresponding people beyond corresponding preset violation threshold value by the specific gravity under a certain violation classification
Work audits platform 42, is further manually examined again the pending video by the staff at 42 end of manual examination and verification platform
Core, so that it is determined that the audit classification of the pending video, and after the completion of manual examination and verification, it will manually be examined by manual examination and verification platform 42
Core result returns in video audit terminal 40 as corresponding feedback information, to keep the video audit terminal 40 anti-according to this
Feedforward information determines corresponding manual examination and verification as a result, so that it is determined that the audit classification of the pending video.If under each violation classification
Specific gravity be below corresponding preset violation threshold value, then can directly determine the pending video is normal video.The present embodiment
In machine audit and by way of manual examination and verification combine, further promote the accuracy of video audit.
Specifically, video audit terminal 40 is when auditing the pending video, first to pending referring to Fig. 4 B
Video carries out segment processing, and extracts the image information in the pending video under different type, audio-frequency information and text letter
Breath, while corresponding video frame is extracted in image information after segmentation, and be combined sequentially in time, it is corresponded to
Sequence of pictures;Resampling is carried out to the audio-frequency information after segmentation, extracts the frequency spectrum spy of audio-frequency information after the resampling of each segmentation
Sign, and be combined sequentially in time, obtain corresponding tonic train;Obtain the text information in pending video, and benefit
It is handled with text information of the Word2Vec algorithm to acquisition, obtains corresponding word sequence;By sequence of pictures, tonic train
Pass through corresponding study submodel in fusion learning model respectively with word sequence, extracts sequence of pictures, tonic train and text
The feature vector of word sequence, thus according to fusion submodel each feature vector extracted is merged, and according to fusion after
Feature vector determine specific gravity of the pending video under different default audit classifications, and then pending view is determined according to the specific gravity
The audit classification of frequency.
It should be noted that being not construed as limiting in the present embodiment for the quantity of user terminal 41, according to the use of uploaded videos
Amount amount determines.
Technical solution provided in this embodiment, by extract the feature of different type content information in pending video to
Amount merges each feature vector according to the correlation between feature vector, and is respectively obtained by fused feature vector pending
Specific gravity of the core video under different default audit classifications, so that it is determined that the audit classification of pending video, realizes polymorphic type fusion
Under video audit, solve the problems, such as to improve the complete of video audit in the prior art for video audit there are limitation
Face property and accuracy.
Embodiment five
Fig. 5 is a kind of structural schematic diagram of the device for video audit that the embodiment of the present invention five provides, specifically, such as Fig. 5
It is shown, the apparatus may include:
Information extraction modules 510, for extracting the different type content information in pending video;
Fusion Features module 520, for merging different type content according to the correlation between different type content information
The feature vector of information obtains the feature vector of pending video;
Specific gravity determining module 530 respectively obtains pending video in difference for the feature vector according to pending video
Specific gravity under default audit classification;
Category determination module 540 is audited, for determining pending video according to the specific gravity under different default audit classifications
Audit classification.
Technical solution provided in this embodiment, by extract the feature of different type content information in pending video to
Amount merges each feature vector according to the correlation between feature vector, and is respectively obtained by fused feature vector pending
Specific gravity of the core video under different default audit classifications, so that it is determined that the audit classification of pending video, realizes polymorphic type fusion
Under video audit, solve the problems, such as to improve the complete of video audit in the prior art for video audit there are limitation
Face property and accuracy.
Further, features described above Fusion Module 520 can be specifically used for:
Variant type content information is inputted into the fusion learning model constructed in advance, by merging in learning model not
Study submodel under same type extracts the feature vector of different type content information respectively;
According to the correlation between different type content information, by the fusion submodel in fusion learning model to difference
The feature vector of type content information is merged, and the feature vector of pending video is obtained.
Further, above-mentioned specific gravity determining module 530, can be specifically used for:
According to the feature vector of pending video, respectively obtained by the default regression function in fusion submodel pending
Specific gravity of the video under different default audit classifications.
Further, above-mentioned fusion learning model can be by executing operations described below building:
Different types of samples of content information in training sample is extracted, which is the history under target audit classification
Video;
Extract the feature vector of samples of content information respectively by the study submodel under different type, and according to each sample
Correlation between content information merges the feature vector of each samples of content information by merging submodel, is instructed
Practice the feature vector of sample;
According to the feature vector of training sample, training sample is respectively obtained by the default regression function in fusion submodel
Specific gravity under different default audit classifications;
Classification and the specific gravity in the case where Bu Tong presetting audit classification are audited according to the target of training sample, determines corresponding classification
Loss, and Classification Loss is subjected to backpropagation, each study submodel and fusion submodel are modified, and continue to obtain mesh
New training sample under mark audit classification, until the Classification Loss under target audit classification is lower than default loss threshold value;
The training sample reacquired under other audit classifications is trained again, until under preset all audits classification
Classification Loss be below corresponding default loss threshold value, then by obtained each study submodel and fusant model construction be melt
Close learning model.
Further, above-mentioned different type content information may include sequence of pictures in pending video, tonic train
And word sequence.
Further, above- mentioned information extraction module 510 can be specifically used for:
After pending video segmentation, video frame is extracted in the pending video of segmentation;
The video frame extracted is combined, corresponding sequence of pictures is obtained.
Further, above- mentioned information extraction module 510 can also be specifically used for:
After pending video segmentation, the resampling audio-frequency information in the pending video of segmentation;
The spectrum signature in the audio-frequency information of resampling is extracted by Mel-cepstral MFC algorithm;
The spectrum signature extracted is combined, corresponding tonic train is obtained.
Further, above- mentioned information extraction module 510 can also be specifically used for:
The text information in pending video is obtained by optical character identification OCR algorithm, obtains corresponding word sequence.
Further, above-mentioned audit category determination module 540, can be specifically used for:
Specific gravity of the pending video under violation classification exceed preset violation threshold value, then to manual examination and verification platform send to
Audit video;
The audit classification of pending video is determined according to the feedback information of manual examination and verification platform.
The device of video audit provided in this embodiment is applicable to the side for the video audit that above-mentioned any embodiment provides
Method has corresponding function and beneficial effect.
Embodiment six
Fig. 6 is a kind of structural schematic diagram for equipment that the embodiment of the present invention six provides, as shown in fig. 6, the equipment includes place
Manage device 60, storage device 61 and communication device 62;The quantity of processor 60 can be one or more in equipment, with one in Fig. 6
For a processor 60;Processor 60, storage device 61 and communication device 62 in equipment can pass through bus or other modes
It connects, in Fig. 6 for being connected by bus.
Storage device 61 is used as a kind of computer readable storage medium, and it is executable to can be used for storing software program, computer
Program and module, the corresponding program instruction/module of method audited such as the video provided in the embodiment of the present invention.Processor 60
By running the software program, instruction and the module that are stored in storage device 61, thereby executing the various function application of equipment
And data processing, that is, realize the method for above-mentioned video audit.
Storage device 61 can mainly include storing program area and storage data area, wherein storing program area can store operation
Application program needed for system, at least one function;Storage data area, which can be stored, uses created data etc. according to terminal.
It can also include nonvolatile memory in addition, storage device 61 may include high-speed random access memory, for example, at least one
A disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, storage device 61 can
It further comprise the memory remotely located relative to processor 60, these remote memories can be by network connection to setting
It is standby.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Communication device 62 can be used for realizing the network connection or mobile data cube computation of equipment room.
A kind of equipment provided in this embodiment can be used for executing the method for the video that above-mentioned any embodiment provides, and have phase
The function and beneficial effect answered.
Embodiment seven
The embodiment of the present invention seven additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should
Program can realize the audit of the video in above-mentioned any embodiment method when being executed by processor.This method can specifically include:
Extract the different type content information in pending video;
According to the correlation between different type content information, the feature vector of different type content information is merged, is obtained
The feature vector of pending video;
According to the feature vector of the pending video, pending video is respectively obtained under the default audit classification of difference
Specific gravity;
According to the specific gravity under different default audit classifications, the audit classification of the pending video is determined.
Certainly, a kind of storage medium comprising computer executable instructions, computer provided by the embodiment of the present invention
Video audit provided by any embodiment of the invention can also be performed in the method operation that executable instruction is not limited to the described above
Method in relevant operation.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention
It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more
Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art
Part can be embodied in the form of software products, which can store in computer readable storage medium
In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer
Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
It is worth noting that, included each unit and module are in the embodiment of the device of above-mentioned video audit
It is divided according to the functional logic, but is not limited to the above division, as long as corresponding functions can be realized;Separately
Outside, the specific name of each functional unit is also only for convenience of distinguishing each other, the protection scope being not intended to restrict the invention.
The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art
For, the invention can have various changes and changes.All any modifications made within the spirit and principles of the present invention are equal
Replacement, improvement etc., should all be included in the protection scope of the present invention.
Claims (12)
1. a kind of method of video audit characterized by comprising
Extract the different type content information in pending video;
According to the correlation between different type content information, the feature vector of different type content information is merged, is obtained pending
The feature vector of core video;
According to the feature vector of the pending video, ratio of the pending video under different default audit classifications is respectively obtained
Weight;
According to the specific gravity under different default audit classifications, the audit classification of the pending video is determined.
2. the method according to claim 1, wherein being melted according to the correlation between different type content information
The feature vector for closing different type content information, obtains the feature vector of pending video, comprising:
Each different type content information is inputted into the fusion learning model constructed in advance, passes through the fusion learning model
Study submodel under middle different type extracts the feature vector of the different type content information respectively;
According to the correlation between different type content information, by the fusion submodel in the fusion learning model to difference
The feature vector of type content information is merged, and the feature vector of pending video is obtained.
3. according to the method described in claim 2, it is characterized in that, being obtained respectively according to the feature vector of the pending video
To specific gravity of the pending video under different default audit classifications, comprising:
According to the feature vector of the pending video, by the default regression function in the fusion submodel respectively obtain to
Audit specific gravity of the video under different default audit classifications.
4. according to the method described in claim 2, it is characterized in that, the fusion learning model is by executing operations described below structure
It builds:
Different types of samples of content information in training sample is extracted, the training sample is that the history that target is audited under classification regards
Frequently;
Extract the feature vector of the samples of content information respectively by the study submodel under different type, and according to each described
Correlation between samples of content information melts the feature vector of each samples of content information by merging submodel
It closes, obtains the feature vector of training sample;
According to the feature vector of the training sample, training is respectively obtained by the default regression function in the fusion submodel
Specific gravity of the sample under different default audit classifications;
Classification and the specific gravity in the case where Bu Tong presetting audit classification are audited according to the target of training sample, determines corresponding classification damage
It loses, and the Classification Loss is subjected to backpropagation, each study submodel and fusion submodel are modified, and continue to obtain
New training sample under the target audit classification, until the Classification Loss under target audit classification is lower than default loss threshold
Value;
The training sample reacquired under other audit classifications is trained again, until point under preset all audits classification
Class loss is below corresponding default loss threshold value, then is that fusion is learned by obtained each study submodel and fusant model construction
Practise model.
5. the method according to claim 1, wherein the different type content information includes in pending video
Sequence of pictures, tonic train and word sequence.
6. according to the method described in claim 5, it is characterized in that, extract the different type content information in pending video,
Include:
After the pending video segmentation, video frame is extracted in the pending video of segmentation;
The video frame extracted is combined, corresponding sequence of pictures is obtained.
7. according to the method described in claim 5, it is characterized in that, extract the different type content information in pending video,
Include:
After the pending video segmentation, the resampling audio-frequency information in the pending video of segmentation;
The spectrum signature in the audio-frequency information of resampling is extracted by Mel-cepstral MFC algorithm;
The spectrum signature extracted is combined, corresponding tonic train is obtained.
8. according to the method described in claim 5, it is characterized in that, extract the different type content information in pending video,
Include:
The text information in the pending video is obtained by optical character identification OCR algorithm, obtains corresponding word sequence.
9. the method according to claim 1, wherein being determined according to the specific gravity under different default audit classifications
The audit classification of the pending video, comprising:
Specific gravity of the pending video under violation classification exceeds preset violation threshold value, then sends institute to manual examination and verification platform
State pending video;
The audit classification of the pending video is determined according to the feedback information of the manual examination and verification platform.
10. a kind of device of video audit characterized by comprising
Information extraction modules, for extracting the different type content information in pending video;
Fusion Features module, for merging different type content information according to the correlation between different type content information
Feature vector obtains the feature vector of pending video;
Specific gravity determining module respectively obtains pending video different pre- for the feature vector according to the pending video
If auditing the specific gravity under classification;
Category determination module is audited, for determining the pending video according to the specific gravity under different default audit classifications
Audit classification.
11. a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method of the video audit as described in any in claim 1-9.
12. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The method of the video audit as described in any in claim 1-9 is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811438719.8A CN109495766A (en) | 2018-11-27 | 2018-11-27 | A kind of method, apparatus, equipment and the storage medium of video audit |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811438719.8A CN109495766A (en) | 2018-11-27 | 2018-11-27 | A kind of method, apparatus, equipment and the storage medium of video audit |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109495766A true CN109495766A (en) | 2019-03-19 |
Family
ID=65698526
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811438719.8A Pending CN109495766A (en) | 2018-11-27 | 2018-11-27 | A kind of method, apparatus, equipment and the storage medium of video audit |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109495766A (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109862394A (en) * | 2019-03-27 | 2019-06-07 | 北京周同科技有限公司 | Checking method, device, equipment and the storage medium of video content |
CN110085213A (en) * | 2019-04-30 | 2019-08-02 | 广州虎牙信息科技有限公司 | Abnormality monitoring method, device, equipment and the storage medium of audio |
CN110225373A (en) * | 2019-06-13 | 2019-09-10 | 腾讯科技(深圳)有限公司 | A kind of video reviewing method, device and electronic equipment |
CN110647905A (en) * | 2019-08-02 | 2020-01-03 | 杭州电子科技大学 | Method for identifying terrorist-related scene based on pseudo brain network model |
CN110781916A (en) * | 2019-09-18 | 2020-02-11 | 平安科技(深圳)有限公司 | Video data fraud detection method and device, computer equipment and storage medium |
CN110990631A (en) * | 2019-12-16 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Video screening method and device, electronic equipment and storage medium |
CN111079816A (en) * | 2019-12-11 | 2020-04-28 | 北京金山云网络技术有限公司 | Image auditing method and device and server |
CN111090776A (en) * | 2019-12-20 | 2020-05-01 | 广州市百果园信息技术有限公司 | Video auditing method, device, auditing server and storage medium |
CN111770353A (en) * | 2020-06-24 | 2020-10-13 | 北京字节跳动网络技术有限公司 | Live broadcast monitoring method and device, electronic equipment and storage medium |
CN111770352A (en) * | 2020-06-24 | 2020-10-13 | 北京字节跳动网络技术有限公司 | Security detection method and device, electronic equipment and storage medium |
CN111813399A (en) * | 2020-07-23 | 2020-10-23 | 平安医疗健康管理股份有限公司 | Machine learning-based auditing rule processing method and device and computer equipment |
CN112579771A (en) * | 2020-12-08 | 2021-03-30 | 腾讯科技(深圳)有限公司 | Content title detection method and device |
CN114157906A (en) * | 2020-09-07 | 2022-03-08 | 北京达佳互联信息技术有限公司 | Video detection method and device, electronic equipment and storage medium |
CN114760484A (en) * | 2021-01-08 | 2022-07-15 | 腾讯科技(深圳)有限公司 | Live video identification method and device, computer equipment and storage medium |
CN114915779A (en) * | 2022-04-08 | 2022-08-16 | 阿里巴巴(中国)有限公司 | Video quality evaluation method, device, equipment and storage medium |
CN114979051A (en) * | 2022-04-18 | 2022-08-30 | 中移互联网有限公司 | Message processing method and device, electronic equipment and storage medium |
CN115297360A (en) * | 2022-09-14 | 2022-11-04 | 百鸣(北京)信息技术有限公司 | Intelligent auditing system for multimedia software video uploading |
CN115460433A (en) * | 2021-06-08 | 2022-12-09 | 京东方科技集团股份有限公司 | Video processing method and device, electronic equipment and storage medium |
CN115834935A (en) * | 2022-12-21 | 2023-03-21 | 阿里云计算有限公司 | Multimedia information auditing method, advertisement auditing method, equipment and storage medium |
CN116824107A (en) * | 2023-07-13 | 2023-09-29 | 北京万物镜像数据服务有限公司 | Processing method, device and equipment for three-dimensional model review information |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050238238A1 (en) * | 2002-07-19 | 2005-10-27 | Li-Qun Xu | Method and system for classification of semantic content of audio/video data |
CN101035280A (en) * | 2007-04-19 | 2007-09-12 | 鲍东山 | Classified content auditing terminal system |
CN103049530A (en) * | 2012-12-22 | 2013-04-17 | 深圳先进技术研究院 | System and method for deep fused video examination |
US20140037269A1 (en) * | 2012-08-03 | 2014-02-06 | Mrityunjay Kumar | Video summarization using group sparsity analysis |
CN107580259A (en) * | 2016-07-04 | 2018-01-12 | 北京新岸线网络技术有限公司 | A kind of verifying video content method and system |
CN108124191A (en) * | 2017-12-22 | 2018-06-05 | 北京百度网讯科技有限公司 | A kind of video reviewing method, device and server |
-
2018
- 2018-11-27 CN CN201811438719.8A patent/CN109495766A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050238238A1 (en) * | 2002-07-19 | 2005-10-27 | Li-Qun Xu | Method and system for classification of semantic content of audio/video data |
CN101035280A (en) * | 2007-04-19 | 2007-09-12 | 鲍东山 | Classified content auditing terminal system |
US20140037269A1 (en) * | 2012-08-03 | 2014-02-06 | Mrityunjay Kumar | Video summarization using group sparsity analysis |
CN103049530A (en) * | 2012-12-22 | 2013-04-17 | 深圳先进技术研究院 | System and method for deep fused video examination |
CN107580259A (en) * | 2016-07-04 | 2018-01-12 | 北京新岸线网络技术有限公司 | A kind of verifying video content method and system |
CN108124191A (en) * | 2017-12-22 | 2018-06-05 | 北京百度网讯科技有限公司 | A kind of video reviewing method, device and server |
Non-Patent Citations (2)
Title |
---|
陈斌: "基于双模态特征和支持向量机的视频自动分类算法研究", 《万方数据库》 * |
韦鹏程 等: "《基于R语言数据挖掘的统计与分析》", 31 December 2017 * |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109862394A (en) * | 2019-03-27 | 2019-06-07 | 北京周同科技有限公司 | Checking method, device, equipment and the storage medium of video content |
CN110085213A (en) * | 2019-04-30 | 2019-08-02 | 广州虎牙信息科技有限公司 | Abnormality monitoring method, device, equipment and the storage medium of audio |
CN110225373A (en) * | 2019-06-13 | 2019-09-10 | 腾讯科技(深圳)有限公司 | A kind of video reviewing method, device and electronic equipment |
CN110225373B (en) * | 2019-06-13 | 2023-01-24 | 腾讯科技(深圳)有限公司 | Video auditing method and device and electronic equipment |
CN110647905A (en) * | 2019-08-02 | 2020-01-03 | 杭州电子科技大学 | Method for identifying terrorist-related scene based on pseudo brain network model |
CN110647905B (en) * | 2019-08-02 | 2022-05-13 | 杭州电子科技大学 | Method for identifying terrorist-related scene based on pseudo brain network model |
WO2021051607A1 (en) * | 2019-09-18 | 2021-03-25 | 平安科技(深圳)有限公司 | Video data-based fraud detection method and apparatus, computer device, and storage medium |
CN110781916A (en) * | 2019-09-18 | 2020-02-11 | 平安科技(深圳)有限公司 | Video data fraud detection method and device, computer equipment and storage medium |
CN111079816A (en) * | 2019-12-11 | 2020-04-28 | 北京金山云网络技术有限公司 | Image auditing method and device and server |
CN110990631A (en) * | 2019-12-16 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Video screening method and device, electronic equipment and storage medium |
CN111090776A (en) * | 2019-12-20 | 2020-05-01 | 广州市百果园信息技术有限公司 | Video auditing method, device, auditing server and storage medium |
CN111090776B (en) * | 2019-12-20 | 2023-06-30 | 广州市百果园信息技术有限公司 | Video auditing method and device, auditing server and storage medium |
CN111770352A (en) * | 2020-06-24 | 2020-10-13 | 北京字节跳动网络技术有限公司 | Security detection method and device, electronic equipment and storage medium |
CN111770353A (en) * | 2020-06-24 | 2020-10-13 | 北京字节跳动网络技术有限公司 | Live broadcast monitoring method and device, electronic equipment and storage medium |
CN111813399A (en) * | 2020-07-23 | 2020-10-23 | 平安医疗健康管理股份有限公司 | Machine learning-based auditing rule processing method and device and computer equipment |
CN114157906A (en) * | 2020-09-07 | 2022-03-08 | 北京达佳互联信息技术有限公司 | Video detection method and device, electronic equipment and storage medium |
CN114157906B (en) * | 2020-09-07 | 2024-04-02 | 北京达佳互联信息技术有限公司 | Video detection method, device, electronic equipment and storage medium |
CN112579771A (en) * | 2020-12-08 | 2021-03-30 | 腾讯科技(深圳)有限公司 | Content title detection method and device |
CN112579771B (en) * | 2020-12-08 | 2024-05-07 | 腾讯科技(深圳)有限公司 | Content title detection method and device |
CN114760484A (en) * | 2021-01-08 | 2022-07-15 | 腾讯科技(深圳)有限公司 | Live video identification method and device, computer equipment and storage medium |
CN114760484B (en) * | 2021-01-08 | 2023-11-07 | 腾讯科技(深圳)有限公司 | Live video identification method, live video identification device, computer equipment and storage medium |
CN115460433A (en) * | 2021-06-08 | 2022-12-09 | 京东方科技集团股份有限公司 | Video processing method and device, electronic equipment and storage medium |
CN115460433B (en) * | 2021-06-08 | 2024-05-28 | 京东方科技集团股份有限公司 | Video processing method and device, electronic equipment and storage medium |
CN114915779A (en) * | 2022-04-08 | 2022-08-16 | 阿里巴巴(中国)有限公司 | Video quality evaluation method, device, equipment and storage medium |
CN114979051B (en) * | 2022-04-18 | 2023-08-15 | 中移互联网有限公司 | Message processing method and device, electronic equipment and storage medium |
CN114979051A (en) * | 2022-04-18 | 2022-08-30 | 中移互联网有限公司 | Message processing method and device, electronic equipment and storage medium |
CN115297360A (en) * | 2022-09-14 | 2022-11-04 | 百鸣(北京)信息技术有限公司 | Intelligent auditing system for multimedia software video uploading |
CN115834935A (en) * | 2022-12-21 | 2023-03-21 | 阿里云计算有限公司 | Multimedia information auditing method, advertisement auditing method, equipment and storage medium |
CN116824107A (en) * | 2023-07-13 | 2023-09-29 | 北京万物镜像数据服务有限公司 | Processing method, device and equipment for three-dimensional model review information |
CN116824107B (en) * | 2023-07-13 | 2024-03-19 | 北京万物镜像数据服务有限公司 | Processing method, device and equipment for three-dimensional model review information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109495766A (en) | A kind of method, apparatus, equipment and the storage medium of video audit | |
WO2021051607A1 (en) | Video data-based fraud detection method and apparatus, computer device, and storage medium | |
CN111462735A (en) | Voice detection method and device, electronic equipment and storage medium | |
CN114694076A (en) | Multi-modal emotion analysis method based on multi-task learning and stacked cross-modal fusion | |
CN105957531B (en) | Speech content extraction method and device based on cloud platform | |
CN110717324B (en) | Judgment document answer information extraction method, device, extractor, medium and equipment | |
CN112182229A (en) | Text classification model construction method, text classification method and device | |
CN110472548B (en) | Video continuous sign language recognition method and system based on grammar classifier | |
CN110991165A (en) | Method and device for extracting character relation in text, computer equipment and storage medium | |
CN113035311A (en) | Medical image report automatic generation method based on multi-mode attention mechanism | |
CN116955699B (en) | Video cross-mode search model training method, searching method and device | |
CN110188195A (en) | A kind of text intension recognizing method, device and equipment based on deep learning | |
CN110232564A (en) | A kind of traffic accident law automatic decision method based on multi-modal data | |
CN117079299A (en) | Data processing method, device, electronic equipment and storage medium | |
US11238289B1 (en) | Automatic lie detection method and apparatus for interactive scenarios, device and medium | |
CN113076720B (en) | Long text segmentation method and device, storage medium and electronic device | |
CN114372532A (en) | Method, device, equipment, medium and product for determining label marking quality | |
CN113762503A (en) | Data processing method, device, equipment and computer readable storage medium | |
CN113378826B (en) | Data processing method, device, equipment and storage medium | |
CN116186258A (en) | Text classification method, equipment and storage medium based on multi-mode knowledge graph | |
CN112199954B (en) | Disease entity matching method and device based on voice semantics and computer equipment | |
CN114780757A (en) | Short media label extraction method and device, computer equipment and storage medium | |
CN115273856A (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN114064968A (en) | News subtitle abstract generating method and system | |
CN113837910B (en) | Test question recommending method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190319 |
|
RJ01 | Rejection of invention patent application after publication |