CN110427519A

CN110427519A - The processing method and processing device of video

Info

Publication number: CN110427519A
Application number: CN201910704248.9A
Authority: CN
Inventors: 白肇强; 白雪峰; 程文文
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2019-11-08

Abstract

The present invention provides a kind of processing method and processing devices of video；Method includes: at least one interactive information for obtaining corresponding video, and the interactive information is used to characterize the Sentiment orientation that user is directed to the video；According to the mapping relations of interactive information and emotional category, the emotional category of at least one interactive information is determined；Obtain the interactive information that emotional category at least one interactive information is negative sense classification；When the other interactive information of the negative sense-class meets preset condition, determine that the video is the target video for carrying negative sense content.By means of the invention it is possible to which automatic identification carries the video of negative sense content.

Description

The processing method and processing device of video

Technical field

The present invention relates to Internet technology more particularly to a kind of processing method and processing devices of video.

Background technique

With the development of internet, the video for carrying all kinds of contents constantly pours in the visual field of user, wherein is no lack of The video of negative sense content is carried, the video of wrong Deceptive news content is such as carried, very poor visual experience is caused to user.

In the related technology, mainly pass through broadcasting video for carrying the identification of the video of negative sense content, manually examined, Waste a large amount of manpower and material resources.

Summary of the invention

The embodiment of the present invention provides a kind of processing method of video, is capable of the video of automatic identification carrying negative sense content.

The technical solution of the embodiment of the present invention is achieved in that

The embodiment of the present invention provides a kind of processing method of video, comprising:

At least one interactive information of corresponding video is obtained, the interactive information is used to characterize user for the video Sentiment orientation；

According to the mapping relations of interactive information and emotional category, the emotional category of at least one interactive information is determined；

Obtain the interactive information that emotional category at least one interactive information is negative sense classification；

When the other interactive information of the negative sense-class meets preset condition, determine that the video is the mesh for carrying negative sense content Mark video.

The embodiment of the present invention provides a kind of processing unit of video, comprising:

Information acquisition unit, for obtaining at least one interactive information of corresponding video, the interactive information is for characterizing User is directed to the Sentiment orientation of the video；

Classification determination unit determines described at least one mutual for the mapping relations according to interactive information and emotional category The emotional category of dynamic information；

Information sifting unit is believed for obtaining the interaction that emotional category at least one interactive information is negative sense classification Breath；

Video determination unit, for determining the video when the other interactive information of the negative sense-class meets preset condition For the target video for carrying negative sense content.

In above scheme, the classification determination unit is also used to obtain when the interactive information includes text interactive information Take the corresponding term vector sequence of the text interactive information；

By the corresponding term vector sequence inputting sentiment classification model of the text interactive information, the corresponding text of output is mutual The emotional category of dynamic information.

In above scheme, described device further includes model training unit, for constructing training sample set, the trained sample This set includes that interactive information sample and the interactive information sample are corresponding with reference to emotional category；

By sentiment classification model described in the corresponding term vector sequence inputting of the interactive information sample, the interaction letter is exported Cease the corresponding target emotional category of sample；

Based on the target emotional category and the difference with reference to emotional category, the mould of the sentiment classification model is updated Shape parameter.

In above scheme, the classification determination unit is also used to carry out word segmentation processing to the text interactive information, obtain The set of words of the interactive information；

Obtain the corresponding term vector of each word in the set of words；

The corresponding term vector of word each in the set of words is combined into the term vector sequence of the corresponding text interactive information Column.

In above scheme, the information sifting unit is also used to when the interactive information include multiple text interactive informations When, obtain text information associated with the video；

The semantic similarity of each the text interactive information and the text information is determined respectively；

According to the semantic similarity, the multiple text interactive information is screened, obtains target text interaction letter Breath set.

In above scheme, the video determination unit is also used to obtain the text in the other interactive information of the negative sense-class Interactive information and non-textual interactive information；

Determine the first quantity of the text interactive information and the second quantity of the non-textual interactive information；

When first quantity meets the first quantity term, and second quantity meets the second quantity term, determine The video is the target video for carrying negative sense content.

In above scheme, the video determination unit is also used to determine that the interaction that the emotional category is negative sense classification is believed The quantity of breath；

When the quantity of the other interactive information of the negative sense-class meets preset condition, the video is determined to carry in negative sense The target video of appearance.

In above scheme, the video determination unit is also used to obtain the total quantity of the interactive information of the corresponding video；

When the ratio of the total quantity of the quantity and interactive information of the other interactive information of the negative sense-class reaches default threshold When value, determine that the video is the target video for carrying negative sense content.

In above scheme, described device further includes video processing unit, for deleting the target video；Or,

It will be sent to terminal for the prompt information of the target video, the prompt information is for prompting the target to regard Frequency carries negative sense content.

The embodiment of the present invention provides a kind of server, comprising:

Memory, for storing executable instruction；

Processor when for executing the executable instruction stored in the memory, realizes that the embodiment of the present invention provides view The processing method of frequency.

The embodiment of the present invention provides a kind of storage medium, is stored with executable instruction, real when for causing processor to execute The existing embodiment of the present invention provides the processing method of video.

The embodiment of the present invention has the advantages that

By the mapping relations of interactive information and emotional category, the emotional category of at least one interactive information is determined, to obtain Taking emotional category at least one interactive information is the interactive information of negative sense classification, when the other interactive information of the negative sense-class meets When preset condition, determine that the video is the target video for carrying negative sense content；It so, it is possible according to interactive information and emotion class Other mapping relations automatically determine the other interactive information of negative sense-class, and then according to the other interactive information of negative sense-class, determine that video is The no target video to carry negative sense content realizes that automatic identification carries the video of negative sense content.

Detailed description of the invention

Fig. 1 is the network architecture schematic diagram of the processing system of video provided in an embodiment of the present invention；

Fig. 2 is the composed structure schematic diagram of the processing unit of video provided in an embodiment of the present invention；

Fig. 3 is the flow diagram of the processing method of video provided in an embodiment of the present invention；

Fig. 4 is the flow diagram of the processing method of video provided in an embodiment of the present invention；

Fig. 5 is the flow diagram of the processing method of video provided in an embodiment of the present invention；

Fig. 6 is the composed structure schematic diagram of the processing unit of video provided in an embodiment of the present invention.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention make into It is described in detail to one step, described embodiment is not construed as limitation of the present invention, and those of ordinary skill in the art are not having All other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.

In the following description, it is related to " some embodiments ", which depict the subsets of all possible embodiments, but can To understand, " some embodiments " can be the same subsets or different subsets of all possible embodiments, and can not conflict In the case where be combined with each other.

Increase the following description if the similar description for occurring " first second " in application documents, in description below In, related term " Yi second " is only to distinguish similar object, does not represent the particular sorted for object, can be with Understand ground, specific sequence or precedence can be interchanged in " first second " in the case where permission, so that described herein Inventive embodiments can be implemented with the sequence other than illustrating or describing herein.

Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention The normally understood meaning of technical staff is identical.Term used herein is intended merely to the purpose of the description embodiment of the present invention, It is not intended to limit the present invention.

Before the embodiment of the present invention is further elaborated, to noun involved in the embodiment of the present invention and term It is illustrated, noun involved in the embodiment of the present invention and term are suitable for following explanation.

1) it segments, refers to the process that continuous word sequence is reassembled into word sequence according to certain specification；

2) solution of complicated problem is resolved into processing unit one by one by assembly line (pipeline), then according to Secondary processing, the result of previous processing unit are also the input of second module；

3) negative sense content refers to that wrong Deceptive news, topic text are not inconsistent the contents such as content and political sensitivity, and this content is easy It generates and complains or report, influence product public praise.

The processing system of the video of the embodiment of the present invention is illustrated first, Fig. 1 is view provided in an embodiment of the present invention The network architecture schematic diagram of the processing system of frequency supports an exemplary application, the processing system of video referring to Fig. 1 to realize 100 include terminal (including terminal 400-1 and terminal 400-2) and server 200, and terminal connects server by network 300 200, network 300 can be wide area network or local area network, or be combination, realize that data pass using Radio Link It is defeated.

Server 200, for obtaining at least one interactive information of corresponding video, interactive information is directed to for characterizing user The Sentiment orientation of video；According to the mapping relations of interactive information and emotional category, the emotion class of at least one interactive information is determined Not；Obtain the interactive information that emotional category at least one interactive information is negative sense classification；When the other interactive information of negative sense-class is full When sufficient preset condition, determine that video is the target video for carrying negative sense content；It will be sent to for the prompt information of target video Terminal, prompt information is for prompting target video to carry negative sense content；

Terminal (terminal 400-1 and/or terminal 400-2), for receiving the prompt information for being directed to target video.

Next the processing unit of video provided in an embodiment of the present invention is illustrated.The video of the embodiment of the present invention Processing unit can be implemented in a variety of manners, such as: individually implemented by smart phone, tablet computer and desktop computer terminal, or Person is by terminal, server coordinated implementation.The processing unit of the video of the embodiment of the present invention, may be implemented as hardware or software and hardware In conjunction with mode, illustrate the various exemplary implementations of the processing unit of video provided in an embodiment of the present invention below.

It elaborates below to the hardware configuration of the processing unit of the video of the embodiment of the present invention, Fig. 2 is that the present invention is real Apply the composed structure schematic diagram of the processing unit of the video of example offer, it will be understood that Fig. 2 illustrate only the processing unit of video Exemplary structure rather than entire infrastructure, can be implemented as needed Fig. 2 shows part-structure or entire infrastructure.

The processing unit of video provided in an embodiment of the present invention includes: at least one processor 201, memory 202, user Interface 203 and at least one network interface 204.Various components in the processing unit 20 of video are coupled by bus system 205 Together.It is appreciated that bus system 205 is for realizing the connection communication between these components.It includes number that bus system 205, which is removed, It further include power bus, control bus and status signal bus in addition except bus.But for the sake of clear explanation, in Fig. 2 Various buses are all designated as bus system 205.

Wherein, user interface 203 may include display, keyboard, mouse, trace ball, click wheel, key, button, sense of touch Plate or touch screen etc..

It is appreciated that memory 202 can be volatile memory or nonvolatile memory, may also comprise volatibility and Both nonvolatile memories.Wherein, nonvolatile memory can be read-only memory (ROM, Read Only Memory), Programmable read only memory (PROM, Programmable Read-Only Memory), Erasable Programmable Read Only Memory EPROM (EPROM, Erasable Programmable Read-Only Memory), flash memory (Flash Memory) etc..Volatibility is deposited Reservoir can be random access memory (RAM, Random Access Memory), be used as External Cache.By showing Example property but be not restricted explanation, the RAM of many forms is available, such as static random access memory (SRAM, Static Random Access Memory), synchronous static random access memory (SSRAM, Synchronous Static Random Access Memory).The memory 202 of description of the embodiment of the present invention is intended to include depositing for these and any other suitable type Reservoir.

Memory 202 in the embodiment of the present invention can storing data to support the operation of server 200.These data Example includes: any computer program for operating on server 200, such as operating system and application program.Wherein, it operates System includes various system programs, such as ccf layer, core library layer, driving layer etc., for realizing various basic businesses and place Manage hardware based task.Application program may include various application programs.

The example that processing unit as video provided in an embodiment of the present invention uses software and hardware combining to implement, the present invention are real The processing unit for applying video provided by example can be embodied directly in be combined by the software module that processor 201 executes, software mould Block can be located in storage medium, and storage medium is located at memory 202, and processor 201 reads software module package in memory 202 The executable instruction included, in conjunction with necessary hardware (e.g., including processor 201 and the other assemblies for being connected to bus 205) Complete the processing method of video provided in an embodiment of the present invention.

As an example, processor 201 can be a kind of IC chip, and the processing capacity with signal, for example, it is general Processor, digital signal processor (DSP, Digital Signal Processor) or other programmable logic device are divided Vertical door or transistor logic, discrete hardware components etc., wherein general processor can be microprocessor or any normal The processor etc. of rule.

The example that processing unit as video provided in an embodiment of the present invention uses hardware to implement, institute of the embodiment of the present invention The processor 201 of hardware decoding processor form can be directly used to execute completion in the processing unit of the video of offer, for example, By one or more application specific integrated circuit (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), field programmable gate array (FPGA, Field-Programmable Gate Array) or other electronic components execute the processing method for realizing video provided in an embodiment of the present invention.

Memory 202 in the embodiment of the present invention is for storing various types of data to support the processing unit 20 of video Operation.The example of these data includes: any executable instruction for operating in the processing unit 20 of video, can such as be held Row instruction, realizes that the program of the processing method of the video of the embodiment of the present invention may be embodied in executable instruction.

The exemplary application and implementation of server provided in an embodiment of the present invention will be combined, illustrates that the embodiment of the present invention provides Video processing method.Fig. 3 is the flow diagram of the processing method of video provided in an embodiment of the present invention, referring to Fig. 3, originally The processing method of video that inventive embodiments provide includes:

Step 301: server obtains at least one interactive information of corresponding video, and interactive information is directed to for characterizing user The Sentiment orientation of video.

In practical applications, there are many types of the interactive information of corresponding video, such as comment on, forward, thumbing up, point is stepped on, minute It enjoys, one or more interactive informations of the available corresponding video of server, wherein the quantity of every kind of interactive information can be It is multiple, for example, all comments of the available corresponding video of server, or obtain all comments and forwarding of corresponding video.

Here, Sentiment orientation indicates that user likes evil, a kind of tendency of intrinsic evaluation to heart existing for video subjectivity, such as right The liking, support, disliking of video, indignation etc..

Step 302: according to the mapping relations of interactive information and emotional category, determining the emotion class of at least one interactive information Not.

It should be noted that emotional category includes positive sense-class and negative sense-class, positive sense-class is such as supported representation of video shot, is liked Deng negative sense-class disagreeable, indignation etc. to representation of video shot.Here, the quantity of interactive information is a plurality of, according to interactive information and emotion class Other mapping relations determine the emotional category of each interactive information.

In some embodiments, when interactive information includes non-textual interactive information, server can be according to non-textual mutual The attribute of dynamic information and the mapping relations of emotional category, determine the emotional category of non-textual interactive information, for example, working as interactive information When to thumb up, corresponding emotional category is forward direction；When interactive information is that point is stepped on, corresponding emotional category is negative sense.

In some embodiments, when interactive information includes text interactive information, server can in the following manner really Determine the emotional category of interactive information: obtaining the corresponding term vector sequence of text interactive information；By the corresponding word of text interactive information The sentiment classification model that sequence vector input training obtains, exports the emotional category of corresponding text interactive information.

In actual implementation, text interactive information is comment, will comment on what corresponding term vector sequence inputting training obtained Sentiment classification model, the positive negative sense probability commented on.One probability threshold value is set, when obtained probability is greater than threshold value, really The emotional category of accepted opinion opinion is negative sense；When obtained probability is less than threshold value, determine that the emotional category of comment is forward direction.For example, Probability threshold value is 0.5, if the term vector sequence inputting of a certain comment is trained in obtained sentiment classification model, obtained probability It is 0.8, the emotional category of comment is negative sense；If obtained probability is 0.2, the emotional category commented on is forward direction.

In some embodiments, server obtains the term vector of text interactive information in the following manner: interacting to text Information carries out word segmentation processing, obtains the set of words of text interactive information；Obtain the corresponding term vector of each word in set of words；By word The corresponding term vector of each word is combined into the term vector sequence of text interactive information in set.

Here, participle refers to the process that continuous word sequence is reassembled into word sequence according to certain specification, example Such as, interactive information is " I likes literature ", and participle obtains " I ", " liking ", " literature ".

In some embodiments, server can using the segmenting method based on string matching to text interactive information into Row participle, i.e., match the word sequence in text interactive information with the entry in machine dictionary.In further embodiments, Server can segment text interactive information using the segmenting method based on understanding, i.e., carry out sentence while participle The analysis of method and semanteme, handles Ambiguity using syntactic information and semantic information.In further embodiments, server Text interactive information can be segmented using the segmenting method based on statistics, i.e., using the text largely segmented, led to The rule of statistical machine learning model study word segmentation is crossed, to realize the cutting to unknown text.It, can in actual implementation To be segmented using existing participle tool to text interactive information, such as jieba participle, SnowNLP, THULAC.

In actual implementation, server can use the Chinese term vector of Beijing Normal University and People's University's open source (Chinese word Vector) corpus obtains the corresponding term vector of each word in set of words.The corpus includes passing through number Ten kinds of each field corpus (Baidupedia, wikipedia, People's Daily 1947-2017, know, microblogging, literature, finance, archaic Chinese Deng) training term vector, cover each field, and include it is a variety of training setting.

In some embodiments, participle and term vector conversion can be carried out by pipeline model, so, it is possible occurring When new participle tool or term vector, module before replacing builds model using new module.

In some embodiments, training obtains sentiment classification model in the following manner: building training sample set, training Sample set includes that interactive information sample and interactive information sample are corresponding with reference to emotional category；Interactive information sample is corresponding Term vector sequence inputting sentiment classification model obtains the corresponding target emotional category of interactive information sample；Based on target emotion class Not with the difference of reference emotional category, the model parameter of sentiment classification model is updated.

In some embodiments, server can be used under type such as and realize the update of sentiment classification model:

Target emotional category of the server based on output and emotional category is referred to, determines the value of loss function, and judge to damage Whether the value for losing function exceeds preset threshold, when the value of loss function exceeds preset threshold, determines emotion based on loss function The error signal of disaggregated model, the backpropagation in sentiment classification model by control information, and updated respectively during propagation A layer of model parameter.

Here backpropagation is illustrated, training sample data is input to the input layer of neural network model, passed through Hidden layer finally reaches output layer and exports as a result, this is the propagated forward process of sentiment classification model, due to emotional semantic classification mould The output result of type and actual result have error, then calculate the error between output result and actual value, and by the error from defeated Layer is to hidden layer backpropagation out, until input layer is traveled to, during backpropagation, according to error transfer factor model parameter Value；The continuous iteration above process, until convergence.

In some implementations, the loss function of sentiment classification model are as follows:

Wherein,For target emotional category, y₁For with reference to emotional category, ω₁(θ) is the model parameter for exporting current iteration.

In actual implementation, server can find user's high frequency and deliver by analyzing the more content of daily negative-feedback Negative reviews, be labeled as the sample of negative emotion, and building rule based on this, similar commented using rule extraction is more By the sample as new negative emotion, so as to quickly obtain sample data largely with negative emotion.

In some embodiments, the sample set that server will acquire is broken up at random, and according to a certain percentage Sample set is divided into training sample set, verifying sample set and test sample set, then using training sample set with Sample set adjusting parameter on single model is verified, longitudinal forecasting accuracy for comparing different parameters selects accuracy highest Parameter combination.

In some embodiments, server can simultaneously be trained multiple and different models, obtain quasi- in each model The highest parameter combination of true property, then across comparison is carried out to each model under the highest parameter combination of accuracy, to determine most Excellent model determines the emotional category of interactive information by optimal model.For example, server can training text convolution mind simultaneously Through network (text-CNN, text-Convolutional Neural Networks) model, gating cycle unit (GRU, Gated Recurrent Unit) model and length time memory network (LSTM, Long Short-Term Memory) mould Type obtains the accuracy highest of LSTM model by across comparison.

In actual implementation, it can use TensorFlow+Keras and build sentiment classification model.TensorFlow is exactly The frame of realization and execution machine learning algorithm in such a way that tensor (Tensor) flows (Flow) on calculating figure (Graph) Frame.TensorFlow have the characteristics that flexibly, code portable, differentiate automatically, it is multilingual support and performance it is high.Keras is One open source artificial neural network library write by Python, can be used as Tensorflow, Microsoft-CNTK and The high-order application programming interfaces of Theano carry out design, debugging, assessment, application and the visualization of deep learning model.Therefore, It being capable of fast construction sentiment classification model by TensorFlow+Keras.

Step 303: according to determining emotional category, obtaining emotional category at least one interactive information is negative sense-class Other interactive information.

In some embodiments, when interactive information includes multiple text interactive informations, server obtains related to video The text information of connection；The semantic similarity of each text interactive information and text information is determined respectively；It is right according to semantic similarity Multiple text interactive informations are screened, and target text interactive information set is obtained.

In practical applications, text information associated with video can be title, abstract, the label etc. of video.Work as text When this information includes a variety of, the semantic similarity of text interactive information and each text information is determined respectively, for example, working as text information For title and abstract when, respectively obtain text interactive information and title semantic similarity and text interactive information and abstract Semantic similarity.When any similarity meets preset threshold, text interactive information is added to target text interaction letter Breath set.

In actual implementation, server can obtain text interactive information and text by the neural network model that training obtains The semantic similarity of this information.By in text interactive information and text information input neural network model, output text interaction is believed The semantic similarity of breath and text information.

In some embodiments, neural network model can be trained in the following manner and be obtained: building training sample set, Training sample set includes text interactive information sample corresponding with the associated text information sample of video, video and text The reference semantic similarity of interactive information and text information；By the corresponding term vector sequence of text interactive information sample and text envelope The corresponding term vector sequence inputting neural network model of sample is ceased, the mesh of text interactive information sample Yu text information sample is obtained Mark semantic similarity；Target semantic similarity based on output and semantic similarity is referred to, determines the loss of neural network model The value of function；Value based on loss function updates the model parameter of neural network model.

The loss function of neural network model are as follows:

Wherein,For the target semantic similarity of output, y₂For with reference to semantic similarity, ω₂(θ) is "current" model ginseng Number.

It should be noted that text interactive information and text information semantic similarity characterization text interactive information and video Degree of correlation screens interactive information by semantic similarity, rejects the text interactive information low with video degree of correlation.

In some embodiments, sentiment classification model can be dual output model, and the input of sentiment classification model is not only wrapped Text interactive information is included, further includes the text information of video, the feelings that text interactive information and text information input training are obtained Feel disaggregated model, the similarity of the corresponding emotional category of output text interactive information and text interactive information and text information.

In actual implementation, since sentiment classification model is dual output model, there are two loss functions for model tool, respectively Corresponding two output, in the training process, by the way of each self refresh, i.e., respectively according to two loss functions, to emotion point The parameter of class model is updated.

In this way, the embodiment of the present invention only needs one sentiment classification model of training, it will be able to while obtaining text interaction letter Cease the similarity of corresponding emotional category and text interactive information and text information.

In some implementations, it is negative sense classification that server can obtain the emotional category of corresponding video in the following manner Text interactive information: according to determining emotional category, determine that emotional category is target class from target text interactive information set Other text interactive information.

In actual implementation, what server obtained is related to video content, and emotional category is the text of negative sense classification Interactive information so, it is possible to weed out the text interactive information that some anencephalies are abused, and is promoted and determines that video is to carry negative sense content Target video accuracy.

Step 304: when the other interactive information of negative sense-class meets preset condition, determining that video is the mesh for carrying negative sense content Mark video.

In some embodiments, server determines that emotional category is the quantity of the interactive information of negative sense classification；Work as target class When the quantity of other interactive information meets preset condition, determine that video is the target video for carrying negative sense content.

In some embodiments, amount threshold can be set in server, judges that the quantity of the interactive information of target category is It is no to be greater than preset quantity threshold value, if more than, it is determined that video is the target video for carrying negative sense content；Otherwise, video is not to take Target video with negative sense content.For example, setting 100 for amount threshold, interactive information is comment, when the negative sense of video is commented on When more than 100, determine that video is the target video for carrying negative sense content.

In some embodiments, server obtains the quantity of the interactive information of target category；Obtain the interaction of corresponding video The total quantity of information；When the ratio of the total quantity of the quantity and interactive information of the other interactive information of negative sense-class reaches preset ratio threshold When value, determine that video is target video.For example, 30% is set by proportion threshold value, if the general comment quantity of video is 100, Negative sense number of reviews is 40, and the ratio of negative sense number of reviews and general comment quantity is 40%, it is determined that video is to carry negative sense The target video of information.

In some embodiments, server can be with the mutual of the quantity of the interactive information of combining target classification and target category Whether the ratio-dependent video of the total quantity of the quantity and interactive information of dynamic information is the target video for carrying negative sense information: working as mesh The ratio for marking the quantity of the interactive information of classification and the total quantity of interactive information reaches preset ratio threshold value, and target category is mutual When the quantity of dynamic information meets preset quantity threshold value, determine that video is target video.

For example, proportion threshold value is set as 30%, amount threshold is set as 100, then, when the quantity of negative sense comment is 5, always When comment number is 10, although ratio meets preset ratio threshold value, it not can determine that video is target video.

In some embodiments, server obtain the other interactive information of negative sense-class in text interactive information and it is non-textual mutually Dynamic information；Determine the first quantity of text interactive information and the second quantity of non-textual interactive information；When the first quantity meets the One quantity term, and when the second quantity the second quantity term of satisfaction, determine that video is target video.

In actual implementation, the other non-textual interactive information of negative sense-class can be stepped on for point, when the quantity of negative sense comment meets First quantity term, and point step on quantity meet the second quantity term when, determine video be carrying negative sense content video.For example, First quantity term be greater than 100, the second quantity term be greater than 500, then when negative sense comment quantity be 200, Point step on quantity be 600 when, determine video be carrying negative sense content video.

In some embodiments, when the quantity of the interactive information of target category meets preset condition, server can also be into One step judges whether posteriority data meet preset condition.Here, posteriority data can be User Page pageview (PV PageView), access times (VV, VisitView), thumb up number, comment number and forwarding number etc..

For example, when determining that the quantity of comment meets preset condition, further judging user when interactive information is comment Whether page browsing amount meets condition, only when User Page pageview also meets condition, just can determine whether that video is target video.

The embodiment of the present invention, which passes through, judges whether posteriority data meet preset condition, improves the validity of interactive data, And then improve the accuracy that determining video is target video.

In some embodiments, server is when determining video is to carry the target video of negative sense content, delete target view Frequently；Or, the prompt information is for prompting the target video to carry by terminal is sent to for the prompt information of target video Negative sense content.

In actual implementation, server can delete target video in the Video Reservoir of server, avoid user Watch the video for carrying negative sense content；Or when user plays target video, it will be sent out for the prompt information of target video Terminal is given, user's video is prompted to carry negative sense content；Or target video is fed back into administrator, by administrator to target Video is operated, the play right of limited target video.

The embodiment of the present invention passes through the mapping relations of interactive information and emotional category, determines the feelings of at least one interactive information Feel classification, emotional category is the interactive information of negative sense classification at least one interactive information to obtain, when the negative sense-class is other When interactive information meets preset condition, determine that the video is the target video for carrying negative sense content；It so, it is possible according to interaction The mapping relations of information and emotional category automatically determine the other interactive information of negative sense-class, and then are believed according to the other interaction of negative sense-class Breath determines whether video is the target video for carrying negative sense content, realizes that automatic identification carries the video of negative sense content.

Below by taking interactive information is comment as an example, the processing method of video provided by the invention is illustrated.Fig. 4 is this The flow diagram of the processing method for the video that inventive embodiments provide, referring to fig. 4, the place of video provided in an embodiment of the present invention Reason method includes:

Step 401: server obtains whole comments of the title of video and corresponding video in Video Reservoir.

Step 402: tool being segmented by jieba, word segmentation processing is carried out to title and each comment respectively, obtain corresponding title And the multiple set of words respectively commented on.

Step 403: according to Chinese term vector corpus, obtaining the corresponding term vector of each word in each set of words, obtain pair The multiple term vector sequences of correspondence answering title and respectively commenting on.

Step 404: the multiple term vector sequence inputting sentiment classification models of correspondence that will be corresponded to title and respectively comment on, output are each Corresponding emotional category is commented in comment with the similarity of title and respectively.

Here, defeated every time to the term vector sequence of sentiment classification model input header and a term vector sequence commented on Comment emotional category corresponding with the similarity of title and the comment out.By repeatedly inputting, every comment is inputted into feelings Disaggregated model is felt, to export every comment emotional category corresponding with the similarity of title and every comment.

In actual implementation, it can train and obtain in the following manner: building training sample set, training sample set packet Include the emotional category of comment, the comment of the title, video sample of video sample and the reference similarity of comment and title；It will comment By corresponding term vector sequence and the corresponding term vector sequence inputting sentiment classification model of title, the target emotion class commented on Not and comment and the similarity of title；It is based respectively on the target emotional category of output and difference and output with reference to emotional category Target similarity and with reference to similarity difference, update the model parameter of sentiment classification model.

Step 405: according to each comment and the similarity of title, being filtered out from all comments and be more than with the similarity of title The comment of preset threshold forms target comment collection.

Step 406: according to the corresponding emotional category of each comment, it is negative sense that emotional category is filtered out from target comment collection Comment.

Step 407: obtaining the quantity for the comment that the emotional category that screening obtains is negative sense and the comment sum of video.

Step 408: the quantity for calculating the comment of negative sense accounts for the ratio of comment sum.

Step 409: when determining that ratio is greater than 30 percent, obtaining PV, VV of video, thumb up number, comment number and turn Send out number.

Step 410: when PV, VV of video, thumb up number, comment number and forwarding number be all larger than 500 when, determine video be mesh Mark video.

Step 411: target video is fed back into Administrator Client.

Step 412: administrator sends the instruction of delete target video by client.

Step 413: server deletes target video from Video Reservoir.

In the following, continuing to be illustrated the processing method of video provided in an embodiment of the present invention, Fig. 5 is the embodiment of the present invention The flow diagram of the processing method of the video of offer, in actual implementation, the processing method of video includes the training department of model Point and application training obtained model part that video is handled.

Firstly, the training part to model is illustrated.

Here, in actual implementation, server extraction section is typically commented on, and carries out positive negative sense mark to comment, will The comment of extraction and the positive negative sense mark of comment are used as sample set.Simultaneously obtain comment on belonging to video title or abstract, with And comment and the degree of correlation of its affiliated video, by the positive negative sense mark of the video comments of extraction, comment, video title or pluck It wants and comments on the degree of correlation of video as sample set.Wherein, server is by analyzing more interior of daily negative-feedback Hold, find the negative reviews that user's high frequency is delivered, be labeled as negative sample, and building rule based on this, is taken out using rule Take more similar comments as new negative sample, so as to quickly obtain that largely there is the sample negatively marked.

In actual implementation, the data in sample set are broken up at random, according to a certain percentage sample set point At training set, verifying collection and test set.

Next, building sentiment classification model using TensorFlow+Keras, and in order to be suitble to business use to mould The network structure of the sentiment classification model of type is adaptively adjusted.Here, the input of sentiment classification model include comment and The abstract or title of video export the positive negative sense prediction of comment, and the degree of correlation of comment and video, to identify abusing for anencephaly Comment.

Here, multiple sentiment classification models have been built, using training set and verifying collection on each model adjusting parameter into The longitudinal comparison of row, recycles test set to assess the effect of each model, and the best parameter of effect is chosen in each model Combination, then across comparison is carried out to each model under optimized parameter, finally determine optimal models.For example, can build simultaneously Text-CNN model, GRU model and LSTM, by across comparison, obtain the accuracy of LSTM model most in actual implementation It is high.

It should be noted that input model is the corresponding term vector sequence of comment and title or the corresponding word of abstract Sequence vector is obtained by participle tool and term vector crossover tool.

In actual implementation, the participle tool of open source and term vector has been used to carry out pipeline, participle tool uses It is " jieba participle " the Chinese term vector (Chinese that term vector uses Beijing Normal University and People's University to increase income Word Vector) corpus.It can be when there is new participle tool or term vector, before replacing by pipeline Module builds model using new module.

Then, the part handled video the model that application training obtains is illustrated.

The magnanimity comment data of the whole network is first obtained, and respectively comments on the title or abstract of corresponding video.By it is each comment and In its corresponding title or the obtained affective characteristics model of abstract input training, the emotional category of output each comment and its with The degree of correlation of video.According to different video, comment is divided, obtains the ratio of each video negative reviews.Here, it determines When the ratio of each video negative reviews, eliminate and the incoherent negative reviews of video.The ratio of negative reviews is greater than pre- If the video of value screens, postsearch screening then is carried out further according to posteriority data, such as TV/VV, number, comment number is thumbed up and turns Send out number etc..For example, k most video feed of number can will be commented on to administrator.

Next the processing unit of video provided in an embodiment of the present invention is illustrated, in some embodiments, video Processing unit the mode of software module can be used realize, Fig. 6 is the group of the processing unit of video provided in an embodiment of the present invention At structural schematic diagram, referring to Fig. 6, the processing unit of video includes:

Information acquisition unit 601, for obtaining at least one interactive information of corresponding video, the interactive information is used for table Take over the Sentiment orientation that family is directed to the video for use；

Classification determination unit 602 determines at least one for the mapping relations according to interactive information and emotional category The emotional category of interactive information；

Information sifting unit 603 is the mutual of negative sense classification for obtaining emotional category at least one interactive information Dynamic information；

Video determination unit 604, for determining the view when the other interactive information of the negative sense-class meets preset condition Frequency is the target video for carrying negative sense content.

The embodiment of the present invention also provides a kind of server, comprising:

Memory, for storing executable instruction；

Processor when for executing the executable instruction stored in the memory, is realized provided in an embodiment of the present invention The processing method of video.

In some embodiments, executable instruction can use program, software, software module, the form of script or code, By any form of programming language (including compiling or interpretative code, or declaratively or process programming language) write, and its It can be disposed by arbitrary form, including be deployed as independent program or be deployed as module, component, subroutine or be suitble to Calculate other units used in environment.

As an example, executable instruction can with but not necessarily correspond to the file in file system, can be stored in A part of the file of other programs or data is saved, for example, being stored in hypertext markup language (HTML, Hyper Text Markup Language) in one or more scripts in document, it is stored in the single file for being exclusively used in discussed program In, alternatively, being stored in multiple coordinated files (for example, the file for storing one or more modules, subprogram or code section).

As an example, executable instruction can be deployed as executing in a calculating equipment, or it is being located at one place Multiple calculating equipment on execute, or, be distributed in multiple places and by multiple calculating equipment of interconnection of telecommunication network Upper execution.

The above, only the embodiment of the present invention, are not intended to limit the scope of the present invention.It is all in this hair Made any modifications, equivalent replacements, and improvements etc. within bright spirit and scope, be all contained in protection scope of the present invention it It is interior.

Claims

1. a kind of processing method of video, which is characterized in that the described method includes:

At least one interactive information of corresponding video is obtained, the interactive information is used to characterize the emotion that user is directed to the video Tendency；

When the other interactive information of the negative sense-class meets preset condition, determine that the video is the target view for carrying negative sense content Frequently.

2. the method as described in claim 1, which is characterized in that the mapping relations according to interactive information and emotional category, Determine the emotional category of at least one interactive information, comprising:

When the interactive information includes text interactive information, the corresponding term vector sequence of the text interactive information is obtained；

By the corresponding term vector sequence inputting sentiment classification model of the text interactive information, the corresponding text interaction letter of output The emotional category of breath.

3. method according to claim 2, which is characterized in that the method also includes:

Training sample set is constructed, the training sample set includes that interactive information sample and the interactive information sample are corresponding With reference to emotional category；

By sentiment classification model described in the corresponding term vector sequence inputting of the interactive information sample, the interactive information sample is exported This corresponding target emotional category；

Based on the target emotional category and the difference with reference to emotional category, the model ginseng of the sentiment classification model is updated Number.

4. method according to claim 2, which is characterized in that it is described to obtain the corresponding term vector sequence of the interactive information, Include:

Word segmentation processing is carried out to the text interactive information, obtains the set of words of the interactive information；

Obtain the corresponding term vector of each word in the set of words；

The corresponding term vector of word each in the set of words is combined into the term vector sequence of the corresponding text interactive information.

5. the method as described in claim 1, which is characterized in that the method also includes:

When the interactive information includes multiple text interactive informations, text information associated with the video is obtained；

According to the semantic similarity, the multiple text interactive information is screened, obtains target text interactive information collection It closes.

6. the method as described in claim 1, which is characterized in that described when the other interactive information of the negative sense-class meets default item When part, determine that the video is the target video for carrying negative sense content, comprising:

Obtain the text interactive information and non-textual interactive information in the other interactive information of the negative sense-class；

When first quantity meets the first quantity term, and second quantity meets the second quantity term, described in determination Video is the target video for carrying negative sense content.

7. the method as described in claim 1, which is characterized in that described when the other interactive information of the negative sense-class meets default item When part, determine that the video is the target video for carrying negative sense content, comprising:

Determine that the emotional category is the quantity of the interactive information of negative sense classification；

When the quantity of the other interactive information of the negative sense-class meets preset condition, determine that the video is to carry negative sense content Target video.

8. the method for claim 7, which is characterized in that described when the other interactive information of the negative sense-class meets default item When part, determine that the video is the target video for carrying negative sense content, comprising:

Obtain the total quantity of the interactive information of the corresponding video；

When the ratio of the quantity of the other interactive information of the negative sense-class and the total quantity of the interactive information reaches preset threshold, Determine that the video is the target video for carrying negative sense content.

9. the method as described in claim 1, which is characterized in that the method also includes:

Delete the target video；Or,

It will be sent to terminal for the prompt information of the target video, the prompt information is for prompting the target video to take Band negative sense content.

10. a kind of processing unit of video, which is characterized in that described device includes:

Information acquisition unit, for obtaining at least one interactive information of corresponding video, the interactive information is for characterizing user For the Sentiment orientation of the video；

Classification determination unit determines at least one interaction letter for the mapping relations according to interactive information and emotional category The emotional category of breath；

Information sifting unit, for obtaining the interactive information that emotional category at least one interactive information is negative sense classification；

Video determination unit, for when the other interactive information of the negative sense-class meets preset condition, determining that the video is to take Target video with negative sense content.