CN109961032A

CN109961032A - Method and apparatus for generating disaggregated model

Info

Publication number: CN109961032A
Application number: CN201910204092.8A
Authority: CN
Inventors: 李伟健; 王长虎
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Douyin Vision Co Ltd; Douyin Vision Beijing Co Ltd
Priority date: 2019-03-18
Filing date: 2019-03-18
Publication date: 2019-07-02
Anticipated expiration: 2039-03-18
Also published as: CN109961032B

Abstract

Embodiment of the disclosure discloses the method and apparatus for generating disaggregated model.One specific embodiment of this method includes: acquisition sample set；Sample is selected from sample set；Determine whether selected sample is positive sample based on the classification information of selected sample；If it is positive sample, the parameter of disaggregated model is adjusted using first-loss function；If it is negative sample, the parameter of disaggregated model is adjusted using the second loss function.The embodiment can targetedly optimize model using different types of Sample video, to help to improve the accuracy that the disaggregated model that training obtains classifies to video, reduce the demand to training sample, improve the efficiency of model training.

Description

Method and apparatus for generating disaggregated model

Technical field

Embodiment of the disclosure is related to field of computer technology, and in particular to for generating the method and dress of disaggregated model It sets.

Background technique

Currently, the classification in order to identify the information such as video, image, can be used disaggregated model.Disaggregated model usually requires Training obtains in advance.Existing disaggregated model is usually more disaggregated models, it can is incorporated the video of input into preset more One of them in a classification.In order to improve more disaggregated models to the accuracy rate of visual classification, it usually needs it is directed to each classification, A large amount of training sample is set for optimizing when training to the parameter of model.

Summary of the invention

Embodiment of the disclosure proposes method and apparatus for generating disaggregated model and video classification methods and dress It sets.

In a first aspect, embodiment of the disclosure provides a kind of method for generating disaggregated model, this method comprises: obtaining Take sample set, wherein the sample in sample set has corresponding classification information, and category information is for characterizing the sample Positive sample or negative sample；Sample is selected from sample set；Selected sample is determined based on the classification information of selected sample Whether this is positive sample；It is positive sample in response to the selected sample of determination, by using selected sample as disaggregated model Input adjusted using first-loss function and using the classification information of selected sample as the desired output of disaggregated model The parameter of whole disaggregated model；It is negative sample in response to the selected sample of determination, by using selected sample as classification mould The input of type, and using the classification information of selected sample as the desired output of disaggregated model, using the second loss function come Adjust the parameter of disaggregated model.

In some embodiments, first-loss function includes more Classification Loss functions, and the second loss function includes two Classification Loss function.

In some embodiments, the classification information of the sample in sample set by the vector of the element with preset number Lai It indicates, the element in the element of preset number corresponds to the classification in multiple pre-set categories.

In some embodiments, sample set includes video sample set, and selected sample is video sample；And it is logical It crosses using selected sample as the input of disaggregated model, and using the classification information of selected sample as the phase of disaggregated model It hopes output, the parameter of disaggregated model is adjusted using first-loss function, comprising: extract sample from selected video sample Sets of video frames；Characteristic is extracted from extracted Sample video frame set；Based on extracted characteristic and selected The classification information of video sample adjust the parameter of disaggregated model using first-loss function.

In some embodiments, Sample video frame set, including following at least one are extracted from selected video sample : from selected video sample, the set of key frame is extracted, as Sample video frame set；When based on preset broadcasting Between interval from selected video sample extract Sample video frame set.

In some embodiments, by using selected sample as the input of disaggregated model, and by selected sample Desired output of the classification information as disaggregated model, the parameter of disaggregated model is adjusted using the second loss function, comprising: from Sample video frame set is extracted in selected video sample；Characteristic is extracted from extracted Sample video frame set；Base Disaggregated model is adjusted using the second loss function in the classification information of extracted characteristic and selected video sample Parameter.

Second aspect, embodiment of the disclosure provide a kind of video classification methods, this method comprises: obtaining view to be sorted Frequently；The disaggregated model that video input to be sorted is trained in advance, to generate for characterizing video classification belonging to video to be sorted Classification information, wherein disaggregated model be according in above-mentioned first aspect any embodiment describe method generate.

In some embodiments, this method further include: video to be sorted is sent to the video class with classification information characterization The terminal of the user of corresponding relationship is not established.

The third aspect, embodiment of the disclosure provide a kind of for generating the device of disaggregated model, which includes: sample This acquiring unit is configured to obtain sample set, wherein and the sample in sample set has corresponding classification information, such Other information is positive sample or negative sample for characterizing the sample；Selecting unit is configured to select sample from sample set；Really Order member, is configured to determine whether selected sample is positive sample based on the classification information of selected sample；First instruction Practice unit, is configured in response to determine that selected sample is positive sample, by using selected sample as disaggregated model Input adjusted using first-loss function and using the classification information of selected sample as the desired output of disaggregated model The parameter of whole disaggregated model；Second training unit is configured in response to determine that selected sample is negative sample, by by institute Input of the sample selected as disaggregated model, and the classification information of selected sample is defeated as the expectation of disaggregated model Out, the parameter of disaggregated model is adjusted using the second loss function.

In some embodiments, sample set includes video sample set, and selected sample is video sample；And the One training unit includes: the first extraction module, is configured to extract Sample video frame set from selected video sample；The Two extraction modules are configured to extract characteristic from extracted Sample video frame set；The first adjustment module, is configured to Classification information based on extracted characteristic and selected video sample utilizes first-loss function, adjustment classification mould The parameter of type.

In some embodiments, the first extraction module is configured to execute at least one of following: from selected video sample In this, the set of key frame is extracted, as Sample video frame set；Based on preset play time interval from selected video Sample video frame set is extracted in sample.

In some embodiments, the second training unit includes: third extraction module, is configured to from selected video sample Sample video frame set is extracted in this；4th extraction module is configured to extract feature from extracted Sample video frame set Data；Second adjustment module is configured to the classification information based on extracted characteristic and selected video sample, benefit With the second loss function, the parameter of disaggregated model is adjusted.

Fourth aspect, embodiment of the disclosure provide a kind of visual classification device, which includes: video acquisition list Member is configured to obtain video to be sorted；Generation unit is configured to the classification mould for training video input to be sorted in advance Type, to generate for characterizing the other classification information of video class belonging to video to be sorted, wherein disaggregated model is according to above-mentioned the What the method that any embodiment describes in one side generated.

In some embodiments, the device further include: transmission unit, be configured to for video to be sorted being sent in advance with The video classification of classification information characterization establishes the terminal of the user of corresponding relationship.

5th aspect, embodiment of the disclosure provide a kind of electronic equipment, which includes: one or more places Manage device；Storage device is stored thereon with one or more programs；When one or more programs are held by one or more processors Row, so that one or more processors realize the method as described in implementation any in first aspect or second aspect.

6th aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program, The method as described in implementation any in first aspect or second aspect is realized when the computer program is executed by processor.

The method and apparatus for generating disaggregated model that embodiment of the disclosure provides, by obtaining sample set, In, the sample in sample set has corresponding classification information, and category information is positive sample or negative sample for characterizing the sample This, then, selects sample from sample set, if the sample selected is positive sample, is classified using the training of first-loss function Model, if the sample selected is negative sample, using the second loss function train classification models, final training obtains classification mould Type, thus using first-loss function and the second loss function, targetedly using different types of training sample to model It optimizes, to help to improve the accuracy that the disaggregated model that training obtains classifies to various information.Compared to existing Have in technology, it usually needs a large amount of training sample is trained model, and embodiment of the disclosure can improve classification On the basis of accuracy, the demand to training sample is reduced, thus the efficiency of model training can be improved, helps to reduce and obtain Storage resource spent by a large amount of training samples is taken, and reduces the time of occupied processor when training pattern.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein；

Fig. 2 is according to an embodiment of the present disclosure for generating the flow chart of one embodiment of the method for disaggregated model；

Fig. 3 is according to an embodiment of the present disclosure for generating the signal of an application scenarios of the method for disaggregated model Figure；

Fig. 4 is the flow chart of one embodiment of video classification methods according to an embodiment of the present disclosure；

Fig. 5 is according to an embodiment of the present disclosure for generating the structural representation of one embodiment of the device of disaggregated model Figure；

Fig. 6 is the structural schematic diagram of one embodiment of visual classification device according to an embodiment of the present disclosure；

Fig. 7 is adapted for the structural schematic diagram for realizing the electronic equipment of embodiment of the disclosure.

Specific embodiment

The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining that correlation is open, rather than the restriction to the disclosure.It also should be noted that in order to Convenient for description, is illustrated only in attached drawing and disclose relevant part to related.

It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the method for being used to generate disaggregated model of embodiment of the disclosure or for generating classification The device and video classification methods of model or the exemplary system architecture 100 of visual classification device.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed on terminal device 101,102,103, such as video playing application, Video processing applications, web browser applications, social platform software etc..

Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard When part, it can be various electronic equipments.When terminal device 101,102,103 is software, above-mentioned electronic equipment may be mounted at In.Multiple softwares or software module (such as providing the software of Distributed Services or software module) may be implemented into it, Single software or software module may be implemented into.It is not specifically limited herein.

Server 105 can be to provide the server of various services, such as uploaded using terminal device 101,102,103 The background model server of sample set progress model training.The sample set that background model server can use acquisition carries out Model training generates disaggregated model, can also send disaggregated model on terminal device, or using disaggregated model to be sorted Video is handled, and the classification information of video to be sorted is obtained.

It should be noted that can be by server for generating the method for disaggregated model provided by embodiment of the disclosure 105 execute, and can also be executed by terminal device 101,102,103, correspondingly, the device for generating disaggregated model can be set In server 105, also it can be set in terminal device 101,102,103.In addition, view provided by embodiment of the disclosure Frequency classification method can be executed by server 105, can also be executed by terminal device 101,102,103, correspondingly, visual classification Device can be set in server 105, also can be set in terminal device 101,102,103.

It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software To be implemented as multiple softwares or software module (such as providing the software of Distributed Services or software module), also may be implemented At single software or software module.It is not specifically limited herein.

It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.Sample set needed for training pattern is not required to from long-range It obtains or video to be sorted is not required in the case where long-range obtain, above system framework can not include network, and only include Server or terminal device.

With continued reference to Fig. 2, the stream of one embodiment of the method for generating disaggregated model according to the disclosure is shown Journey 200.The method for being used to generate disaggregated model, comprising the following steps:

Step 201, sample set is obtained.

In the present embodiment, for generating executing subject (such as server shown in FIG. 1 or the end of the method for disaggregated model End equipment) sample set can be obtained from long-range by wired connection mode or radio connection, or sample is obtained from local This set.Wherein, the sample in sample set corresponds to preset classification information, and classification information is positive sample for characterizing sample Or negative sample.Specifically, positive sample is the sample of the classification belonged in preset at least two classification, and negative sample is pre- to be not belonging to If at least two classifications in classification sample.The corresponding classification information of positive sample is used to characterize classification belonging to positive sample, The corresponding classification information of negative sample is used to characterize the classification that negative sample is not belonging to classification information instruction.

In the present embodiment, the sample in sample set can be various types of samples, including but not limited to following to appoint It is a kind of: video, image, text etc..

As an example it is supposed that the type of the sample in sample set is video, the corresponding classification information of certain positive sample is " 1 " is " cat " (i.e. the video of the type includes the picture for characterizing cat) for characterizing video type belonging to the positive sample, certain is just The corresponding classification information of sample is " 2 ", is " dog " for characterizing video type belonging to the positive sample.Assuming that certain negative sample is corresponding Classification information be " 101 ", category information correspond to video type " cat ", be not belonging to " cat " type for characterizing the negative sample Video, the corresponding classification information of certain negative sample be " 102 ", category information correspond to video type " dog ", for characterizing this Negative sample is not belonging to the video of " dog " type.

In some optional implementations of the present embodiment, the classification information of the sample in sample set is default by having The vector of the element of number indicates that the element in the element of preset number corresponds to the classification in multiple pre-set categories.

As an example, the object element in the corresponding vector of positive sample belongs to the corresponding class of positive sample for characterizing positive sample Not, the object element in the corresponding vector of negative sample is not belonging to the corresponding classification of negative sample, object element for characterizing negative sample To establish the element at the position of corresponding relationship positioned at classification corresponding with sample in advance in the element in vector.Assuming that sample Type be video, preset number 200, for a positive sample, the corresponding video classification of the positive sample is " cat ", then this is being just The corresponding classification information of sample can be vector (1,0,0,0 ..., 0), which includes 200 elements, wherein first element (i.e. object element) corresponds to " cat " class.Here, number 1 indicates that the video belongs to " cat " class, and other elements 0 indicate the video The corresponding video classification of element position being not belonging to where 0.Similar, for another positive sample, the corresponding view of the positive sample Frequency classification is " dog ", then the corresponding classification information of the positive sample can be vector (0,1,0,0 ..., 0), wherein second element Corresponding to " dog " class.

In addition, corresponding classification information can be vector (0,0,0,0 ... 0,1,0 ..., 0) for a negative sample, Corresponding to " cat " class, wherein the 101st element (i.e. object element) is number 1, and other elements are number 0, indicates the negative sample It is not belonging to " cat " class.For another negative sample, corresponding classification information can be vector (0,0,0,0 ... 0,0,1 ..., 0), correspond to " dog " class, wherein the 102nd element is number 1, indicates that the negative sample is not belonging to " dog " class.In general, sample set In the other quantity of the corresponding video class of negative sample be less than or equal to the other quantity of the corresponding video class of positive sample.For example, just The corresponding video classification of sample includes " cat ", " dog ", " fox " three classifications, on the corresponding video classification of negative sample may include State at least one of three classifications.

It should be noted that the numerical value in vector is also possible to other numerical value, it is not limited to 0 and 1.It is characterized by using vector Classification information can neatly be extended the classification of disaggregated model identification.For example, need to only identify 10 in practical application Classification, the quantity for the element that vector includes are greater than the 10, the therein 1st to the 10th element and correspond respectively to preset classification.When When needing that disaggregated model is enable to identify larger class, only the corresponding classification of other elements need to be set, so as to neatly right The recognition capability of disaggregated model is extended.

Step 202, sample is selected from sample set.

In the present embodiment, above-mentioned executing subject can select sample from sample set in various manners, such as with Machine selection, selected according to the number order of pre-set each sample etc..

Step 203, the classification information based on selected sample determines whether selected sample is positive sample.

In the present embodiment, above-mentioned executing subject can determine selected sample based on the classification information of selected sample Whether this is positive sample.

It step 204, is positive sample in response to the selected sample of determination, by using selected sample as disaggregated model Input adjusted using first-loss function and using the classification information of selected sample as the desired output of disaggregated model The parameter of whole disaggregated model.

In the present embodiment, above-mentioned executing subject can be positive sample in response to the selected sample of determination, by by institute Input of the sample selected as disaggregated model, and the classification information of selected sample is defeated as the expectation of disaggregated model Out, the parameter of disaggregated model is adjusted using first-loss function.Wherein, first-loss function may include various for training The loss function of disaggregated model.

Specifically, disaggregated model can be various types of models, such as Recognition with Recurrent Neural Network model, convolutional neural networks Model etc..It is defeated for the positive sample or negative sample, available reality of each training input during train classification models Out.Wherein, reality output is the data of disaggregated model reality output, for characterizing classification information.Then, above-mentioned executing subject can To use gradient descent method and back propagation, it is based on reality output and desired output, determines penalty values using loss function, really Determine the penalty values for characterizing the gap of reality output and desired output, (is adjusted according to the parameter that penalty values adjust disaggregated model Whole parameter is so that penalty values are gradually reduced).

In the present embodiment, disaggregated model may include feature extraction layer and classification layer, wherein feature extraction layer is for mentioning Take the sample of input feature (such as when sample be video or image when, feature may include the spies such as color, shape, texture Sign), obtain the characteristic of the feature for characterizing sample.Classification layer can be various for classifying to characteristic Classifier (such as support vector machines, softmax classification function etc.).

It step 205, is negative sample in response to the selected sample of determination, by using selected sample as disaggregated model Input adjusted using the second loss function and using the classification information of selected sample as the desired output of disaggregated model The parameter of whole disaggregated model.

In the present embodiment, above-mentioned executing subject can be negative sample in response to the selected sample of determination, by by institute Input of the sample selected as disaggregated model, and the classification information of selected sample is defeated as the expectation of disaggregated model Out, the parameter of disaggregated model is adjusted using the second loss function.Wherein, the second loss function may include various for training The loss function of disaggregated model.

In some optional implementations of the present embodiment, first-loss function includes more Classification Loss functions, and Second loss function includes two Classification Loss functions.

It is used to characterize classification belonging to positive sample generally, due to the corresponding classification information of positive sample of input, usually classify The classification that model can identify be it is multiple, therefore, above-mentioned executing subject can for input positive sample, using more Classification Loss Function determines that penalty values, identified penalty values can be used for characterizing the gap of reality output and desired output.As an example, more Classification Loss function can be cross entropy loss function.Above-mentioned executing subject can use gradient descent method and back propagation, Based on above-mentioned more Classification Loss functions, the parameter of disaggregated model is adjusted.

Since the corresponding classification information of the negative sample of input is for characterizing whether negative sample belongs to the corresponding class of classification information Not, i.e., the reality output of model is for characterizing two kinds of recognition results (belong to or be not belonging to), and therefore, above-mentioned executing subject can be with needle To the negative sample of input, penalty values are determined using two Classification Loss functions.As an example, two Classification Loss functions can be intersection Entropy loss function.It should be noted that although two Classification Loss functions and above-mentioned more Classification Loss functions here are to intersect Entropy loss function, still, the form of the two cross entropy loss functions are different.Above-mentioned executing subject can use gradient descent method And back propagation, it is based on above-mentioned two Classification Loss function, adjusts the parameter of disaggregated model.

It should be noted that the classification layer that disaggregated model includes may include multiple two classifiers and a multi-categorizer, Each two classifier corresponds to a classification, the parameter of two classifiers and multi-categorizer be it is shared, using two Classification Loss When function or more Classification Loss functions are trained, the parameter of two classifiers and multi-categorizer can optimize simultaneously.Using instruction When disaggregated model after white silk carries out visual classification, above-mentioned multi-categorizer can be used and carry out visual classification.

In some optional implementations of the present embodiment, sample set includes video sample set, selected sample This is video sample.Above-mentioned executing subject can be positive sample in response to the selected sample of determination, adjust in accordance with the following steps The parameter of disaggregated model:

Firstly, extracting Sample video frame set from selected video sample.Specifically, above-mentioned executing subject can be by Sample video frame set is extracted from selected video sample according to various modes.

In some optional implementations of the present embodiment, above-mentioned executing subject can according to following at least one mode, Sample video frame set is extracted from selected video sample:

Mode one extracts the set of key frame from selected video sample, as Sample video frame set.It can be with Understand, when selected video sample is positive sample, the Sample video frame set selected is positive Sample video frame set, when When selected video sample is negative sample, the Sample video frame set selected is negative Sample video frame set.Above-mentioned execution Main body can extract key frame according to the method for the existing key frame for extracting video from video sample.

Mode two extracts sample view based on preset play time interval (such as 10 seconds) from selected video sample Frequency frame set.By this implementation, can be extracted from video sample a certain number of video frames for Sample video into Row classification, so as to reduce the calculation amount of model, improves the efficiency of model training.

Then, characteristic is extracted from Sample video frame set.Specifically, Sample video frame set can be inputted and is classified The feature extraction layer that model includes, feature extraction layer export characteristic.

Finally, the classification information based on extracted characteristic and selected video sample, utilizes first-loss letter Number, adjusts the parameter of disaggregated model.In general, the input for the classification layer that can include as disaggregated model using characteristic, by class Other information determines the gap of reality output and desired output using first-loss function as the desired output of classification layer, adjusts The parameter of disaggregated model, so that above-mentioned gap reduces.

In some optional implementations of the present embodiment, above-mentioned executing subject can be in response to the selected sample of determination This is negative sample, adjusts the parameter of disaggregated model in accordance with the following steps:

Firstly, extracting Sample video frame set from selected video sample.

Then, characteristic is extracted from extracted Sample video frame set.

Finally, the classification information based on extracted characteristic and selected video sample, utilizes the second loss letter Number, adjusts the parameter of disaggregated model.

It should be noted that each step for including in this implementation, except the loss function utilized is different, other steps Essentially identical with the method for the above-mentioned parameter using positive sample adjustment disaggregated model, which is not described herein again.

In practice, the process of train classification models is usually to recycle to execute, i.e., when above-mentioned executing subject determines adjusting parameter When disaggregated model afterwards meets preset condition, determine that disaggregated model training is completed.When being unsatisfactory for preset condition, sample is gradually selected Sheet and the parameter for adjusting disaggregated model.As an example, preset condition can include but is not limited to it is at least one of following: the training time More than preset duration；Frequency of training is more than preset times；It is calculated using above-mentioned first-loss function and above-mentioned second loss function Obtained penalty values are respectively smaller than default first-loss value threshold value and the second penalty values threshold value.

It is one of the application scenarios of the method according to the present embodiment for generating disaggregated model with continued reference to Fig. 3, Fig. 3 Schematic diagram.In the application scenarios of Fig. 3, electronic equipment 301 obtains sample set 302 first, wherein sample is video sample. Sample set 302 includes positive video sample and negative video sample.Wherein, positive video sample corresponds to preset classification information, uses The video classification belonging to the positive video sample of characterization.Negative video sample corresponds to preset classification information, for characterizing negative sample It is not belonging to the corresponding video classification of classification information.For example, positive corresponding 3021 ' of classification information of video sample 3021 is for characterizing just Video sample 3021 belongs to video classification " cat ", and corresponding 3022 ' of classification information of positive video sample 3022 is for characterizing positive video Sample 3022 belongs to video classification " dog ", and corresponding 3023 ' of classification information of negative video sample 3023 is for characterizing negative video sample 3023 are not belonging to video classification " cat ", and corresponding 3024 ' of classification information of negative video sample 3024 is for characterizing negative video sample 3024 are not belonging to video classification " dog ".

Then, electronic equipment 301 is from above-mentioned sample set, according to pre-set, video sample number order, according to Secondary selection video sample executes following training step using selected video sample: being in response to the selected sample of determination Positive video sample, by using selected sample as the input of disaggregated model, and by the classification information of selected sample As the desired output of disaggregated model, adjusted using first-loss function (such as more Classification Loss functions, i.e. L1 in figure) The parameter of disaggregated model；It is negative sample in response to the selected sample of determination, by using selected sample as disaggregated model Input, and using the classification information of selected sample as the desired output of disaggregated model, using the second loss function (such as Two Classification Loss functions, i.e. L2 in figure) adjust the parameter of disaggregated model.

As shown in the figure is to use 3021,3022,3023,3024 train classification models of video sample.Disaggregated model is each After video sample training, retain parameter adjusted, continues to use the training of other video samples.Every time after training, Electronic equipment 301 determines whether to meet preset end training condition and (such as utilizes above-mentioned more Classification Loss functions and above-mentioned two The obtained penalty values of Classification Loss function calculating, which are respectively smaller than, presets more Classification Loss value threshold values and two Classification Loss value thresholds Value), if meeting terminates training condition, the disaggregated model of the last adjusting parameter is determined as disaggregated model 303.

The method provided by the above embodiment of the disclosure, by obtaining sample set, wherein the sample tool in sample set There is corresponding classification information, category information is then positive sample or negative sample are selected from sample set for characterizing the sample Sample is selected, if the sample selected is positive sample, using first-loss function train classification models, if the sample of selection is negative Sample, using the second loss function train classification models, final training obtains disaggregated model, thus using first-loss function and Second loss function targetedly optimizes model using different types of training sample, to help to improve instruction The accuracy that the disaggregated model got classifies to various information.In compared with the prior art, it usually needs a large amount of instruction Practice sample to be trained model, embodiment of the disclosure can improve the accurate of classification in the case where training sample is less Property, the demand to training sample is reduced, thus the efficiency of model training can be improved, help to reduce obtaining a large amount of training samples Storage resource spent by this, and reduce the time of occupied processor when training pattern.

With further reference to Fig. 4, the process 400 of one embodiment of the video classification methods according to the disclosure is shown.It should Video classification methods, comprising the following steps:

Step 401, video to be sorted is obtained.

In the present embodiment, the executing subject (server or terminal device as shown in Figure 1) of video classification methods can be with From local or from remotely obtaining video to be sorted.Wherein, video to be sorted is the video to classify to it.Such as it is regarded from certain The video extracted in the video that frequency playback website or video playing application provide.

Step 402, disaggregated model video input to be sorted trained in advance, to generate for characterizing video institute to be sorted The other classification information of the video class of category.

In the present embodiment, the disaggregated model that above-mentioned executing subject can train video input to be sorted in advance generates For characterizing the other classification information of video class belonging to video to be sorted.Classification information generated can include but is not limited to Under at least a form of information: text, number, symbol.

In the present embodiment, disaggregated model is generated according to the method for above-mentioned Fig. 2 corresponding embodiment description, specifically can be with Each step of corresponding embodiment description referring to fig. 2, which is not described herein again.It should be noted that above-mentioned disaggregated model includes classification Layer, classification layer include two classifiers and multi-categorizer, and in the present embodiment, the disaggregated model that training obtains uses multi-categorizer pair The video of input is classified.

Optionally, above-mentioned executing subject can export classification information in various manners, such as classification information is shown On the display screen that above-mentioned executing subject includes, or send classification information and video to be sorted to and above-mentioned executing subject communication link Other electronic equipments connect.

In some optional implementations of the present embodiment, video to be sorted can also be sent to by above-mentioned executing subject The terminal of the user of corresponding relationship is established with the video classification of classification information characterization.Specifically, user is other corresponding with video class Relationship can use the forms such as two-dimensional table, chained list, pointer characterization.By establishing above-mentioned corresponding relationship, user can be carried out Classification, each classification can correspond at least one video classification.For example, it includes that " cat ", " dog " etc. are doted on that some classification, which corresponds to, The video classification of species type, when the corresponding classification information of video to be sorted obtained using above steps is for characterizing " cat " class When the video of type, above-mentioned executing subject can send video to be sorted to the user name instruction being divided under the classification in advance The terminal that user uses.This implementation is due to being utilized the high spy of accuracy that above-mentioned disaggregated model classifies to video Point may be implemented more targetedly to look into the terminal of video push to user so as to save user in Internet resources Look for time and data traffic spent by its interested video.

The method provided by the above embodiment of the disclosure is treated by using the disaggregated model that Fig. 2 corresponding embodiment generates Classification video is classified, and the classification information of video to be sorted is generated, to improve the accurate of the classification information for generating video Property.

With further reference to Fig. 5, as the realization to method shown in above-mentioned Fig. 2, present disclose provides one kind to divide for generating One embodiment of the device of class model, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically may be used To be applied in various electronic equipments.

As shown in figure 5, the present embodiment includes: sample acquisition unit 501, quilt for generating the device 500 of disaggregated model It is configured to obtain sample set, wherein the sample in sample set has corresponding classification information, and category information is for characterizing The sample is positive sample or negative sample；Selecting unit 502 is configured to select sample from sample set；Determination unit 503, It is configured to determine whether selected sample is positive sample based on the classification information of selected sample；First training unit 504, it is configured in response to determine that selected sample is positive sample, by using selected sample as the defeated of disaggregated model Enter, and using the classification information of selected sample as the desired output of disaggregated model, is adjusted using first-loss function point The parameter of class model；Second training unit 505 is configured in response to determine that selected sample is negative sample, by by institute Input of the sample selected as disaggregated model, and the classification information of selected sample is defeated as the expectation of disaggregated model Out, the parameter of disaggregated model is adjusted using the second loss function.

In the present embodiment, sample acquisition unit 501 can be by wired connection mode or radio connection from remote Journey obtains sample set, or obtains sample set from local.Wherein, the sample in sample set is believed corresponding to preset classification Breath, classification information are positive sample or negative sample for characterizing sample.Specifically, positive sample is to belong to preset at least two classification In classification sample, negative sample is the sample for the classification being not belonging in preset at least two classification.The corresponding class of positive sample Other information is not belonging to classification letter for characterizing negative sample for characterizing classification belonging to positive sample, the corresponding classification information of negative sample Cease the classification of instruction.

In the present embodiment, selecting unit 502 can select sample from sample set in various manners, such as at random Selection, selected according to the number order of pre-set each sample etc..

In the present embodiment, determination unit 503 can determine selected sample based on the classification information of selected sample Whether this is positive sample.

In the present embodiment, the first training unit 504 can in response to the selected sample of determination be positive sample, pass through by Input of the selected sample as disaggregated model, and the classification information of selected sample is defeated as the expectation of disaggregated model Out, the parameter of disaggregated model is adjusted using first-loss function.Wherein, first-loss function may include various for training The loss function of disaggregated model.

Specifically, disaggregated model can be various types of models, such as Recognition with Recurrent Neural Network model, convolutional neural networks Model etc..It is defeated for the positive sample or negative sample, available reality of each training input during train classification models Out.Wherein, reality output is the data of disaggregated model reality output, for characterizing classification information.Then, above-mentioned first training is single Member 504 can use gradient descent method and back propagation, be based on reality output and desired output, determined and damaged using loss function Mistake value determines the penalty values for characterizing the gap of reality output and desired output, and the ginseng of disaggregated model is adjusted according to penalty values Number (i.e. adjusting parameter is so that penalty values are gradually reduced).

In the present embodiment, the second training unit 505 can in response to the selected sample of determination be negative sample, pass through by Input of the selected sample as disaggregated model, and the classification information of selected sample is defeated as the expectation of disaggregated model Out, the parameter of disaggregated model is adjusted using the second loss function.Wherein, the second loss function may include various for training The loss function of disaggregated model.

In some optional implementations of the present embodiment, the classification information of the sample in the sample set is by having The vector of the element of preset number indicates that the element in the element of the preset number corresponds to the class in multiple pre-set categories Not.

In some optional implementations of the present embodiment, sample set includes video sample set, selected sample This is video sample；And first training unit 504 may include: the first extraction module (not shown), be configured to from Sample video frame set is extracted in selected video sample；Second extraction module (not shown) is configured to from being mentioned The Sample video frame set taken extracts characteristic；The first adjustment module (not shown) is configured to based on extracted The classification information of characteristic and selected video sample adjusts the parameter of disaggregated model using first-loss function.

In some optional implementations of the present embodiment, the first extraction module is configured to execute following at least one : from selected video sample, the set of key frame is extracted, as Sample video frame set；When based on preset broadcasting Between interval from selected video sample extract Sample video frame set.

In some optional implementations of the present embodiment, the second training unit 505 may include: that third extracts mould Block is configured to extract Sample video frame set from selected video sample；4th extraction module is configured to from being mentioned The Sample video frame set taken extracts characteristic；Second adjustment module is configured to based on extracted characteristic and institute The classification information of the video sample of selection adjusts the parameter of disaggregated model using the second loss function.

The device provided by the above embodiment of the disclosure, by obtaining sample set, wherein the sample tool in sample set There is corresponding classification information, category information is then positive sample or negative sample are selected from sample set for characterizing the sample Sample is selected, if the sample selected is positive sample, using first-loss function train classification models, if the sample of selection is negative Sample, using the second loss function train classification models, final training obtains disaggregated model, thus using first-loss function and Second loss function targetedly optimizes model using different types of training sample, to help to improve instruction The accuracy that the disaggregated model got classifies to various information.In compared with the prior art, it usually needs a large amount of instruction Practice sample to be trained model, embodiment of the disclosure can be reduced on the basis of improving the accuracy of classification to training The demand of sample, thus the efficiency of model training can be improved, help to reduce to obtain and be deposited spent by a large amount of training samples Resource is stored up, and reduces the time of occupied processor when training pattern.

With further reference to Fig. 6, as the realization to method shown in above-mentioned Fig. 4, present disclose provides a kind of visual classification dresses The one embodiment set, the Installation practice is corresponding with embodiment of the method shown in Fig. 4, which specifically can be applied to respectively In kind electronic equipment.

As shown in fig. 6, the visual classification device 600 of the present embodiment includes: video acquisition unit 601, it is configured to obtain Video to be sorted；Generation unit 602 is configured to the disaggregated model for training video input to be sorted in advance, is used for generating Characterize the other classification information of video class belonging to video to be sorted, wherein disaggregated model is retouched according to above-mentioned Fig. 2 corresponding embodiment What the method stated generated.

In the present embodiment, video acquisition unit 601 can be from local or from remotely obtaining video to be sorted.Wherein, to Classification video is the video to classify to it.Such as from the video that certain video playback website or video playing application provide The video of extraction.

In the present embodiment, the disaggregated model that generation unit 602 can train video input to be sorted in advance is generated and is used The other classification information of video class belonging to characterization video to be sorted.Classification information generated can include but is not limited to following The information of at least one form: text, number, symbol.

Optionally, above-mentioned apparatus 600 can export classification information in various manners, such as classification information is shown upper It states on the display screen that device 600 includes, or sends classification information and video to be sorted to and communicated to connect with above-mentioned apparatus 600 Other electronic equipments.

In some optional implementations of the present embodiment, the device 600 can also include: transmission unit (in figure not Show), it is configured to for video to be sorted being sent to the use for establishing corresponding relationship with the video classification of classification information characterization in advance The terminal at family.

The device provided by the above embodiment 600 of the disclosure, by using Fig. 2 corresponding embodiment generate disaggregated model, Classify to video to be sorted, generates the classification information of video to be sorted, thus improve the classification information of output video Accuracy.

Below with reference to Fig. 7, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1 Server or terminal device) 700 structural schematic diagram.Terminal device in embodiment of the disclosure can include but is not limited to all As mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP are (portable Formula multimedia player), the mobile terminal and such as number TV, desk-top meter of car-mounted terminal (such as vehicle mounted guidance terminal) etc. The fixed terminal of calculation machine etc..Electronic equipment shown in Fig. 7 is only an example, should not be to the function of embodiment of the disclosure Any restrictions are brought with use scope.

As shown in fig. 7, electronic equipment 700 may include processing unit (such as central processing unit, graphics processor etc.) 701, random access can be loaded into according to the program being stored in read-only memory (ROM) 702 or from storage device 708 Program in memory (RAM) 703 and execute various movements appropriate and processing.In RAM 703, it is also stored with electronic equipment Various programs and data needed for 700 operations.Processing unit 701, ROM 702 and RAM703 are connected with each other by bus 704. Input/output (I/O) interface 705 is also connected to bus 704.

In general, following device can connect to I/O interface 705: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 706 of head, microphone, accelerometer, gyroscope etc.；Including such as liquid crystal display (LCD), loudspeaker, vibration The output device 707 of dynamic device etc.；Storage device 708 including such as tape, hard disk etc.；And communication device 709.Communication device 709, which can permit electronic equipment 700, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 7 shows tool There is the electronic equipment 700 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with Alternatively implement or have more or fewer devices.Each box shown in Fig. 7 can represent a device, can also root According to needing to represent multiple devices.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 709, or from storage device 708 It is mounted, or is mounted from ROM 702.When the computer program is executed by processing unit 701, the implementation of the disclosure is executed The above-mentioned function of being limited in the method for example.It should be noted that computer-readable medium described in embodiment of the disclosure can be with It is computer-readable signal media or computer-readable medium either the two any combination.Computer-readable medium Such as may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or Any above combination.The more specific example of computer-readable medium can include but is not limited to: lead with one or more The electrical connection of line, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type can Program read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, Magnetic memory device or above-mentioned any appropriate combination.

In embodiment of the disclosure, computer-readable medium can be any tangible medium for including or store program, The program can be commanded execution system, device or device use or in connection.And in embodiment of the disclosure In, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, wherein holding Computer-readable program code is carried.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable medium with Outer any computer-readable medium, the computer-readable signal media can be sent, propagated or transmitted for being held by instruction Row system, device or device use or program in connection.The program code for including on computer-readable medium It can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any conjunction Suitable combination.

Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment；It is also possible to individualism, and not It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more When a program is executed by the electronic equipment, so that the electronic equipment: obtaining sample set, wherein the sample tool in sample set There is corresponding classification information, category information is positive sample or negative sample for characterizing the sample；Sample is selected from sample set This；Determine whether selected sample is positive sample based on the classification information of selected sample；It is selected in response to determination Sample is positive sample, by using selected sample as the input of disaggregated model, and by the classification information of selected sample As the desired output of disaggregated model, the parameter of disaggregated model is adjusted using first-loss function；In response to selected by determination Sample be negative sample, by using selected sample as the input of disaggregated model, and by the classification of selected sample letter The desired output as disaggregated model is ceased, the parameter of disaggregated model is adjusted using the second loss function.

In addition, when said one or multiple programs are executed by the electronic equipment, it is also possible that the electronic equipment: obtaining Take video to be sorted；The disaggregated model that video input to be sorted is trained in advance, to generate for characterizing belonging to video to be sorted The other classification information of video class.

The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof The computer program code of work, described program design language include object oriented program language-such as Java, Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor Including sample acquisition unit, selecting unit, determination unit, the first training unit, the second training unit.Wherein, these units Title does not constitute the restriction to the unit itself under certain conditions, for example, sample acquisition unit is also described as " obtaining Take the unit of sample set ".

Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.

Claims

1. a kind of method for generating disaggregated model, comprising:

Obtain sample set, wherein the sample in the sample set has corresponding classification information, and category information is used for table Levying the sample is positive sample or negative sample；

Sample is selected from the sample set；

Determine whether selected sample is positive sample based on the classification information of selected sample；

It is positive sample in response to the selected sample of determination, by using selected sample as the input of the disaggregated model, And using the classification information of selected sample as the desired output of the disaggregated model, using first-loss function to adjust State the parameter of disaggregated model；

It is negative sample in response to the selected sample of determination, by using selected sample as the input of the disaggregated model, And using the classification information of selected sample as the desired output of the disaggregated model, using the second loss function to adjust State the parameter of disaggregated model.

2. according to the method described in claim 1, wherein, the first-loss function includes more Classification Loss functions, and institute Stating the second loss function includes two Classification Loss functions.

3. according to the method described in claim 1, wherein, the classification information of the sample in the sample set is by with present count The vector of purpose element indicates that the element in the element of the preset number corresponds to the classification in multiple pre-set categories.

4. method described in one of -3 according to claim 1, wherein the sample set includes video sample set, selected Sample be video sample；And

By using selected sample as the input of the disaggregated model, and using the classification information of selected sample as institute The desired output for stating disaggregated model adjusts the parameter of the disaggregated model using first-loss function, comprising:

Sample video frame set is extracted from selected video sample；

Characteristic is extracted from extracted Sample video frame set；

Classification information based on extracted characteristic and selected video sample is adjusted using the first-loss function The parameter of the whole disaggregated model.

5. according to the method described in claim 4, wherein, Sample video frame set, packet are extracted from selected video sample It includes at least one of following:

From selected video sample, the set of key frame is extracted, as the Sample video frame set；

The Sample video frame set is extracted from selected video sample based on preset play time interval.

6. according to the method described in claim 4, wherein, by using selected sample as the input of the disaggregated model, And using the classification information of selected sample as the desired output of the disaggregated model, using the second loss function to adjust State the parameter of disaggregated model, comprising:

Sample video frame set is extracted from selected video sample；

Characteristic is extracted from extracted Sample video frame set；

Classification information based on extracted characteristic and selected video sample is adjusted using second loss function The parameter of the whole disaggregated model.

7. a kind of video classification methods, comprising:

Obtain video to be sorted；

The disaggregated model that the video input to be sorted is trained in advance, to generate for characterizing belonging to the video to be sorted The other classification information of video class, wherein the disaggregated model is that the method according to one of the claims 1-6 generates 's.

8. according to the method described in claim 7, wherein, the method also includes:

The video to be sorted is sent to the end that the user of corresponding relationship is established with the video classification of classification information characterization End.

9. a kind of method for generating disaggregated model, comprising:

Sample acquisition unit is configured to obtain sample set, wherein the sample in the sample set has corresponding classification Information, category information are positive sample or negative sample for characterizing the sample；

Selecting unit is configured to select sample from the sample set；

Determination unit is configured to determine whether selected sample is positive sample based on the classification information of selected sample；

First training unit is configured in response to determine that selected sample is positive sample, by making selected sample For the input of the disaggregated model, and using the classification information of selected sample as the desired output of the disaggregated model, benefit The parameter of the disaggregated model is adjusted with first-loss function；

Second training unit is configured in response to determine that selected sample is negative sample, by making selected sample For the input of the disaggregated model, and using the classification information of selected sample as the desired output of the disaggregated model, benefit The parameter of the disaggregated model is adjusted with the second loss function.

10. device according to claim 9, wherein the first-loss function includes more Classification Loss functions, and institute Stating the second loss function includes two Classification Loss functions.

11. device according to claim 9, wherein the classification information of the sample in the sample set is default by having The vector of the element of number indicates that the element in the element of the preset number corresponds to the classification in multiple pre-set categories.

12. the device according to one of claim 9-11, wherein the sample set includes video sample set, selected The sample selected is video sample；And

First training unit includes:

First extraction module is configured to extract Sample video frame set from selected video sample；

Second extraction module is configured to extract characteristic from extracted Sample video frame set；

The first adjustment module is configured to the classification information based on extracted characteristic and selected video sample, benefit With the first-loss function, the parameter of the disaggregated model is adjusted.

13. device according to claim 12, wherein first extraction module is configured to execute following at least one :

14. device according to claim 12, wherein second training unit includes:

Third extraction module is configured to extract Sample video frame set from selected video sample；

4th extraction module is configured to extract characteristic from extracted Sample video frame set；

Second adjustment module is configured to the classification information based on extracted characteristic and selected video sample, benefit With second loss function, the parameter of the disaggregated model is adjusted.

15. a kind of visual classification device, comprising:

Video acquisition unit is configured to obtain video to be sorted；

Generation unit is configured to the disaggregated model for training the video input to be sorted in advance, to generate for characterizing State the other classification information of video class belonging to video to be sorted, wherein the disaggregated model be according to the claims 1-6 it What method described in one generated.

16. device according to claim 15, wherein described device further include:

Transmission unit is configured to for the video to be sorted being sent to the preparatory video classification with classification information characterization and builds The terminal of the user of vertical corresponding relationship.

17. a kind of electronic equipment, comprising:

One or more processors；

Storage device is stored thereon with one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method described in any one of claims 1-8.

18. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor Such as method described in any one of claims 1-8.