CN105045819A

CN105045819A - Model training method and device for training data

Info

Publication number: CN105045819A
Application number: CN201510362322.5A
Authority: CN
Inventors: 李超
Original assignee: Shenzhen Tencent Computer Systems Co Ltd
Current assignee: Shenzhen Tencent Computer Systems Co Ltd
Priority date: 2015-06-26
Filing date: 2015-06-26
Publication date: 2015-11-11
Anticipated expiration: 2035-06-26
Also published as: CN105045819B

Abstract

The invention discloses a model training method and device for training data. The method comprises the following steps: obtaining original training data; carrying out aggregation on the original training data to obtain aggregated training data; according to the original training data and the aggregated training data, establishing an index vector, wherein the absolute value of an index vector value is used for indicating the position of the training data in the aggregated training data in the original training data; randomly reading the value of the index vector, and obtaining corresponding training data from the aggregated training data according to the value; and utilizing the obtained training data to carry out model training. On the premise of training data aggregation, the value of the index vector is randomly read, the corresponding training data can be obtained from the aggregated training data, and the randomness of the training data used for the model training is guaranteed so as to improve a model training effect on the basis of memory saving.

Description

A kind of model training method of training data and device

Technical field

The invention belongs to computing technique field, particularly relate to a kind of model training method and device of training data.

Background technology

The clicking rate of online advertisement is estimated and is played an important role in advertisement release process, current industry mainly uses simple linear model as logistic regression (LR, etc. LogisticRegression) ad click rate modeling is carried out, model solution process is succinctly and comparatively rapid, can prevent the overfitting etc. to data to a certain extent.Although LR model solution is simple, at large data age, still need the calculated performance excavating LR more fully.Stochastic gradient descent (SGD, StochasticGradientDescent) algorithm is the optimized algorithm being usually used in training LR model, and it can reach convergence quickly under the scene of mass data.

Estimating in the training data of generation for ad click rate, often easily there are a large amount of repetition training data, and these a large amount of repeating datas are wasting storage space to a great extent.Be polymerized if these training datas repeated are done, only retain the copy of a training data, although then save internal memory, because identical data build up a place after polymerization, destroy being uniformly distributed of data, do not meet the randomness that data are original.And SGD algorithm needs just can obtain good model training result on the training dataset ensureing randomness, training based on the training dataset after polymerization, the modelling effect often obtained is not good.

Summary of the invention

The object of the present invention is to provide a kind of model training method and device of training data, be intended to improve model training effect.

For solving the problems of the technologies described above, the embodiment of the present invention provides following technical scheme:

A model training method for training data, comprising:

Obtain original training data;

Described original training data is polymerized, obtains being polymerized training data;

Set up index vector according to described original training data with the described training data that is polymerized, the absolute value of described index vector value is used to indicate the position of training data in polymerization training data in original training data;

The value of random reading index vector, and from described polymerization training data, obtain corresponding training data according to described value;

The training data got is utilized to carry out model training.

For solving the problems of the technologies described above, the embodiment of the present invention also provides following technical scheme:

A model training apparatus for training data, comprising:

Acquiring unit, for obtaining original training data;

Polymerized unit, for being polymerized described original training data, obtains being polymerized training data;

Vector sets up unit, and for setting up index vector according to described original training data with the described training data that is polymerized, the absolute value of described index vector value is used to indicate the position of training data in polymerization training data in original training data;

Reading unit, for reading the value of index vector at random, and obtains corresponding training data according to described value from described polymerization training data;

Training unit, carries out model training for utilizing the training data got.

Relative to prior art, the present embodiment, is being polymerized original training data, under obtaining the prerequisite of polymerization training data, set up index vector, the absolute value of this index vector value is used to indicate the position of training data in polymerization training data in original training data; When carrying out model training, the random value reading index vector, and from polymerization training data, obtain corresponding training data according to value, utilize the training data got to carry out model training; The embodiment of the present invention, under the prerequisite of training data polymerization, by reading index vector value at random, corresponding training data can be obtained from polymerization training data, ensure that the randomness of the training data for model training, thus model training effect can be improved on the basis saving internal memory.

Accompanying drawing explanation

Below in conjunction with accompanying drawing, by the specific embodiment of the present invention describe in detail, will make technical scheme of the present invention and other beneficial effect apparent.

Fig. 1 a is the scene schematic diagram of the model training method of training data provided by the invention;

The schematic flow sheet of the model training method of the training data that Fig. 1 b provides for first embodiment of the invention;

The schematic flow sheet of the model training method of the training data that Fig. 2 a provides for second embodiment of the invention;

Original training data schematic diagram in the model training method of the training data that Fig. 2 b provides for second embodiment of the invention;

Polymerization training data in the model training method of the training data that Fig. 2 c provides for second embodiment of the invention and the schematic diagram of index vector;

The structural representation of the model training apparatus of the training data that Fig. 3 a provides for third embodiment of the invention;

Another structural representation of model training apparatus of the training data that Fig. 3 b provides for third embodiment of the invention;

The structural representation of the server that Fig. 4 provides for fourth embodiment of the invention.

Embodiment

Please refer to graphic, wherein identical element numbers represents identical assembly, and principle of the present invention implements to illustrate in a suitable computing environment.The following description is based on the illustrated specific embodiment of the invention, and it should not be regarded as limiting the present invention not at other specific embodiment that this describes in detail.

In the following description, specific embodiments of the invention illustrate, unless otherwise stating clearly with reference to the step performed by or multi-section computing machine and symbol.Therefore, these steps and operation will have to mention for several times and performed by computing machine, and computing machine execution as referred to herein includes by representing with the operation of the computer processing unit of the electronic signal of the data in a structuring pattern.These data of this operation transformation or the position maintained in the memory system of this computing machine, its reconfigurable or other running changing this computing machine in the mode known by the tester of this area.The data structure that these data maintain is the provider location of this internal memory, and it has the particular characteristics defined by this data layout.But the principle of the invention illustrates with above-mentioned word, it is not represented as a kind of restriction, and this area tester can recognize that the plurality of step of the following stated and operation also may be implemented in the middle of hardware.

The embodiment of the present invention provides a kind of model training method and device of training data.

See Fig. 1 a, this figure is the scene schematic diagram of the model training method of this training data, the model training method of this training data can be applicable to the training system of ad click rate prediction model, this system can comprise the model training apparatus of training data, be mainly used in obtaining original training data, original training data is polymerized, obtains being polymerized training data; Then according to original training data be polymerized training data and set up index vector, wherein the absolute value of this index vector value is used to indicate, the position of the training data in original training data in polymerization training data; Thereafter, the value of random reading index vector, and from polymerization training data, obtain corresponding training data according to this value, the training data got is utilized to carry out model training, such as ad click rate modeling is carried out to linear models such as logistic regression LR, to estimate based on the clicking rate of training the model obtained to carry out advertisement, etc.

In addition, the model training systems of this training data can also comprise multiple advertisement services server, is mainly used according to clickstream data as generating training datas such as age of user, user's sex, advertisement ID (identify label number); Certainly, the model training systems of this training data can also comprise online storage server, advertisement delivery device and terminal etc., wherein online storage service device is mainly used in training data, the storing for information about etc. of ad distribution, advertisement delivery device is mainly used in for information about etc. carrying out advertisement putting according to training result and ad distribution, and terminal is mainly used in the advertisement etc. showing input to user.

To be described in detail respectively below.

First embodiment

In the present embodiment, the angle of the model training apparatus from training data be described, can be called for short model training apparatus, this model training apparatus specifically can be integrated in the network equipment such as server or gateway.

A model training method for training data, comprising: obtain original training data; Original training data is polymerized, obtains being polymerized training data; According to original training data be polymerized training data and set up index vector, the absolute value of index vector value is used to indicate the position of training data in polymerization training data in original training data; The value of random reading index vector, and from polymerization training data, obtain corresponding training data according to value; The training data got is utilized to carry out model training.

Refer to Fig. 1, Fig. 1 is the schematic flow sheet of the model training method of the training data that first embodiment of the invention provides.Described method comprises:

In step S101, obtain original training data.

In step s 102, original training data is polymerized, obtains being polymerized training data.

Wherein, step S101 and step S102 can be specially:

In the embodiment of the present invention, original training data can be specially some historical datas, is stored in online storage service device, because original training data repeatability is high, adopt the mode of data aggregate herein, original training data is polymerized, obtain being polymerized training data; Wherein, data aggregate refers to many parts of data aggregates of identical content, only retains the data processing method of a data trnascription.

That is, the training data repeated in original training data is done and is polymerized, only retain the copy of a training data, and the copy of these training datas retained is collected and is defined as being polymerized training data, effectively can reduce data space through data aggregate.

Such as, M bar training data is had in original training data, the training data repeated in M bar training data is done and is polymerized, only retain the copy of a training data, using the copy of these training datas retained as new training data, composition polymerization training data, and determine there is N bar training data in polymerization training data, wherein M, N are positive integer, and M is greater than N, and usual M can reach the several times of N.

In step s 103, according to original training data be polymerized training data and set up index vector, the absolute value of this index vector value is used to indicate the position of training data in polymerization training data in original training data.

Wherein, " according to original training data be polymerized training data and set up index vector " step can be specific as follows:

(1) determine the quantity of training data in described original training data, and determine the quantity of training data in described polymerization training data;

(2) index vector is set up according to the quantity of training data in described original training data with the quantity of training data in polymerization training data.

Such as, determine that the quantity of training data in original training data is M, determine that the quantity of being polymerized training data in training data is N, thereafter setting up a length is the index vector of M, its span is the integer in [-N ,-1] ∪ [1, N], wherein, the absolute value of this index vector value is used to indicate the position of training data in polymerization training data in original training data.

Again such as, when index vector value is 3, the absolute value 3 of this value indicates the position of the training data in polymerization training data; That is, by the absolute value of this value, the training data of correspondence position in polymerization training data can be got.

Be understandable that, if index vector value is positive, then can represent that the sample class indicated by training data of its correspondence is positive sample, if index vector value is negative, then can represent that the sample class indicated by training data of its correspondence is negative sample.

Such as, on each training data in original training data, an identifier can being set, being used to indicate the sample class indicated by this training data, with when setting up index vector, determining the positive and negative of index vector value according to the instruction of this identifier; That is, index vector value size is relevant with the position of training data in original training data, and the value of index vector is positive and negative relevant with the sample class of training data in original training data.

In step S104, the random value reading index vector, and from described polymerization training data, obtain corresponding training data according to described value.

In certain embodiments, " the random value reading index vector " can specifically comprise:

A () carries out shuffle operation to index vector;

B () reads the value of index vector successively from the index vector after shuffling.

Be understandable that, shuffle operation (Shuffle) carried out to index vector and can think to carry out shuffle operation, to ensure to realize being loaded into training data at random before model training to the training data in polymerization training data; Ensure the prerequisite of randomness in data under, SGD Algorithm for Training modelling effect out can be better.

Such as, after randomness shuffle operation is carried out to index vector, can read according to the order of this random index, the randomness of the training data read in can be ensured.Wherein, the method for reading in successively by indexed sequential simply and easily realize.

In step S105, the training data got is utilized to carry out model training.

Be understandable that, because linear model solves comparatively rapid in advertising circle, the overfitting etc. to data can be prevented to a certain extent, therefore usually use logistic regression LR model to carry out ad click rate modeling.

Such as, in the training process, logic-based regression model structure, utilizes the training data got to carry out model repetitive exercise, obtains the Logic Regression Models after iteration.

Further, in order to obtain the better Logic Regression Models of training effect, can also arrange one pre-conditioned, whether requirement is reached for the Logic Regression Models after detecting described iteration, it is pre-conditioned if judge, the Logic Regression Models after iteration meets, then preserve the Logic Regression Models after this iteration, on the contrary, if judge, the Logic Regression Models after this iteration does not meet pre-conditioned, then trigger the step (i.e. above-mentioned steps (a)) that execution " carries out shuffle operation to described index vector ", to proceed next round training; That is, each time before iteration, the randomness that training data is loaded into can be ensured.

From the above, the model training method of the training data that the present embodiment provides, is being polymerized training data, under obtaining the prerequisite of polymerization training data, set up index vector, the absolute value of this index vector value is used to indicate the position of training data in polymerization training data; When carrying out model training, the random value reading index vector, and from polymerization training data, obtain corresponding training data according to value, utilize the training data got to carry out model training; The embodiment of the present invention, under the prerequisite of training data polymerization, by reading index vector value at random, corresponding training data can be obtained from polymerization training data, ensure that the randomness of the training data for model training, thus model training effect can be improved on the basis saving internal memory.

Second embodiment

According to the method described by the first embodiment, below citing is described in further detail.

Refer to Fig. 2 a, the schematic flow sheet of the model training method of the training data that Fig. 2 a provides for second embodiment of the invention, described method comprises:

In step s 201, original training data is obtained.

In step S202, described original training data is polymerized, obtains being polymerized training data.

Wherein, described step S201 and step S202 can be specially:

Such as, M bar training data is had in original training data, the training data repeated in M bar training data is done and is polymerized, only retain the copy of a training data, using the copy of these training datas retained as new training data, composition polymerization training data, and determine there is N bar training data in polymerization training data, wherein M, N are positive integer, and M is greater than N, and usual M can reach the several times of N; Effectively data space can be reduced through data aggregate.

Be understandable that, the mark of instruction sample class can be carried in original training data.

Such as, in original training data, every bar training data end can arrange the mark of this instruction sample class, if Fig. 2 b is the signal of original training data, its last be classified as the mark of sample class, wherein 1 represent positive sample ,-1 represents negative sample.

Further, again such as, in advertising circle, usually use simple linear model such as logistic regression LR model to carry out ad click rate modeling, LR model is generally used for and solves two classification (namely result has two kinds of different classifications) problem, it under given input, can export the probability inputting these two classifications corresponding respectively with this.That is, before model training, needing manually to demarcate (demarcation positive and negative samples) the classification of training data, as Fig. 2 b, is the probability of positive/negative sample under finally can predicting certain scene.

General, arrange the classification of the current concern of user for positive example (sample class corresponding to it is exactly positive sample), a remaining class is just as negative example (sample class corresponding to it is exactly negative sample).

More specifically, in calculating advertising, because people pay close attention to click probability or transition probability that user sees an advertisement mostly, wherein " conversion " refer generally to user the behaviors such as purchase occur, so general by " click ", " conversion " as positive sample, " exposure " (namely user does not deal with to corresponding advertisement) is as negative sample.

In step S203, determine the quantity of training data in original training data, and determine the quantity of being polymerized training data in training data.

In step S204, the quantity according to training data in original training data sets up index vector with the quantity of training data in polymerization training data.

Wherein, described step S203 can be specially with step S204: according to original training data and the process of being polymerized training data and setting up index vector, wherein, the absolute value of this index vector value is used to indicate the position of training data in polymerization training data in original training data, by the absolute value of this value, the training data of correspondence position in polymerization training data can be got.

As shown in Figure 2 b, such as, determine that the quantity of training data in original training data be M is 8, by carrying out analysis polymerization to training data, obtain being polymerized training data, and determine that the quantity N being polymerized training data in training data is 4, thereafter setting up a length is the index vector of 8, and as shown in Figure 2 c, its span is [-4,-1] integer of ∪ [Isosorbide-5-Nitrae].

Further, again such as, shown in Fig. 2 b the 2nd, 3, training data representated by 4 row is the same, be polymerized, thus obtain the training data shown in Fig. 2 c representated by the 1st row, the value of this training data manipulative indexing vector relevant position is respectively 1 ,-1,1;

For another example, the training data shown in Fig. 2 b the 6th, representated by 7 row is the same, is polymerized, thus obtains the training data shown in Fig. 2 c representated by the 3rd row, and the value of this training data manipulative indexing vector relevant position is respectively 3 ,-3.

In summary, index vector value size is determined according to the position of training data in original training data, and the value of index vector is positive and negative determines according to the sample class of training data in original training data.

Be understandable that, when setting up index vector, the positive and negative of index vector value can be determined according to the instruction of the mark of aforementioned sample class, if index vector value is positive, then can represent that the sample class indicated by training data of its correspondence is positive sample, if index vector value is negative, then can represent that the sample class indicated by training data of its correspondence is negative sample.

In step S205, shuffle operation is carried out to above-mentioned index vector.

Preferably, after randomness shuffle operation is carried out to index vector, can read in successively according to the order of this random index, can ensure that the training data read in is random.

That is, by the foundation of this index vector, the compression of data and the distribution retaining original training data effectively can be ensured.By the position of index vector record original training data in polymerization training data, and determine positive negative sample by positive and negative values, to reach packed data under the prerequisite ensureing Algorithm for Training effect.

In step S206, from the index vector after shuffling, read the value of index vector successively.

In step S207, from polymerization training data, obtain corresponding training data according to above-mentioned value.

Preferably, after the step of " obtaining corresponding training data according to above-mentioned value from polymerization training data ", can also comprise:

1. the sample class indicated by corresponding training data is determined according to above-mentioned value.

Wherein, this sample class comprises positive sample and negative sample.

Such as, in calculating advertising, general using " click " (as user sees the click probability of an advertisement), as positive sample, " exposure " (namely user does not deal with to corresponding advertisement) is as negative sample for " conversion " (behaviors such as purchase occurring after an advertisement as user sees).

2. according to this sample class, gradient calculation is carried out to the training data of correspondence.

Because stochastic gradient descent SGD algorithm is the optimized algorithm being usually used in training LR model, therefore before carrying out repetitive exercise, needs the sample class according to training data, according to the gradient algorithm of correspondence, gradient calculation is carried out to the training data of correspondence.

In step S208, logic-based regression model structure, utilizes the training data got to carry out model repetitive exercise, obtains the Logic Regression Models after iteration.

Such as, logic-based regression model structure, the result according to gradient calculation carries out model training, obtains the Logic Regression Models after iteration.

Can be concrete, Logic Regression Models is mainly used in two classification problems (namely export and only have two kinds, represent two classifications respectively, as whether user clicks certain advertisement).Logistic regression is equivalent to y=f (X) (f represents a kind of Function Mapping relation), shows the relation of independent variable X and dependent variable y; Model solution/training need draws from given history data set (i.e. above-mentioned training data) learning/training.After model training is good, when new data (vectorial X) arrive, can according to this function prediction result.

In step S209, judge whether the Logic Regression Models after above-mentioned iteration meets pre-conditioned.

In this embodiment, whether the Logic Regression Models after can judging iteration in the following manner meets pre-conditioned:

Such as, judge whether the Logic Regression Models after described iteration restrains, and in judgment models iterative process, whether iterations reaches predetermined threshold value simultaneously.

According to judged result, perform step S210 and step S211 respectively.

In step S210, if judge, the Logic Regression Models after described iteration meets pre-conditioned, then preserve the Logic Regression Models after above-mentioned iteration.

Such as, when whether the Logic Regression Models after judging iteration restrains, determine that the Logic Regression Models after iteration restrains, then can think that Logic Regression Models meets pre-conditioned, preserve the Logic Regression Models after iteration;

Again such as, when judging in iterative process that whether iterations reaches predetermined threshold value, determining that iterations reaches predetermined threshold value, then can think that Logic Regression Models meets pre-conditioned, preserving the Logic Regression Models after iteration.

In step S211, if judge, the Logic Regression Models after described iteration does not meet pre-conditioned, then trigger performing step S205 (step S211 does not indicate in the drawings);

Namely when the Logic Regression Models after judging described iteration does not meet pre-conditioned, shuffle operation is carried out to above-mentioned index vector, to proceed model repetitive exercise.

Such as, when whether the Logic Regression Models after judging iteration restrains, determine that the Logic Regression Models after iteration does not restrain, and, when judging in iterative process that whether iterations reaches predetermined threshold value, determining that iterations does not reach predetermined threshold value, then can think that Logic Regression Models does not meet pre-conditioned, shuffle operation is carried out to above-mentioned index vector, to proceed next round training.

Be understandable that, in some embodiments, whether the Logic Regression Models after also only can need judging iteration restrains, or only needs in judgment models iterative process whether iterations reaches predetermined threshold value, determines whether Logic Regression Models meets pre-conditioned; Such as, when judging that the Logic Regression Models after iteration does not restrain, or, when iterations does not reach predetermined threshold value in judgment models iterative process, then can think that Logic Regression Models does not meet pre-conditioned, etc.

3rd embodiment

For ease of better implementing the model training method of the training data that the embodiment of the present invention provides, the embodiment of the present invention also provides a kind of device of the model training method based on above-mentioned training data.Wherein the implication of noun is identical with the method for the model training of above-mentioned training data, and specific implementation details can explanation in reference method embodiment.

Refer to Fig. 3, the structural representation of the model training apparatus of the training data that Fig. 3 provides for the embodiment of the present invention, the model training apparatus of this training data can comprise acquiring unit 301, polymerized unit 302, vector set up unit 303, reading unit 304 and training unit 305, as follows:

Wherein this acquiring unit 301, for obtaining original training data;

Polymerized unit 302, for being polymerized described original training data, obtains being polymerized training data.

This vector sets up unit 303, and for setting up index vector according to described original training data with the described training data that is polymerized, the absolute value of described index vector value is used to indicate the position of training data in polymerization training data in original training data.

Wherein, above-mentioned vector is set up unit 303 and can specifically be comprised:

(1) determining subelement, for determining the quantity of training data in described original training data, and determining the quantity of training data in described polymerization training data;

(2) set up subelement, set up index vector for the quantity according to training data in described original training data with the quantity of training data in polymerization training data.

This reading unit 304, for reading the value of index vector at random, and obtains corresponding training data according to described value from described polymerization training data.

In certain embodiments, this reading unit 304 can specifically comprise:

A () is shuffled subelement, for carrying out shuffle operation to described index vector;

B () reads subelement, for reading the value of index vector from the index vector after shuffling successively.

This training unit 305, carries out model training for utilizing the training data got.

Further, in order to obtain the better Logic Regression Models of training effect, can also arrange one pre-conditioned, whether reach requirement for the Logic Regression Models after detecting described iteration, it is pre-conditioned if judge, the Logic Regression Models after iteration meets, then preserve the Logic Regression Models after this iteration, on the contrary, if judge, the Logic Regression Models after this iteration does not meet pre-conditioned, then trigger subelement execution of shuffling and carry out shuffle operation to index vector, to proceed next round training; That is, each time before iteration, the randomness that training data is loaded into can be ensured.

Preferably, based on the model training apparatus of above-mentioned described training data, as shown in Figure 3 b, this device can also comprise:

1. determining unit 306, for determining the sample class indicated by corresponding training data according to above-mentioned value, described sample class comprises positive sample and negative sample;

2. computing unit 307, for according to this sample class, carries out gradient calculation to the training data of correspondence;

Then this training unit 305, can also be used for carrying out model training according to the result of gradient calculation.

In some embodiments, as in calculating advertising, this training unit 305, specifically for logic-based regression model structure, can utilize the training data got to carry out model repetitive exercise, obtains the Logic Regression Models after iteration.

After Logic Regression Models after obtaining iteration, whether the Logic Regression Models after this device can also judge above-mentioned iteration meets pre-conditioned, as follows:

Whether judging unit 308, meet pre-conditioned for the Logic Regression Models after judging described iteration;

Storage unit 309, if meet pre-conditioned for the Logic Regression Models after judging described iteration, then preserves the Logic Regression Models after described iteration.

Trigger element 310, if do not meet pre-conditioned for the Logic Regression Models after judging described iteration, then triggers to perform and carries out shuffle operation to described index vector.

Further, this judging unit can comprise:

First judgment sub-unit, if reach predetermined threshold value for the Logic Regression Models convergence after determining described iteration or iterations, then judges that the Logic Regression Models after described iteration meets pre-conditioned.

Second judgment sub-unit, if not restrain for the Logic Regression Models after determining described iteration and iterations does not reach predetermined threshold value, then judges that the Logic Regression Models after described iteration does not meet pre-conditioned.

Be understandable that, utilize estimating of the Logic Regression Models ad click rate after iteration to can refer to existing mode and realize, do not do concrete restriction herein.

During concrete enforcement, above unit can realize as independently entity, and can carry out combination in any yet, realize as same or several entities, the concrete enforcement of above unit see embodiment of the method above, can not repeat them here.

The model information device of this training data specifically can be integrated in the network equipment such as server or gateway.

From the above, the model training apparatus of the training data that the present embodiment provides, is being polymerized training data, under obtaining the prerequisite of polymerization training data, set up index vector, the absolute value of this index vector value is used to indicate the position of training data in polymerization training data; When carrying out model training, the random value reading index vector, and from polymerization training data, obtain corresponding training data according to value, utilize the training data got to carry out model training; The embodiment of the present invention, under the prerequisite of training data polymerization, by reading index vector value at random, corresponding training data can be obtained from polymerization training data, ensure that the randomness of the training data for model training, thus model training effect can be improved on the basis saving internal memory.

4th embodiment

The embodiment of the present invention also provides a kind of server, wherein can the model training apparatus of training data of the integrated embodiment of the present invention, described server can run based on a receiving terminal server, as shown in Figure 4, it illustrates the structural representation of the server involved by the embodiment of the present invention, specifically:

This server can comprise processor 401, the storer 402 of one or more computer-readable recording mediums, radio frequency (RadioFrequency, RF) circuit 403, power supply 404, the parts such as input block 405 and display unit 406 that more than or processes core.It will be understood by those skilled in the art that the server architecture shown in Fig. 4 does not form the restriction to server, the parts more more or less than diagram can be comprised, or combine some parts, or different parts are arranged.Wherein:

Processor 401 is control centers of this server, utilize the various piece of various interface and the whole server of connection, software program in storer 402 and/or module is stored in by running or performing, and call the data be stored in storer 402, perform various function and the process data of server, thus integral monitoring is carried out to server.Optionally, processor 401 can comprise one or more process core; Preferably, processor 401 accessible site application processor and modem processor, wherein, application processor mainly processes operating system, user interface and application program etc., and modem processor mainly processes radio communication.Be understandable that, above-mentioned modem processor also can not be integrated in processor 401.

Storer 402 can be used for storing software program and module, and processor 401 is stored in software program and the module of storer 402 by running, thus performs the application of various function and data processing.Storer 402 mainly can comprise storage program district and store data field, and wherein, storage program district can store operating system, application program (such as sound-playing function, image player function etc.) etc. needed at least one function; Store data field and can store the data etc. created according to the use of server.In addition, storer 402 can comprise high-speed random access memory, can also comprise nonvolatile memory, such as at least one disk memory, flush memory device or other volatile solid-state parts.Correspondingly, storer 402 can also comprise Memory Controller, to provide the access of processor 401 pairs of storeies 402.

RF circuit 403 can be used in the process of receiving and sending messages, the reception of signal and transmission, especially, after being received by the downlink information of base station, transfers to more than one or one processor 401 to process; In addition, base station is sent to by relating to up data.Usually, RF circuit 403 includes but not limited to antenna, at least one amplifier, tuner, one or more oscillator, subscriber identity module (SIM) card, transceiver, coupling mechanism, low noise amplifier (LNA, LowNoiseAmplifier), diplexer etc.In addition, RF circuit 403 can also by radio communication and network and other devices communicatings.Described radio communication can use arbitrary communication standard or agreement, include but not limited to global system for mobile communications (GSM, GlobalSystemofMobilecommunication), general packet radio service (GPRS, GeneralPacketRadioService), CDMA (CDMA, CodeDivisionMultipleAccess), Wideband Code Division Multiple Access (WCDMA) (WCDMA, WidebandCodeDivisionMultipleAccess), Long Term Evolution (LTE, LongTermEvolution), Email, Short Message Service (SMS, ShortMessagingService) etc.

Server also comprises the power supply 404 (such as battery) of powering to all parts, preferably, power supply can be connected with processor 401 logic by power-supply management system, thus realizes the functions such as management charging, electric discharge and power managed by power-supply management system.Power supply 404 can also comprise one or more direct current or AC power, recharging system, power failure detection circuit, power supply changeover device or the random component such as inverter, power supply status indicator.

This server also can comprise input block 405, and this input block 405 can be used for the numeral or the character information that receive input, and produces and to arrange with user and function controls relevant keyboard, mouse, control lever, optics or trace ball signal and inputs.

This server also can comprise display unit 406, this display unit 406 can be used for the various graphical user interface showing information or the information being supplied to user and the server inputted by user, and these graphical user interface can be made up of figure, text, icon, video and its combination in any.Display unit 408 can comprise display panel, optionally, the form such as liquid crystal display (LCD, LiquidCrystalDisplay), Organic Light Emitting Diode (OLED, OrganicLight-EmittingDiode) can be adopted to configure display panel.

Specifically in the present embodiment, processor 401 in server can according to following instruction, executable file corresponding for the process of one or more application program is loaded in storer 402, and the application program be stored in storer 402 is run by processor 401, thus realize various function, as follows:

Obtain original training data; Original training data is polymerized, obtains being polymerized training data; According to original training data be polymerized training data and set up index vector, the absolute value of index vector value is used to indicate the position of training data in polymerization training data in original training data; The value of random reading index vector, and from polymerization training data, obtain corresponding training data according to value; The training data got is utilized to carry out model training.

Preferably, described processor 401 can also be used for: carry out shuffle operation to index vector; The value of index vector is read successively from the index vector after shuffling.

Preferably, described processor 401 can also be used for: the quantity determining training data in original training data, and determines the quantity of being polymerized training data in training data; Quantity according to training data in original training data sets up index vector with the quantity of training data in polymerization training data.

Preferably, described processor 401 can also be used for: determine the sample class indicated by corresponding training data according to value, sample class comprises positive sample and negative sample; According to sample class, carry out gradient calculation to the training data of correspondence, the result according to gradient calculation carries out model training.

Preferably, described processor 401 can also be used for: logic-based regression model structure, utilizes the training data got to carry out model repetitive exercise, obtains the Logic Regression Models after iteration.

Preferably, described processor 401 can also be used for: judge whether the Logic Regression Models after iteration meets pre-conditioned; It is pre-conditioned if judge, the Logic Regression Models after iteration meets, then preserve the Logic Regression Models after iteration; It is pre-conditioned if judge, the Logic Regression Models after iteration does not meet, then trigger to perform and carry out shuffle operation to described index vector.

Preferably, described processor 401 can also be used for: if determine, the convergence of the Logic Regression Models after iteration or iterations reach predetermined threshold value, then judge that the Logic Regression Models after iteration meets pre-conditioned; If determine, the Logic Regression Models after iteration does not restrain and iterations does not reach predetermined threshold value, then judge that the Logic Regression Models after iteration does not meet pre-conditioned.

From the above, in the server that the present embodiment provides, original training data is being polymerized, under obtaining the prerequisite of polymerization training data, set up index vector, the absolute value of this index vector value is used to indicate the position of training data in polymerization training data in original training data; When carrying out model training, the random value reading index vector, and from polymerization training data, obtain corresponding training data according to value, utilize the training data got to carry out model training; The embodiment of the present invention, under the prerequisite of training data polymerization, by reading index vector value at random, corresponding training data can be obtained from polymerization training data, ensure that the randomness of the training data for model training, thus model training effect can be improved on the basis saving internal memory.

In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, there is no the part described in detail in certain embodiment, see above for the detailed description of the model training method of training data, can repeat no more herein.

The model training apparatus of the described training data that the embodiment of the present invention provides, be for example computing machine, panel computer, the mobile phone with touch function etc., the model training method of the training data in the model training apparatus of described training data and foregoing embodiments belongs to same design, the model training apparatus of described training data can run the either method provided in the model training method embodiment of described training data, its specific implementation process refers to the model training method embodiment of described training data, repeats no more herein.

It should be noted that, for the model training method of training data of the present invention, this area common test personnel are appreciated that all or part of flow process of the model training method realizing training data described in the embodiment of the present invention, that the hardware that can control to be correlated with by computer program has come, described computer program can be stored in a computer read/write memory medium, as being stored in the storer of terminal, and performed by least one processor in this terminal, can comprise in the process of implementation as described in the flow process of embodiment of model training method of training data.Wherein, described storage medium can be magnetic disc, CD, ROM (read-only memory) (ROM, ReadOnlyMemory), random access memory (RAM, RandomAccessMemory) etc.

For the model training apparatus of the described training data of the embodiment of the present invention, its each functional module can be integrated in a process chip, also can be that the independent physics of modules exists, also can two or more module integrations in a module.Above-mentioned integrated module both can adopt the form of hardware to realize, and the form of software function module also can be adopted to realize.If described integrated module using the form of software function module realize and as independently production marketing or use time, also can be stored in a computer read/write memory medium, described storage medium such as be ROM (read-only memory), disk or CD etc.

Above the model training method of a kind of training data that the embodiment of the present invention provides and device are described in detail, apply specific case herein to set forth principle of the present invention and embodiment, the explanation of above embodiment just understands method of the present invention and core concept thereof for helping; Meanwhile, for those skilled in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims

1. a model training method for training data, is characterized in that, comprising:

Obtain original training data;

The training data got is utilized to carry out model training.

2. the model training method of training data according to claim 1, is characterized in that, the value of described random reading index vector, comprising:

Shuffle operation is carried out to described index vector;

The value of index vector is read successively from the index vector after shuffling.

3. the model training method of training data according to claim 2, is characterized in that, describedly sets up index vector according to described original training data with the described training data that is polymerized, and comprising:

Determine the quantity of training data in described original training data, and determine the quantity of training data in described polymerization training data;

Quantity according to training data in described original training data sets up index vector with the quantity of training data in polymerization training data.

4. the model training method of training data according to claim 2, is characterized in that, described from described polymerization training data, obtain corresponding training data according to described value after, also comprise:

Determine the sample class indicated by corresponding training data according to described value, described sample class comprises positive sample and negative sample;

According to described sample class, gradient calculation is carried out to the training data of correspondence;

The training data that then described utilization gets carries out model training, comprising: the result according to gradient calculation carries out model training.

5. the model training method of the training data according to any one of claim 2 to 4, is characterized in that, the training data that described utilization gets carries out model training, comprising:

Logic-based regression model structure, utilizes the training data got to carry out model repetitive exercise, obtains the Logic Regression Models after iteration.

6. the model training method of training data according to claim 5, is characterized in that, described in obtain the Logic Regression Models after iteration after, also comprise:

Judge whether the Logic Regression Models after described iteration meets pre-conditioned;

If judge, the Logic Regression Models after described iteration meets pre-conditioned, then preserve the Logic Regression Models after described iteration;

If judge, the Logic Regression Models after described iteration does not meet pre-conditioned, then triggering execution is described carries out shuffle operation to described index vector.

7. the model training method of training data according to claim 6, is characterized in that, described judge described iteration after Logic Regression Models whether meet pre-conditioned, comprising:

If determine, the convergence of the Logic Regression Models after described iteration or iterations reach predetermined threshold value, then judge that the Logic Regression Models after described iteration meets pre-conditioned;

If determine, the Logic Regression Models after described iteration does not restrain and iterations does not reach predetermined threshold value, then judge that the Logic Regression Models after described iteration does not meet pre-conditioned.

8. a model training apparatus for training data, is characterized in that, comprising:

Acquiring unit, for obtaining original training data;

Training unit, carries out model training for utilizing the training data got.

9. the model training apparatus of training data according to claim 8, is characterized in that, described reading unit comprises:

To shuffle subelement, for carrying out shuffle operation to described index vector;

Read subelement, for reading the value of index vector from the index vector after shuffling successively.

10. the model training apparatus of training data according to claim 9, is characterized in that, described vector is set up unit and comprised:

Determining subelement, for determining the quantity of training data in described original training data, and determining the quantity of training data in described polymerization training data;

Set up subelement, set up index vector for the quantity according to training data in described original training data with the quantity of training data in polymerization training data.

The model training apparatus of 11. training datas according to claim 9, is characterized in that, described device also comprises:

Determining unit, for determining the sample class indicated by corresponding training data according to described value, described sample class comprises positive sample and negative sample;

Computing unit, for according to described sample class, carries out gradient calculation to the training data of correspondence;

Then described training unit, carries out model training for the result according to gradient calculation.

The model training apparatus of 12. training datas according to any one of claim 9 to 11, it is characterized in that, described training unit, specifically for logic-based regression model structure, utilize the training data got to carry out model repetitive exercise, obtain the Logic Regression Models after iteration.

The model training apparatus of 13. training datas according to claim 12, is characterized in that, described device also comprises:

Whether judging unit, meet pre-conditioned for the Logic Regression Models after judging described iteration;

Storage unit, if meet pre-conditioned for the Logic Regression Models after judging described iteration, then preserves the Logic Regression Models after described iteration;

Trigger element, if do not meet pre-conditioned for the Logic Regression Models after judging described iteration, then triggering execution is described carries out shuffle operation to described index vector.

The model training apparatus of 14. training datas according to claim 13, is characterized in that, described judging unit comprises:

First judgment sub-unit, if reach predetermined threshold value for the Logic Regression Models convergence after determining described iteration or iterations, then judges that the Logic Regression Models after described iteration meets pre-conditioned;