CN112990480A - Method and device for building model, electronic equipment and storage medium - Google Patents
Method and device for building model, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN112990480A CN112990480A CN202110262260.6A CN202110262260A CN112990480A CN 112990480 A CN112990480 A CN 112990480A CN 202110262260 A CN202110262260 A CN 202110262260A CN 112990480 A CN112990480 A CN 112990480A
- Authority
- CN
- China
- Prior art keywords
- model
- target
- user
- training
- target parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 82
- 238000003860 storage Methods 0.000 title claims abstract description 14
- 239000013598 vector Substances 0.000 claims abstract description 101
- 238000012549 training Methods 0.000 claims abstract description 56
- 230000006399 behavior Effects 0.000 claims abstract description 51
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 41
- 238000000605 extraction Methods 0.000 claims abstract description 18
- 230000002787 reinforcement Effects 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 25
- 238000004590 computer program Methods 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 16
- 230000009467 reduction Effects 0.000 claims description 14
- 230000004044 response Effects 0.000 claims description 14
- 238000005457 optimization Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 10
- 238000007477 logistic regression Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 5
- 238000002790 cross-validation Methods 0.000 description 4
- 238000007418 data mining Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000010200 validation analysis Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 102100023927 Asparagine synthetase [glutamine-hydrolyzing] Human genes 0.000 description 2
- 101100380329 Homo sapiens ASNS gene Proteins 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a method and a device for constructing a model, electronic equipment and a storage medium. The model building method of the embodiment of the invention comprises the following steps: acquiring a target parameter; selecting target characteristics related to target parameters according to target parameters and an automatic characteristic selection model, wherein the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises embedded vectors and characteristics, the embedded vectors are extracted from user behavior data through a characteristic extraction model and used for representing user behaviors, the characteristics are labeled in advance, and the embedded vectors and the characteristics are stored in a database in advance; and taking the target characteristics related to the target parameters as input, and taking the target parameters as an output training model to determine the target model. The method for constructing the model determines the target characteristics by embedding the vector, and can improve the efficiency of model training.
Description
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method and a device for constructing a model, electronic equipment and a storage medium.
Background
With the rapid development of electronic technology, user behavior generates huge data on the network. The data analysis can centralize and extract the information hidden in a large amount of disordered data, thereby finding out the internal rules of the researched object. Data is typically analyzed by building a data model. However, in the process of mathematical model training, the development period of the model is long and the efficiency is low, so that a method capable of automatically constructing the model is needed.
Disclosure of Invention
Based on the above problems, the embodiments of the present invention provide a new technical solution, in which a target feature related to a target parameter is determined according to a feature selection model, and a target model is trained according to the target feature. The automatic selection of the characteristics can be realized, and the training efficiency of the model is improved.
According to a first aspect of an embodiment of the present invention, a method for constructing a model is provided, which includes:
obtaining target parameters, wherein the target parameters are used for representing behavior preference related to a user;
selecting target characteristics related to target parameters according to target parameters and an automatic characteristic selection model, wherein the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises embedded vectors and characteristics, the embedded vectors are extracted from user behavior data through a characteristic extraction model and used for representing user behaviors, the characteristics are user attributes marked in advance, and the embedded vectors and the characteristics are stored in a database in advance; and
and determining a target model by taking the target characteristics related to the target parameters as input and the target parameters as an output training model, wherein the target model is used for predicting the behavior of the user.
Preferably, extracting the embedded vector comprises:
determining a time sequence matrix of the user according to the user behavior data;
embedding the time sequence matrix to generate an initial vector;
performing a convolution operation on the initial vector to generate a plurality of convolution vectors;
global average pooling and full concatenation of a plurality of the convolution vectors to extract an embedded vector.
Preferably, the embedding process includes: the CBOW algorithm and Skip-Gram algorithm process the timing matrix.
Preferably, the convolution operation comprises:
carrying out convolution dimensionality reduction on the expansion vector; and
and carrying out multidimensional convolution on the vector obtained by the dimension reduction processing to generate a plurality of different convolution vectors.
Preferably, the training model with the target characteristics related to the target parameters as input and the target parameters as output includes:
and automatically tuning parameters by adopting any one method of grid search, random search and Bayesian optimization.
Preferably, the method further comprises:
automatically updating the target model in response to model attenuation being less than a predetermined threshold; and
in response to the model decay being greater than the predetermined threshold, target features associated with the target parameters are selected again according to the target parameters and the automatic feature selection model to retrain the target model again.
According to a third aspect of an embodiment of the present invention, there is provided a construction model apparatus, including:
the parameter acquisition unit is used for acquiring target parameters, and the target parameters are used for representing behavior preference related to a user;
the system comprises a characteristic selection unit, a characteristic selection unit and a characteristic selection unit, wherein the characteristic selection unit is used for selecting target characteristics related to target parameters according to the target parameters and an automatic characteristic selection model, the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises an embedded vector and characteristics, the embedded vector is extracted from user behavior data through a characteristic extraction model and is used for representing user behaviors, the characteristics are user attributes marked in advance, and the embedded vector and the characteristics are stored in a database in advance; and
and the model determining unit is used for determining a target model by taking the target characteristics related to the target parameters as input and the target parameters as output training models, and the target model is used for predicting the behavior of the user.
Preferably, extracting the embedded vector comprises:
determining a time sequence matrix of the user according to the user behavior data;
embedding the time sequence matrix to generate an initial vector;
performing a convolution operation on the initial vector to generate a plurality of convolution vectors;
global average pooling and full concatenation of a plurality of the convolution vectors to extract an embedded vector.
Preferably, the embedding process includes: the CBOW algorithm and Skip-Gram algorithm process the timing matrix.
Preferably, the convolution operation comprises:
carrying out convolution dimensionality reduction on the expansion vector; and
and carrying out multidimensional convolution on the vector obtained by the dimension reduction processing to generate a plurality of different convolution vectors.
Preferably, the model determination unit includes:
and the automatic parameter adjusting module is used for automatically adjusting parameters by adopting any one method of grid search, random search and Bayesian optimization.
Preferably, the apparatus further comprises:
a first model updating unit for automatically updating the target model in response to a model decay being less than a predetermined threshold; and
and the second model updating unit is used for responding to the model attenuation larger than the preset threshold value, and selecting the target characteristics related to the target parameters again according to the target parameters and the automatic characteristic selection model so as to train the target model again.
According to a third aspect of embodiments of the present invention, a computer-readable storage medium is presented, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method according to any of the first aspects.
According to a fourth aspect of embodiments of the present invention, an electronic device is presented, wherein the mobile terminal comprises a processor for implementing the method according to any of the first aspect when executing a computer program stored in a memory.
The embodiment of the invention provides a method and a device for constructing a model, electronic equipment and a storage medium. The model building method of the embodiment of the invention comprises the following steps: acquiring a target parameter; selecting target characteristics related to target parameters according to target parameters and an automatic characteristic selection model, wherein the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises embedded vectors and characteristics, the embedded vectors are extracted from user behavior data through a characteristic extraction model and used for representing user behaviors, the characteristics are labeled in advance, and the embedded vectors and the characteristics are stored in a database in advance; and taking the target characteristics related to the target parameters as input, and taking the target parameters as an output training model to determine the target model. The method for constructing the model determines the target characteristics by embedding the vector, and can improve the efficiency of model training.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:
FIG. 1 is a schematic flow chart diagram of a method of constructing a model according to an embodiment of the invention;
FIG. 2 is a flow chart of the extraction of the embedded vector according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a timing matrix according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a feature extraction model of an embodiment of the present invention;
FIG. 5 is a schematic diagram of a model building apparatus according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an electronic device of an embodiment of the invention.
Detailed Description
The embodiments of the present invention are described below based on examples, but the embodiments of the present invention are not limited to only these examples. In the following detailed description of embodiments of the invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the embodiments of the invention.
Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.
Unless the context clearly requires otherwise, in embodiments of the invention, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".
In the description of the embodiments of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the embodiments of the present invention, "a plurality" means two or more unless otherwise specified.
In the existing model training process, a plurality of algorithm labels related to people (mainly passengers and drivers) are used for producing business, such as basic attributes (sex, age and the like), the probability that the passengers become high conversion people, the probability that the drivers become introduction people and the like, the algorithm labels are all artificial subjects, and a plurality of features can be shared, however, the features are not fully utilized. Therefore, it is necessary to share the labeled features in the training process of each model to reduce the workload of data labeling in the training process of the model, thereby improving the efficiency of model training.
Meanwhile, it takes a long time to adjust parameters of the model, resulting in long period and low efficiency of model training. Data mining tasks in most scenes are mainly divided into characteristic association, characteristic screening, model establishment and model online, the flows in different scenes have high similarity, and in order to improve the efficiency of model training, an automatic data mining modeling system needs to be constructed.
A user generates a large amount of behavior data by using a user terminal, and in the face of such a huge data treasure, an effective means is lacked for mining useful information from the user terminal, the data extracted from original data by a window sliding method is too rough, and a lot of information possibly disappears in the process of processing features. On the other hand, the machined features are too many, can have thousands of dimensions, and are difficult to use in practice. In order to solve the above problems, the embodiment of the present invention adopts an unsupervised + embedding technical scheme, that is, an unsupervised model is trained to extract embedding features from original behavior data, because unsupervised, it can be ensured that the extracted features do not deviate to any supervision information, the generalization capability of the features is enhanced, and embedding can greatly reduce feature dimensions and improve the calculation efficiency of the model.
In this embodiment, a model building method according to the present application is described by taking data generated in the processes of analyzing taxi taking and taxi renting as an example.
FIG. 1 is a flow chart of a method of constructing a model according to an embodiment of the invention. As shown in fig. 1, the method for constructing a model according to the embodiment of the present invention includes:
step S110, a target parameter is acquired.
The target parameter is an output result of the target model. The target parameters are used to characterize behavioral preferences associated with the user. Wherein, the passenger and the driver form various user behavior data through different user terminals, such as passenger taking or renting, driver refueling and vehicle maintenance. For example, the target parameters may be the probability of a passenger becoming a high conversion crowd and the probability of a driver becoming an introducer. The objective parameters may also include the probability of a passenger taking a car, the probability of a passenger large money order, the probability of a driver taking an order or refusing an order, and so on.
And step S120, selecting target characteristics related to the target parameters according to the target parameters and the automatic characteristic selection model. The feature selection model is obtained by training user data authorized by a user as training data by adopting a Deep Q Network (DQN). The user data comprises embedded vectors and features, the embedded vectors are extracted from the user behavior data through a feature extraction model and used for representing user behaviors, the features are user attributes marked in advance, and the embedded vectors and the features are stored in a database in advance.
The characteristic may include an attribute of the passenger or the driver. Attributes of a passenger or driver, for example, may include: "age", "gender" and "educational level", etc.
The user data is pre-processed and pre-stored in the database in the form of labeled features and embedded vectors. The database stores metadata of these features such as mean, variance, kurtosis, and the like in advance.
Feature selection, which may be measured by the importance of the captured features, is used to select features that are easy to determine the parameters of interest. The feature selection model takes the target parameters and the embedded vectors as input, takes the features as output, and is determined by training of a reinforcement learning algorithm.
The embedding vector is determined by a feature extraction model, wherein the feature extraction model is determined by pre-training. Specifically, the feature extraction model is determined while training the feature selection model.
FIG. 2 is a flow chart of the extraction of the embedded vector according to an embodiment of the present invention. FIG. 3 is a schematic diagram of a timing matrix according to an embodiment of the invention. FIG. 4 is a schematic diagram of a feature extraction model according to an embodiment of the invention. As shown in fig. 2 to 4, in step S120, extracting the embedded vector from the user behavior data through the feature extraction model specifically includes:
step S121, determining a time sequence matrix of the user according to the user behavior data.
As shown in fig. 3, time is one dimension of the timing matrix and behavior is another dimension of the timing matrix.
And step S122, performing embedding processing on the time sequence matrix to generate an initial vector.
Specifically, the embedding process includes: the Continuous Bag Of Words (CBOW) algorithm and the Skip-Gram (Continuous Skip-Gram) algorithm process the timing matrix. It should be understood that in alternative implementations, other embedding algorithms may be used for the embedding process, such as a graph embedding algorithm, etc.
Step S123, performing convolution operation on the initial vector to generate a plurality of convolution vectors.
Specifically, the convolution operation includes: and carrying out convolution dimensionality reduction on the expansion vector. Specifically, the initial vector is subjected to 1 × 1 convolution processing. And carrying out multidimensional convolution on the vector obtained by the dimension reduction processing to generate a plurality of different convolution vectors.
And step S124, performing global average pooling and full connection on the convolution vectors to extract an embedded vector.
In the embodiment of the invention, a Q-learning and DQN-based feature selection algorithm is adopted, features are divided into a plurality of feature groups according to a service domain, and the algorithm iterates different feature group training models and adjusts strategies, so that an optimal feature combination is continuously searched. The effect is guaranteed, meanwhile, manual intervention can be reduced, automation is improved, and selection efficiency can be improved.
Step S130, the target characteristics related to the target parameters are used as input, the target parameters are used as an output training model, and a target model is determined and used for predicting the behavior of the user.
The objective model is used for predicting the behavior of the user, and specifically, the objective parameter may be determined according to user data. For example, the goal model may predict whether a passenger will cancel an order, whether a driver will take an order, and so on.
Specifically, the target characteristics related to the target parameters are used as input, the target parameters are used as output, a plurality of different models are trained sequentially according to different algorithms, the models are subjected to super-parameter tuning and optimization respectively, and the model with the highest finally obtained precision is determined as the target model.
The different algorithms may include, but are not limited to, a Logistic Regression (LR) algorithm, a eXtreme Gradient Boosting Decision Tree (GBDT) algorithm, a distributed Gradient enhancement library (XGBoost), a deep learning algorithm, or an end-to-end (end-to-end) algorithm.
The LR algorithm is a model commonly used for classification tasks in machine learning, is a generalized linear regression analysis model in essence, and has the advantages of simple model structure, high training speed and good probability explanation on output variables. The XGboost algorithm is an extensible machine learning system, the system can be used as an open-source software package, meanwhile, the influence of the system is widely recognized in a large number of machine learning and data mining challenges, and in the embodiment of the invention, the XGboost can play a good classification role along with the continuous increase of the data volume. The history object is an object which is clicked or browsed by the user history. The deep learning model is a model established based on a deep neural network, and can realize accurate classification based on good learning ability. The end-to-end model is different from a traditional machine learning model (consisting of a plurality of independent modules), integrates a plurality of modules, and takes all the modules as a whole, so that the model training process is simplified, and the fault tolerance rate is increased. The GBDT model is a precision Tree model trained based on the Gradient Boosting strategy, and can realize the classification function of data based on a Decision Tree. In addition, the GBDT model used alone is prone to overfitting, and therefore, in practical applications, the GBDT model and the LR model may be combined to implement a data classification function, that is, behavior classification is implemented by the GBDT + LR model.
And the super-parameter tuning in the model training process adopts any one of grid search, random search and Bayesian optimization to automatically tune parameters.
The automatic parameter adjustment is to automatically adjust the optimal super parameter of the model through an automatic learning algorithm so as to approach or even exceed the optimal effect of manual parameter adjustment. The automatic parameter adjusting algorithm can be managed in an integrated mode, automatic parameter adjusting is provided in a service mode, and a real machine learning pipeline (pipeline) constructed end to end is achieved.
In an alternative implementation, the target parameters are taken as an example for producing the high-consumption prediction label for the user: preparing a sample table with high consumption, selecting the required features and a model (LR/XGBooost) to be tested, creating an experiment, and then automatically carrying out hyper-parameter tuning on the model.
Specifically, during the hyper-parameter tuning process, the hyper-parameters are evaluated. In response to the failure of the evaluation, target features related to the target parameters are selected again according to the target parameters and the automatic feature selection model to train the target model again. In response to the evaluation passing, a set of hyper-parameters is selected to automatically deploy the target model.
And aiming at different types of data, dividing a training test set in different modes, and evaluating the target model by adopting different cross validation modes and different validation indexes. The validation index may include mean absolute error, mean variance, and R-squared value, among others. Specifically, predetermined thresholds for different verification data may be set to evaluate a relationship between a value of the determined verification index and the predetermined threshold to determine whether the evaluation is passed.
Cross-validation, also known as Rotation Estimation, is a practical method to statistically cut data samples into smaller subsets, the theory being proposed by Seymour Geisser. In a given modeling sample, most samples are taken out to build a model, a small part of samples are reserved to be forecasted by the just built model, forecasting errors of the small part of samples are solved, and the sum of squares of the forecasting errors is recorded. This process continues until all samples have been forecasted once and only once. The squared prediction Error for each sample is summed, called the Sum of squared Prediction Residuals (PRESS).
The basic idea of cross-validation is to group the raw data (dataset) in a way that one part is used as training set (train) and the other part is used as validation set or test set. Firstly, training a classifier by using a training set, and then testing a model (model) obtained by training by using a verification set to serve as a performance index for evaluating the classifier. Both classification and regression models can be evaluated using cross-validation.
In an optional implementation, the method further includes:
in step S140, it is determined whether the attenuation of the target model is greater than a predetermined threshold.
In the present embodiment, the model retraining condition is set in advance as the degree of model attenuation. In an alternative implementation, different model retraining conditions may also be set, such as training sample magnitude or model time period.
And step S150, automatically updating the target model.
Specifically, step S150 is performed in response to the model decay being less than the predetermined threshold. The parameters of the target model can be continuously adjusted according to the new behavior data of the users in the database so as to update the target model.
In response to the model decay being greater than the predetermined threshold, step S120 is performed. And selecting the target characteristics related to the target parameters according to the target parameters and the automatic characteristic selection model again so as to train the target model again.
In an alternative implementation, the model update mode may also be a timed update or a manual update.
The embodiment of the invention is designed to support daily automatic iterative tuning for long-term deployment tasks, a new scheme (algorithm, characteristics and super parameters) is tried by a system, and the situation of unmanned maintenance and optimization of online tasks can not occur.
Fig. 5 is a schematic structural diagram of a model building apparatus according to an embodiment of the present invention. As shown in fig. 5, in an alternative implementation manner, the model building apparatus according to the embodiment of the present invention includes: a parameter acquisition unit 510, a feature selection unit 520, and a model determination unit 530.
The parameter obtaining unit 510 is configured to obtain a target parameter, where the target parameter is used to characterize a behavior preference related to a user.
The feature selection unit 520 is configured to select a target feature related to a target parameter according to the target parameter and an automatic feature selection model, where the feature selection model is obtained by training data using a reinforcement learning algorithm with user data authorized by a user as training data, the user data includes an embedded vector and a feature, the embedded vector is extracted from user behavior data through a feature extraction model and used for representing a user behavior, the feature is a pre-labeled user attribute, and the embedded vector and the feature are pre-stored in a database.
The extracting the embedded vector comprises:
and determining a time sequence matrix of the user according to the user behavior data.
And performing embedding processing on the time sequence matrix to generate an initial vector.
Specifically, the embedding process includes: the CBOW algorithm and Skip-Gram algorithm process the timing matrix.
And performing convolution operation on the initial vector to generate a plurality of convolution vectors.
Specifically, the convolution operation includes:
carrying out convolution dimensionality reduction on the expansion vector; and
and carrying out multidimensional convolution on the vector obtained by the dimension reduction processing to generate a plurality of different convolution vectors.
Global average pooling and full concatenation of a plurality of the convolution vectors to extract an embedded vector.
The model determining unit 530 is configured to take the target feature related to the target parameter as an input, and take the target parameter as an output training model to determine a target model, where the target model is used to predict the behavior of the user.
The model determining unit 530 includes:
the automatic parameter adjusting module is used for automatically adjusting parameters by adopting any one method of grid search, random search and Bayesian optimization.
In an optional implementation, the apparatus further includes: a first model updating unit 540 and a second model updating unit 550.
The first model updating unit 540 is configured to automatically update the target model in response to a model decay being less than a predetermined threshold.
The second model updating unit 550 is configured to select the target feature related to the target parameter again according to the target parameter and the automatic feature selection model to train the target model again in response to the model attenuation being greater than the predetermined threshold.
Fig. 6 is a schematic diagram of an electronic device of an embodiment of the invention. The electronic device shown in fig. 6 is a general-purpose data processing apparatus comprising a general-purpose computer hardware structure including at least a processor 601 and a memory 602. The processor 601 and the memory 602 are connected by a bus 603. The memory 602 is adapted to store instructions or programs executable by the processor 601. Processor 601 may be a stand-alone microprocessor or a collection of one or more microprocessors. Thus, the processor 601 implements the processing of data and the control of other devices by executing instructions stored by the memory 602 to perform the method flows of embodiments of the present invention as described above. The bus 603 connects the above components together, as well as to the display controller 604 and the display device and input/output (I/O) device 605. Input/output (I/O) device 605 may be a mouse, keyboard, modem, network interface, touch input device, motion sensing input device, printer, and other devices known in the art. Typically, the input/output devices 605 are connected to the system through an input/output (I/O) controller 606.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus (device) or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may employ a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations of methods, apparatus (devices) and computer program products according to embodiments of the invention. It will be understood that each flow in the flow diagrams can be implemented by computer program instructions.
These computer program instructions may be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows.
These computer program instructions may also be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows.
Another embodiment of the invention is directed to a non-transitory storage medium storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.
That is, those skilled in the art can understand that all or part of the steps in the method according to the above embodiments may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps in the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
The embodiment of the invention provides a TS1 and a method for constructing a model, wherein the method comprises the following steps:
obtaining target parameters, wherein the target parameters are used for representing behavior preference related to a user;
selecting target characteristics related to target parameters according to target parameters and an automatic characteristic selection model, wherein the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises embedded vectors and characteristics, the embedded vectors are extracted from user behavior data through a characteristic extraction model and used for representing user behaviors, the characteristics are user attributes marked in advance, and the embedded vectors and the characteristics are stored in a database in advance; and
and determining a target model by taking the target characteristics related to the target parameters as input and the target parameters as an output training model, wherein the target model is used for predicting the behavior of the user.
TS2, the method of TS1, extracting the embedded vector comprising:
determining a time sequence matrix of the user according to the user behavior data;
embedding the time sequence matrix to generate an initial vector;
performing a convolution operation on the initial vector to generate a plurality of convolution vectors; and
global average pooling and full concatenation of a plurality of the convolution vectors to extract an embedded vector.
TS3, the method of TS2, the embedding process comprising: the CBOW algorithm and Skip-Gram algorithm process the timing matrix.
TS4, the method of TS2, the convolution operation comprising:
carrying out convolution dimensionality reduction on the expansion vector; and
and carrying out multidimensional convolution on the vector obtained by the dimension reduction processing to generate a plurality of different convolution vectors.
TS5, the method according to TS2, wherein training the model with the target feature associated with the target parameter as input and the target parameter as output comprises:
and automatically tuning parameters by adopting any one method of grid search, random search and Bayesian optimization.
TS6, the method of TS5, the method further comprising:
automatically updating the target model in response to model attenuation being less than a predetermined threshold; and
in response to the model decay being greater than the predetermined threshold, target features associated with the target parameters are selected again according to the target parameters and the automatic feature selection model to retrain the target model again.
TS7, a build model apparatus, the apparatus comprising:
the parameter acquisition unit is used for acquiring target parameters, and the target parameters are used for representing behavior preference related to a user;
the system comprises a characteristic selection unit, a characteristic selection unit and a characteristic selection unit, wherein the characteristic selection unit is used for selecting target characteristics related to target parameters according to the target parameters and an automatic characteristic selection model, the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises an embedded vector and characteristics, the embedded vector is extracted from user behavior data through a characteristic extraction model and is used for representing user behaviors, the characteristics are user attributes marked in advance, and the embedded vector and the characteristics are stored in a database in advance; and
and the model determining unit is used for determining a target model by taking the target characteristics related to the target parameters as input and the target parameters as output training models, and the target model is used for predicting the behavior of the user.
TS8, the device of TS7, the extracting the embedded vector comprising:
determining a time sequence matrix of the user according to the user behavior data;
embedding the time sequence matrix to generate an initial vector;
performing a convolution operation on the initial vector to generate a plurality of convolution vectors;
global average pooling and full concatenation of a plurality of the convolution vectors to extract an embedded vector.
TS9, the apparatus of TS8, the embedding process comprising: the CBOW algorithm and Skip-Gram algorithm process the timing matrix.
TS10, the method of TS8, the convolution operation comprising:
carrying out convolution dimensionality reduction on the expansion vector; and
and carrying out multidimensional convolution on the vector obtained by the dimension reduction processing to generate a plurality of different convolution vectors.
TS11, the apparatus according to TS8, the model determination unit comprising:
and the automatic parameter adjusting module is used for automatically adjusting parameters by adopting any one method of grid search, random search and Bayesian optimization.
TS12, the apparatus of TS11, the apparatus further comprising:
a first model updating unit for automatically updating the target model in response to a model decay being less than a predetermined threshold; and
and the second model updating unit is used for responding to the model attenuation larger than the preset threshold value, and selecting the target characteristics related to the target parameters again according to the target parameters and the automatic characteristic selection model so as to train the target model again.
TS13, a computer readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the method according to any one of TS1 to 6.
TS14, an electronic device comprising a processor for implementing the method as claimed in any of TS1 to 6 when executing a computer program stored in a memory.
TS15, a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any one of TS1 to 6.
Claims (10)
1. A method for constructing a model, the method comprising:
obtaining target parameters, wherein the target parameters are used for representing behavior preference related to a user;
selecting target characteristics related to target parameters according to target parameters and an automatic characteristic selection model, wherein the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises embedded vectors and characteristics, the embedded vectors are extracted from user behavior data through a characteristic extraction model and used for representing user behaviors, the characteristics are user attributes marked in advance, and the embedded vectors and the characteristics are stored in a database in advance; and
and determining a target model by taking the target characteristics related to the target parameters as input and the target parameters as an output training model, wherein the target model is used for predicting the behavior of the user.
2. The method of claim 1, wherein extracting the embedded vector comprises:
determining a time sequence matrix of the user according to the user behavior data;
embedding the time sequence matrix to generate an initial vector;
performing a convolution operation on the initial vector to generate a plurality of convolution vectors; and
global average pooling and full concatenation of a plurality of the convolution vectors to extract an embedded vector.
3. The method of claim 2, wherein the embedding process comprises: the CBOW algorithm and Skip-Gram algorithm process the timing matrix.
4. The method of claim 2, wherein the convolution operation comprises:
carrying out convolution dimensionality reduction on the expansion vector; and
and carrying out multidimensional convolution on the vector obtained by the dimension reduction processing to generate a plurality of different convolution vectors.
5. The method of claim 2, wherein the training model with the target feature related to the target parameter as an input and the target parameter as an output comprises:
and automatically tuning parameters by adopting any one method of grid search, random search and Bayesian optimization.
6. The method of claim 5, further comprising:
automatically updating the target model in response to model attenuation being less than a predetermined threshold; and
in response to the model decay being greater than the predetermined threshold, target features associated with the target parameters are selected again according to the target parameters and the automatic feature selection model to retrain the target model again.
7. An apparatus for building a model, the apparatus comprising:
the parameter acquisition unit is used for acquiring target parameters, and the target parameters are used for representing behavior preference related to a user;
the system comprises a characteristic selection unit, a characteristic selection unit and a characteristic selection unit, wherein the characteristic selection unit is used for selecting target characteristics related to target parameters according to the target parameters and an automatic characteristic selection model, the characteristic selection model is obtained by training through a reinforcement learning algorithm by taking user data authorized by a user as training data, the user data comprises an embedded vector and characteristics, the embedded vector is extracted from user behavior data through a characteristic extraction model and is used for representing user behaviors, the characteristics are user attributes marked in advance, and the embedded vector and the characteristics are stored in a database in advance; and
and the model determining unit is used for determining a target model by taking the target characteristics related to the target parameters as input and the target parameters as output training models, and the target model is used for predicting the behavior of the user.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
9. An electronic device, characterized in that the electronic device comprises a processor for implementing the method according to any of claims 1 to 6 when executing a computer program stored in a memory.
10. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110262260.6A CN112990480A (en) | 2021-03-10 | 2021-03-10 | Method and device for building model, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110262260.6A CN112990480A (en) | 2021-03-10 | 2021-03-10 | Method and device for building model, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112990480A true CN112990480A (en) | 2021-06-18 |
Family
ID=76334865
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110262260.6A Pending CN112990480A (en) | 2021-03-10 | 2021-03-10 | Method and device for building model, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112990480A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113771289A (en) * | 2021-09-16 | 2021-12-10 | 健大电业制品(昆山)有限公司 | Method and system for optimizing injection molding process parameters |
CN114986833A (en) * | 2022-06-06 | 2022-09-02 | 健大电业制品(昆山)有限公司 | Dynamically regulated injection molding method, system, device and medium |
CN115934809A (en) * | 2023-03-08 | 2023-04-07 | 北京嘀嘀无限科技发展有限公司 | Data processing method and device and electronic equipment |
TWI824700B (en) * | 2022-09-06 | 2023-12-01 | 中華電信股份有限公司 | An automated machine learning system, method and computer readable medium thereof |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190364123A1 (en) * | 2017-04-13 | 2019-11-28 | Tencent Technology (Shenzhen) Company Limited | Resource push method and apparatus |
WO2020020088A1 (en) * | 2018-07-23 | 2020-01-30 | 第四范式(北京)技术有限公司 | Neural network model training method and system, and prediction method and system |
CN110796232A (en) * | 2019-10-12 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Attribute prediction model training method, attribute prediction method and electronic equipment |
WO2020087281A1 (en) * | 2018-10-30 | 2020-05-07 | 深圳市大疆创新科技有限公司 | Hyper-parameter optimization method and apparatus |
CN111753987A (en) * | 2020-07-08 | 2020-10-09 | 深延科技(北京)有限公司 | Method and device for generating machine learning model |
WO2020258508A1 (en) * | 2019-06-27 | 2020-12-30 | 平安科技(深圳)有限公司 | Model hyper-parameter adjustment and control method and apparatus, computer device, and storage medium |
CN112288483A (en) * | 2020-10-29 | 2021-01-29 | 北京沃东天骏信息技术有限公司 | Method and device for training model and method and device for generating information |
CN112329816A (en) * | 2020-10-09 | 2021-02-05 | 北京嘀嘀无限科技发展有限公司 | Data classification method and device, electronic equipment and readable storage medium |
CN112365051A (en) * | 2020-11-10 | 2021-02-12 | 中国平安人寿保险股份有限公司 | Agent retention prediction method and device, computer equipment and storage medium |
-
2021
- 2021-03-10 CN CN202110262260.6A patent/CN112990480A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190364123A1 (en) * | 2017-04-13 | 2019-11-28 | Tencent Technology (Shenzhen) Company Limited | Resource push method and apparatus |
WO2020020088A1 (en) * | 2018-07-23 | 2020-01-30 | 第四范式(北京)技术有限公司 | Neural network model training method and system, and prediction method and system |
WO2020087281A1 (en) * | 2018-10-30 | 2020-05-07 | 深圳市大疆创新科技有限公司 | Hyper-parameter optimization method and apparatus |
WO2020258508A1 (en) * | 2019-06-27 | 2020-12-30 | 平安科技(深圳)有限公司 | Model hyper-parameter adjustment and control method and apparatus, computer device, and storage medium |
CN110796232A (en) * | 2019-10-12 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Attribute prediction model training method, attribute prediction method and electronic equipment |
CN111753987A (en) * | 2020-07-08 | 2020-10-09 | 深延科技(北京)有限公司 | Method and device for generating machine learning model |
CN112329816A (en) * | 2020-10-09 | 2021-02-05 | 北京嘀嘀无限科技发展有限公司 | Data classification method and device, electronic equipment and readable storage medium |
CN112288483A (en) * | 2020-10-29 | 2021-01-29 | 北京沃东天骏信息技术有限公司 | Method and device for training model and method and device for generating information |
CN112365051A (en) * | 2020-11-10 | 2021-02-12 | 中国平安人寿保险股份有限公司 | Agent retention prediction method and device, computer equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
柯志辅;: "运营商移动用户离网预测模型", 科技经济导刊, no. 29 * |
邹晓辉;: "基于Logistic回归的数据分类问题研究", 智能计算机与应用, no. 06 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113771289A (en) * | 2021-09-16 | 2021-12-10 | 健大电业制品(昆山)有限公司 | Method and system for optimizing injection molding process parameters |
CN113771289B (en) * | 2021-09-16 | 2022-06-24 | 健大电业制品(昆山)有限公司 | Method and system for optimizing injection molding process parameters |
CN114986833A (en) * | 2022-06-06 | 2022-09-02 | 健大电业制品(昆山)有限公司 | Dynamically regulated injection molding method, system, device and medium |
TWI824700B (en) * | 2022-09-06 | 2023-12-01 | 中華電信股份有限公司 | An automated machine learning system, method and computer readable medium thereof |
CN115934809A (en) * | 2023-03-08 | 2023-04-07 | 北京嘀嘀无限科技发展有限公司 | Data processing method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112990480A (en) | Method and device for building model, electronic equipment and storage medium | |
EP3467723B1 (en) | Machine learning based network model construction method and apparatus | |
CN111144542B (en) | Oil well productivity prediction method, device and equipment | |
CN112069310B (en) | Text classification method and system based on active learning strategy | |
CN110674841B (en) | Logging curve identification method based on clustering algorithm | |
CN107230108A (en) | The processing method and processing device of business datum | |
CN112329816A (en) | Data classification method and device, electronic equipment and readable storage medium | |
CN108804577B (en) | Method for estimating interest degree of information tag | |
CN111834010A (en) | COVID-19 detection false negative identification method based on attribute reduction and XGboost | |
CN114548591B (en) | Sequential data prediction method and system based on mixed deep learning model and Stacking | |
CN104885101A (en) | Automatically selecting analogous members for new population members based on incomplete descriptions, including an uncertainty characterizing selection | |
CN115481577B (en) | Automatic oil reservoir history fitting method based on random forest and genetic algorithm | |
Maliah et al. | MDP-based cost sensitive classification using decision trees | |
CN116805533A (en) | Cerebral hemorrhage operation risk prediction system based on data collection and simulation | |
CN111612491A (en) | State analysis model construction method, analysis method and device | |
CN115203970B (en) | Diagenetic parameter prediction model training method and prediction method based on artificial intelligence algorithm | |
CN111126694A (en) | Time series data prediction method, system, medium and device | |
CN115600102B (en) | Abnormal point detection method and device based on ship data, electronic equipment and medium | |
CN113592341A (en) | Measurement loss function, sector complexity evaluation method and system | |
CN113569018A (en) | Question and answer pair mining method and device | |
CN111108516A (en) | Evaluating input data using a deep learning algorithm | |
KR102399833B1 (en) | synopsis production service providing apparatus using log line based on artificial neural network and method therefor | |
Feder | Machine-learning approach determines spatial variation in shale decline curves | |
CN115600121B (en) | Data hierarchical classification method and device, electronic equipment and storage medium | |
KR102636461B1 (en) | Automated labeling method, device, and system for learning artificial intelligence models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |