CN116992253A

CN116992253A - Method for determining value of super-parameter in target prediction model associated with target service

Info

Publication number: CN116992253A
Application number: CN202310914743.9A
Authority: CN
Inventors: 林潮
Original assignee: Zhongdian Jinxin Software Co Ltd
Current assignee: Zhongdian Jinxin Software Co Ltd
Priority date: 2023-07-24
Filing date: 2023-07-24
Publication date: 2023-11-03

Abstract

The application provides a method for determining the value of a super parameter in a target prediction model associated with a target service, and relates to the technical field of artificial intelligence. The method comprises the following steps: executing a plurality of first search processes on the super-parameter value subspace to generate a first search record according to the first value combination obtained by searching in each first search process and the corresponding first evaluation index, and adding the first search record into a search record set; executing a second search process on the super-parameter value space to generate a second search record according to a second value combination obtained by searching in the second search process and a corresponding second evaluation index, and updating a search record set according to the second search record; and determining the target value of each super parameter in the target prediction model according to the target search records in the search record set. Therefore, the accuracy of super-parameter setting can be improved, and the prediction accuracy of the target prediction model in the target business scene is improved.

Description

Method for determining value of super-parameter in target prediction model associated with target service

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a method for determining the value of a super parameter in a target prediction model associated with a target service.

Background

In the business (such as loan, deposit, financial, etc.) scenario of the financial institution, the prediction model associated with the specific business may be used to predict the business data of the customer (the data associated with the specific business), for example, for the loan business, the prediction model associated with the loan business may be used to predict the score of the business data associated with the loan product applied by the customer, where the score output by the prediction model may be used to indicate the repayment capability, default risk, etc. of the customer, if the score is relatively high, the loan may be issued to the customer through the loan application of the customer, and if the score is relatively low, the loan application of the customer may be denied, and the loan may be refused to be issued to the customer.

The above-mentioned predictive models often have a large number of parameters to be optimized, which can generally be divided into two categories: one is a model parameter which can be optimized in the fitting process of the model, and the other is a parameter which cannot be optimized through the model, but needs to be preset by a user, and the parameter is also called super parameter. For the optimization of the super parameters, the super parameters can be tried and adjusted by engineers according to own modeling experience at present, and the optimization is gradually carried out until the level which is regarded as better empirically is reached.

The optimization mode is limited by manual experience, the stability of the optimization effect is poor, the prediction accuracy of the prediction model under a specific service scene is low, and great loss is brought to a financial institution, for example, when the target service is a loan service, the prediction model is adopted to predict service data of a client applying for a loan, and if the value setting of the super-parameter is inaccurate, the loan can be issued to the client with lower credit.

Disclosure of Invention

The present application aims to solve at least one of the technical problems in the related art to some extent.

According to the method, the target value combination is determined from the value combination obtained by multiple searches, the evaluation index corresponding to the target value combination is optimal, and the target value of each super parameter in the target prediction model is determined according to the target value combination, so that the accuracy of super parameter setting can be improved, and the prediction accuracy of the target prediction model in a target service scene is improved.

An embodiment of a first aspect of the present application provides a method for determining a value of a super parameter in a target prediction model associated with a target service, including:

Determining a super-parameter value subspace from the set super-parameter value space; wherein the super-parameter value space comprises a plurality of value combinations, and each value combination comprises a candidate value corresponding to each super-parameter in the target prediction model;

executing a plurality of first search processes on the super-parameter value subspace; wherein the first search process includes: searching from the super-parameter value subspace to obtain a first value combination, obtaining a first evaluation index corresponding to the first value combination, generating a first search record according to the first value combination and the first evaluation index, and adding the first search record into a search record set;

executing a second search process on the hyper-parameter value space to obtain a second value combination obtained by searching in the second search process, obtaining a second evaluation index corresponding to the second value combination, generating a second search record according to the second value combination and the second evaluation index, and updating the search record set according to the second search record;

determining the target value of each super parameter in the target prediction model according to the target value combination in the target search record in the last updated search record set; wherein the evaluation index in the target search record is superior to other search records.

An embodiment of a second aspect of the present application provides a device for determining a value of a super parameter in a target prediction model associated with a target service, including:

the first determining module is used for determining a super-parameter value subspace from the set super-parameter value space; wherein the super-parameter value space comprises a plurality of value combinations, and each value combination comprises a candidate value corresponding to each super-parameter in the target prediction model;

the first execution module is used for executing a plurality of first search processes on the hyper-parameter value subspace; wherein the first search process includes: searching from the super-parameter value subspace to obtain a first value combination, obtaining a first evaluation index corresponding to the first value combination, generating a first search record according to the first value combination and the first evaluation index, and adding the first search record into a search record set;

the second execution module is used for executing a second search process on the hyper-parameter value space to obtain a second value combination obtained by searching in the second search process, obtaining a second evaluation index corresponding to the second value combination, generating a second search record according to the second value combination and the second evaluation index, and updating the search record set according to the second search record;

The second determining unit is used for determining the target value of each super parameter in the target prediction model according to the target value combination in the target search record in the last updated search record set; wherein the evaluation index in the target search record is superior to other search records.

An embodiment of a third aspect of the present application provides an electronic device, including: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the value determination method of the super-parameters in the target prediction model related to the target service according to the embodiment of the first aspect of the application when executing the program.

An embodiment of a fourth aspect of the present application proposes a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor implements a method for determining a value of a super parameter in a target prediction model associated with a target service as proposed in the embodiment of the first aspect of the present application.

An embodiment of a fifth aspect of the present application proposes a computer program product, which when executed by a processor, performs a method for determining a value of a super parameter in a target prediction model associated with a target service as proposed in an embodiment of the first aspect of the present application.

The technical scheme provided by the embodiment of the application has at least the following beneficial effects:

according to the first aspect, a target value combination is determined from value combinations obtained by multiple searches, wherein an evaluation index corresponding to the target value combination is optimal, and the target value of each super parameter in the target prediction model is determined according to the target value combination, so that the accuracy of super parameter setting can be improved, and the prediction precision of the target prediction model in a target service scene is improved.

In the second aspect, a k-fold cross verification method is adopted to obtain the first evaluation index corresponding to each first value combination, so that the rationality or credibility of the calculation of the first evaluation index can be improved.

In the third aspect, the core super-parameters and the auxiliary super-parameters are integrated, and the super-parameter value subspace is selected, so that the efficiency and the accuracy of the subsequent super-parameter search can be considered.

In the fourth aspect, the searching strategy of the probability model and the grid searching strategy are combined, and the target value of each superparameter in the target prediction model is determined, so that the accuracy of superparameter setting can be improved.

In the fifth aspect, the searching strategy, the grid searching strategy and the random searching strategy of the probability model are combined at the same time, and the target value of each super parameter in the target prediction model is determined, so that the searching efficiency can be improved, the accuracy of super parameter setting can be improved, and the prediction precision of the target prediction model is improved.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a schematic flow chart of a method for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application;

FIG. 2 is a flowchart of another method for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application;

FIG. 3 is a flowchart of another method for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application;

FIG. 4 is a flowchart of another method for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application;

FIG. 5 is a flowchart of another method for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application;

FIG. 6 is a flowchart of another method for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application;

Fig. 7 is a schematic structural diagram of a device for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application;

fig. 8 is a schematic structural view of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present application and should not be construed as limiting the application.

In general, the prediction model includes a plurality of super-parameters to be optimized, each super-parameter has a certain value range (may be a numerical value or a fractional type), and the value ranges of the super-parameters form a super-parameter value space, and a value combination is obtained by sampling in the super-parameter value space once, where the value combination is composed of actual values of different super-parameters. In actual operation, a user can sample in the hyper-parameter value space for multiple times, perform modeling attempt on different value combinations, evaluate the value combinations by means of cross verification and the like, and after finding out high-quality value combinations, construct and output a final prediction model by using the value combinations.

Since an actual prediction model often includes several to tens of unequal hyper parameters to be optimized, the hyper parameter optimization is a very key and time-consuming link in the prediction model optimization, and currently, the hyper parameter optimization in the industry generally adopts the following two modes:

first, manual adjustment mode: the super-parameters are tried and adjusted by engineers according to the modeling experience of the engineers, and the optimization is gradually carried out until the level which is regarded as better empirically is reached.

However, this approach suffers from at least several drawbacks:

on the one hand, the optimization effect is determined by the level of engineers, if the level of the engineers is not high, the prediction accuracy of the model is lower, the culture cost and the employment cost of the engineers are higher, and meanwhile, the risk of talent loss exists;

on the other hand, the technical means often have no clear specification, so that the optimization effect is unstable;

on the other hand, the reusability is poor, and in practical application, different engineers often lack consensus and communication, so that technical achievements are difficult to share;

on the other hand, it is difficult to perform multiple optimization tasks in parallel;

on the other hand, it is difficult to fully utilize computing resources.

Second, the automatic search hyper-parameters mode: many modeling tools today provide a hyper-parametric search tool with some automation capability that can reduce the cost of hyper-parametric optimization. Under the current trend of high development of artificial intelligence, the wide use of automatic modeling tools gradually replaces the manual parameter adjustment mode. However, the current common hyper-parameter automatic searching means is single, and is often a simple application of a single algorithm and strategy. For example, the more common super-parametric search strategies include the following three categories:

first, grid search. When using grid search, a user determines a plurality of value points to be searched for each super parameter, then constructs all possible value combinations in a Cartesian product mode, and then tries on the value combinations one by one.

However, the disadvantage of this type of search strategy is that when the number of super-parameters or the number of values is large, there are a large number of combinations of values to be tried, which results in difficulty in searching, and because most of the super-parameters have poor quality of values, a large amount of calculation power is wasted.

And second, random searching. The random search is to randomly select the value on each super parameter, so that the obtained super parameter value combination is also randomly generated, and the maximum try times are set, so that the problem of high calculation difficulty of the search can be solved.

However, the disadvantage of this type of search strategy is that some key combinations of values may not be searched all the time due to randomness, and the result of the previous search cannot be referred to because the hyper-parameters of each search are generated randomly independently, so that the targeted search for the key area is lacking, and a great deal of computational power is wasted.

Third, probabilistic model-based searches (e.g., tree-structured Parzen Estimator is a Bayesian optimization algorithm based on Tree structures). The probability model-based search algorithm builds a probability model on the results of the preamble search, so as to predict the next possibly better data point, and the preamble search results are considered to perform targeted search.

However, a disadvantage of this type of search strategy is that since the probabilistic model requires a certain amount of initialization data for model construction, the amount and quality of these initialization data will determine the quality of the super parameters to be searched next, however, the amount of initialization data will often not be too large, usually about 10 to 20, and the amount of initialization data is small, which may lead to unstable effects of initialization. Furthermore, the searching mode based on the probability model is easy to fall into a certain local searching range, so that the searching space is limited.

In summary, in the prediction scenario of the financial field, such as intelligent wind control prediction, asset preference prediction, user financial product purchase prediction, customer loss prediction, and the like, because the modeling accuracy requirement is relatively high, a large number of samples and features are often required to be used, and a plurality of links such as data cleaning, data preprocessing, feature engineering, prediction modeling, and the like are involved in modeling the samples and the features. Because each link may have a plurality of super parameters to be optimized, the overall modeling flow is finally caused to have a large number of super parameters to be optimized, the number of super parameters is usually between 15 and 30, and the number of super parameters may be more on complex tasks. Meanwhile, in the automatic modeling process, if a relatively stable modeling effect is to be obtained, a sufficiently wide hyper-parameter value space needs to be provided, so that the model has higher flexibility so as to adapt to various different tasks and data samples. Therefore, on the premise that the number of the super-parameters is large and the value range of the super-parameters is wide, the problem of difficult searching is faced by modeling by using the three super-parameter searching strategies.

For example, with respect to the grid search strategy, problems of excessively long search time and excessive resource consumption are likely to occur.

For another example, for a random search strategy, the randomness of each search may result in unstable search results and poor search objectives.

For another example, with a search strategy based on a probabilistic model, the result of experience is easily affected by some local points, and tends to be easily trapped in local optima.

Aiming at least one problem, the application provides a method and a device for determining the value of a super parameter in a target prediction model associated with a target service.

The following describes a method and a device for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application with reference to the accompanying drawings.

Fig. 1 is a flowchart of a method for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application.

The method for determining the value of the super-parameter in the target prediction model associated with the target service, which is provided by the embodiment of the application, can be applied to any electronic equipment so that the electronic equipment can execute the function of determining the value of the super-parameter.

The electronic device may be any device with computing capability, for example, may be a personal computer (Personal Computer, abbreviated as PC), a mobile terminal, a server, etc., and the mobile terminal may be, for example, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, an intelligent robot, etc., and have various hardware devices including an operating system, a touch screen, and/or a display screen.

As shown in fig. 1, the method for determining the value of the super parameter in the target prediction model associated with the target service may include the following steps:

step S101, determining a super-parameter value subspace from a set super-parameter value space; the hyper-parameter value space comprises a plurality of value combinations, and each value combination comprises a candidate value corresponding to each hyper-parameter in the target prediction model.

In the embodiment of the present application, the hyper-parameter value space is preset, for example, the target prediction model includes N hyper-parameters (H ₁ ,H ₂ ,H ₃ ,...,H _N ) The modeling engineer may define a range of values for each of the hyper-parameters based on the model algorithm used, where the range of values for the nth hyper-parameter is denoted as R _n N is a positive integer not greater than N. For all the super parameters, performing data sampling once in the value range R of each super parameter to obtain a specific value combination, wherein the value combination is expressed as: h is a _c ＝(h ₁ ,h ₂ ,h ₃ ,...,h _N ) Wherein h is _n Representing the specific value of the nth hyper-parameter, i.e. the combination of values h _c Includes a specific value (denoted as candidate value in the present application) corresponding to each super-parameter in the target prediction model. Then h _c All possible values of (a) constitute a hyper-parameter value space omega, the hyper-parameter The number of dimensions of the numerical space omega is equal to the number of super parameters.

The super-parameters can be divided into two types, namely continuous super-parameters and discrete super-parameters, wherein the value range of the continuous super-parameters is a real value interval h epsilon a and b, and the value range of the discrete super-parameters is a set of finite elements.

In the embodiment of the application, p value combinations h can be selected from the hyper-parameter value space omega _c Form a coarse-grained hyper-parameter value subspace ψ= (hc) ¹ ,hc ² ,hc ³ ,...,hc ^p ) Accordingly, Ω may be referred to as a fine-grained hyper-parameter value space. Wherein p is a positive integer, hc ^p The combination of values representing the p-th selection.

Step S102, executing a plurality of first search processes on the super-parameter value subspace; the first search process comprises the following steps: searching from the super-parameter value-taking subspace to obtain a first value-taking combination, obtaining a first evaluation index corresponding to the first value-taking combination, generating a first search record according to the first value-taking combination and the first evaluation index, and adding the first search record into a search record set.

In the embodiment of the application, a plurality of first search processes can be performed on the super-parameter valued subspace, wherein each first search process can comprise: searching from the super parameter value-taking subspace to obtain a first value-taking combination, obtaining evaluation indexes (such as accuracy, precision, loss (or loss value) and the like) corresponding to the first value-taking combination, recording as a first evaluation index in the application, generating a first search record according to the first value-taking combination and the first evaluation index, and adding the first search record into a search record set. For example, when the hyper-parameter valued subspace ψ includes p valued combinations h _c And when the method is used, p times of first search processes can be executed on the super-parameter value subspace, so that p first search records are obtained.

As one possible implementation, a grid search strategy may be performed on the hyper-parametric value subspace ψ, i.e. traversing each h in ψ _c By using a k-fold cross-validation means to obtain each h _c Corresponding toA first evaluation index.

Step S103, a second searching process is executed on the super-parameter value space to obtain a second value combination obtained by searching in the second searching process, a second evaluation index corresponding to the second value combination is obtained, a second searching record is generated according to the second value combination and the second evaluation index, and a searching record set is updated according to the second searching record.

In the embodiment of the application, a second search process can be performed on the hyper-parameter value space to obtain a second value combination obtained by searching in the second search process, and a second evaluation index corresponding to the second value combination is obtained, wherein the determination mode of the second evaluation index is similar to the determination mode of the first evaluation index, for example, a k-fold cross validation method can be adopted to obtain the second evaluation index corresponding to the second value combination.

In the application, a second search record can be generated according to the second value combination obtained by searching in the second search process and the corresponding second evaluation index, and the second search record is added into the search record set.

It should be noted that, in order to facilitate analysis of quality and accuracy of the super-parameter values in each value combination, the index type of the second evaluation index may be the same as the index type of the first evaluation index, for example, when the first evaluation index is the accuracy, the second evaluation index is also the accuracy, or when the first evaluation index is the accuracy, the second evaluation index is also the accuracy. Alternatively, when the index type of the second evaluation index is different from the index type of the first evaluation index, the second evaluation index and the first evaluation index may be mutually comparable indexes.

Step S104, determining the target value of each super parameter in the target prediction model according to the target value combination in the target search record in the last updated search record set; wherein the evaluation index in the target search record is better than other search records.

In an embodiment of the present application, the target search record may be determined from each search record in the last updated search record set, where an evaluation index in the target search record (e.g., when the target search record is a first search record, the evaluation index may be a first evaluation index, or when the target search record is a second search record, the evaluation index may be a second evaluation index) is better than the evaluation index of the other search records in the search record set.

For example, when the evaluation index is an accuracy rate, the evaluation index in the target search record is larger than the evaluation index of the other search records.

For another example, when the evaluation index is a loss, the evaluation index in the target search record is smaller than the evaluation index of the other search records.

In the embodiment of the present application, the target value of each super parameter in the target prediction model may be determined according to a target value combination in the target search record (for example, when the target search record is a first search record, the target value combination may be a first value combination, or when the target search record is a second search record, the target value combination may be a second value combination).

That is, the target value of the corresponding super parameter in the target prediction model may be assigned according to the candidate value of each super parameter in the target value combination.

It should be noted that, in the present application, a specific business scenario related to the target prediction model is not limited, for example, the target prediction model may be used for risk prediction, purchase consumption prediction, customer group category prediction, customer loss probability prediction, pricing prediction, storage or loan interest rate prediction of financial goods, etc. of a financial institution, and the data type of the input data of the target prediction model may be a form type, a text type, an image type, etc., which has relatively wide versatility.

As an application scenario, taking a target service as a loan service as an example, input data of a target prediction model may be first service data associated with the loan service, where the target prediction model is used to perform credit prediction on the first service data, and output a credit score of the first service data, where the credit score is used to indicate a credit status of a first object associated with the first service data.

Wherein the first service data includes, but is not limited to: basic information (such as gender, age, academy, name, etc.), credit records, deposit balances, external data, etc. of a first object (such as a customer) applying for a certain loan product under a loan service.

For example, when the credit score is relatively high, the credit score may be used to indicate that the credit of a first object (e.g., customer) associated with the first business data is high, the first object is at a low risk of surprise (e.g., payability is high, etc.), and when the credit score is relatively low, the credit score may be used to indicate that the credit of a first object (e.g., customer) associated with the first business data is low, the first object is at a high risk of surprise (e.g., payability is low, etc.).

As another application scenario, taking the target business as the loan business for example, the input data of the target prediction model may be first business data associated with the loan business, where the target prediction model is used for predicting the loan interest rate of the first business data, so as to obtain the loan interest rate of the first business data.

Wherein the first service data includes, but is not limited to: information on the loan amount, credit status, loan use, loan term, mortgage attribute, loan form, and guarantor attribute of the object associated with the first business data.

As still another application scenario, taking the target service as an account opening service for example, the input data of the target prediction model may be second service data associated with the account opening service, the target prediction model is used for performing account opening prediction on the second service data, and outputting account opening probability of the second service data, where the account opening probability is used for indicating probability of handling the account opening service by a second object associated with the second service data.

Wherein the second service data includes, but is not limited to: basic information (such as gender, age, academic calendar, name, etc.), credit record, etc. of the second object.

As still another application scenario, taking the target service as the storage service for example, the input data of the target prediction model may be third service data associated with the storage service, and the target prediction model is used for predicting deposit interest rate of the third service data and outputting deposit interest rate of the third service data.

Wherein the third service data includes, but is not limited to: deposit amount of the object associated with the third business data, deposit deadline, basic information, region information in which the object is located, storage form (e.g., live, regular), and the like.

As yet another example, taking the target service as the fund supervision service for illustration, the input data of the target prediction model may be fourth service data associated with the fund supervision service and at least one fifth service data associated with the fourth service data, the target prediction model is used for predicting the illegal fund transfer probability of the fourth service data based on each fifth service data, and the prediction probability of the fourth service data is output, where the prediction probability is used for indicating the probability of illegal fund transfer of a third object associated with the fourth service data.

Wherein the fourth or fifth service data includes, but is not limited to: basic information of the associated objects, fund transactions, transaction actions and the like.

Wherein there is a fund exchange between the object associated with the fourth business data and the object associated with the fifth business data, for example, the object associated with the fourth business data pays money to a bank account of the object associated with the fifth business data for a long time, or the object associated with the fifth business data pays money to a bank account of the object associated with the fourth business data for a long time.

As another example, taking the target service as the financial service as an example, the input data of the target prediction model may be sixth service data associated with the financial service, and the target prediction model is used for pricing prediction of a financial product related to the sixth service data, so as to obtain predicted pricing for purchasing the financial product by a fourth object associated with the sixth service data.

Wherein the sixth service data includes, but is not limited to: various attributes of the financial product (e.g., category, product form, product risk level, purchase channel, redemption limit, maximum limit, minimum limit, historical equity, etc.), various attributes of the fourth object (e.g., revenue, occupation, age, deposit history, loan history, financial products purchased historically, etc.), various attributes of the target object population (e.g., revenue, occupation, age, deposit history, loan history, financial products purchased historically, etc.), and the like.

The target object group may be an object group for which the financial product is mainly aimed, or an object group for which the financial product has been purchased historically.

According to the method for determining the value of the super-parameters in the target prediction model associated with the target service, the target value combination is determined from the value combination obtained by multiple searches, wherein the evaluation index corresponding to the target value combination is optimal, and the target value of each super-parameter in the target prediction model is determined according to the target value combination, so that the accuracy of super-parameter setting can be improved, and the prediction precision of the target prediction model in a target service scene is improved.

In order to clearly explain how to determine the hyper-parameter value subspace from the set hyper-parameter value space in any embodiment of the application, the application also provides a hyper-parameter value determination method in a target prediction model associated with a target service.

Fig. 2 is a flowchart of another method for determining the value of the super parameter in the target prediction model associated with the target service according to the embodiment of the present application.

As shown in fig. 2, the method for determining the value of the super parameter in the target prediction model associated with the target service may include the following steps:

step S201, determining a core super-parameter and an auxiliary super-parameter from a plurality of super-parameters in a target prediction model.

The influence degree of the core super-parameters on the prediction result of the target prediction model is higher than that of the auxiliary super-parameters on the prediction result of the target prediction model.

In the embodiment of the application, a plurality of super-parameters in the target prediction model can be divided into two types according to the influence degree, wherein the first type is a core super-parameter with larger influence on the prediction result of the target prediction model, the core super-parameter is often related to the model structure and the complexity of the model, and the second type is an auxiliary super-parameter with smaller influence on the prediction result of the target prediction model, and the auxiliary super-parameter is often used for fine adjustment or overfitting control of the model structure.

It should be noted how to classify the multiple superparameters to determine the core superparameters and the auxiliary superparameters, reference may be made to a description document of a model algorithm implementation framework and a modeling experience of an engineer, for example, reference may be made to a parameter description document of an XGBoost algorithm framework, so as to determine which superparameters are suitable for tuning, and which superparameters should be included in the core superparameter class.

Step S202, counting a first number of the core super parameters and a second number of the auxiliary super parameters.

In the embodiment of the present application, the number of core superparameters in the target prediction model (the first number is recorded in the present application) may be counted, and the number of auxiliary superparameters in the target prediction model (the second number is recorded in the present application) may be counted.

Step S203, a first set value number corresponding to the core super parameter and a second set value number corresponding to the auxiliary super parameter are obtained.

In the embodiment of the present application, the first set number of values is greater than the second set number of values, for example, the second set number of values may be 1, and the second set number of values may be 2 or 3.

In the embodiment of the application, a small number of values can be selected on each super parameter, usually 1-3 values, that is, the value ranges of the first set value number and the second set value number can be [1,3], and the actual value number of each super parameter can be determined according to the total number of the super parameters and the importance degree of each super parameter. In general, the larger the total number of the super-parameters is, the smaller the number of values that can be selected for each super-parameter is, meanwhile, for the core super-parameter, the number of values can be increased appropriately, and for the auxiliary super-parameter, only one value can be recommended to be selected.

Step S204, determining the target number of the value combinations contained in the super-parameter value subspace according to the first number, the first set value number, the second number and the second set value number.

In the embodiment of the application, the target number of the value combinations contained in the super-parameter value subspace can be determined according to the first number, the first set value number, the second number and the second set value number.

For example, assume that the number of values of the ith super parameter is q _i (q when the superparameter is a core superparameter) _i For the first set value number, q when the super parameter is the auxiliary super parameter _i The second set value number) and the target number p are:

step S205, selecting a target number of value combinations from the set hyper-parameter value space, and generating a hyper-parameter value subspace.

The super-parameter value space comprises a plurality of value combinations, and each value combination comprises a candidate value corresponding to each super-parameter in the target prediction model.

In the embodiment of the application, the value combination of the target number can be automatically selected from the set super-parameter value space to generate the super-parameter value subspace, or a plurality of value combinations with relatively better values can be selected from the super-parameter value space according to the manual experience by an algorithm engineer or a modeling engineer to generate the super-parameter value subspace.

Step S206, executing a plurality of first search processes on the super parameter value subspace.

The first search process comprises the following steps: searching from the super-parameter value-taking subspace to obtain a first value-taking combination, obtaining a first evaluation index corresponding to the first value-taking combination, generating a first search record according to the first value-taking combination and the first evaluation index, and adding the first search record into a search record set.

The explanation of step S206 may be referred to the related description in any embodiment of the present application, and will not be repeated here.

In any embodiment of the present application, a k-fold cross-validation method is used to obtain a first evaluation index corresponding to each first value combination.

As an example, for any first value combination, each super parameter in the target prediction model may be assigned according to the first value combination, so as to obtain an updated target prediction model, and labeled sample data in the sample set is grouped to obtain k sample subsets.

Where k is a preset positive integer, and the intersection of any two sample subsets among the k sample subsets is an empty set.

For example, when training the target prediction model according to k sample subsets by adopting a k-fold cross validation algorithm, one sample subset of the k sample subsets can be sequentially selected as a test set, and the remaining sample subsets except the test set are used as training sets. For example, the average value, the maximum value, the median, the minimum value, and the like of each evaluation index may be used as the first evaluation index corresponding to the first value combination, which is not limited in the present application.

For example, sample data is divided into 5 sample subsets, namely a sample subset 1, a sample subset 2, a sample subset 3, a sample subset 4 and a sample subset 5, by not repeatedly sampling, firstly, the sample subset 1 can be selected as a test set, a set formed by the sample subset 2, the sample subset 3, the sample subset 4 and the sample subset 5 is used as a training set, a target prediction model is trained according to the training set, and the trained target prediction model is adopted to predict the sample subset 1, so as to obtain an evaluation index 1 of the target prediction model; then, a sample subset 2 can be selected as a test set, a set formed by the sample subset 1, the sample subset 3, the sample subset 4 and the sample subset 5 is used as a training set, a target prediction model is trained according to the training set, and the trained target prediction model is adopted to predict the sample subset 2 so as to obtain an evaluation index 2 of the target prediction model; then, sample subset 3, sample subset 4, and sample subset 5 may be sequentially selected as test sets, and the above operations are repeatedly performed, so that a total of 5 evaluation indexes may be obtained. Finally, the maximum value, the average value, the median, the weighted average value and the like in the 5 evaluation indexes can be used as a first evaluation index corresponding to the first value combination.

In summary, by adopting a k-fold cross validation method, the first evaluation index corresponding to each first value combination is obtained, so that the rationality or credibility of the calculation of the first evaluation index can be improved.

Step S207, a second search process is performed on the hyper-parameter value space to obtain a second value combination obtained by searching in the second search process, a second evaluation index corresponding to the second value combination is obtained, a second search record is generated according to the second value combination and the second evaluation index, and the search record set is updated according to the second search record.

Step S208, determining the target value of each super parameter in the target prediction model according to the target value combination in the target search record in the last updated search record set.

Wherein the evaluation index in the target search record is better than other search records.

The explanation of steps S207 to S208 can be referred to the related description in any embodiment of the present application, and will not be repeated here.

According to the method for determining the value of the super-parameter in the target prediction model associated with the target service, the core super-parameter and the auxiliary super-parameter are integrated, and the super-parameter value subspace is selected, so that the efficiency and the accuracy of subsequent super-parameter searching can be considered.

In order to clearly illustrate how the second search process is performed on the hyper-parameter value space in any embodiment of the present application, the present application also provides a method for determining the hyper-parameter value in the target prediction model associated with the target service.

Fig. 3 is a flowchart of another method for determining the value of the super parameter in the target prediction model associated with the target service according to the embodiment of the present application.

As shown in fig. 3, on the basis of the embodiment shown in fig. 1 or 2, the second search process may include at least one first sub-search process, wherein any one first sub-search process may include the steps of:

step S301, generating an index prediction model by using each search record in the search record set updated last time.

In the embodiment of the application, each search record in the latest updated search record set can be adopted to generate an index prediction model (or called a probability model and a probability prediction model).

In one possible implementation of the embodiment of the present application, the index prediction model may be generated in the following manner:

1. and sequentially inputting the value combinations in the search records in the latest updated search record set, and inputting an initial index prediction model to perform index prediction to obtain a prediction evaluation index corresponding to the value combinations in the search records.

2. And generating a loss value according to the difference between the evaluation index in each search record and the prediction evaluation index corresponding to each search record.

That is, for any one search record, the difference between the predicted evaluation index obtained by index prediction of the combination of values in that search record by the index prediction model and the evaluation index in that search record may be calculated, and the value of the loss function (referred to as the loss value in the present application) may be determined based on the difference corresponding to each search record, where the loss value and the difference are in positive correlation.

3. And training the initial index prediction model according to the loss value to obtain a trained index prediction model.

For example, model parameters in the index prediction model may be adjusted based on the loss value to minimize the loss value.

It should be noted that, the foregoing is only exemplified by taking the termination condition of the training of the index prediction model as the minimization of the loss value, but the present application is not limited to this, and other termination conditions may be set in practical application, for example, the termination conditions may further include: the number of training times reaches the set number of times, the training time reaches the set time, etc., and the present application is not limited thereto.

Step S302, a plurality of first candidate value combinations are randomly searched from the super-parameter value space, and index prediction is respectively carried out on the plurality of first candidate value combinations by adopting an index prediction model, so as to obtain first prediction indexes of the plurality of first candidate value combinations.

In the embodiment of the application, a plurality of value combinations (marked as first candidate value combinations in the application) can be searched randomly from a super-parameter value space, and index prediction is performed on the plurality of first candidate value combinations by adopting an index prediction model to obtain prediction evaluation indexes (marked as first prediction indexes in the application) of the plurality of first candidate value combinations.

Step S303, determining a second value combination obtained by searching in the first sub-searching process from the plurality of first candidate value combinations according to the first prediction indexes of the plurality of first candidate value combinations.

In the embodiment of the application, the second value combination obtained by searching in the first sub-searching process can be determined from the plurality of first candidate value combinations according to the first prediction indexes of the plurality of first candidate value combinations. For example, the second combination of values obtained by the first sub-search process may be the first combination of candidate values with the optimal first prediction index.

Step S304, according to the second value combination obtained by searching in the first sub-searching process and the corresponding second evaluation index, generating a second searching record corresponding to the first sub-searching process, and adding the second searching record into the searching record set.

In the embodiment of the present application, a second evaluation index corresponding to a second value combination obtained by searching in the first sub-search process may be obtained, where the obtaining manner of the second evaluation index is similar to that of the first evaluation index, and will not be described herein.

In the embodiment of the application, a second search record corresponding to the first sub-search process can be generated according to the second value combination obtained by searching in the first sub-search process and the corresponding second evaluation index, and the second search record is added into the search record set.

Step S305, it is determined whether the set stop search condition is satisfied, and if the set stop search condition is satisfied, execution of the second search process is ended.

In the embodiment of the application, the search stopping condition is set as a preset search stopping condition.

As one possible implementation, setting the stop search condition may include any one of:

1. the total number of times searched is greater than or equal to a set number of times threshold, wherein the total number of times searched is determined according to the number of times the first search process is performed, the number of times the first sub-search process is performed, and the number of times the subsequent second sub-search process is performed.

2. The searched total time length is greater than or equal to a set time length threshold, wherein the searched total time length refers to a time length between a time point when the first search process is first executed and a current time point.

3. The improvement rate of the optimal evaluation index in the evaluation indexes of the value combinations obtained by q continuous searches relative to the optimal evaluation index in the evaluation indexes of the value combinations searched before q searches is smaller than a set improvement rate threshold.

Wherein q is a set positive integer, for example, the value range of q may be [5,10], and the value range of the set lifting rate threshold may be [0.1,1.0], with the units being percentages.

In the embodiment of the application, whether the set stop search condition is satisfied or not can be judged, and if the set stop search condition is satisfied, the execution of the second search process is ended.

The method for determining the value of the super-parameter in the target prediction model associated with the target service can be used for determining the target value of each super-parameter in the target prediction model by combining the searching strategy and the grid searching strategy of the probability model, and can improve the accuracy of super-parameter setting.

In a possible implementation manner of the embodiment of the present application, in order to further improve accuracy of superparameter setting, a search strategy of a probability model, a grid search strategy and a random search strategy may be combined at the same time to determine a target value of each superparameter in a target prediction model. The above process will be described in detail with reference to fig. 4.

Fig. 4 is a flowchart of another method for determining the value of the super parameter in the target prediction model associated with the target service according to the embodiment of the present application.

As shown in fig. 4, on the basis of the embodiments shown in fig. 1 to 3, the second search process may include at least one second sub-search process in addition to the at least one first sub-search process, and in particular, the second search process may include the steps of:

step S401, each search record in the latest updated search record set is adopted to generate an index prediction model.

Step S402, a plurality of first candidate value combinations are randomly searched from the super parameter value space, and index prediction is respectively carried out on the plurality of first candidate value combinations by adopting an index prediction model, so as to obtain first prediction indexes of the plurality of first candidate value combinations.

Step S403, determining a second value combination obtained by searching in the first sub-search process from the plurality of first candidate value combinations according to the first prediction indexes of the plurality of first candidate value combinations.

Step S404, according to the second value combination obtained by searching in the first sub-searching process and the corresponding second evaluation index, generating a second searching record corresponding to the first sub-searching process, and adding the second searching record into the searching record set.

Step S405, it is determined whether the set stop search condition is satisfied, if yes, step S406 is executed, and if no, step S407 is executed.

Step S406, the execution of the second search process is ended.

The explanation of steps S401 to S406 may be referred to the related description in any embodiment of the present application, and will not be repeated here.

Step S407, counting the execution times of the first sub-search process.

Step S408, when the execution number reaches the first set number, it is determined whether the second combination of values searched for in the first sub-search process of the first set number satisfies the set re-search condition, if yes, the process returns to step S401, the first sub-search condition is re-executed, and if no, step S409 is executed.

The re-search condition is set to indicate that the second value combination searched by the first sub-search process of the first set times is located in a certain local range of the super-parameter value space.

As one possible implementation, setting the re-search condition may include: the first distance is less than a distance threshold, wherein the distance threshold is determined from a product of the set coefficient and the second distance.

The first distance is determined according to the distance between the second value combinations searched by the first sub-search process of the first set times, and the distance is determined according to the difference between elements at the same position in any two second value combinations.

For example, taking the first set number of times as m as an example, the distance between every two second value combinations can be calculated to obtainThe distance between any two second combination of values can be determined in the following manner: calculating the difference (such as difference value, absolute value of difference value, etc.) between the elements at the same position in the two second value combinations to obtain N element differences, and adding N elementsThe mean value of the differences of the elements, the weighted sum value, etc., as the distance between the two second combination of values. Afterwards, can be right->And obtaining an average value of the distances to obtain a first distance.

Wherein the second distance is determined based on the distance between the value combinations in each search record in the newly updated set of search records.

It should be noted that, the second distance is calculated in a similar manner to that of the first distance, for example, if the number of search records in the search record set updated by the latest mark is u, the distance between the value combinations in the search records can be calculated to obtainDistance, thus can be to->And (5) obtaining an average value of the distances to obtain a second distance.

For example, the first distance is marked as l _m The second distance is l, and the set coefficient is 0.5, when When it is determined that the set re-search condition is satisfied, when +.>When the set re-search condition is not satisfied, it is determined.

In the embodiment of the present application, the execution times of the first sub-search process may be counted, and whether the execution times reach the first set times may be determined, if the execution times reach the first set times, whether the second value combination obtained by searching in the first sub-search process of the first set times satisfies the set re-search condition may be further determined, if the execution times do not reach the first set times, the first sub-search process may be re-executed until the execution times of the first sub-search process reach the first set times, and then step S408 may be executed again.

Step S409, at least one second sub-search process among the second search processes is performed.

In the embodiment of the present application, when the second value combination obtained by searching in the first sub-search process of the first set number of times does not satisfy the set re-search condition, at least one second sub-search process in the second search process may be further executed.

And when the second combination of values obtained by searching in the first sub-search process of the first set number of times meets the set re-search condition, the first sub-search process may be re-executed, i.e. step S401 and subsequent steps may be re-executed.

In one possible implementation manner of the embodiment of the present application, the arbitrary first sub-search process may include:

1. randomly searching a second candidate value combination from the super-parameter value space, and taking the second candidate value combination as a second value combination obtained by searching in a second sub-searching process.

2. And generating a second search record corresponding to the second sub-search process according to the second value combination obtained by searching in the second sub-search process and the corresponding second evaluation index, and adding the second search record into the search record set.

The calculation manner of the second evaluation index corresponding to the second value combination obtained by the second sub-search process is similar to that of the first evaluation index, and will not be described herein.

3. Judging whether the set search stopping condition is met, if yes, ending the execution of the second search process, and if not, judging whether the arrangement order of the second search records in the latest updated search record set in the second sub-search process is larger than the set value.

The arrangement order of each search record in the search record set is determined according to the evaluation indexes in each search record, that is, each search record can be ordered from top to bottom according to the value of the corresponding evaluation index. For example, when the evaluation index is an accuracy rate, the search records may be ranked from large to small according to the value of the corresponding evaluation index, or when the evaluation index is a loss, the search records may be ranked from small to large according to the value of the corresponding evaluation index.

The set value is a preset value, for example, the set value may be 2, 3, 4, etc.

In the embodiment of the present application, after adding the second search record corresponding to the second sub-search process to the search record set, it may be determined whether a set stop search condition is satisfied, if the set stop search condition is satisfied, execution of the second search process is ended, and if the set stop search condition is not satisfied, it is further determined whether the number of arrangement bits of the second search record in the most recently updated search record set in the second sub-search process is greater than a set value.

4. And if the arrangement order is smaller than or equal to the second set value, re-executing the first sub-search process, and if the arrangement order is larger than the second set value, executing the next second sub-search process.

In the present application, in the case where the order of arrangement bits of the second search record in the most recently updated search record set in the second sub-search process is less than or equal to the second set value, it is indicated that the second value combination in the second search record is a high-quality value, at which time, since the set stop search condition has not been satisfied, the first sub-search process may be repeatedly performed.

And in the case where the ranking order of the second search record in the most recently updated search record set of the second sub-search process is greater than the second set value, it is indicated that the second value combination in the second search record is not a good-quality value, at which time, since the set stop search condition has not been satisfied, the second sub-search process may be repeatedly performed.

According to the method for determining the values of the super parameters in the target prediction model associated with the target service, provided by the embodiment of the application, the target values of each super parameter in the target prediction model are determined by combining the searching strategy, the grid searching strategy and the random searching strategy of the probability model, so that the searching efficiency can be improved, the accuracy of setting the super parameters can be improved, and the prediction precision of the target prediction model can be improved.

In any of the embodiments of the present application, the super parameters in the target prediction model may be optimized by the following steps, and the flow is shown in fig. 5:

step 1, dividing all the super parameters into the following steps according to the influence degree on the target prediction model: core superparameters and auxiliary superparameters.

Step 2, selecting p super-parameter value combinations h in the super-parameter value space omega _c Form a coarse-grained hyper-parameter value subspace ψ= (hc) ¹ ,hc ² ,hc ³ ,...,hc ^p ) Accordingly, Ω may be referred to as a fine-grained hyper-parameter value space.

It should be noted that the purpose of constructing ψ is to enable the search front of the super parameter to be tried in a specific few value combinations, and the value combinations in ψ may be designed in advance. Because the scale of the value combination in the original super-parameter value space omega is far larger than that of the value combination in the super-parameter value subspace psi, if the former random search is carried out in omega, the search result is unstable due to the existence of randomness, so that the optimization effect of the whole model is affected. For this purpose, the value of p cannot be set too large, e.g. p can be controlled around 30, up to 50.

The specific method comprises the following steps: a small number of values, typically 1-3, are selected on each superparameter, and the actual number of values is determined according to the total number of superparameters and the importance of each superparameter. Generally, the more the total number of superparameters, the fewer the number of values each superparameter can choose, while for the core superparameter, the number of values may be increased appropriately, and for the auxiliary superparameter, it is recommended to choose only one value. Assume that the value quantity of each super parameter is q _i The number of value combinations included in ψ is p, and there are:

in step 1 and step 2, an algorithm engineer or modeling engineer is required to make the determination of the hyper-parameter type and the construction of the ψ space. Since these steps are tailored to the specific model algorithm used for modeling, independent of the specific task and the specific data used. In the case that the type of model algorithm used is deterministic, then steps 1 and 2 need only be performed once to be applicable to all modeling tasks that follow-up apply to the model algorithm.

Step 3, executing a grid search strategy in the super-parameter value subspace ψ, namely traversing each h in the subspace ψ _c To obtain each h by using a k-fold cross validation means _c Corresponding model evaluation values (denoted as evaluation indexes in the present application). In which the k-fold cross-validation is implemented by splitting the original sample data into k shares (k generally suggests a selection range of [5,10 ]]) And selecting one sample data as a test set, using the rest sample data as a training set, training a target prediction model by using the training set, predicting the target prediction model on the test set, and finally calculating an evaluation index according to the prediction result of the test set and the labeling result of the sample data in the test set.

The purpose of this step is to perform a super-parameter attempt with higher uniformity but relatively stable value in the super-parameter value subspace ψ, where the result of the attempt obtained will be used to guide the subsequent super-parameter optimization process performed in the super-parameter value subspace Ω. Assume that the evaluation index of the ith search process is s ⁱ The search record of the ith search process can be expressed as: [ hc ⁱ ,s ⁱ The set formed by all search records is denoted as t (denoted as search record set in this application).

And 4, carrying out probability modeling by using the search record in the T to obtain an index prediction model M. For the ith sample in T,s is an N-dimensional feature for modeling ⁱ The modeling algorithm is not particularly limited herein for the purpose of modeling the predicted targets.

In the step 5, the step of the method,randomly selecting a plurality of value combinations { hc ] from the hyper-parameter value subspace omega ¹ ,hc ² ,...,hc ⁱ ,..}, predicting each combination of values hc using an index prediction model M ⁱ Predictive evaluation index of (a)(in the application, the first prediction index) and selecting the optimal value combination corresponding to the first prediction index, wherein the value combination is expressed as hc _best 。

Step 6, using hc _best Taking the result as the value of a new super-parameter attempt, and obtaining a new super-parameter attempt result, namely obtaining hc _best Corresponding evaluation index, the hc is calculated _best And adding the evaluation index and the corresponding evaluation index to the T. Judging whether the set search stopping condition is reached, if so, executing the search process, and if not, continuing the following steps.

Step 7, judging hc obtained by the last m times of searching _best If the search result is within a certain local range of omega, prompting that the search of the hyper-parameters falls within the certain local range, turning to the step 8, and if not, repeating the steps 4-7.

Wherein, judging hc obtained by the last m times of searching _best Whether or not it is within a certain local range of Ω, an average distance method may be used, specifically, hc obtained by the last m searches is calculated _best Average distance between l _m Simultaneously calculating the average distance l between any two value combinations in T, ifThe search indicating the hyper-parameters falls within a certain local range of omega. />

Where m may be determined according to the maximum number of searches set by the user (denoted as the second set number in the present application), and typically 5-7 is selected.

And 8, carrying out random super-parameter search in omega, namely randomly selecting a value combination to carry out super-parameter try, and obtaining an evaluation index corresponding to the value combination. Meanwhile, the trial result record (namely, the value combination obtained by random search and the corresponding evaluation index) is added to the T. Judging whether the set search stopping condition is reached, if so, stopping executing the search process, and if not, continuing the following steps.

It should be noted that, the advantage of random search is that it is easy to jump out of the current local area, and a wider parameter search can be performed.

And 9, judging whether the newly tried value combination is a high-quality value, if so, turning to the step 4, and if not, continuing the following steps.

The definition of the high-quality value is as follows: if the evaluation index corresponding to the newly tried value combination is ranked first three in T (exemplified by the set value of 3), then a good quality value is considered.

And 10, repeating the steps 8-9 until the set search stopping condition is reached, ending the search process of the super parameter and outputting the optimal super parameter.

Optionally, after the set search stopping condition is reached, if the search of the super-parameters is not stopped immediately, the super-parameters can be finely tuned near the current optimal super-parameters (candidate values of the super-parameters in the target value combination), and after multiple fine tuning attempts, the optimal super-parameters are finally output.

That is, at least one candidate value in the target value combination can be finely tuned at least once, and an evaluation index corresponding to the target value combination after fine tuning each time is obtained, so that an optimal super-parameter can be determined according to the value combination with the largest evaluation index, and the fine tuning process is shown in fig. 6.

Corresponding to the above-mentioned method for determining the value of the super parameter in the target prediction model associated with the target service provided by the embodiment of fig. 1 to 6, the present application further provides a device for determining the value of the super parameter in the target prediction model associated with the target service provided by the embodiment of the present application, and because the device for determining the value of the super parameter in the target prediction model associated with the target service provided by the embodiment of the present application corresponds to the method for determining the value of the super parameter in the target prediction model associated with the target service provided by the embodiment of fig. 1 to 6, the implementation of the method for determining the value of the super parameter in the target prediction model associated with the target service provided by the embodiment of the present application is also applicable to the device for determining the value of the super parameter in the target prediction model associated with the target service provided by the embodiment of the present application, which is not described in detail in the embodiment of the present application.

Fig. 7 is a schematic structural diagram of a device for determining the value of a super parameter in a target prediction model associated with a target service according to an embodiment of the present application.

As shown in fig. 7, the apparatus 700 for determining the value of the super parameter in the target prediction model associated with the target service may include: the first determination module 710, the first execution module 720, the second execution module 730, and the second determination unit 740.

The first determining module 710 is configured to determine a hyper-parameter value subspace from the set hyper-parameter value space; the hyper-parameter value space comprises a plurality of value combinations, and each value combination comprises a candidate value corresponding to each hyper-parameter in the target prediction model.

A first execution module 720, configured to execute a first search process for a plurality of times on the super parameter value subspace; the first search process comprises the following steps: searching from the super-parameter value-taking subspace to obtain a first value-taking combination, obtaining a first evaluation index corresponding to the first value-taking combination, generating a first search record according to the first value-taking combination and the first evaluation index, and adding the first search record into a search record set.

The second execution module 730 is configured to execute a second search process on the hyper-parameter value space to obtain a second value combination obtained by searching in the second search process, obtain a second evaluation index corresponding to the second value combination, generate a second search record according to the second value combination and the second evaluation index, and update the search record set according to the second search record.

A second determining unit 740, configured to determine a target value of each super parameter in the target prediction model according to a target value combination in the target search record in the last updated search record set; wherein the evaluation index in the target search record is better than other search records.

As one possible implementation, the second search process includes at least one first sub-search process, any of which includes: generating an index prediction model by adopting each search record in the latest updated search record set; randomly searching a plurality of first candidate value combinations from the super-parameter value space, and respectively carrying out index prediction on the plurality of first candidate value combinations by adopting an index prediction model to obtain first prediction indexes of the plurality of first candidate value combinations; determining a second value combination obtained by searching in the first sub-searching process from the plurality of first candidate value combinations according to the first prediction indexes of the plurality of first candidate value combinations; generating a second search record corresponding to the first sub-search process according to the second value combination obtained by searching in the first sub-search process and the corresponding second evaluation index, and adding the second search record into a search record set; judging whether the set stop search condition is met, and if so, ending the execution of the second search process.

As one possible implementation, generating an index prediction model using each search record in the set of search records that was updated last time includes: sequentially inputting the value combinations in the search records in the latest updated search record set, and inputting an initial index prediction model to perform index prediction to obtain a prediction evaluation index corresponding to the value combinations in the search records; generating a loss value according to the difference between the evaluation index in each search record and the prediction evaluation index corresponding to each search record; and training the initial index prediction model according to the loss value to obtain a trained index prediction model.

As a possible implementation manner, the second search process further includes: if the set search stopping condition is not met, counting the execution times of the first sub-search process; in response to the execution times reaching the first set times, judging whether a second value combination searched by a first sub-search process of the first set times meets a set re-search condition; if the set re-searching condition is met, re-executing the first sub-searching process; and if the set re-search condition is not met, executing at least one second sub-search process in the second search process.

As one possible implementation, setting the re-search condition includes: the first distance is smaller than a distance threshold, wherein the distance threshold is determined according to the product of a set coefficient and the second distance; the first distance is determined according to the distance between the second value combinations searched by the first sub-search process of the first set times; the distance is determined according to the difference between the elements at the same position in any two second value combinations; wherein the second distance is determined based on the distance between the value combinations in each search record in the newly updated set of search records.

As one possible implementation, any of the second sub-search processes includes: randomly searching a second candidate value combination from the super-parameter value space, and taking the second candidate value combination as a second value combination obtained by searching in a second sub-searching process; generating a second search record corresponding to the second sub-search process according to the second value combination obtained by searching in the second sub-search process and the corresponding second evaluation index, and adding the second search record into a search record set; judging whether the set search stopping condition is met, if yes, ending the execution of the second search process, and if not, judging whether the arrangement order of the second search records in the latest updated search record set in the second sub-search process is larger than a set value; if the arrangement order is smaller than or equal to the set value, the first sub-search process is re-executed, and if the arrangement order is larger than the set value, the next second sub-search process is executed.

As one possible implementation, setting the search stopping condition includes any one of the following: the searched total times are larger than the second set times; the searched total time length is larger than a set time length threshold value; the improvement rate of the optimal evaluation index of the value combination obtained by q continuous searches relative to the optimal evaluation index of each value combination searched before q searches is smaller than a set improvement rate threshold; wherein q is a set positive integer.

As a possible implementation manner, the first execution module is specifically configured to: assigning values to each super parameter in the target prediction model according to the first value combination to obtain an updated target prediction model; grouping the marked sample data in the sample set to obtain k sample subsets; wherein k is a positive integer; and training the updated target prediction model by adopting a k-fold cross validation algorithm according to the k sample subsets to obtain a first evaluation index corresponding to the first value combination.

As a possible implementation manner, the first determining module 710 is specifically configured to: determining a core superparameter and an auxiliary superparameter from a plurality of superparameters in a target prediction model; the influence degree of the core super-parameters on the prediction result of the target prediction model is higher than that of the auxiliary super-parameters on the prediction result of the target prediction model; counting a first number of core super-parameters and a second number of auxiliary super-parameters; acquiring a first set value number corresponding to the core super-parameters and a second set value number corresponding to the auxiliary super-parameters; wherein the first set number of values is greater than the second set number of values; determining a target number of value combinations contained in the super-parameter value subspace according to the first number, the first set value number, the second number and the second set value number; selecting a target number of value combinations from the super-parameter value space, and generating a super-parameter value subspace.

As a possible implementation manner, the target service is a loan service, the input data of the target prediction model is first service data associated with the loan service, the target prediction model is used for performing credit prediction on the first service data and outputting a credit score of the first service data, wherein the credit score is used for indicating the credit condition of a first object associated with the first service data, or the target prediction model is used for performing loan interest rate prediction on the first service data to obtain the loan interest rate of the first service data;

the target business is an account opening business, the input data of the target prediction model is second business data associated with the account opening business, the target prediction model is used for carrying out account opening prediction on the second business data and outputting account opening probability of the second business data, and the account opening probability is used for indicating the probability of handling the account opening business by a second object associated with the second business data;

the target business is a storage business, the input data of the target prediction model is third business data related to the storage business, the target prediction model is used for predicting deposit interest rate of the third business data and outputting the deposit interest rate of the third business data;

the target business is a fund supervision business, input data of the target prediction model are fourth business data related to the fund supervision business and at least one fifth business data related to the fourth business data, the target prediction model is used for predicting the illegal fund transfer probability of the fourth business data based on the fifth business data and outputting the prediction probability of the fourth business data, and the prediction probability is used for indicating the illegal fund transfer probability of a third object related to the fourth business data;

The target business is financial business, the input data of the target prediction model is sixth business data related to the financial business, and the target prediction model is used for pricing prediction of financial products related to the sixth business data to obtain predicted pricing of fourth object purchase financial products related to the sixth business data.

According to the device for determining the value of the super-parameters in the target prediction model associated with the target service, the target value combination is determined from the value combination obtained by multiple searches, wherein the evaluation index corresponding to the target value combination is optimal, and the target value of each super-parameter in the target prediction model is determined according to the target value combination, so that the accuracy of super-parameter setting can be improved, and the prediction precision of the target prediction model in a target service scene is improved.

In order to implement the above embodiment, the present application further proposes an electronic device, where the electronic device may be any device with computing capability, and the electronic device includes: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the value determination method of the super-parameters in the target prediction model related to the target service according to any one of the embodiments of the application when executing the program.

As an example, fig. 8 is a schematic structural diagram of an electronic device 800 according to an exemplary embodiment of the present application, where, as shown in fig. 8, the electronic device 800 may further include:

the system comprises a memory 810 and a processor 820, and a bus 830 connected with different components (comprising the memory 810 and the processor 820), wherein the memory 810 stores a computer program, and when the processor 820 executes the program, the method for determining the value of the super parameter in the target prediction model associated with the target service according to the embodiment of the application is realized.

Bus 830 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Electronic device 800 typically includes a variety of electronic device readable media. Such media can be any available media that is accessible by electronic device 800 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 810 may also include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 840 and/or cache memory 850. Server 800 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 860 may be used to read from and write to non-removable, non-volatile magnetic media (not shown in FIG. 8, commonly referred to as a "hard disk drive"). Although not shown in fig. 8, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 830 through one or more data medium interfaces. Memory 810 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the application.

A program/utility 880 having a set (at least one) of program modules 870 may be stored, for example, in memory 810, such program modules 870 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 870 typically carry out the functions and/or methods of the embodiments described herein.

The electronic device 800 may also communicate with one or more external devices 890 (e.g., keyboard, pointing device, display 891, etc.), one or more devices that enable a user to interact with the electronic device 800, and/or any device (e.g., network card, modem, etc.) that enables the electronic device 800 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 892. Also, electronic device 800 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 893. As shown, network adapter 893 communicates with other modules of electronic device 800 over bus 830. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 800, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

Processor 820 executes various functional applications and data processing by executing programs stored in memory 810.

It should be noted that, the implementation process and the technical principle of the electronic device in this embodiment refer to the explanation of the above-mentioned method for determining the value of the super parameter in the target prediction model associated with the target service in this embodiment of the present application, which is not repeated here.

In order to achieve the foregoing embodiments, the present application further proposes a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for determining a value of a super parameter in a target prediction model associated with a target service according to any one of the foregoing embodiments of the present application.

In order to implement the above embodiments, the present application further proposes a computer program product, which when executed by a processor, executes a method for determining the value of a super parameter in a target prediction model associated with a target service according to any of the foregoing embodiments of the present application.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. The method for determining the value of the super parameter in the target prediction model associated with the target service is characterized by comprising the following steps:

2. The method of claim 1, wherein the second search process comprises at least one first sub-search process, any of the first sub-search processes comprising:

generating an index prediction model by adopting each search record in the search record set updated last time;

randomly searching a plurality of first candidate value combinations from the super-parameter value space, and respectively carrying out index prediction on the plurality of first candidate value combinations by adopting the index prediction model to obtain first prediction indexes of the plurality of first candidate value combinations;

Determining a second value combination obtained by searching in the first sub-searching process from the plurality of first candidate value combinations according to a first prediction index of the plurality of first candidate value combinations;

generating a second search record corresponding to the first sub-search process according to a second value combination obtained by searching in the first sub-search process and a corresponding second evaluation index, and adding the second search record into the search record set;

judging whether the set search stopping condition is met, and if so, ending executing the second search process.

3. The method of claim 2, wherein generating an index prediction model using each search record in the set of search records that was last updated comprises:

sequentially inputting the value combinations in each search record in the latest updated search record set, and inputting an initial index prediction model to perform index prediction to obtain a prediction evaluation index corresponding to the value combination in each search record;

generating a loss value according to the difference between the evaluation index in each search record and the prediction evaluation index corresponding to each search record;

And training the initial index prediction model according to the loss value to obtain the trained index prediction model.

4. The method of claim 2, wherein the second search process further comprises:

if the set search stopping condition is not met, counting the execution times of the first sub-search process;

in response to the execution times reaching a first set time, judging whether a second value combination searched in a first sub-search process of the first set time meets a set re-search condition;

if the set re-search condition is met, re-executing the first sub-search process;

and if the set re-search condition is not met, executing at least one second sub-search process in the second search process.

5. The method of claim 4, wherein setting a re-search condition comprises:

the first distance is smaller than a distance threshold, wherein the distance threshold is determined according to the product of a set coefficient and the second distance;

wherein the first distance is determined according to the distance between the second value combinations searched by the first sub-search process of the first set times; the distance is determined according to the difference between the elements at the same position in any two second value combinations;

Wherein the second distance is determined based on a distance between value combinations in each search record in the set of search records that was updated most recently.

6. The method of claim 4, wherein any of the second sub-search processes comprises:

randomly searching a second candidate value combination from the super-parameter value space, and taking the second candidate value combination as a second value combination obtained by searching in the second sub-searching process;

generating a second search record corresponding to the second sub-search process according to a second value combination obtained by searching in the second sub-search process and a corresponding second evaluation index, and adding the second search record into the search record set;

judging whether the set search stopping condition is met, if yes, ending the execution of the second search process, and if not, judging whether the arrangement order of the second search records in the latest updated search record set in the second sub-search process is larger than a set value;

and if the arrangement order is smaller than or equal to the set value, re-executing the first sub-search process, and if the arrangement order is larger than the set value, executing the second sub-search process next time.

7. The method according to any one of claims 2-6, wherein the setting of the stop search condition includes any one of:

the searched total times are larger than or equal to the second set times;

the searched total time length is greater than or equal to a set time length threshold value;

the improvement rate of the optimal evaluation index of the value combination obtained by q continuous searches relative to the optimal evaluation index of each value combination searched before the q searches is smaller than a set improvement rate threshold; wherein q is a set positive integer.

8. The method according to any one of claims 1-6, wherein the obtaining a first evaluation index corresponding to the first combination of values includes:

assigning values to each super parameter in the target prediction model according to the first value combination to obtain an updated target prediction model;

grouping the marked sample data in the sample set to obtain k sample subsets; wherein k is a positive integer;

and training the updated target prediction model by adopting a k-fold cross validation algorithm according to the k sample subsets to obtain a first evaluation index corresponding to the first value combination.

9. The method according to any one of claims 1-6, wherein said determining a hyper-parameter value subspace from a set hyper-parameter value space comprises:

determining a core superparameter and an auxiliary superparameter from a plurality of superparameters in the target prediction model; the influence degree of the core super-parameters on the prediction result of the target prediction model is higher than that of the auxiliary super-parameters on the prediction result of the target prediction model;

counting the first number of the core super-parameters and the second number of the auxiliary super-parameters;

acquiring a first set value number corresponding to the core super-parameter and a second set value number corresponding to the auxiliary super-parameter; wherein the first set number of values is greater than the second set number of values;

determining a target number of value combinations contained in the super-parameter value subspace according to the first number, the first set value number, the second number and the second set value number;

selecting the value combination of the target number from the super-parameter value space, and generating the super-parameter value subspace.

10. The method according to any one of claims 1 to 6, wherein,

The target business is a loan business, the input data of the target prediction model is first business data associated with the loan business, the target prediction model is used for performing credit prediction on the first business data and outputting a credit score of the first business data, wherein the credit score is used for indicating the credit condition of a first object associated with the first business data, or the target prediction model is used for performing loan interest rate prediction on the first business data to obtain the loan interest rate of the first business data;

the target service is an account opening service, the input data of the target prediction model is second service data related to the account opening service, the target prediction model is used for carrying out account opening prediction on the second service data and outputting account opening probability of the second service data, and the account opening probability is used for indicating the probability of handling the account opening service by a second object related to the second service data;

the target business is financial business, the input data of the target prediction model is sixth business data related to the financial business, and the target prediction model is used for pricing prediction of financial products related to the sixth business data, so that predicted pricing of purchasing the financial products by a fourth object related to the sixth business data is obtained.