CN115526266A - Model training method and device, and business prediction method and device - Google Patents

Model training method and device, and business prediction method and device Download PDF

Info

Publication number
CN115526266A
CN115526266A CN202211272252.0A CN202211272252A CN115526266A CN 115526266 A CN115526266 A CN 115526266A CN 202211272252 A CN202211272252 A CN 202211272252A CN 115526266 A CN115526266 A CN 115526266A
Authority
CN
China
Prior art keywords
parameter
training
current
parameter value
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211272252.0A
Other languages
Chinese (zh)
Other versions
CN115526266B (en
Inventor
易灿
张天翼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202211272252.0A priority Critical patent/CN115526266B/en
Publication of CN115526266A publication Critical patent/CN115526266A/en
Application granted granted Critical
Publication of CN115526266B publication Critical patent/CN115526266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the specification provides a training method and device of a neural network model and a business prediction method and device. When training a neural network model, acquiring training sample data according to historical service data; in each training round, the following steps are performed: inputting training sample data into the neural network model to adjust the parameter value of each parameter in the neural network model; detecting whether the training of the current round meets the parameter acquisition condition, and if so, recording the current parameter value of each parameter in the neural network model obtained by the training of the current round; after each round of training is finished, aiming at each parameter of the neural network model, obtaining a final parameter value corresponding to the parameter according to at least one recorded current parameter value of the parameter; and setting the parameter value of each parameter in the neural network model as the final parameter value corresponding to the parameter. The embodiment of the specification can better utilize a neural network model to predict the business, and reduces the consumption of system resources.

Description

Model training method and device, and business prediction method and device
Technical Field
One or more embodiments of the present specification relate to computer technology, and more particularly, to a model training method and apparatus, and a traffic prediction method and apparatus.
Background
An artificial neural network is a model for information processing that uses a structure similar to brain neurosynaptic connections. The neural network model is an operational model, and is formed by connecting a large number of nodes (or called neurons) with each other. Each node represents a particular output function, called the excitation function. Every connection between two nodes represents a weighted value, called weight, for the signal passing through the connection, which is equivalent to the memory of the artificial neural network. The output of the network is different according to the connection mode of the network, the weight value and the excitation function. The neural network model can realize functions of function approximation, data clustering, mode classification, optimization calculation and the like. Therefore, neural network models are widely used in information processing in the fields of artificial intelligence, automatic control, robotics, statistics, and the like. Such as applying neural network models for risk control in payment transactions.
The neural network model includes various parameters. The values of the parameters can be determined by training the neural network model. The performance of the neural network model is greatly influenced by the difference of the parameter values.
Therefore, how to better utilize the neural network model to perform the service prediction so as to obtain a more accurate service prediction result is an urgent problem to be solved.
Disclosure of Invention
One or more embodiments of the present specification describe a model training method and apparatus, a business prediction method and apparatus, which can better utilize a neural network model for business prediction.
According to a first aspect, a method for training a neural network model is provided, wherein the method comprises the following steps:
acquiring training sample data according to historical service data;
in each training round:
inputting training sample data into the neural network model to adjust the parameter value of each parameter in the neural network model; and
detecting whether the training of the current round meets the parameter obtaining condition, and if so, recording the current parameter value of each parameter in the neural network model obtained by the training of the current round;
after each training turn is finished, aiming at each parameter of the neural network model, obtaining a final parameter value corresponding to the parameter according to at least one recorded current parameter value of the parameter;
and setting the parameter value of each parameter in the neural network model as the final parameter value corresponding to the parameter.
Wherein, whether the detection of the current round of training meets the parameter acquisition condition comprises:
detecting whether the current round reaches the sampling period or not according to the preset sampling period, and if so, determining that the current round of training meets the parameter acquisition condition; wherein the sampling period represents that a parameter value is obtained every N rounds of training; n is a positive integer not less than 1;
or,
and judging whether the current parameter value of each parameter obtained in the current round is superior to the current parameter value of each parameter recorded last time, and if so, determining that the training in the current round meets the parameter obtaining condition.
The step of detecting whether the training of the current round meets the parameter acquisition condition is executed in the preset L-th round and each round after the preset L-th round, and the step of detecting whether the training of the current round meets the parameter acquisition condition is not executed in each round before the preset L-th round; wherein L is a positive integer greater than 1.
Wherein, the obtaining of the final parameter value corresponding to the parameter according to the recorded at least one current parameter value of the parameter includes:
and selecting the maximum parameter value from at least one recorded current parameter value of the parameter, and taking the maximum parameter value as a final parameter value corresponding to the parameter.
Wherein, the obtaining of the final parameter value corresponding to the parameter according to the recorded at least one current parameter value of the parameter includes:
determining a weight value corresponding to each current parameter value of the recorded parameters;
and carrying out weighted average on each current parameter value of the parameter to obtain the final parameter value corresponding to the parameter.
Wherein, the determining the weight value corresponding to each current parameter value of the recorded parameter includes:
setting the weight value corresponding to each current parameter value as
Figure BDA0003895495880000031
Wherein, M is the number of the current parameter values recorded aiming at the parameter;
or,
and setting a weight value corresponding to each current parameter value according to the number of rounds of each current parameter value, wherein when the number of rounds of each current parameter value is obtained, the weight value corresponding to the current parameter value is smaller, and when the number of rounds of each current parameter value is obtained, the weight value corresponding to the current parameter value is larger.
According to a second aspect, a traffic prediction method is provided, which includes:
acquiring service data to be predicted;
inputting business data to be predicted into a neural network model; the neural network model is trained by using the method of the embodiment of the specification;
obtaining the recognition result of the business data to be predicted, which is output by the neural network model;
according to a third aspect, there is provided a model training apparatus comprising:
the sample acquisition module is configured to acquire training sample data according to historical service data;
a training execution module configured to execute, in each round of training: inputting training sample data into the neural network model to adjust the parameter value of each parameter in the neural network model; detecting whether the training of the current round meets the parameter acquisition condition, and if so, recording the current parameter value of each parameter in the neural network model obtained by the training of the current round;
the parameter value determining module is configured to obtain a final parameter value corresponding to each parameter of the neural network model according to at least one recorded current parameter value of the parameter after each round of training is finished;
and the model generation module is configured to set the parameter value of each parameter in the neural network model as the final parameter value corresponding to the parameter.
According to a fourth aspect, a traffic prediction apparatus is provided, which includes:
the service data acquisition module is configured to acquire service data to be predicted;
the input module is configured to input the service data to be predicted into the neural network model; the neural network model is trained by the model training device in the embodiment of the specification;
and the output module is configured to obtain the recognition result of the business data to be predicted, which is output by the neural network model.
According to a fifth aspect, there is provided a computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements a method as described in any of the embodiments of the present specification.
The model training method and device, and the service prediction method and device provided by the embodiments of the present specification have at least the following beneficial effects:
1. only one neural network model needs to be trained, and a plurality of neural network models do not need to be trained as in the prior art, so that only the trained neural network model needs to be used in subsequent deployment and service prediction, and system resources are greatly saved.
2. The parameter value of the parameter of the neural network model obtained in the last stage is not adopted, the current parameter value of the parameter is recorded once when the parameter obtaining condition is met in one training round, and the finally obtained parameter value of the parameter of the neural network model is not obtained in one training round but obtained comprehensively according to a plurality of current parameter values obtained in a plurality of training rounds. That is to say, in the neural network model finally trained in the embodiment of the present specification, the parameter value of each parameter embodies the characteristics of the neural network model in different stages of the training process, and it is as if the consideration of the student performance is not the performance of one examination at the end of the period, but the comprehensive consideration of the multiple performances of the student in multiple examinations of the period. Therefore, the trained neural network model can better meet the training requirements, the learned content is richer, and subsequently, the business prediction can be more accurately carried out, such as risk assessment and the like.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the description below are some embodiments of the present specification, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a method for training a neural network model in one embodiment of the present disclosure.
Fig. 2 is a flow chart of a traffic prediction method in one embodiment of the present description.
Fig. 3 is a schematic structural diagram of a training apparatus for a neural network model in one embodiment of the present disclosure.
Fig. 4 is a schematic structural diagram of a traffic prediction apparatus in one embodiment of the present specification.
Detailed Description
As previously mentioned, the values of the parameters can be determined by training the neural network model. The performance of the neural network model is greatly influenced by the difference of the parameter values. In the prior art, in order to better perform service prediction and obtain an accurate service prediction result, a plurality of neural network models are usually deployed, and the neural network models have the same structure but have different parameter values for parameters. The plurality of neural network models are used to obtain respective service prediction results, and then the obtained plurality of service prediction results are used to comprehensively obtain (for example, by using a voting method or an averaging method) a final service prediction result. Because a plurality of neural network models with different parameter values are used for comprehensively predicting one service, the influence of different parameter values on the neural network models can be more fully considered, and a better service prediction result is obtained.
However, the neural network model occupies more resources in both deployment and operation. In order to perform business prediction, a plurality of neural network models are deployed on the line synchronously, so that great consumption is generated, and sometimes, the system cannot operate normally.
The scheme provided by the specification is described in the following with reference to the attached drawings.
FIG. 1 is a flow chart of a method for training a neural network model in one embodiment of the present disclosure. The execution subject of the method is a training device of the neural network model. It is to be understood that the method may also be performed by any apparatus, device, platform, cluster of devices having computing, processing capabilities. Referring to fig. 1, the method includes:
step 101: and acquiring training sample data according to the historical service data.
Step 103, step 105, step 107 and step 109 are executed in each round of training:
step 103: inputting training sample data into a neural network model to adjust the parameter value of each parameter in the neural network model;
step 105: detecting whether the training of the current round meets parameter acquisition conditions, if so, executing step 107, otherwise, executing step 109;
step 107: and recording the current parameter value of each parameter in the neural network model obtained by the training of the current round.
Step 109: and judging whether all the training rounds are finished, if so, executing the step 111, and otherwise, returning to the step 103.
Step 111: and after all rounds of training are finished, aiming at each parameter of the neural network model, and obtaining a final parameter value corresponding to the parameter according to at least one recorded current parameter value of the parameter.
Step 113: and setting the parameter value of each parameter in the neural network model as the final parameter value corresponding to the parameter.
As can be seen from the flow shown in fig. 1, in the embodiment of the present specification, only one neural network model needs to be trained, instead of training a plurality of neural network models in the prior art, so that only the trained neural network model needs to be used in subsequent deployment and service prediction, thereby greatly saving system resources.
Meanwhile, in the process of training the neural network model shown in fig. 1, instead of using the parameter value of the parameter of a certain round of neural network model obtained in the last stage, the current parameter value of the parameter is recorded once when the parameter acquisition condition is satisfied in one round of training, and the parameter value of the parameter of the neural network model obtained finally is not obtained in one round of training but is obtained by synthesis according to a plurality of current parameter values obtained in a plurality of rounds of training. That is to say, in the neural network model finally trained in the embodiment of the present specification, the parameter value of each parameter embodies the characteristics of the neural network model in different stages of the training process, and it is as if the consideration of the student performance is not the performance of one examination at the end of the period, but the comprehensive consideration of the multiple performances of the student in multiple examinations of the period. Therefore, the trained neural network model can better meet the training requirements, the learned content is richer, and subsequently, the business prediction can be more accurately carried out, such as risk assessment and the like.
Each step in fig. 1 is described below with reference to a specific example.
First for step 101: and acquiring training sample data according to the historical service data.
The method of the embodiment of the specification can be applied to various service scenes, such as a face recognition service scene, that is, a neural network model for face recognition is trained; for example, a wind control scenario is followed, that is, a neural network model is trained for risk recognition such as trading.
Accordingly, the training sample data is the sample data in the corresponding service scenario. Such as a face image with a label (real or false), or transaction data with a label (at risk or no risk), etc.
When a neural network model needs to be trained, the structure of the model, such as the structure of an input layer, an intermediate layer and an output layer, can be set first.
Then, the parameter value of each parameter in the neural network model needs to be determined through multiple rounds of training, such as 1 ten thousand rounds.
Next for step 103: in each training round, training sample data is input into the currently trained neural network model so as to adjust the parameter value of each parameter in the neural network model.
The training process includes adjusting parameter values of parameters in the neural network model to expect to adjust the parameter values to be optimal, and improving the performance of the neural network model.
Next for step 105: in each training round, whether the training round meets the parameter acquisition condition is detected.
As described above, in the embodiment of the present specification, the parameter values of the parameters of the finally obtained neural network model are not obtained in one training round, but are obtained by being integrated according to a plurality of current parameter values obtained in a plurality of training rounds. Therefore, in this step 105, it is detected whether the parameter obtaining condition is satisfied in the current round of training, and if so, the current parameter value of each parameter in the current neural network model trained in the current round of training is recorded in step 107, so as to embody the characteristics of the neural network model in this training stage.
In this step 105, the implementation manner of detecting whether the current round of training satisfies the parameter obtaining condition at least includes:
in the first mode, the parameter obtaining condition corresponds to a period, and is recorded as a sampling period. The parameter acquisition condition is considered to be satisfied once every N rounds of training.
In the first mode, the step of detecting whether the current round of training satisfies the parameter obtaining condition may include:
step 1051: detecting whether the current round reaches the sampling period or not according to the preset sampling period, and if so, determining that the current round of training meets the parameter acquisition condition; wherein N is a positive integer not less than 1, and the sampling period represents that parameter values are obtained every N times; if not, the current round of training is considered not to meet the parameter acquisition condition.
For example. For example, a preset sampling period represents that parameter values are acquired every 10 rounds, and then, parameter acquisition conditions are met every 10 rounds in rounds 10, 20, 30 and 40, wherein 8230, and the like, namely, current parameter values of parameters acquired in the round are required to be recorded every 10 rounds. For example, in step 1051, it is detected that the current round is the 9 th round, and thus the sampling period of each 10 rounds is not reached, the parameter acquisition condition is not satisfied. For another example, in step 1051, it is detected that the current round is the 30 th round, and thus the sampling period of every 10 rounds is reached, the parameter acquisition condition is satisfied.
And in the second mode, the parameter value corresponding to the parameter acquisition condition becomes better, namely, when the parameter value of each parameter obtained in one round is better than the parameter value recorded last time, the parameter acquisition condition is considered to be satisfied once.
In the second mode, the step of detecting whether the current round of training satisfies the parameter obtaining condition may include:
step 1053: after the training of the current round is finished, judging whether the current parameter value of each parameter in the neural network model obtained in the current round is superior to the current parameter value of each parameter recorded last time, and if so, determining that the training of the current round meets the parameter obtaining condition.
Next for step 107: and recording the current parameter value of each parameter in the neural network model obtained by the training of the current round.
The processing of step 105 and step 107 is illustrated. For example, for a neural network model, the current parameter values of each parameter in the model are recorded in the 80 th round of training. Then, taking the parameter values of the parameters recorded in the 80 th round as a comparison reference, if the current parameter values of the parameters obtained in the 81 th round are not better than the current parameter values of the parameters recorded in the 80 th round, the current parameter values of the parameters obtained in the 81 th round will not be recorded. If the current parameter values of the parameters obtained in the 82 nd, 83 th and 84 th rounds of training are not better than the current parameter values of the parameters recorded in the 80 th round of training, the current parameter values of the parameters obtained in the 82 nd, 83 th and 84 th rounds of training will not be recorded. If the current parameter values of the parameters obtained in the 85 th round of training are better than the current parameter values of the parameters recorded last time (i.e., recorded in the 80 th round of training), the current parameter values of the parameters obtained in the 85 th round of training are recorded. And then, taking the parameter values of all the parameters recorded in the 85 th round as a comparison reference, if the current parameter values of all the parameters obtained in the 86 th round of training are better than the current parameter values of all the parameters recorded in the 85 th round of training, recording the current parameter values of all the parameters obtained in the 86 th round of training, otherwise, not recording the current parameter values.
In the embodiment of the present specification, for training sample data, whether the current parameter value of one parameter is better than the current parameter value recorded last time may be determined according to the degree of similarity between the recognition result output by the neural network model and the label. For example, in the 80 th round of training, the degree of similarity between the recognition result (for example, the score value of 0.8) of the training sample data output by the neural network model and the label (for example, 1) of the training sample data is 0.8, then in the 81 th, 82 th, 83 th and 84 th rounds of training, if the degree of similarity between the recognition result of the training sample data output by the neural network model and the label of the training sample data is less than 0.8, it indicates that the current parameter value of each parameter obtained in the 81 th, 82 th, 83 th and 84 th rounds of training is not better than the current parameter value of each parameter recorded last time (i.e., recorded in the 80 th round of training). In the 85 th round of training, if the similarity between each recognition result (or average recognition result) of the training sample data output by the neural network model and the label of the training sample data is greater than 0.8, it indicates that the current parameter value of each parameter obtained in the 85 th round of training is better than the current parameter value of each parameter recorded last time (i.e., recorded in the 80 th round of training), and then the current value of each parameter of the neural network model obtained in the 85 th round of training is recorded.
Next for step 109: and judging whether all the training rounds are finished, if so, executing the step 111, otherwise, returning to the step 103.
In step 109, the method for determining whether all rounds of training are finished may be to determine whether the number of currently trained rounds reaches a preset training round number threshold, for example, 100 rounds of training are preset, and if the current round of training is the 100 th round of training, it indicates that all rounds of training are finished.
In step 109, the method of determining whether all rounds of training are finished may also be to determine whether the neural network model converges, for example, whether the loss function meets a preset requirement, and if so, it indicates that all rounds of training are finished.
Generally, in the training of the neural network model, the more rounds of training, the better the performance of the neural network model, and the performance of the neural network model generally cannot meet the requirements at the beginning of the training, for example, 1 ten thousand rounds of training are performed in total, in the first 8 thousand rounds of training, the performance of the neural network model generally cannot meet the requirements, and in the last 2 thousand rounds of training, the performance of the neural network model is more and more excellent, so in this specification, the preset lth round and later rounds may be performed, and after step 103 is performed, the processing of step 105 and step 107 is also performed; after step 103 is executed, the processing of step 105 and step 107 is not executed in each round before the preset lth round, and step 109 is directly executed; wherein L is a positive integer greater than 1, for example, L is 8 thousand, that is, in each round of training before the 8 th round, step 103 is executed, but the processing of steps 105 and 107 is not executed, and in each round from the 8 th round to the 1 st round, the processing of steps 103, 105 and 107 is executed.
Next for step 111: and after all rounds of training are finished, aiming at each parameter of the neural network model, and obtaining a final parameter value corresponding to the parameter according to at least one recorded current parameter value of the parameter.
The implementation manner of this step 111 includes:
and B, selecting the maximum value.
In this mode a, the implementation process of step 111 includes the following steps 1111: and aiming at each parameter of the neural network model, selecting the maximum parameter value from at least one recorded current parameter value of the parameter, and taking the maximum parameter value as the final parameter value corresponding to the parameter.
And B, selecting an average value.
In this mode B, the implementation process of step 111 includes the following steps:
step 11121: aiming at each parameter of the neural network model, determining a weight value corresponding to each current parameter value of the recorded parameter;
step 11123: and carrying out weighted average on each current parameter value of the parameter to obtain a final parameter value.
The implementation manner of the step 11121 may include:
mode 11, setting the weight value corresponding to each current parameter value to be
Figure BDA0003895495880000101
Wherein M is the number of current parameter values recorded for the parameter.
And 12, setting a weight value corresponding to each current parameter value according to the obtained number of rounds of each current parameter value, wherein when the number of rounds of one current parameter value is obtained, the weight value corresponding to the current parameter value is smaller, and when the number of rounds of one current parameter value is obtained, the weight value corresponding to the current parameter value is larger.
The implementation of step 111 is described by taking the above mode 11 as an example. For example, for a parameter M of the neural network model, a total of 100 times/current parameter values of the parameter M are recorded. Then, the weight value corresponding to each current parameter value is 1/100, and the final parameter value of the parameter M is obtained by multiplying 100 current parameter values by 1/100 respectively and then adding the products.
The implementation of step 111 is described by taking the above mode 12 as an example. For example, for a parameter M of the neural network model, a total of 100 times/current parameter values of the parameter M are recorded. Then, the later the training rounds are, the better the performance is, so the weighting value corresponding to the current parameter value obtained in the later training rounds is larger. For example, for the parameter M, the weight value corresponding to the current parameter value obtained in the 85 th round is greater than the weight value corresponding to the current parameter value obtained in the 80 th round, and the weight value corresponding to the current parameter value obtained in the 91 st round is greater than the weight value corresponding to the current parameter value obtained in the 85 th round.
And step 113: and setting the parameter value of each parameter in the neural network model as the final parameter value corresponding to the parameter.
After the step 113 is executed, the parameter value of each parameter in the neural network model is endowed with a final parameter value capable of reflecting the model characteristics of each stage in the training process.
It should be noted that, in the embodiment of the present specification, each parameter in the neural network model referred to in fig. 1 may be all parameters in the neural network model, or may be a specified part of important parameters.
After the neural network model is trained by using the method of the embodiment of the present specification, the model can be used for business prediction. Referring to fig. 2, in an embodiment of the present specification, a traffic prediction method is provided, including the following steps:
step 201: acquiring service data to be predicted;
step 203: inputting service data to be predicted into a neural network model;
step 205: and obtaining the recognition result of the business data to be predicted, which is output by the neural network model.
The embodiment of the present specification further provides a training apparatus for a neural network model, and referring to fig. 3, the apparatus includes:
the sample acquisition module 301 is configured to acquire training sample data according to historical service data;
a training execution module 302 configured to execute, in each training round: inputting training sample data into the neural network model to adjust the parameter value of each parameter in the neural network model; detecting whether the training of the current round meets the parameter acquisition condition, and if so, recording the current parameter value of each parameter in the neural network model obtained by the training of the current round;
a parameter value determining module 303, configured to obtain, for each parameter of the neural network model, a final parameter value corresponding to the parameter according to at least one recorded current parameter value of the parameter after each round of training is finished;
the model generation module 304 is configured to set a parameter value of each parameter in the neural network model as a final parameter value corresponding to the parameter.
In the embodiment of the present specification apparatus illustrated in fig. 3, the training execution module 302 is configured to execute: detecting whether the sampling period is reached in the current round or not according to a preset sampling period, and if so, determining that the current round of training meets parameter acquisition conditions; and N is a positive integer not less than 1, and the sampling period represents that the parameter value is acquired every N times.
In the embodiment of the present specification apparatus illustrated in fig. 3, the training performing module 303 is configured to perform: and judging whether the current parameter value of each parameter obtained in the current round is superior to the current parameter value of each parameter recorded last time, and if so, determining that the training in the current round meets the parameter obtaining condition.
In the embodiment of the present specification apparatus illustrated in fig. 3, the training performing module 302 is configured to perform: in the preset L-th round and each round after the L-th round, the step of detecting whether the training of the current round meets the parameter acquisition condition is executed, otherwise, the step is not executed; wherein L is a positive integer greater than 1.
In the embodiment of the present specification apparatus illustrated in fig. 3, the parameter value determination module 303 is configured to perform: and selecting the maximum parameter value from at least one recorded current parameter value of the parameter, and taking the maximum parameter value as a final parameter value corresponding to the parameter.
In the embodiment of the present specification apparatus illustrated in fig. 3, the parameter value determination module 303 is configured to perform: determining a weight value corresponding to each current parameter value of the recorded parameters;
and carrying out weighted average on each current parameter value of the parameter to obtain the final parameter value.
In the embodiment of the present specification apparatus illustrated in fig. 3, the parameter value determination module 303 is configured to perform: setting the weight value corresponding to each current parameter value as
Figure BDA0003895495880000121
Wherein M is the number of current parameter values recorded for the parameter.
In the embodiment of the present specification apparatus illustrated in fig. 3, the parameter value determination module 303 is configured to perform:
and setting a weight value corresponding to each current parameter according to the number of turns of each current parameter, wherein when the number of turns of a current parameter is obtained, the weight value corresponding to the current parameter is smaller, and when the number of turns of a current parameter is obtained later, the weight value corresponding to the current parameter is larger.
An embodiment of the present specification provides a service prediction apparatus, and referring to fig. 4, the apparatus includes:
a service data obtaining module 401 configured to obtain service data to be predicted;
an input module 402 configured to input the service data to be predicted into the neural network model; the neural network model is trained by using a model training device in the embodiment of the specification;
and an output module 403 configured to obtain an identification result of the to-be-predicted service data output by the neural network model.
It should be noted that the above devices are usually implemented on a server side, and may be respectively installed on independent servers, or some or all of the devices may be installed in a combination on the same server. The Server may be a single Server or a Server cluster composed of a plurality of servers, and the Server may be a cloud Server, also called a cloud computing Server or a cloud host, which is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility existing in the traditional physical host and virtual Private Server (VPs) service. The above devices can also be implemented in computer terminals with strong computing power.
An embodiment of the present specification provides a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of the embodiments of the specification.
One embodiment of the present specification provides a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor implementing a method in accordance with any one of the embodiments of the specification when executing the executable code.
It is to be understood that the illustrated construction of the embodiments herein is not to be construed as limiting the apparatus of the embodiments herein specifically. In other embodiments of the description, the apparatus may include more or fewer components than illustrated, or some components may be combined, some components may be separated, or a different arrangement of components may be used. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.
Those skilled in the art will recognize that the functionality described in this disclosure may be implemented in hardware, software, firmware, or any combination thereof, in one or more of the examples described above. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (10)

1. The training method of the neural network model comprises the following steps:
acquiring training sample data according to historical service data;
in each training round, the following steps are performed:
inputting training sample data into the neural network model to adjust the parameter value of each parameter in the neural network model; and
detecting whether the training of the current round meets the parameter obtaining condition, and if so, recording the current parameter value of each parameter in the neural network model obtained by the training of the current round;
after each training turn is finished, aiming at each parameter of the neural network model, obtaining a final parameter value corresponding to the parameter according to at least one recorded current parameter value of the parameter;
and setting the parameter value of each parameter in the neural network model as the final parameter value corresponding to the parameter.
2. The method of claim 1, wherein the detecting whether the parameter acquisition condition is satisfied by the current round of training comprises:
detecting whether the current round reaches the sampling period or not according to the preset sampling period, and if so, determining that the current round of training meets the parameter acquisition condition; wherein the sampling period represents that a parameter value is obtained every N rounds of training; n is a positive integer not less than 1;
or,
and judging whether the current parameter value of each parameter obtained in the current round is better than the current parameter value of each parameter recorded last time, and if so, determining that the training in the current round meets the parameter obtaining condition.
3. The method according to claim 1, wherein the step of detecting whether the training of the current round satisfies the parameter acquisition condition is performed in each round before the preset lth round, and the step of detecting whether the training of the current round satisfies the parameter acquisition condition is not performed in each round before the preset lth round; wherein L is a positive integer greater than 1.
4. The method of claim 1, wherein obtaining a final parameter value corresponding to the recorded at least one current parameter value of the parameter comprises:
and selecting the maximum parameter value from at least one recorded current parameter value of the parameter, and taking the maximum parameter value as a final parameter value corresponding to the parameter.
5. The method of claim 1, wherein obtaining a final parameter value corresponding to the parameter according to the recorded at least one current parameter value of the parameter comprises:
determining a weight value corresponding to each current parameter value of the recorded parameters;
and carrying out weighted average on each current parameter value of the parameter to obtain the final parameter value corresponding to the parameter.
6. The method of claim 5, wherein the determining the weight value corresponding to each current recorded parameter value of the parameter comprises:
setting the weight value corresponding to each current parameter value as
Figure FDA0003895495870000021
Wherein, M is the number of the current parameter values recorded aiming at the parameter;
or,
and setting a weight value corresponding to each current parameter value according to the number of rounds of obtaining each current parameter value, wherein when the number of rounds of obtaining one current parameter value is in the front, the weight value corresponding to the current parameter value is smaller, and when the number of rounds of obtaining one current parameter value is in the back, the weight value corresponding to the current parameter value is larger.
7. The service prediction method comprises the following steps:
acquiring service data to be predicted;
inputting business data to be predicted into a neural network model; the neural network model is trained using the method of any one of claims 1 to 6;
and obtaining the recognition result of the business data to be predicted, which is output by the neural network model.
8. A model training apparatus, the apparatus comprising:
the sample acquisition module is configured to acquire training sample data according to historical service data;
a training execution module configured to execute, in each round of training: inputting training sample data into the neural network model to adjust the parameter value of each parameter in the neural network model; detecting whether the training of the current round meets the parameter obtaining condition, and if so, recording the current parameter value of each parameter in the neural network model obtained by the training of the current round;
the parameter value determining module is configured to obtain a final parameter value corresponding to each parameter of the neural network model according to at least one recorded current parameter value of the parameter after each round of training is finished;
and the model generation module is configured to set the parameter value of each parameter in the neural network model as the final parameter value corresponding to the parameter.
9. The traffic prediction device comprises:
the service data acquisition module is configured to acquire service data to be predicted;
the input module is configured to input the business data to be predicted into the neural network model; the neural network model is trained using the apparatus of claim 8;
and the output module is configured to obtain the identification result of the business data to be predicted, which is output by the neural network model.
10. A computing device comprising a memory having executable code stored therein and a processor that, when executing the executable code, implements the method of any of claims 1-7.
CN202211272252.0A 2022-10-18 2022-10-18 Model Training Method and Device, Service Prediction Method and Device Active CN115526266B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211272252.0A CN115526266B (en) 2022-10-18 2022-10-18 Model Training Method and Device, Service Prediction Method and Device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211272252.0A CN115526266B (en) 2022-10-18 2022-10-18 Model Training Method and Device, Service Prediction Method and Device

Publications (2)

Publication Number Publication Date
CN115526266A true CN115526266A (en) 2022-12-27
CN115526266B CN115526266B (en) 2023-08-29

Family

ID=84703949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211272252.0A Active CN115526266B (en) 2022-10-18 2022-10-18 Model Training Method and Device, Service Prediction Method and Device

Country Status (1)

Country Link
CN (1) CN115526266B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510083A (en) * 2018-03-29 2018-09-07 国信优易数据有限公司 A kind of neural network model compression method and device
CN109118013A (en) * 2018-08-29 2019-01-01 黑龙江工业学院 A kind of management data prediction technique, readable storage medium storing program for executing and forecasting system neural network based
CN111291605A (en) * 2018-12-07 2020-06-16 财团法人交大思源基金会 People flow analysis system and people flow analysis method
US20200210840A1 (en) * 2018-12-31 2020-07-02 Microsoft Technology Licensing, Llc Adjusting precision and topology parameters for neural network training based on a performance metric
CN111539479A (en) * 2020-04-27 2020-08-14 北京百度网讯科技有限公司 Method and device for generating sample data
CN111582450A (en) * 2020-05-08 2020-08-25 广东电网有限责任公司 Neural network model training method based on parameter evaluation and related device
US20210241097A1 (en) * 2019-11-07 2021-08-05 Canon Kabushiki Kaisha Method and Apparatus for training an object recognition model
CN113762061A (en) * 2021-05-26 2021-12-07 腾讯科技(深圳)有限公司 Quantitative perception training method and device for neural network and electronic equipment
CN113836804A (en) * 2021-09-15 2021-12-24 向前 Animal identification model establishing method based on convolutional neural network and application system thereof
CN113888475A (en) * 2021-09-10 2022-01-04 上海商汤智能科技有限公司 Image detection method, training method of related model, related device and equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510083A (en) * 2018-03-29 2018-09-07 国信优易数据有限公司 A kind of neural network model compression method and device
CN109118013A (en) * 2018-08-29 2019-01-01 黑龙江工业学院 A kind of management data prediction technique, readable storage medium storing program for executing and forecasting system neural network based
CN111291605A (en) * 2018-12-07 2020-06-16 财团法人交大思源基金会 People flow analysis system and people flow analysis method
US20200210840A1 (en) * 2018-12-31 2020-07-02 Microsoft Technology Licensing, Llc Adjusting precision and topology parameters for neural network training based on a performance metric
US20210241097A1 (en) * 2019-11-07 2021-08-05 Canon Kabushiki Kaisha Method and Apparatus for training an object recognition model
CN111539479A (en) * 2020-04-27 2020-08-14 北京百度网讯科技有限公司 Method and device for generating sample data
CN111582450A (en) * 2020-05-08 2020-08-25 广东电网有限责任公司 Neural network model training method based on parameter evaluation and related device
CN113762061A (en) * 2021-05-26 2021-12-07 腾讯科技(深圳)有限公司 Quantitative perception training method and device for neural network and electronic equipment
CN113888475A (en) * 2021-09-10 2022-01-04 上海商汤智能科技有限公司 Image detection method, training method of related model, related device and equipment
CN113836804A (en) * 2021-09-15 2021-12-24 向前 Animal identification model establishing method based on convolutional neural network and application system thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘万军等: "基于双重优化的卷积神经网络图像识别算法", 《模式识别与人工智能》 *

Also Published As

Publication number Publication date
CN115526266B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
KR102263397B1 (en) Method for acquiring sample images for inspecting label among auto-labeled images to be used for learning of neural network and sample image acquiring device using the same
CN110532932B (en) Method for identifying multi-component radar signal intra-pulse modulation mode
CN110956255B (en) Difficult sample mining method and device, electronic equipment and computer readable storage medium
CN112149524B (en) Radar signal sorting and identifying method, device, detector and storage medium
CN111291817A (en) Image recognition method and device, electronic equipment and computer readable medium
CN114821066A (en) Model training method and device, electronic equipment and computer readable storage medium
CN111027591B (en) Node fault prediction method for large-scale cluster system
CN112200772A (en) Pox check out test set
CN110135428B (en) Image segmentation processing method and device
CN111507396B (en) Method and device for relieving error classification of unknown class samples by neural network
CN113902944A (en) Model training and scene recognition method, device, equipment and medium
CN115526266A (en) Model training method and device, and business prediction method and device
CN116912483A (en) Target detection method, electronic device and storage medium
CN114120180B (en) Time sequence nomination generation method, device, equipment and medium
CN114742644A (en) Method and device for training multi-scene wind control system and predicting business object risk
CN117523218A (en) Label generation, training of image classification model and image classification method and device
CN114722942A (en) Equipment fault diagnosis method and device, electronic equipment and storage medium
CN113569957A (en) Object type identification method and device of business object and storage medium
CN115471717B (en) Semi-supervised training and classifying method device, equipment, medium and product of model
CN116415137B (en) Emotion quantification method, device, equipment and storage medium based on multi-modal characteristics
CN117197592B (en) Target detection model training method and device, electronic equipment and medium
CN116416456B (en) Self-distillation-based image classification method, system, storage medium and electronic device
CN114978616A (en) Method and device for constructing risk assessment system and method and device for risk assessment
CN115630289A (en) Target identification method and device based on evidence theory
CN114863240A (en) Automatic testing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant