CN116975753A - Data category based prediction method, device, equipment and medium - Google Patents

Data category based prediction method, device, equipment and medium Download PDF

Info

Publication number
CN116975753A
CN116975753A CN202310231183.7A CN202310231183A CN116975753A CN 116975753 A CN116975753 A CN 116975753A CN 202310231183 A CN202310231183 A CN 202310231183A CN 116975753 A CN116975753 A CN 116975753A
Authority
CN
China
Prior art keywords
classification
classification model
sample data
initial
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310231183.7A
Other languages
Chinese (zh)
Inventor
高思哲
陈波
欧阳天雄
严君刚
曾庆然
赵雪尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310231183.7A priority Critical patent/CN116975753A/en
Publication of CN116975753A publication Critical patent/CN116975753A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a data category prediction method, a device, equipment and a medium, which are used for improving the accuracy of data category prediction. The method comprises the following steps: acquiring initial sample data in a training sample set and first classification weights of the initial sample data in the t-th training, and carrying out weighting treatment on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training; outputting a first class prediction result corresponding to the first weighted sample data through the initial classification model, and determining a first loss of the initial classification model according to the first class prediction result, the first classification weight and class label information corresponding to the initial sample data; correcting the first classification weight and the initial classification model according to the first loss to obtain a second classification weight and a candidate classification model for the t training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t training; the target classification model is used for predicting the category to which the target data belongs.

Description

Data category based prediction method, device, equipment and medium
Technical Field
The application relates to the field of artificial intelligence, in particular to a data category prediction method, a device, equipment and a medium.
Background
In some specific prediction scenarios, for example, in scenarios of predicting whether a bank loan will violate, whether a body issuing a bond will violate a liability, etc., there is often a phenomenon that the data amount of a sample of a first class (for example, a violated sample) is much smaller than the data amount of a sample of a second class (for example, a non-violated sample), which results in the problem of generating imbalance of sample data. If the unbalanced sample data is adopted to construct a classification model for predicting the data type, the prediction accuracy of the classification model obtained by training is possibly far lower than that of the second type data in practical application, that is, the prediction accuracy of the first type data is too low.
Disclosure of Invention
The embodiment of the application provides a data category prediction-based method, a device, equipment and a medium, which are used for improving the accuracy of data category prediction.
In one aspect, an embodiment of the present application provides a data class-based prediction method, including:
acquiring initial sample data in a training sample set, acquiring first classification weights of the initial sample data in the t-th training, and carrying out weighting treatment on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer;
Inputting the first weighted sample data into an initial classification model, performing category prediction processing on the first weighted sample data through the initial classification model to obtain a first category prediction result corresponding to the first weighted sample data, and determining a first loss corresponding to the initial classification model according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data;
carrying out weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t-th training, carrying out network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t-th training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t-th training; the target classification model is used for predicting the category to which the target data belong.
In one aspect, an embodiment of the present application provides a data class-based prediction apparatus, including:
the first weighting processing module is used for acquiring initial sample data in the training sample set, acquiring first classification weights of the initial sample data during the t-th training, and carrying out weighting processing on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer;
The first loss determination module is used for inputting the first weighted sample data into the initial classification model, carrying out category prediction processing on the first weighted sample data through the initial classification model to obtain a first category prediction result corresponding to the first weighted sample data, and determining the first loss corresponding to the initial classification model according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data;
the first parameter correction module is used for carrying out weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t training, carrying out network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t training; the target classification model is used for predicting the category to which the target data belong.
Wherein the training sample set comprises sample data of a first category and sample data of a second category; the data class prediction device further includes:
the sample number acquisition module is used for acquiring a first sample number corresponding to sample data of a first category in the training sample set and acquiring a second sample number corresponding to sample data of a second category in the training sample set;
The classification weight determining module is used for determining initial classification weights corresponding to the first category and the second category respectively according to the first sample number and the second sample number; wherein the product between the first sample number and the initial classification weight corresponding to the first class is the same as the product between the second sample number and the initial classification weight corresponding to the second class; when t=1, the first classification weight in the t training is the initial classification weight corresponding to the category to which the initial sample data belongs.
Wherein the data class-based prediction apparatus further comprises:
the first class prediction module is used for acquiring a source classification model, inputting first weighted sample data into the source classification model, and performing class prediction processing on the first weighted sample data through the source classification model to obtain an initial class prediction result corresponding to the first weighted sample data;
the cross entropy loss determining module is used for determining cross entropy loss corresponding to the source classification model according to the initial category prediction result and category label information corresponding to the initial sample data;
and the second parameter correction module is used for correcting the network parameters of the source classification model according to the cross entropy loss, and determining the source classification model containing the corrected network parameters as an initial classification model.
Wherein the initial classification model comprises M classification sub-models; m is an integer greater than 1; the first loss determination module includes:
the class prediction unit is used for inputting the first weighted sample data into the initial classification model, and carrying out class prediction processing on the first weighted sample data through M classification sub-models of the initial classification model to obtain M sub-class prediction results;
and the first determining unit is used for determining the sum of the M subcategory prediction results as a first category prediction result corresponding to the first weighted sample data.
Wherein the first loss determination module includes:
the logarithmic processing unit is used for carrying out logarithmic processing on the first type of predicted result to obtain a logarithmic result corresponding to the first type of predicted result;
the second determining unit is used for determining a correction coefficient corresponding to the logarithmic result according to the first type predicting result and the type label information corresponding to the initial sample data;
and a third determining unit, configured to determine a product among the logarithmic result, the correction coefficient and the first classification weight as a first loss corresponding to the initial classification model.
Wherein the second determining unit includes:
a first determining subunit, configured to determine, as a first candidate value, a product between the first class prediction result and class label information corresponding to the initial sample data;
And the second determining subunit is used for acquiring the constant parameter and the index parameter, determining the difference value between the constant parameter and the first candidate value as a second candidate value, and performing index operation on the second candidate value and the index parameter to obtain a correction coefficient corresponding to the logarithmic result.
Wherein, the first parameter correction module includes:
the weighting processing unit is used for carrying out weighting processing on the initial sample data according to the second classification weight to obtain second weighted sample data of the t+1st training;
the loss determination unit is used for outputting a second class prediction result corresponding to the second weighted sample data through the candidate classification model trained for the t time, and determining a second loss corresponding to the candidate classification model trained for the t time according to the second class prediction result, the second class weight and the class label information;
the parameter correction unit is used for carrying out parameter correction on the candidate classification model trained for the t time according to the second loss to obtain a candidate classification model trained for the t+1th time;
and the model determining unit is used for determining the t+1st training candidate classification model as a target classification model if the t+1st training candidate classification model meets the training stopping condition.
Wherein, the data category predicting device further includes:
The second class prediction module is used for acquiring target data, inputting the target data into the target classification model, and performing class prediction processing on the target data through the target classification model to obtain a third class prediction result corresponding to the target data;
the third category prediction module is used for acquiring a business category strategy associated with the target data and obtaining a fourth category prediction result corresponding to the target data according to the hit result of the target data in the business category strategy;
and the weighted summation module is used for carrying out weighted summation processing on the third category prediction result and the fourth category prediction result and determining the category to which the target data belong.
In one aspect, the embodiment of the present application provides a computer device, including a memory and a processor, where the memory stores a computer program, and the computer program when executed by the processor causes the processor to perform the steps of the method in one aspect of the embodiment of the present application.
In one aspect, an embodiment of the present application provides a computer readable storage medium, where a computer program is stored, where the computer program includes program instructions, and when the program instructions are executed by a processor, the steps of a method in one aspect of an embodiment of the present application are performed.
According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the methods provided in the various alternatives of the above aspect.
In the embodiment of the application, after initial sample data in a training sample set is obtained and a first classification weight of the initial sample data in the t-th training is obtained, the initial sample data can be weighted according to the first classification weight to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer; further, the first weighted sample data can be input into an initial classification model, category prediction processing is carried out on the first weighted sample data through the initial classification model, a first category prediction result corresponding to the first weighted sample data is obtained, and a first loss corresponding to the initial classification model is determined according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data; finally, the first classification weight can be subjected to weight correction according to the first loss to obtain a second classification weight for the t training, the initial classification model is subjected to network parameter correction according to the first loss to obtain a candidate classification model for the t training, and the target classification model is obtained according to the second classification weight and the candidate classification model for the t training; the target classification model is used for predicting the category to which the target data belong. Therefore, as the first weighted sample data are weighted sample data obtained after weighting processing based on the first classification weight, compared with the initial sample data, the target classification model constructed based on the first weighted sample data can improve the accuracy of data category prediction when predicting the target data; further, since the first loss corresponding to the initial classification model takes the first classification weight into consideration, when the parameter correction is performed on the initial classification model based on the first loss, the model performance of the candidate classification model trained for the t-th time can be improved; in addition, the target classification model is obtained based on the corrected second classification weight and the t-th trained candidate classification model, so that the target classification model is adopted to predict the target data, and the prediction accuracy in the application scene of unbalanced sample data can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application;
FIG. 2 is a schematic view of a scenario of data class prediction provided by an embodiment of the present application;
FIG. 3 is a schematic flow chart of a data class-based prediction method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of obtaining a classification model of a target according to an embodiment of the present application;
FIG. 5 is a flowchart of another data class-based prediction method according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an initial classification model according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a training process of an initial classification model according to an embodiment of the present application;
FIG. 8 is a schematic diagram of prediction distribution obtained by using a prediction method of different data types according to an embodiment of the present application;
FIG. 9 is a schematic diagram of predicted duty ratios obtained by a different data class prediction method according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a data class-based prediction device according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Embodiments of the application relate to artificial intelligence (artificial intelligence, AI). Wherein artificial intelligence is the intelligence of simulating, extending and expanding a person using a digital computer or a machine controlled by a digital computer, sensing the environment, obtaining knowledge, and using knowledge to obtain optimal results. In other words, artificial intelligence is a comprehensive technology of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that reacts in a similar way to human intelligence in order to predict the class to which target data belongs and to make the classification prediction result as similar as possible to the classification prediction result of human intelligence on target data. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The embodiment of the application mainly relates to machine learning/deep learning and other directions. The target classification model for predicting the category to which the target data belongs can be obtained through learning, the target classification model can be obtained through training the weighted first weighted sample data, the t training second classification weight and the t training candidate classification model, the target classification model is adopted to conduct category prediction on the target data, and the accuracy of data category prediction can be achieved.
Artificial intelligence cloud services, also commonly referred to as AIaaS (AI as a Service), AI is a Service. The service mode of the artificial intelligent platform is the mainstream at present, and particularly, the AIaaS platform can split several common AI services and provide independent or packaged services at the cloud. This service mode is similar to an AI theme mall: all developers can access one or more artificial intelligence services provided by the use platform through an API interface, and partial deep developers can deploy and operate and maintain cloud artificial intelligence services exclusive to themselves by using an AI framework and an AI infrastructure provided by the platform. According to the embodiment of the application, the initial classification model for data category prediction provided by the application platform can be accessed in an API interface mode, the first classification weight is introduced into the first loss, and then the target classification model is obtained by training the initial classification model based on the first loss, the category prediction is carried out on the target data, and the accuracy of the data category prediction can be realized.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application. As shown in fig. 1, the network architecture may include a server 101, a terminal device 102a, a terminal device 102b, a terminal device 102c, and the like, and the network architecture may include one or more servers and may further include at least one or more terminal devices, and the number of servers and terminal devices will not be limited herein. As shown in fig. 1, the server 101 may be connected to each terminal device through a network, so that the server 101 may perform data interaction with each terminal device through the network connection.
The server 101 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing a cloud database, a cloud service, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a content delivery network (content delivery network, CDN), a basic cloud computing service such as big data and an artificial intelligence platform. The terminal clusters (including terminal device 102a, terminal device 102b, terminal device 102c, etc.) may be smart phones, tablet computers, notebook computers, desktop computers, palm computers, mobile internet devices (mobile internet device, MID), wearable devices (e.g., smart watches, smart bracelets, etc.), smart computers, smart vehicles, etc. The server 101 may establish communication connection with each terminal device in the terminal cluster, and may also establish communication connection between each terminal device in the terminal cluster. In other words, the server 101 may establish a communication connection with each of the terminal devices 102a, 102b, 103c, and the like, for example, a communication connection may be established between the terminal device 102a and the server 101. A communication connection may be established between the terminal device 102a and the terminal device 102b, and a communication connection may also be established between the terminal device 102a and the terminal device 102 c. The communication connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, or may be directly or indirectly connected through a wireless communication manner, and the like, and may be specifically determined according to an actual application scenario, which is not limited in the embodiment of the present application.
In the embodiment of the present application, both the server 101 and the terminal device (e.g., the terminal device 102a, etc.) may be used independently for performing data class prediction, and the server 101 and the terminal device (e.g., the terminal device 102a, etc.) may also be used cooperatively for performing data class prediction. For example, the terminal device 102a may send a data class prediction request to the server 101, the data class prediction request including target data, which may be understood as an object of class prediction. The server 101 may parse the data type prediction request, obtain target data from the data type prediction request, construct a target classification model based on the target data, determine a type to which the target data belongs based on the target classification model, and finally return the type to which the target data belongs to the terminal device 102a.
Specifically, referring to fig. 2, fig. 2 is a schematic view of a scenario of data category prediction according to an embodiment of the present application. As shown in FIG. 2, the data class prediction process of embodiments of the present application may be implemented based on a target classification model. The service side (e.g., terminal device 102a, etc., shown in fig. 1) first needs to provide the demand. For example, the service side may generate a data class prediction request, and send the data class prediction request to the service side (e.g., the server 101 shown in fig. 1, etc.), and provide the requirement to the service side through the data class prediction request; the data class prediction request may carry target data that may be understood as the object of the data class prediction. The service side receives the data type prediction request and analyzes the data type prediction request to obtain target data; and the service side can determine the specific prediction scene of the classification model according to the target data. Or, the data category prediction request may also carry indication information of a specific prediction scene to which the classification model is applied. For example, the above-mentioned prediction scenario may be a scenario in which it is predicted whether a bank loan will violate or a scenario in which a body issuing a bond will incur liability violations, a scenario in which it is predicted that property data of a business object, or the like.
After receiving the demand provided by the service side, the service side can preliminarily determine the preference of the classification model to be built. It will be appreciated that the different requirements provided by the service side will also result in different preferences for the classification model, and therefore the requirements provided by the service side need to be taken into account when building the classification model. Taking asset data of a predicted business object as an example, if business emphasis is placed on the rating of a business object with high asset data, a classification model with better performance and easier performance can be prone to be built when the classification model is built; if some policy adjustments are required to be made according to the asset data predictions for the business objects other than the business objects of high asset data, the business objects of medium asset data and the business objects of low asset data, then a multi-classification model may tend to be built when the classification model is built. After the requirements of the classification model are preliminarily determined, the construction of the classification model is required, as shown in fig. 2. It will be appreciated that sample preparation and feature screening are performed first, before the classification model is built. Taking a scenario of predicting whether a bank loan will violate, as an example, the computer device selects property data corresponding to a plurality of business objects (objects applying for the bank loan) from a business database as sample data, assigns corresponding category label information to the property data, and then adds the sample data containing the category label information to a training sample set, thereby completing sample preparation. Optionally, the sample data may also be preprocessed, for example, the sample data containing the abnormal data may be culled, before adding the sample data containing the category label information to the training sample set; or when the missing value of the sample data occurs, filling the missing value of the sample data, and then adding the preprocessed sample data containing the label information to the training sample set, thereby completing feature screening. Furthermore, a first classification weight can be introduced into the training sample set to balance the proportion of initial sample data of different categories in the training sample set, so that the finally obtained target classification model can balance the learning of each category, and the model output result is prevented from being biased to the category with larger sample occupation. Then, carrying out weighting treatment on the initial sample data according to the first classification weight to obtain first weighted sample data of the t-th training; training and parameter adjustment are carried out on the initial classification model through the first weighted sample data, model effect analysis is carried out, and meanwhile, network parameters of the built initial classification model are further corrected according to actual service requirements.
Further, a first loss corresponding to the initial classification model can be determined based on the first classification weight and the focus loss, the first classification weight is subjected to weight correction according to the first loss to obtain a second classification weight, and the initial classification model is optimized according to the first loss to obtain a candidate classification model. In addition, when the updating of the service data is detected, the initial classification model can be further optimized based on the updated service data as sample data of the initial classification model, so that the candidate classification model has higher accuracy. And finally, obtaining a target classification model according to the second classification weight candidate classification model. After the target classification model is obtained, the class prediction can be performed on the target data based on the target classification model and the service class strategy associated with the target data, and the prediction accuracy under the application scene of unbalanced sample data can be improved through the common decision of the target classification model and the service class strategy.
Further, referring to fig. 3, fig. 3 is a flow chart of a data category prediction method according to an embodiment of the present application. It will be appreciated that the data class based prediction method is performed by a computer device, which may be a terminal device (e.g., terminal device 102a, terminal device 102b, or terminal device 102c in the corresponding embodiment of fig. 1), or a server (e.g., server 101 in the corresponding embodiment of fig. 1). As shown in fig. 3, the data category based prediction method may include the following steps S101 to S103:
Step S101: the method comprises the steps of obtaining initial sample data in a training sample set, obtaining first classification weights of the initial sample data in the t-th training, and carrying out weighting processing on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training.
Wherein, the training sample set can be understood as a sample data set for constructing the target classification model; the target classification model may be used to make predictions of data categories; the initial sample data is sample data of any category in the training sample set; the training sample set may include a plurality of different categories of sample data. Taking a scenario of predicting whether a bank loan will violate, for example, the computer device may select, from the service database, property data corresponding to a plurality of service objects (objects applying for a loan to a bank) as sample data, assign corresponding category label information to the property data, and then add the sample data containing the category label information to the training sample set. Optionally, the sample data may also be preprocessed, for example, the sample data containing the abnormal data may be culled, before adding the sample data containing the category label information to the training sample set; optionally, when the abnormal data is removed, random removal according to the distribution of the sample data can be considered, so that the original distribution of the sample data is not changed, and the reliability of the sample data is improved; or when the missing value of the sample data occurs, filling the missing value of the sample data, and then adding the preprocessed sample data containing the label information to the training sample set, so that the accuracy of model data type prediction is improved. Further, the computer device may combine the business experience of the bank loan, for example, divide the sample data in the training sample data into different categories according to the information such as the average amount of money, the average amount of money of the money, etc., for example, divide the sample data into the first category (the default sample) and the sample data of the second category (the non-default sample).
The training sample set may be pre-stored in a local business database or cloud database from which the computer device may obtain the training sample set. For example, the computer device may obtain a set of training samples from a local business database at intervals. The local service database may be configured with at least one type of interface, such as an Oracle (a relational database management system) interface, an SQL Server (a relational database management system) interface, and the like, and the training sample set is obtained from the local service database by calling the corresponding interface through a programming language adopted by the computer device, where the local service database may be updated based on service data published in real time.
It will be appreciated that in some specific prediction scenarios, there is often a phenomenon that the amount of sample data in one category is much smaller than that in other categories, resulting in the problem of sample data imbalance. For example, when the prediction scenario is a scenario of predicting whether a bank loan will violate, or whether a body issuing a bond will incur liability violations, etc., a phenomenon that the amount of violated sample data is much smaller than that of non-violated sample data often occurs, for example, the non-violated sample data accounts for 95% of the sample data set, and the amount of violated sample data accounts for only 95% of the sample training set, thereby causing a problem that the sample training set generates sample data imbalance. If the unbalanced sample data is adopted to construct a classification model for predicting the data type, the prediction accuracy of the classification model obtained by training is probably far lower than that of the second type data in practical application. That is, the classification model trained by using unbalanced sample data is used for data classification prediction, which may cause that the prediction accuracy of the first class data is too low.
In order to solve the problem of unbalanced sample data, the prediction accuracy of the first class data is improved, a first classification weight can be introduced, and the proportion of initial sample data of different classes in a training sample set is balanced, so that the finally obtained target classification model can balance the learning of each class, and the model output result is prevented from being biased to the class with larger sample occupation. Specifically, the first classification weight is given to the initial sample data corresponding to the category which occupies a larger area in the sample training set, the first classification weight is given to the initial sample data corresponding to the category which occupies a smaller area in the sample training set, so that the duty ratio of the first weighted sample data of different weighted categories in the training sample set tends to be consistent, and the phenomenon of unbalanced sample data is relieved.
In the embodiment of the application, the first classification weight can be used for balancing the proportion of the initial sample data of different categories in the training sample set; in the training sample set, initial sample data of different categories correspond to different first classification weights; accordingly, the initial sample data of the same category corresponds to the same first classification weight. Illustratively, the training sample set includes a first category of sample data and a second category of sample data; the first category and the second category are different categories. Taking the example that the initial sample data comprises sample data a and sample data b, when the category to which the sample data a belongs is a first category and the category to which the sample data b belongs is a second category, the sample data a and the sample data b correspond to different first classification weights; when the sample data a and the sample data b belong to the same first category, the sample data a and the sample data b correspond to the same first classification weight. Wherein t is a positive integer, the value of t can be 1,2,3 and … …, and the specific value can be set according to practical situations, which is not limited in the embodiment of the application. The first weighted sample data of the t-th training may be understood as weighted sample data obtained by weighting the initial sample data according to the first classification weight of the t-th training. It can be understood that, because the first weighted sample data is weighted sample data obtained after weighting processing based on the first classification weight, compared with the initial sample data, the problem of unbalanced sample data duty ratio in the first weighted sample data is relieved, so that the accuracy rate of data type prediction can be improved when the target data is predicted based on the target classification model constructed by the first weighted sample data.
The illustration continues with the training sample set comprising a first class of sample data and a second class of sample data. In one possible implementation, the computer device may obtain a first number of samples corresponding to a first class of sample data in the training sample set and obtain a second number of samples corresponding to a second class of sample data in the training sample set; and further, according to the first sample number and the second sample number, initial classification weights corresponding to the first category and the second category respectively can be determined.
In the embodiment of the application, the initial classification weight can also be used for balancing the proportion of the initial sample data of different categories in the training sample set. It can be appreciated that in the training sample set, different classes of initial sample data correspond to different initial classification weights; accordingly, the initial sample data of the same category corresponds to the same initial classification weight. Further, the product between the first sample number and the initial classification weight corresponding to the first class is the same as the product between the second sample number and the initial classification weight corresponding to the second class. For example, in the training sample set, the first sample number corresponding to the sample data of the first category is 100, and the first sample number corresponding to the sample data of the second category is 900; the initial classification weight of the first class may be 9, the initial classification weight of the second class may be 1, and at this time, the product between the first sample number and the initial classification weight corresponding to the first class is the same as the product between the second sample number and the initial classification weight corresponding to the second class, which are both 900; alternatively, the initial classification weight of the first class may be 18, the initial classification weight of the second class may be 2, and at this time, the product between the first sample number and the initial classification weight corresponding to the first class is the same as the product between the second sample number and the initial classification weight corresponding to the second class, which are both 1800; of course, the specific values of the initial classification weight of the first category and the initial classification weight of the second category may also be other values, which are not limited in the embodiment of the present application.
It can be understood that, in the embodiment of the present application, when t=1, the first classification weight at the time of the t-th training is the initial classification weight corresponding to the category to which the initial sample data belongs. That is, the first classification weight at the time of the 1 st training is the initial classification weight corresponding to the category to which the initial sample data belongs. For example, the category to which the initial sample data belongs is a first category, and the first classification weight in the 1 st training is an initial classification weight corresponding to the first category. When t >1, the first classification weight at the t-th training may be understood as the second classification weight of the t-1 th training; the second classification weight of the t-1 training may be understood as a classification weight obtained by correcting the first classification weight of the t-1 training, and a specific weight correction manner of the classification weight may refer to the following description, which is not repeated herein.
Step S102: and inputting the first weighted sample data into an initial classification model, performing category prediction processing on the first weighted sample data through the initial classification model to obtain a first category prediction result corresponding to the first weighted sample data, and determining a first loss corresponding to the initial classification model according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data.
The initial classification model can be understood as an intermediate model of the target classification model in an untrained finishing stage; when the first weighted sample data is input of the initial classification model, the first class prediction result is output result of the initial classification model. The class label information corresponding to the initial sample data may be used to characterize the degree of difference between the first class result and the true value of the initial sample data. In the embodiment of the application, the initial classification model may be a classification model, or may be a multi-classification model, specifically may be a decision tree model, for example, one or more of a gradient lifting decision tree model (gradient boosting decision tree, GBDT), an extreme gradient lifting tree model (extreme gradient boosting, XGBoost), or a classification model constructed by a support vector machine (support vector machine, SVM) and the like.
In one possible implementation, the initial classification model may be obtained by: after the computer equipment acquires the source classification model, inputting the first weighted sample data into the source classification model, and performing category prediction processing on the first weighted sample data through the source classification model to obtain an initial category prediction result corresponding to the first weighted sample data; furthermore, the cross entropy loss corresponding to the source classification model can be determined according to the initial category prediction result and the category label information corresponding to the initial sample data; and finally, carrying out network parameter correction on the source classification model according to the cross entropy loss, and determining the source classification model containing the corrected network parameters as an initial classification model.
Similarly, a source classification model may be understood as a model that the initial classification model contains during the untrained completion phase; when the first weighted sample data is input of the source classification model, the initial class prediction result is output result of the source classification model. In the embodiment of the application, similar to the initial classification model, the source classification model may be a classification model, or may be a multi-classification model, specifically may be a decision tree model, for example, a gradient-lifted decision tree model (gradient boosting decision tree, GBDT), an extreme gradient-lifted tree model (extreme gradient boosting, XGBoost), or may also be a classification model constructed by one or more of support vector machine (support vector machine, SVM) classification models.
The cross entropy loss can be used for evaluating the difference degree of the initial category prediction result of the source classification model and the category label information corresponding to the initial sample data; it can be understood that the smaller the cross entropy loss is, the smaller the degree of difference between the initial category prediction result and the category label information corresponding to the initial sample data is, the higher the prediction accuracy of the source classification model is; therefore, the computer device may correct network parameters (e.g., parameters such as learning rate, regularization coefficient, etc.) of the source classification model by minimizing cross entropy loss to target that the initial class prediction result is closer to the class label information corresponding to the initial sample data, so as to optimize the model parameters of the source classification model. In the embodiment of the application, the following relationship can be satisfied between the cross entropy loss corresponding to the source classification model, the initial category prediction result, the category label information corresponding to the initial sample data and other parameters:
Wherein L is s Representing cross entropy loss; t represents the number of samples contained in the first weighted sample data; y is j Representing class label information corresponding to the jth sample data in the first weighted sample data; a, a k Represents the kth value of the full link layer;representing the initial category prediction result corresponding to the kth sample data in the first weighted sample data, wherein the initial category prediction result is a softmax function, and the value range is [0,1 ]]It can also be understood that the output vector prob [ T.1 ]]Is the j-th value of (c).
After determining the cross entropy loss of the source classification model, the computer device may target minimizing the cross entropy loss such that the closer the initial class prediction result is to the class label information corresponding to the initial sample data, thereby correcting network parameters (e.g., parameters such as learning rate, regularization coefficient, etc.) of the source classification model to optimize model parameters of the source classification model until the source classification model meets the training stop condition, and determining the source classification model meeting the training stop condition as the initial classification model. The training stopping condition may specifically be that whether a value corresponding to the cross entropy loss is smaller than a first loss threshold value is judged, or whether the training frequency of the source classification model reaches a training frequency threshold value is judged. For example, if the value corresponding to the cross entropy loss is greater than or equal to the first loss threshold, the source classification model may be considered as not reaching convergence, at this time, network parameter correction is required to be performed on the source classification model, and iterative training is continued on the source classification model until the value corresponding to the cross entropy loss is less than the first loss threshold, so as to determine that the source classification model obtained at this time converges, and determine the source classification model including the corrected network parameter as an initial classification model; or if the training frequency of the source classification model is smaller than the training frequency threshold, the source classification model can be considered to not reach convergence, at the moment, the network parameters of the source classification model need to be corrected, and iterative training is continued to be carried out on the source classification model until the training frequency of the source classification model is larger than or equal to the training frequency threshold, so that the source classification model obtained at the moment is determined to be converged, and the source classification model containing the corrected network parameters is determined to be an initial classification model. The initial classification model obtained in this way has higher accuracy. The first loss threshold and the training frequency threshold are preset parameters, the specific values of which can be determined according to the actual application scene, and the embodiment of the application is not limited to the specific values.
In the embodiment of the present application, by performing preliminary parameter correction on the source classification model, the obtained initial classification model, in practical application, has a relatively small sample data size, for example, the initial sample data corresponding to the first category mentioned above still has a relatively large difference between the predicted value and the actual value. That is, the accuracy of the initial classification model constructed by the first weighted sample data weighted by the first classification weight is improved, but the accuracy is still greatly different from the true value, compared with the classification model constructed by the initial sample data. Therefore, it is necessary to further improve the prediction accuracy of the classification model in the application scenario of sample data imbalance.
In the embodiment of the application, the first loss can be used for evaluating the difference degree of the first class prediction result of the initial classification model and the class label information corresponding to the initial sample data; it can be understood that the smaller the first loss, the smaller the degree of difference between the first class prediction result and the class label information corresponding to the initial sample data, the higher the prediction accuracy of the initial classification model; accordingly, the computer device may aim to minimize the first loss, so that the closer the first class prediction result is to the class label information corresponding to the initial sample data, thereby correcting the network parameters (e.g., parameters such as learning rate, regularization coefficient, etc.) of the initial classification model to optimize the model parameters of the initial classification model. In the embodiment of the application, the first loss is associated with the class label information corresponding to the first class prediction result and the initial sample data and the first class weight, so that the prediction accuracy of the candidate class model trained for the t-th time can be improved when the parameter correction is performed on the initial class model based on the first loss.
In one possible implementation manner, the computer device may perform a logarithmic processing on the first type of prediction result to obtain a logarithmic result corresponding to the first type of prediction result; determining a correction coefficient corresponding to the logarithmic result according to the first type prediction result and type label information corresponding to the initial sample data; and finally, determining the product among the logarithmic result, the correction coefficient and the first classification weight as a first loss corresponding to the initial classification model. In the embodiment of the application, the following relationship can be satisfied between the first loss corresponding to the initial classification model and parameters such as the first class prediction result, the first classification weight, class label information corresponding to the initial sample data, and the like:
wherein L is 1 Representing a first loss; alpha 1 Representing a first classification weight; y is pred1 Representing a first class prediction result;and representing a correction coefficient associated with the class label information corresponding to the first class prediction result and the initial sample data.
As can be seen from the above formula (2), the first loss is mainly represented by the first classification weight α with respect to the normal cross entropy loss 1 And correction coefficientIs effective in the following. Wherein the first classification weight alpha 1 Can be used for adjusting sample data proportion imbalance; correction factor- >Correction coefficient +.>The method can be used for adjusting the loss distribution imbalance between a simple sample (a sample with smaller predictive value and truth label error) and a complex sample (a sample with larger predictive value and truth label error) so that the simple sample and the complex sample pull a gap on the action of a loss function. Therefore, the first loss can well solve the problem of sample imbalance, so that the finally obtained target classification model has higher accuracy in predicting complex samples and in aiming at application scenes of sample data imbalance.
Further, the computer device may obtain the correction factor by performing the steps of: determining a product between a first class prediction result and class label information corresponding to the initial sample data as a first candidate value; and acquiring constant parameters and index parameters, determining the difference value between the constant parameters and the first candidate values as a second candidate value, and performing index operation on the second candidate value and the index parameters to obtain a correction coefficient corresponding to the logarithmic result. Therefore, in the embodiment of the present application, the following relationship may be further satisfied between the first loss corresponding to the initial classification model and parameters such as the first class prediction result, the first classification weight, and class label information corresponding to the initial sample data:
L 1 =-α 1 (1-y true *y pred1 ) γ *log(y pred1 ) (3)
Wherein L is 1 Representing a first loss; alpha 1 Representing a first classification weight; 1 represents a constant parameter; y is true Category label information corresponding to the initial sample data is represented; y is pred1 Representing a first class prediction result; gamma represents an index parameter, and the specific value of the index parameter can be set according to practical application; y is true *y pred1 Representing the first candidate value;1-y true *y pred1 Representing a second candidate value; (1-y) true *y pred1 ) γ Representing the correction coefficient; log (y) pred1 ) Representing a logarithmic result corresponding to the first class prediction result; (1-y) true *y pred1 ) γ *log(y pred1 ) Indicating Focal loss (Focal loss).
As can be seen from the above formula (3), the first loss is mainly represented by the first classification weight α with respect to the normal cross entropy loss 1 And the correction coefficient. Wherein the first classification weight alpha 1 Can be used for adjusting sample data proportion imbalance; the correction coefficient can be used for adjusting the loss distribution imbalance between a simple sample (a sample with smaller predicted value and true value label error) and a complex sample (a sample with larger predicted value and true value label error), so that the simple sample and the complex sample pull a gap on the action of the loss function, and the correction coefficient mainly passes through the index parameter gamma when the loss distribution imbalance between the simple sample and the complex sample is adjusted. To make corrections. Taking 0.9 and 0.6 as examples of the first type prediction result corresponding to the first weighted sample data; when the index parameter gamma takes 2, the first weighted sample data with the first class prediction result of 0.9 is obtained from the log result log (y) corresponding to the first class prediction result pred1 ) The former coefficient (third candidate value) is (1-0.9) 2 =0.01; and the first weighted sample data with the first class prediction result of 0.6 corresponds to the log result log (y) of the first class prediction result pred1 ) The former coefficient (third candidate value) is (1-0.6) 2 =0.16. Therefore, for the first weighted sample data with the first class prediction result of 0.9, for easy prediction when the gradient is reduced to fit the residual, the first weighted sample data corresponding to the first class prediction result of 0.9 is a simple sample with respect to the first weighted sample data corresponding to the first class prediction result of 0.6. For simple samples, the first penalty will drop more slowly, while for complex samples, the initial classification model will bias training the complex samples to optimize the model training direction, thereby improving model performance.
It can be seen that the first loss constructed by the embodiment of the present application is calculated by the first classification weight α 1 The problem of unbalanced proportion of sample data in sample imbalance can be relieved, and the problem of unbalanced loss distribution between a simple sample and a complex sample in sample imbalance can be relieved through an index parameter gamma. Therefore, the first loss can well solve the problem of sample imbalance, so that the finally obtained target classification model has higher accuracy in predicting complex samples and in aiming at application scenes of sample data imbalance.
Step S103: carrying out weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t-th training, carrying out network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t-th training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t-th training; the target classification model is used for predicting the category to which the target data belong.
Referring to fig. 4, fig. 4 is a schematic diagram of an object classification model according to an embodiment of the application. As shown in fig. 4, sample preparation and feature screening are first performed before the object classification model is constructed. Taking a scenario of predicting whether a bank loan will violate, as an example, the computer device selects property data corresponding to a plurality of business objects (objects applying for the bank loan) from a business database as sample data, assigns corresponding category label information to the property data, and then adds the sample data containing the category label information to a training sample set, thereby completing sample preparation. Sample data may also be preprocessed, e.g., sample data containing outlier data may be culled, prior to adding sample data containing class label information to the training sample set; or when the missing value of the sample data occurs, filling the missing value of the sample data, and then adding the preprocessed sample data containing the label information to the training sample set, thereby completing feature screening. Furthermore, a first classification weight can be introduced to balance the proportion of initial sample data of different categories in the training sample set, so that the finally obtained target classification model can balance the learning of each category, and the model output result is prevented from being biased to the category with larger sample occupation. Then, carrying out weighting treatment on the initial sample data according to the first classification weight to obtain first weighted sample data of the t-th training; the initial classification model is trained and parameterized by the first weighted sample data.
Further, a first loss corresponding to the initial classification model can be determined based on the first classification weight and the focus loss, the first classification weight is subjected to weight correction according to the first loss to obtain a second classification weight for the t training, and the initial classification model is subjected to network parameter correction according to the first loss to obtain a candidate classification model for the t training; when the network parameters of the initial classification model are corrected based on the first loss, bayesian optimization can be used for correcting the network parameters, and the f1 score (f 1 score) corresponding to the minimum class of the ratio of the first class prediction result is used as a parameter searching index to prevent the model from being over-fitted so as to obtain a candidate classification model trained for the t time; when the weight correction is performed on the first classification weight, the category of the real result which is the least consistent with the predicted result can be corrected preferentially, so that the second classification weight corresponding to the category is larger than the first classification weight, the correction threshold step length is set to a plurality of orders (for example 0.1,0.01,0.001 and the like) to be output, the optimal step length of the correction threshold is found to perform automatic updating iteration, and therefore the weight correction of the first classification weight is completed until the predicted result and the initial real result reach a preset difference range. And finally, obtaining a target classification model according to the second classification weight and the candidate classification model trained for the t time.
Specifically, the computer device may modify network parameters (e.g., learning rate, regularization coefficient, etc.) of the initial classification model by minimizing the first loss as a target, so that the first class prediction result is closer to the class label information corresponding to the initial sample data, e.g., reduce the learning rate of the initial classification model, etc., to obtain a candidate classification model for the t-th training, and finally obtain the target classification model according to the second classification weight and the candidate classification model for the t-th training. In the embodiment of the application, when the network parameter correction is performed on the initial classification model based on the first loss, the network parameter correction can be performed by using Bayesian optimization. Specifically, the f1 score (f 1 score) corresponding to the minimum class of the first class prediction result is used as a parameter searching index, so that a candidate classification model trained for the t-th time is obtained. The f1 score is a measurement index of the classification problem, and simultaneously considers the accuracy and recall rate of the classification model. The f1 score can be considered as a harmonic mean of the model accuracy and recall, with a maximum of 1 and a minimum of 0. f1 fraction = 2 (accuracy rate recall)/(accuracy rate + recall); where accuracy = prediction correct amount of sample data/amount of sample data present in the first class prediction result; recall (recovery) =predict the correct amount of sample data/amount of sample data present in the real label. It can be seen that the model parameters of the candidate classification model trained for the t time can be rapidly determined through Bayesian optimization correction parameters, which is beneficial to improving the model training efficiency.
The second classification weight is obtained by correcting the first classification weight; the second classification weight of the t-th training may be understood as the first classification weight at the t+1st training. In the embodiment of the application, the computer equipment can make the first class prediction result be closer to the class label information corresponding to the initial sample data by minimizing the first loss as a target, so that the weight correction is carried out on the first class weight, and the second class weight of the t-th training is obtained, so that the prediction accuracy of the model is improved. Specifically, the computer device may select the class with the lowest f1 score and the first classification weight corresponding to the class with the largest difference between the class label information corresponding to the first class prediction result and the initial sample data, so that the second classification weight corresponding to the class is greater than the first classification weight, set a correction threshold step size for a plurality of orders (for example 0.1,0.01,0.001, etc.) to output, find the optimal correction threshold step size, and perform automatic update iteration, thereby completing the weight correction of the first classification weight until the class label information corresponding to the first class prediction result and the initial sample data reaches the preset difference range. Meanwhile, after the first classification weight is updated each time to obtain the second classification weight, training and parameter searching of the candidate classification model can be conducted again aiming at the current second classification weight, so that the finally obtained target classification model meets the requirement of higher accuracy.
Further, the computer device may perform a weighting process on the initial sample data according to the second classification weight to obtain second weighted sample data of the t+1st training; outputting a second class prediction result corresponding to the second weighted sample data through the candidate classification model trained for the t time, and determining a second loss corresponding to the candidate classification model trained for the t time according to the second class prediction result, the second class weight and the class label information; network parameter correction is carried out on the candidate classification model trained for the t time according to the second loss, and the candidate classification model trained for the t+1th time is obtained; and if the t+1st training candidate classification model meets the training stopping condition, determining the t+1st training candidate classification model as a target classification model.
In the embodiment of the application, the second weighted sample data is sample data obtained by weighting the first weighted sample data based on the second classification weight; the second class of predictors are predictors of the t-th trained candidate classification model for the second weighted sample data. Similar to the first penalty, the second penalty may be used to evaluate a degree of difference in class label information corresponding to the second class prediction result of the t-th trained candidate classification model and the initial sample data; it can be understood that the smaller the second loss is, the smaller the degree of difference between the second class prediction result and the class label information corresponding to the initial sample data is, the higher the prediction accuracy of the candidate classification model trained for the t-th time is; therefore, the computer device may correct the network parameters (e.g., parameters such as learning rate, regularization coefficient, etc.) of the candidate classification model trained for the t-th time by minimizing the second loss to target that the second class prediction result is closer to the class label information corresponding to the initial sample data, so as to optimize the model parameters of the candidate classification model trained for the t-th time. In the embodiment of the application, the following relationship can be satisfied between the second loss corresponding to the candidate classification model trained for the t time and parameters such as the second class prediction result, the second class weight, class label information corresponding to the initial sample data, and the like:
L 2 =-α 2 (1-y true *y pred2 ) γ *log(y pred2 ) (4)
Wherein L is 2 Representing a second loss; alpha 2 Representing a second classification weight; 1 represents a constant parameter; y is true Category label information corresponding to the initial sample data is represented; y is [red2 Representing a second class prediction result; gamma represents an index parameter, and a specific value of the index parameter can be set according to practical application.
In the embodiment of the present application, when the network parameter correction is performed on the candidate classification model trained for the t time according to the second loss to obtain the candidate classification model trained for the t+1th time, the bayesian optimization may be adopted to perform the network parameter correction on the candidate classification model trained for the t time, and the specific parameter searching mode may refer to the foregoing description and will not be repeated herein. The training stopping condition may specifically be that whether the value corresponding to the second loss is smaller than a second loss threshold value or whether the training frequency of the candidate classification model trained for the t+1st time reaches a training frequency threshold value is judged. For example, if the value corresponding to the second loss is greater than or equal to the second loss threshold, it may be considered that the t+1st training candidate classification model does not reach convergence, at this time, network parameter correction needs to be performed on the t+1st training candidate classification model, and iterative training is continuously performed on the t+1st training candidate classification model until the value corresponding to the second loss is less than the second loss threshold, so as to determine that the t+1st training candidate classification model obtained at this time converges, and determine the t+1st training candidate classification model as the target classification model; or if the training frequency of the t+1st training candidate classification model is smaller than the training frequency threshold, the t+1st training candidate classification model is considered to be not converged, at this time, network parameter correction is required to be performed on the t+1st training candidate classification model, and iterative training is continued on the t+1st training candidate classification model until the training frequency of the t+1st training candidate classification model is greater than or equal to the training frequency threshold, so that convergence of the t+1st training candidate classification model obtained at this time is determined, and the t+1st training candidate classification model is determined to be the target classification model. The second loss threshold and the training frequency threshold are preset parameters, the specific values of which can be determined according to the actual application scene, and the embodiment of the application is not limited to this. It can be understood that the target classification model is obtained based on the second classification weight and the candidate classification model trained for the t time, so that the target classification model is adopted to predict the target data, and the prediction accuracy under the application scene of unbalanced sample data can be further improved.
After the target classification model is obtained, the target classification model may be used to conduct class prediction on the target data. Specifically, the computer equipment acquires target data, inputs the target data into a target classification model, performs category prediction processing on the target data through the target classification model, so as to obtain a third category prediction result corresponding to the target data, and then determines the category to which the target data belongs according to the third category prediction result. Taking the asset class to which the predicted target data belongs as an example, the target data may include information such as basic information, asset information, etc. of the target object, and the third class prediction result may be represented by a numerical value or a text, for example, the third class prediction result is "high asset", and then the asset class to which the target data belongs is the high asset class. It can be seen that the target classification model is adopted to predict the data category, so that the prediction accuracy in the application scene of unbalanced sample data can be improved.
In the embodiment of the application, after initial sample data in a training sample set is obtained and a first classification weight of the initial sample data in the t-th training is obtained, the initial sample data can be weighted according to the first classification weight to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer; further, the first weighted sample data can be input into an initial classification model, category prediction processing is carried out on the first weighted sample data through the initial classification model, a first category prediction result corresponding to the first weighted sample data is obtained, and a first loss corresponding to the initial classification model is determined according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data; finally, the first classification weight can be subjected to weight correction according to the first loss to obtain a second classification weight for the t training, the initial classification model is subjected to network parameter correction according to the first loss to obtain a candidate classification model for the t training, and the target classification model is obtained according to the second classification weight and the candidate classification model for the t training; the target classification model is used for predicting the category to which the target data belong. Therefore, the initial sample data is weighted through the first classification weight, so that the prediction accuracy is improved when the target data is predicted by the target classification model constructed based on the weighted first weighted sample data; further, since the first loss corresponding to the initial classification model takes the first classification weight into consideration, when the network parameter of the initial classification model is corrected based on the first loss, the model performance of the candidate classification model trained for the t-th time can be improved; in addition, the target classification model is obtained based on the second classification weight and the t-th trained candidate classification model, so that the target classification model is adopted to predict the target data, and the prediction accuracy in the application scene of unbalanced sample data can be further improved.
Further, referring to fig. 5, fig. 5 is a flow chart of another data category prediction method according to an embodiment of the present application. It will be appreciated that the data class based prediction method is performed by a computer device, which may be a terminal device (e.g., terminal device 102a, terminal device 102b, or terminal device 102c in the corresponding embodiment of fig. 1), or a server (e.g., server 101 in the corresponding embodiment of fig. 1). As shown in fig. 5, the data category based prediction method may include the following steps S201 to S208:
step S201: the method comprises the steps of obtaining initial sample data in a training sample set, obtaining first classification weights of the initial sample data in the t-th training, and carrying out weighting processing on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training.
The specific implementation manner of step S201 may refer to the specific implementation manner of step S101 in fig. 3, and will not be described herein.
Step S202: and inputting the first weighted sample data into an initial classification model, and performing category prediction processing on the first weighted sample data through M classification sub-models of the initial classification model to obtain M sub-category prediction results.
Step S203: and determining the sum of the M subcategory prediction results as a first category prediction result corresponding to the first weighted sample data.
Step S204: and determining a first loss corresponding to the initial classification model according to the first class prediction result, the first classification weight and class label information corresponding to the initial sample data.
In one possible implementation, the initial classification model may include M classification sub-models; m is an integer greater than 1; the value of M may be 2,3,4, … …, and the embodiment of the present application does not limit the specific value of M. Specifically, the computer device may input the first weighted sample data to an initial classification model, and perform a class prediction process on the first weighted sample data through M classification sub-models of the initial classification model to obtain M sub-class prediction results; and then the sum of the M subcategory predictors can be determined as a first category predictor corresponding to the first weighted sample data.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an initial classification model according to an embodiment of the application. Taking the example that the initial classification model includes 2 sub-classification models (classification sub-model 1 and classification sub-model 2), as shown in fig. 6, the sub-classification model 1 is assumed to include node 1, node 3, node 4, node 5, and node 6, where node 1, node 3 are non-leaf nodes in the sub-classification model 1, and node 4, node 5, and node 6 are leaf nodes in the sub-classification model 1. Node 1 is the root node of nodes 3 and 4, and node 3 is the root node of leaf nodes 5 and 6. Sub-classification model 2 is assumed to include node 2, node 7, and node 8, where node 2 is a non-leaf node in sub-classification model 2 and node 7 and node 8 are leaf nodes in sub-classification model 2. Node 2 is the root node of leaf node 7 and leaf node 8. Taking the sum of the results corresponding to the leaf node 4, the leaf node 5 and the leaf node 6 as a sub-category prediction result 1 corresponding to the sub-category model 1; taking the sum of the results corresponding to the leaf nodes 7 and 8 as a sub-category prediction result 2 corresponding to the sub-category model 2; and summing the sub-category prediction result 1 and the sub-category prediction result 2 to obtain a first category prediction result corresponding to the first weighted sample data. That is, the output results (first class prediction results) of the original classification model may be obtained by accumulating the output results corresponding to the M sub-classification models. In the embodiment of the present application, the following relationship may be satisfied between parameters such as the first weighted sample data and the first class prediction result:
/>
Wherein,,representing a first class prediction result, f k And (w) represents the weight of the leaf node corresponding to the first weighted sample data w in the kth classification sub-model, wherein k is a positive integer less than or equal to M.
It can be seen that, in the initial classification model of the embodiment of the application, through the common decision of the plurality of classification sub-models, the output results of the classification sub-models are accumulated to obtain the first class prediction result corresponding to the initial classification model, so that the effect of the whole model is improved.
Referring to fig. 7 together, fig. 7 is a schematic diagram of a training process of an initial classification model according to an embodiment of the application. Taking the example that the initial classification model includes 3 sub-classification models, as shown in FIG. 7, the initial classification model may employ a greedy algorithmLearning is carried out on a sub-classification model by sub-classification model, and each sub-classification model fits the residual error of the previous sub-classification model. Specifically, the first weighted sample data may be input to an initial classification model, and the residual error of the classification sub-model 1 is calculated by the classification sub-model 1 to obtain a residual error E 1 Using residual error E 1 Training the classification sub-model 2 as class label information of the classification sub-model 2; then, calculate the residual error of the classification sub-model 2 to obtain residual error E 2 Using residual error E 2 Training the classification sub-model 3 as class label information of the classification sub-model 3 to obtain the classification sub-model 3; and finally, adding the classification sub-model 1, the classification sub-model 2 and the classification sub-model 3 to obtain an initial classification model. The class label information of the classification sub model 1 is class label information corresponding to the initial sample data; residual error E 1 A difference value between the sub-category prediction result 1 corresponding to the classification sub-model 1 and category label information corresponding to the initial sample data; residual error E 2 For classification of sub-category prediction result 2 and residual E corresponding to sub-model 2 1 Difference between them.
As can be seen from fig. 6 and 7, the training process of the initial classification model can be understood as a process of continuously adding sub-classification models, one at a time, which is to learn a new function to fit the residual error of the last prediction. When training is completed to obtain M sub-classification models, predicting a first class prediction result corresponding to the first weighted sample data, which can be understood as that according to the characteristics of the first weighted sample data, each sub-classification model falls to a corresponding leaf node, each leaf node corresponds to a predictor value, and finally only one predictor value corresponding to each sub-classification model needs to be added up to be the first class prediction result corresponding to the first weighted sample data. Therefore, the first class prediction result corresponding to the initial classification model comprising M classification sub-models is equal to the accumulated prediction result corresponding to the previous M-1 classification sub-models in value, and the sub-class prediction result corresponding to the Mth classification sub-model is added. Then for the initial classification model, the following relationship can be satisfied between the target loss (model parameters for optimizing the initial classification model) and parameters such as category label information:
Wherein Obj is the t Representing target loss corresponding to the initial classification model; t represents the number of samples corresponding to the first weighted sample; y is (i) Category label information corresponding to the ith initial sample data is represented;representing accumulated prediction results of M-1 classification sub-models corresponding to the ith initial sample data; />Representing residual errors between accumulated prediction results of the M-1 classification sub-models and class label information; f (f) M (x (i) ) And representing the sub-category prediction result of the Mth classification sub-model corresponding to the ith initial sample data.
Performing second-order taylor expansion on the target loss in the formula (6), wherein the following relationship can be satisfied between parameters such as the target loss and the category label information:
/>
wherein Obj is the t Representing target loss corresponding to the initial classification model; t represents the number of samples corresponding to the first weighted sample; y is (i) Category label information corresponding to the ith initial sample data is represented;representing accumulated prediction results of M-1 classification sub-models corresponding to the ith initial sample data; />Representing residual errors between accumulated prediction results of the M-1 classification sub-models and class label information; f (f) M (x (i) ) Representing the sub-category prediction result of the Mth classification sub-model corresponding to the ith initial sample data; g i Represents f M (x (i) ) Is the first derivative of (a); h is a i Represents f M (x (i) ) Is a second derivative of (c).
For the initial classification model, since the structure of the first M-1 classification sub-models is known at the time of training the Mth classification sub-model, that is, the M-1 classification sub-models have been trained. Residual errors between accumulated prediction results of the previous M-1 classification sub-models and class label informationThe optimization of the target loss is not affected, so that the target loss in the formula (7) can be further optimized, and the following relation can be satisfied between parameters such as the target loss and the sub-category prediction result of the Mth classification sub-model:
wherein Obj is the t Representing target loss corresponding to the initial classification model; t represents the number of samples corresponding to the first weighted sample; f (f) M (x (i) ) Representing the sub-category prediction result of the Mth classification sub-model corresponding to the ith initial sample data; g i Represents f M (x (i) ) Is the first derivative of (a); h is a i Represents f M (x (i) ) Is a second derivative of (c).
As can be seen from the formula (8), the target loss is a unitary quadratic function with respect to the mth classification sub-model, specifically, the target loss is a unitary quadratic function with respect to the leaf node predicted value w corresponding to the mth classification sub-model, and the vertex formula can be used to solve the optimal value of the target loss. That is, when determining a specific target loss, the optimal value of the target loss can be determined by solving the first derivative and the second derivative of the target loss, and model parameter correction is performed on the Mth classification sub-model according to the target loss, so that the model is optimal.
From equation (8), the target loss corresponding to the initial classification model may be any derivative loss, such as cross entropy loss, mean square error loss, or may be a custom loss, etc. In the embodiment of the application, the first loss can be used as the target loss of the initial classification model. And calculating the first derivative and the second derivative of the first loss to determine the optimal value of the target loss, and correcting the network parameters of the Mth classification sub-model to optimize the model.
The first loss is associated with the first classification weight, and the following relationship can be satisfied between parameters such as a sub-category prediction result of the mth classification sub-model, the first classification weight, and category label information of the mth classification sub-model:
L 1 =-α 1 (1-y m *y predm ) γ *log(y predm ) (9)
wherein L is 1 A first loss of representation; alpha 1 A first classification weight for the representation; 1 represents a constant parameter; y is m Class label information representing an mth class sub-model; y is predm Representing the sub-category prediction result of the mth classification sub-model.
It can be seen that the first loss constructed by the embodiment of the present application is calculated by the first classification weight α 1 The problem of unbalanced proportion of sample data in the sample imbalance can be adjusted, and the problem of unbalanced loss distribution between a simple sample and a complex sample in the sample imbalance can be adjusted through an index parameter gamma. When the initial classification model comprises M classification sub-models, the sub-classification models are trained one by one when the initial classification model is trained, so that the sub-classification models in the initial classification model can be trained one by one through first loss associated with first classification weight, and the finally obtained target classification model has higher prediction accuracy under the application scene of unbalanced sample data.
Further, the computer device also needs to determine the specific structure of the mth classification sub-model, that is, also needs to perform feature splitting according to the feature values contained in the first weighted sample data, so as to construct the specific structure of the mth classification sub-model. Specifically, a greedy algorithm may be utilized to traverse all feature division points of all features in the first weighted sample data, and a value corresponding to the target loss is used as an evaluation index. For a node to be split, if the gain of the value corresponding to the target loss after the splitting of the node is larger than the gain of the value corresponding to the target loss before the splitting. The specific splitting index can be selected from information entropy, information gain or a coefficient of gini (gini index), etc. Meanwhile, in order to prevent the structure of the Mth classification sub-model from being too complex, a gain threshold value can be set, when the gain is larger than the gain threshold value, the node to be split is split, otherwise, the node to be split is not split; the gain threshold is a preset parameter, and the specific value can be determined according to actual conditions.
Step S205: carrying out weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t-th training, carrying out network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t-th training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t-th training; the target classification model is used for predicting the category to which the target data belong.
When the initial classification model contains M classification sub-models, the training process for each classification sub-model is similar, and thus, the present embodiment is described by taking the mth classification sub-model as an example. For the initial classification model, since the structure of the first M-1 classification sub-models is known at the time of training the Mth classification sub-model, that is, the M-1 classification sub-models have been trained. Thus, when training the mth classification sub-model, the first penalty may be understood as the first penalty corresponding to the mth classification sub-model. The computer device may modify network parameters (e.g., learning rate, regularization coefficient, etc.) of the mth classification sub-model by minimizing the first loss as a target, so that the more the sub-class prediction result corresponding to the mth classification sub-model is close to the class label information corresponding to the mth classification sub-model, e.g., reduce the learning rate of the initial classification model, etc., to obtain the mth classification sub-model in the candidate classification model trained for the t-th time, after the training of the mth classification sub-model is completed, combine the previous M-1 trained classification sub-model with the mth classification sub-model to obtain the candidate classification model trained for the t-th time, and finally obtain the target classification model according to the second classification weight and the candidate classification model trained for the t-th time. In the embodiment of the application, when the network parameter correction is performed on the Mth classification sub-model based on the first loss, the network parameter correction can be performed by using Bayesian optimization. Specifically, the f1 score (f 1 score) corresponding to the minimum class of the first class prediction result may be used as the parameter searching index.
In the embodiment of the present application, the first classification weight may be understood as a classification weight associated with a first loss corresponding to the mth classification sub-model. When the weight correction is performed on the first classification weight, the computer equipment can enable the sub-class prediction result corresponding to the Mth classification sub-model to be closer to the class label information corresponding to the Mth classification sub-model by taking the first loss of the Mth classification sub-model as a target, so that the weight correction is performed on the first classification weight, the second classification weight of the t training is obtained, and the prediction accuracy of the model is improved.
Specifically, the computer device may select the class with the lowest f1 score and the first classification weight corresponding to the class with the largest difference between the sub-class prediction result corresponding to the mth classification sub-model and the class label information corresponding to the mth classification sub-model to perform weight correction, so that the second classification weight corresponding to the class is greater than the first classification weight, set the correction threshold step size to multiple orders (for example 0.1,0.01,0.001, etc.), output, find the correction threshold optimal step size, and perform automatic update iteration, thereby completing weight correction of the first classification weight until the sub-class prediction result corresponding to the mth classification sub-model and the class label information corresponding to the mth classification sub-model reach a preset difference range. Meanwhile, after the first classification weight is updated each time to obtain the second classification weight, training and parameter searching of the model can be performed again aiming at the current second classification weight, so that the finally obtained target classification model meets the requirement of higher accuracy.
Step S206: and obtaining target data, inputting the target data into a target classification model, and performing category prediction processing on the target data through the target classification model to obtain a third category prediction result corresponding to the target data.
Step S207: and acquiring a business class strategy associated with the target data, and acquiring a fourth class prediction result corresponding to the target data according to the hit result of the target data in the business class strategy.
Step S208: and carrying out weighted summation processing on the third category prediction result and the fourth category prediction result, and determining the category to which the target data belong.
After the target classification model is obtained, class prediction may be performed on the target data based on the target classification model and a traffic class policy associated with the target data. The class policy may be a preset policy, and the service class policy may be used as a reference service for class prediction. The hit results may include results corresponding to hit success and hit failure, and when the target data is successfully matched with the traffic class policy, then the target data is determined to successfully hit the traffic class policy, otherwise the hit fails.
For example, taking the asset class to which the predicted target data belongs as an example, the target data may include information such as basic information and asset information of the target object; after the computer equipment acquires the target data, the target data can be input into a target classification model, and the target classification model is used for carrying out class prediction processing on the target data, so that a third class prediction result corresponding to the target data is obtained. The third category prediction result may be a numerical value, for example, the third category prediction result corresponding to the target data output by the target classification model is 0.8, etc. And then acquiring a business class strategy associated with the target data, and obtaining a fourth class prediction result corresponding to the target data according to the hit result of the target data in the business class strategy.
The business type policy may be set as "the asset data exceeds 50 ten thousand hit succeeds, otherwise hit fails", if the target data indicates that the asset data exceeds 50 ten thousand, it is determined that the hit result of the target data against the business type policy is hit success. Optionally, the service class policy may include a plurality of policy rules, and the computer device may obtain the number of successful hits of the target data in the service class policy, and determine a fourth class prediction result corresponding to the target data. For example, the traffic class policy may include 10 policy rules, when the hit result corresponding to the policy rule is hit success, the prediction score may be set to 0.1, when the hit result corresponding to the policy rule is hit failure, the prediction score may be set to 0, and then the number of hits of the target data in the traffic class policy is counted, and the fourth class prediction result of the target data is determined according to the number of hits. For example, the number of hits of the target data in the traffic class policy is 6, and the fourth class prediction result of the target data may be 0.6.
Further, the third category prediction result and the fourth category prediction result can be weighted and summed to obtain a target category prediction result corresponding to the target data. Alternatively, the calculation formula of the target class prediction result may be expressed as: target class predictor = weighting factor 1 x third class predictor + weighting factor 2 x fourth class predictor; the weighting coefficients 1 and 2 are preset parameters, and specific values can be set according to actual requirements. For example, the weighting coefficient 1 may be set to 0.7, the weighting coefficient 2 may be set to 0.3, and the target class prediction result may be 0.74 when the third class prediction result is 0.8 and the fourth class prediction result is 0.6.
After the target class prediction result is obtained, the asset class to which the target data belongs can be determined by setting the corresponding relation between the target class prediction result and the asset class. For example, the correspondence of the target category prediction result and the asset category may be set as: when the target class prediction result corresponding to the target data is smaller than 0.3, the asset class to which the target data belongs is a low asset class; when the target class prediction result corresponding to the target data is between 0.3 and 0.8, the asset class to which the target data belongs is a medium asset class; when the target class prediction result corresponding to the target data is greater than 0.8, the asset class to which the target data belongs is a high asset class. When the target category prediction result is 0.74, it may be determined that the asset category to which the target data belongs is a medium asset category.
In the embodiment of the application, the prediction accuracy under the application scene of unbalanced sample data can be improved by carrying out weighted summation on the third category classification result output by the target classification model and the fourth category prediction result obtained by the service category strategy.
In the embodiment of the application, after initial sample data in a training sample set is obtained and a first classification weight of the initial sample data in the t-th training is obtained, the initial sample data can be weighted according to the first classification weight to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer; further, the first weighted sample data may be input to an initial classification model, category prediction processing is performed on the first weighted sample data through M classification sub-models of the initial classification model to obtain M sub-category prediction results, a sum of the M sub-category prediction results is determined to be a first category prediction result corresponding to the first weighted sample data, and a first loss corresponding to the initial classification model is determined according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data; finally, the first classification weight can be subjected to weight correction according to the first loss to obtain a second classification weight for the t training, the initial classification model is subjected to network parameter correction according to the first loss to obtain a candidate classification model for the t training, and the target classification model is obtained according to the second classification weight and the candidate classification model for the t training; acquiring target data, inputting the target data into a target classification model, and performing category prediction processing on the target data through the target classification model to obtain a third category prediction result corresponding to the target data; acquiring a business class strategy associated with the target data, and acquiring a fourth class prediction result corresponding to the target data according to a hit result of the target data in the business class strategy; and finally, carrying out weighted summation processing on the third category prediction result and the fourth category prediction result, and determining the category to which the target data belong.
Therefore, the initial sample data is weighted through the first classification weight, so that the prediction accuracy is improved when the target data is predicted by the target classification model constructed based on the weighted first weighted sample data; further, since the first loss corresponding to the initial classification model takes the first classification weight into consideration, when the network parameter of the initial classification model is corrected based on the first loss, the model performance of the candidate classification model trained for the t-th time can be improved; in addition, the target classification model is obtained based on the second classification weight and the t-th trained candidate classification model, so that the target classification model is adopted to predict the target data, and the accuracy of data class prediction can be further improved. Finally, through the common decision of the target classification model and the business class strategy, the prediction accuracy under the application scene of unbalanced sample data can be further improved.
To verify the performance of the target classification model in the present solution, taking the predicted asset data as an example, the target classification model in the present solution is compared with the classification model in the prior art, and the comparison result shown in fig. 8 and fig. 9 is obtained. The classification model adopted in the prior art 1 is a classification model obtained by training by using cross entropy loss; the classification model used in the prior art 2 is a classification model obtained by training sample data weighted by classification weights and cross entropy loss.
As can be seen from fig. 8, compared with the prior art 1 and the prior art 2, the curve fitting between the prediction curve obtained by the scheme and the original data is better, that is, the difference between the prediction result obtained by the target classification model of the scheme and the original data is smaller, so that the prediction distribution of the scheme is closer to the original distribution.
In addition, as can be seen from fig. 8 and 9, in prior art 1, since no classification weight is introduced, the first four categories, especially the third category, of the original data occupy a relatively large amount, the classification model equally classifies most of the prediction results into the third category during prediction, and the ratio of the prediction results corresponding to the third category reaches 32.53%, which is 9.46% higher than that of the original data; however, the predictions corresponding to the sixth, seventh and eighth categories are all significantly different from the original data, e.g., the eighth category corresponds to only 0.75% of the predictions that are 5.16% lower than the original data. In this scheme, the prediction results corresponding to the sixth, seventh and eighth categories are smaller than the original data. Taking asset data of a predicted business object as an example, if a business emphasis is placed on the rating of a business object with high asset data, the classification model of prior art 1 results in a low accuracy of the prediction results. Therefore, although the prediction accuracy (52.25%) of the entire classification model of the prior art 1 is high, the prediction distribution of the prior art 1 is greatly different from the original distribution, and is not suitable for classification prediction in a scene with unbalanced sample data.
The classification model adopted in the prior art 2 is a classification model obtained by training sample data weighted by classification weights and cross entropy loss, compared with the prior art 1, the distribution of model prediction after the classification weights are applied is closer to the original distribution, the data prediction accuracy of few samples is improved to a certain extent, but the overall prediction accuracy of the model is reduced more than a relatively large amount, only 48.77%, so that the prediction accuracy is difficult to be ensured by adopting the model in the prior art 2 for classification prediction.
In summary, compared with the prior art 1 and the prior art 2, the data prediction accuracy for a small number of samples is higher, and the prediction accuracy (50.66%) of the whole model is also good. Therefore, the method and the device can be used for improving the accuracy of data category prediction, and are particularly suitable for category prediction in a sample data unbalanced scene.
Referring to fig. 10, fig. 10 is a schematic structural diagram of a data class-based prediction apparatus according to an embodiment of the present application. As shown in fig. 10, the data class-based prediction apparatus 1 may include: a first weighting processing module 11, a first loss determination module 12 and a first parameter correction module 13. The detailed description of the individual modules follows:
The first weighting processing module 11 is configured to obtain initial sample data in the training sample set, obtain a first classification weight of the initial sample data during the t-th training, and perform weighting processing on the initial sample data according to the first classification weight to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer;
the first loss determining module 12 is configured to input the first weighted sample data into an initial classification model, perform a class prediction process on the first weighted sample data through the initial classification model, obtain a first class prediction result corresponding to the first weighted sample data, and determine a first loss corresponding to the initial classification model according to the first class prediction result, the first classification weight, and class label information corresponding to the initial sample data;
the first parameter correction module 13 is configured to perform weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t-th training, perform network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t-th training, and obtain a target classification model according to the second classification weight and the candidate classification model for the t-th training; the target classification model is used for predicting the category to which the target data belong.
The specific functional implementation manners of the first weighting processing module 11, the first loss determining module 12, and the first parameter correcting module 13 may refer to step S101-step S103 in the embodiment corresponding to fig. 3, and are not described herein.
In one or more embodiments, the training sample set includes a first category of sample data and a second category of sample data; the data class-based prediction apparatus 1 may further include: a sample number acquisition module 14 and a classification weight determination module 15, wherein:
a sample number obtaining module 14, configured to obtain a first sample number corresponding to sample data of a first category in the training sample set, and a second sample number corresponding to sample data of a second category in the training sample set;
a classification weight determining module 15, configured to determine initial classification weights corresponding to the first category and the second category respectively according to the first sample number and the second sample number; wherein the product between the first sample number and the initial classification weight corresponding to the first class is the same as the product between the second sample number and the initial classification weight corresponding to the second class; when t=1, the first classification weight in the t training is the initial classification weight corresponding to the category to which the initial sample data belongs.
The specific functional implementation manner of the sample number obtaining module 14 and the classification weight determining module 15 may refer to step S101 in the embodiment corresponding to fig. 3, and will not be described herein.
In one or more embodiments, the data class prediction apparatus 1 may further include: a first class prediction module 16, a cross entropy loss determination module 17, and a second parameter modification module 18, wherein:
the first class prediction module 16 is configured to obtain a source classification model, input first weighted sample data to the source classification model, and perform class prediction processing on the first weighted sample data through the source classification model to obtain an initial class prediction result corresponding to the first weighted sample data;
the cross entropy loss determining module 17 is configured to determine a cross entropy loss corresponding to the source classification model according to the initial category prediction result and category label information corresponding to the initial sample data;
a second parameter correction module 18, configured to correct the network parameters of the source classification model according to the cross entropy loss, and determine the source classification model including the corrected network parameters as an initial classification model.
The specific functional implementation manner of the first class prediction module 16, the cross entropy loss determination module 17 and the second parameter correction module 18 may refer to step S102 in the embodiment corresponding to fig. 3, and will not be described herein.
In one or more embodiments, the initial classification model includes M classification sub-models; m is an integer greater than 1; the first loss determination module 12 may include: a category prediction unit 121 and a first determination unit 122, wherein:
a class prediction unit 121, configured to input the first weighted sample data into an initial classification model, and perform class prediction processing on the first weighted sample data through M classification sub-models of the initial classification model, to obtain M sub-class prediction results;
the first determining unit 122 is configured to determine a sum of the M subcategory prediction results as a first category prediction result corresponding to the first weighted sample data.
The specific functional implementation manner of the category prediction unit 121 and the first determination unit 122 may refer to step S202-step S203 in the embodiment corresponding to fig. 5, and will not be described herein.
In one or more embodiments, the first loss determination module 12 may include: a log processing unit 123, a second determination unit 124, and a third determination unit 125, wherein:
a log processing unit 123, configured to perform log processing on the first type of prediction result, so as to obtain a log result corresponding to the first type of prediction result;
A second determining unit 124, configured to determine a correction coefficient corresponding to the logarithmic result according to the first type prediction result and the type label information corresponding to the initial sample data;
the third determining unit 125 is configured to determine a product among the logarithmic result, the correction coefficient, and the first classification weight as a first loss corresponding to the initial classification model.
The specific functional implementation manner of the log processing unit 123, the second determining unit 124, and the third determining unit 125 may refer to step S102 in the embodiment corresponding to fig. 3, and will not be described herein.
In one or more embodiments, the second determining unit 124 may include: a first determination subunit 1241 and a second determination subunit 1242, wherein:
a first determining subunit 1241, configured to determine a product between the first class prediction result and class label information corresponding to the initial sample data as a first candidate value;
the second determining subunit 1242 is configured to obtain the constant parameter and the exponent parameter, determine a difference value between the constant parameter and the first candidate value as a second candidate value, and perform an exponent operation on the second candidate value and the exponent parameter to obtain a correction coefficient corresponding to the logarithmic result.
The specific functional implementation manner of the first determining subunit 1241 and the second determining subunit 1242 may refer to step S102 in the embodiment corresponding to fig. 3, which is not described herein again.
In one or more embodiments, the first parameter modification module 13 may include: a weighting processing unit 131, a second loss determination unit 132, a parameter correction unit 133, and a model determination unit 134, wherein:
a weighting processing unit 131, configured to perform weighting processing on the initial sample data according to the second classification weight, so as to obtain second weighted sample data of the t+1st training;
a loss determining unit 132, configured to output a second class prediction result corresponding to the second weighted sample data through the candidate classification model trained for the t-th time, and determine a second loss corresponding to the candidate classification model trained for the t-th time according to the second class prediction result, the second class weight, and the class label information;
a parameter correction unit 133, configured to perform network parameter correction on the candidate classification model trained for the t-th time according to the second loss, so as to obtain a candidate classification model trained for the t+1th time;
the model determining unit 134 is configured to determine the t+1st training candidate classification model as the target classification model if the t+1st training candidate classification model satisfies the training stop condition.
The specific functional implementation manners of the weighting processing unit 131, the loss determining unit 132, the parameter correcting unit 133 and the model determining unit 134 may refer to step S103 in the embodiment corresponding to fig. 3, and will not be described herein.
In one or more embodiments, the data class-based prediction apparatus 1 may further include: a second class prediction module 19, a third class prediction module 20 and a weighted summation module 21, wherein:
the second class prediction module 19 is configured to obtain target data, input the target data into a target classification model, and perform class prediction processing on the target data through the target classification model to obtain a third class prediction result corresponding to the target data;
the third class prediction module 20 is configured to obtain a traffic class policy associated with the target data, and obtain a fourth class prediction result corresponding to the target data according to a hit result of the target data in the traffic class policy;
the weighted summation module 21 is configured to perform weighted summation processing on the third category prediction result and the fourth category prediction result, and determine a category to which the target data belongs.
The specific functional implementation manner of the second class prediction module 19, the third class prediction module 20 and the weighted summation module 21 may refer to step S206-step S208 in the embodiment corresponding to fig. 5, and will not be described herein.
In the embodiment of the application, after initial sample data in a training sample set is obtained and a first classification weight of the initial sample data in the t-th training is obtained, the initial sample data can be weighted according to the first classification weight to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer; further, the first weighted sample data can be input into an initial classification model, category prediction processing is carried out on the first weighted sample data through the initial classification model, a first category prediction result corresponding to the first weighted sample data is obtained, and a first loss corresponding to the initial classification model is determined according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data; finally, the first classification weight can be subjected to weight correction according to the first loss to obtain a second classification weight for the t training, the initial classification model is subjected to network parameter correction according to the first loss to obtain a candidate classification model for the t training, and the target classification model is obtained according to the second classification weight and the candidate classification model for the t training; the target classification model is used for predicting the category to which the target data belong. Therefore, the initial sample data is weighted through the first classification weight, so that the prediction accuracy is improved when the target data is predicted by the target classification model constructed based on the weighted first weighted sample data; further, since the first loss corresponding to the initial classification model takes the first classification weight into consideration, when the network parameter of the initial classification model is corrected based on the first loss, the model performance of the candidate classification model trained for the t-th time can be improved; in addition, the target classification model is obtained based on the second classification weight and the t-th trained candidate classification model, so that the target classification model is adopted to predict the target data, and the prediction accuracy in the application scene of unbalanced sample data can be further improved.
Referring to fig. 11, fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the application. As shown in fig. 11, the computer device 1000 may include: processor 1001, network interface 1004, and memory 1005, and in addition, the above-described computer device 1000 may further include: a user interface 1003, and one or more communication buses 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface, among others. Alternatively, the network interface 1004 may include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as one or more disk memories. Optionally, the memory 1005 may also be one or more storage devices located remotely from the aforementioned processor 1001. As shown in fig. 11, an operating system, a network communication module, a user interface module, and a device control application may be included in the memory 1005, which is one type of computer-readable storage medium.
In the computer device 1000 shown in FIG. 11, the network interface 1004 may provide network communication functions; while user interface 1003 is primarily used as an interface for providing input to a user; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
acquiring initial sample data in a training sample set, acquiring first classification weights of the initial sample data in the t-th training, and carrying out weighting treatment on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer;
inputting the first weighted sample data into an initial classification model, performing category prediction processing on the first weighted sample data through the initial classification model to obtain a first category prediction result corresponding to the first weighted sample data, and determining a first loss corresponding to the initial classification model according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data;
Carrying out weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t-th training, carrying out network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t-th training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t-th training; the target classification model is used for predicting the category to which the target data belong.
It should be understood that the computer device 1000 described in the embodiment of the present application may perform the description of the data-class-based prediction method in the embodiment corresponding to any one of fig. 3 and fig. 5, and may also perform the description of the data-class-based prediction apparatus 1 in the embodiment corresponding to fig. 10, which is not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.
Furthermore, it should be noted here that: the embodiment of the present application further provides a computer readable storage medium, in which the aforementioned computer program executed by the data class-based prediction device 1 is stored, and the computer program includes program instructions, when executed by a processor, can execute the description of the data class-based prediction method in any of the foregoing embodiments shown in fig. 3 and 5, and therefore, a detailed description thereof will not be provided herein. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application. As an example, program instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or, alternatively, across multiple computing devices distributed across multiple sites and interconnected by a communication network, where the multiple computing devices distributed across multiple sites and interconnected by the communication network may constitute a blockchain system.
In addition, it should be noted that: embodiments of the present application also provide a computer program product or computer program that may include computer instructions that may be stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor may execute the computer instructions, so that the computer device performs the description of the data class based prediction method in the embodiment corresponding to fig. 3 and 5, which will not be described in detail herein. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the computer program product or the computer program embodiments according to the present application, reference is made to the description of the method embodiments according to the present application.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of action described, as some steps may be performed in other order or simultaneously according to the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.
The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.
The modules in the device of the embodiment of the application can be combined, divided and deleted according to actual needs.
Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of a computer program stored in a computer-readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
The foregoing disclosure is illustrative of the present application and is not to be construed as limiting the scope of the application, which is defined by the appended claims.

Claims (11)

1. A data class-based prediction method, comprising:
acquiring initial sample data in a training sample set, acquiring first classification weights of the initial sample data during the t-th training, and carrying out weighting processing on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer;
Inputting the first weighted sample data into an initial classification model, performing category prediction processing on the first weighted sample data through the initial classification model to obtain a first category prediction result corresponding to the first weighted sample data, and determining a first loss corresponding to the initial classification model according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data;
performing weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t training, performing network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t training; the target classification model is used for predicting the category to which the target data belong.
2. The method of claim 1, wherein the training sample set comprises a first category of sample data and a second category of sample data;
after the initial sample data in the training sample set is obtained, the method further comprises:
Acquiring a first sample number corresponding to sample data of a first category in the training sample set, and acquiring a second sample number corresponding to sample data of a second category in the training sample set;
determining initial classification weights corresponding to the first category and the second category respectively according to the first sample number and the second sample number; wherein the product between the first sample number and the initial classification weight corresponding to the first class is the same as the product between the second sample number and the initial classification weight corresponding to the second class; when t=1, the first classification weight in the t training is the initial classification weight corresponding to the category to which the initial sample data belongs.
3. The method of claim 1, further comprising, prior to said inputting said first weighted sample data into an initial classification model:
acquiring a source classification model, inputting the first weighted sample data into the source classification model, and performing category prediction processing on the first weighted sample data through the source classification model to obtain an initial category prediction result corresponding to the first weighted sample data;
Determining cross entropy loss corresponding to the source classification model according to the initial category prediction result and category label information corresponding to the initial sample data;
and correcting network parameters of the source classification model according to the cross entropy loss, and determining the source classification model containing the corrected network parameters as the initial classification model.
4. The method of claim 1, wherein the initial classification model comprises M classification sub-models; m is an integer greater than 1;
and performing category prediction processing on the first weighted sample data through the initial classification model to obtain a first category prediction result corresponding to the first weighted sample data, wherein the method comprises the following steps:
performing category prediction processing on the first weighted sample data through M classification sub-models of the initial classification model to obtain M sub-category prediction results;
and determining the sum of the M subcategory prediction results as a first category prediction result corresponding to the first weighted sample data.
5. The method of claim 1, wherein determining the first penalty corresponding to the initial classification model based on the first class prediction result, the first classification weight, and class label information corresponding to the initial sample data comprises:
Carrying out logarithmic processing on the first type prediction result to obtain a logarithmic result corresponding to the first type prediction result;
determining a correction coefficient corresponding to the logarithmic result according to the first category prediction result and category label information corresponding to the initial sample data;
and determining the product among the logarithmic result, the correction coefficient and the first classification weight as a first loss corresponding to the initial classification model.
6. The method of claim 5, wherein determining the correction factor corresponding to the logarithmic result based on the first class prediction result and class label information corresponding to the initial sample data comprises:
determining a product between the first class prediction result and class label information corresponding to the initial sample data as a first candidate value;
and acquiring a constant parameter and an index parameter, determining a difference value between the constant parameter and the first candidate value as a second candidate value, and performing index operation on the second candidate value and the index parameter to obtain a correction coefficient corresponding to the logarithmic result.
7. The method of claim 1, wherein the deriving the target classification model from the second classification weight and the t-th trained candidate classification model comprises:
Weighting the initial sample data according to the second classification weight to obtain second weighted sample data of the t+1st training;
outputting a second class prediction result corresponding to the second weighted sample data through a t-th trained candidate classification model, and determining a second loss corresponding to the t-th trained candidate classification model according to the second class prediction result, the second class weight and the class label information;
network parameter correction is carried out on the candidate classification model trained for the t time according to the second loss, and a candidate classification model trained for the t+1th time is obtained;
and if the t+1st training candidate classification model meets the training stopping condition, determining the t+1st training candidate classification model as the target classification model.
8. The method according to any one of claims 1-7, further comprising, after the deriving a target classification model from the second classification weight and the t-th trained candidate classification model:
acquiring target data, inputting the target data into the target classification model, and performing category prediction processing on the target data through the target classification model to obtain a third category prediction result corresponding to the target data;
Acquiring a business class strategy associated with the target data, and acquiring a fourth class prediction result corresponding to the target data according to a hit result of the target data in the business class strategy;
and carrying out weighted summation processing on the third category prediction result and the fourth category prediction result to obtain the category to which the target data belong.
9. A data class-based prediction apparatus, comprising:
the first weighting processing module is used for acquiring initial sample data in a training sample set, acquiring first classification weights of the initial sample data in the t-th training, and carrying out weighting processing on the initial sample data according to the first classification weights to obtain first weighted sample data of the t-th training; the first classification weights are used for balancing the proportion of the initial sample data of different categories in the training sample set, and the initial sample data of different categories in the training sample set correspond to different first classification weights; t is a positive integer;
the first loss determination module is used for inputting the first weighted sample data into an initial classification model, carrying out category prediction processing on the first weighted sample data through the initial classification model to obtain the first weighted sample data, and determining first loss corresponding to the initial classification model according to the first category prediction result, the first classification weight and category label information corresponding to the initial sample data;
The first parameter correction module is used for carrying out weight correction on the first classification weight according to the first loss to obtain a second classification weight for the t training, carrying out network parameter correction on the initial classification model according to the first loss to obtain a candidate classification model for the t training, and obtaining a target classification model according to the second classification weight and the candidate classification model for the t training; the target classification model is used for predicting the category to which the target data belong.
10. A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, performs the steps of the method of any of claims 1 to 8.
11. A computer readable storage medium, characterized in that it stores a computer program comprising program instructions which, when executed by a processor, perform the steps of the method according to any of claims 1 to 8.
CN202310231183.7A 2023-03-01 2023-03-01 Data category based prediction method, device, equipment and medium Pending CN116975753A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310231183.7A CN116975753A (en) 2023-03-01 2023-03-01 Data category based prediction method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310231183.7A CN116975753A (en) 2023-03-01 2023-03-01 Data category based prediction method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN116975753A true CN116975753A (en) 2023-10-31

Family

ID=88481990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310231183.7A Pending CN116975753A (en) 2023-03-01 2023-03-01 Data category based prediction method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN116975753A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117292266A (en) * 2023-11-24 2023-12-26 河海大学 Method and device for detecting concrete cracks of main canal of irrigation area and storage medium
CN118171158A (en) * 2024-03-07 2024-06-11 北京中卓时代消防装备科技有限公司 Fire disaster mode identification method and system based on big data

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117292266A (en) * 2023-11-24 2023-12-26 河海大学 Method and device for detecting concrete cracks of main canal of irrigation area and storage medium
CN117292266B (en) * 2023-11-24 2024-03-22 河海大学 Method and device for detecting concrete cracks of main canal of irrigation area and storage medium
CN118171158A (en) * 2024-03-07 2024-06-11 北京中卓时代消防装备科技有限公司 Fire disaster mode identification method and system based on big data

Similar Documents

Publication Publication Date Title
CN116975753A (en) Data category based prediction method, device, equipment and medium
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
CN112508580A (en) Model construction method and device based on rejection inference method and electronic equipment
CN112990423A (en) Artificial intelligence AI model generation method, system and equipment
CN111797320A (en) Data processing method, device, equipment and storage medium
CN112529683A (en) Method and system for evaluating credit risk of customer based on CS-PNN
CN111695084A (en) Model generation method, credit score generation method, device, equipment and storage medium
CN110796485A (en) Method and device for improving prediction precision of prediction model
JP2022033695A (en) Method, device for generating model, electronic apparatus, storage medium and computer program product
CN112215269A (en) Model construction method and device for target detection and neural network architecture
CN114202065B (en) Stream data prediction method and device based on incremental evolution LSTM
CN113342474A (en) Method, device and storage medium for forecasting customer flow and training model
CN117851909B (en) Multi-cycle decision intention recognition system and method based on jump connection
CN117060408B (en) New energy power generation prediction method and system
CN112801231B (en) Decision model training method and device for business object classification
CN117436929A (en) Prediction method and device for user repurchase behavior
CN115412401B (en) Method and device for training virtual network embedding model and virtual network embedding
CN110322055B (en) Method and system for improving grading stability of data risk model
CN114862092A (en) Evaluation method and device based on neural network
CN114781699A (en) Reservoir water level prediction early warning method based on improved particle swarm Conv1D-Attention optimization model
CN114444606A (en) Model training and data classification method and device
CN113723593A (en) Load shedding prediction method and system based on neural network
CN112767128A (en) Weight determination model training method, risk prediction method and device
KR102497543B1 (en) Military demand prediction model and practical system using machine learning
CN117834630B (en) Method, apparatus and medium for sensing edge node health status in a network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication