CN108665175A - A kind of processing method, device and the processing equipment of insurance business risk profile - Google Patents

A kind of processing method, device and the processing equipment of insurance business risk profile Download PDF

Info

Publication number
CN108665175A
CN108665175A CN201810469782.1A CN201810469782A CN108665175A CN 108665175 A CN108665175 A CN 108665175A CN 201810469782 A CN201810469782 A CN 201810469782A CN 108665175 A CN108665175 A CN 108665175A
Authority
CN
China
Prior art keywords
risk
data
forecast model
decision tree
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810469782.1A
Other languages
Chinese (zh)
Inventor
吴龙凤
陈*
石秋慧
张泰玮
陈诗奕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201810469782.1A priority Critical patent/CN108665175A/en
Publication of CN108665175A publication Critical patent/CN108665175A/en
Priority to TW108105613A priority patent/TW201947470A/en
Priority to PCT/CN2019/076524 priority patent/WO2019218751A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Technology Law (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

This specification embodiment discloses a kind of processing method, device and the processing equipment of insurance business risk profile.Therefore decision tree can be promoted using gradient to build risk forecast model, which can use the risk association data associated with insurance business of mark to be trained in advance using the method that this specification embodiment provides.Risk profile on line can be used as to use when prediction requires when risk forecast model training reaches, insurance business risk profile is carried out to user to be predicted, exports prediction result.The non-linear variable of various dimensions in insurance business can be rationally and effectively applied using the method that this specification embodiment provides, the risk forecast model that the non-linear relation of decision tree is promoted based on gradient can be compatible with linear and nonlinear variable, relative to traditional linear model, the accuracy of prediction result is obviously improved, the deficiency of traditional linear model is effectively made up, insurance business service experience is improved.

Description

A kind of processing method, device and the processing equipment of insurance business risk profile
Technical field
This specification example scheme belongs to the microcomputer data processing field of insurance business risk profile, especially relates to And a kind of processing method, device and the processing equipment of insurance business risk profile.
Background technology
Automobile insurance, that is, car insurance (or abbreviation vehicle insurance) refers to motor vehicles due to natural calamity or unexpected thing Therefore caused personal injury or property loss bears a kind of business insurance of liability to pay compensation.With the development of economy, motor vehicles Quantity be continuously increased, currently, vehicle insurance has become one of maximum insurance kind in Chinese property insurance business.
For user when progress vehicle is insured, insurance company would generally carry out risk assessment, the result of risk assessment to user It will have a direct impact on user's insured amount, preferential treatment etc..By the risk assessment to user, insurance company can be more accurate Really, the processing for reasonably carrying out insurance business, effectively evades or reduces business risk.Currently, in vehicle insurance risk profile field, base It is pre- that mainstream risk in the industry is had become in the risk profile of generalized linear model (generalized linear model, GLM) Survey technology system.Generalized linear model master is to be processed to reduce by 1 percentage for linearly related data object, such as online duration Point, age increase 1 years old, and the modeling of GLM can be realized based on length of surfing the Net data and the linear relationship of age data.
But with being continuously increased for vehicle insurance business, multiple types have been presented in internet data, mass data increases, traditional GLM model systems are increasingly restricted, for example, if " age " does not change with online duration not instead of merely and is changed, it is same When it is related to the shopping of crowd and custom etc., different consumption habits with Self-variation change age distribution be in non-linear shadow Loud form.Non-linear variable can be carried out segmentation by branch mailbox and summarized by GLM models, but can lose the accurate of many variables Property, it is difficult to adapt to the risk profile requirement of current big data, various dimensions.Therefore, there is an urgent need for one kind in the industry can be in multi-dimensional data In it is more efficient and it is efficient carry out vehicle insurance business risk prediction processing mode.
Invention content
This specification embodiment is designed to provide a kind of processing method of insurance business risk profile, device and processing and sets It is standby, decision tree can be promoted by introducing gradient in insurance business risk profile, realize the insurance of compatible non-linear relation The risk profile for data of being engaged in, effectively improves the accuracy of insurance business risk profile.
Processing method, device and the processing equipment for a kind of insurance business risk profile that this specification embodiment provides are packets Include following manner realization:
A kind of processing method of insurance business risk profile, the method includes:
Obtain the target risk associated data of user to be predicted;
The target risk associated data is handled using the risk forecast model of structure, exports the use to be predicted The risk profile at family is as a result, the risk forecast model method includes:It is determined to gradient promotion using the risk association data of mark Plan tree is trained determining prediction model.
A kind of insurance business risk profile processing unit, including:
Prediction data acquisition module, the target risk associated data for obtaining user to be predicted;
Risk profile module, for the risk forecast model using structure to the target risk associated data at Reason exports the risk profile of the user to be predicted as a result, the risk forecast model method includes:It is closed using the risk of mark Connection data promote decision tree to gradient and are trained determining prediction model.
A kind of insurance business risk profile processing equipment, including processor and for storing processor-executable instruction Memory, the processor are realized when executing described instruction:
Obtain the target risk associated data of user to be predicted;
The target risk associated data is handled using the risk forecast model of structure, exports the use to be predicted The risk profile at family is as a result, the risk forecast model method includes:It is determined to gradient promotion using the risk association data of mark Plan tree is trained determining prediction model.
Processing method, device and the processing equipment for a kind of insurance business risk profile that this specification embodiment provides, can Risk forecast model is built to promote decision tree using gradient in advance, which can use mark and guarantor The dangerous associated risk association data of business are trained.Line can be used as when prediction requires when risk forecast model training reaches Upper risk profile uses, and carries out insurance business risk profile to user to be predicted, exports prediction result.Implemented using this specification The method that example provides can rationally and effectively apply the non-linear variable of various dimensions in insurance business, and decision tree is promoted based on gradient The risk forecast model of non-linear relation can be compatible with linear and nonlinear variable, relative to traditional linear mould The accuracy of type, prediction result is obviously improved, and effectively makes up the deficiency of traditional linear model, improves insurance business service Experience.
Description of the drawings
In order to illustrate more clearly of this specification embodiment or technical solution in the prior art, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only Some embodiments described in this specification, for those of ordinary skill in the art, in not making the creative labor property Under the premise of, other drawings may also be obtained based on these drawings.
Fig. 1 is a kind of flow diagram for insurance business risk profile processing method embodiment that this specification provides;
Fig. 2 is a kind of processing procedure schematic diagram of structure risk forecast model in the method that this specification provides;
Fig. 3 is the processing procedure schematic diagram that decision tree learning is trained in the method that this specification provides;
Fig. 4 is a kind of hardware configuration for server using insurance business risk profile processing method that this specification provides Block diagram.
Fig. 5 is a kind of modular structure schematic diagram for insurance business risk profile processing unit that this specification provides.
Specific implementation mode
In order to make those skilled in the art more fully understand the technical solution in this specification, below in conjunction with this explanation Attached drawing in book embodiment is clearly and completely described the technical solution in this specification embodiment, it is clear that described Embodiment be only a part of the embodiment in this specification, instead of all the embodiments.Base in this manual one A or multiple embodiments, the every other reality that those of ordinary skill in the art are obtained without creative efforts Example is applied, the range of this specification embodiment protection should be all belonged to.
With the development of computer internet technology, data volume is skyrocketed through.Data characteristics when insurance business risk profile Classification also more and more dimensions, detailed-oriented.Many influences of the variable to sifting sort such as are surfed the Internet with non-linear existing Correlation is presented in duration and age, but the correlation can be diversified.Such as can be simple linear relationship, example If online duration reduces by 1 percentage point, the age increases 1 years old;Can also be more complicated relationship, such as exponential relationship, online Duration reduces by 4 percentage points, and the age increases 2 years old, can be changed by certain mathematics be converted into linear can use extensively at this time Adopted linear model solves.In actual life, other than the variable of some substantially linear relationships, there is also a large amount of non-linear variables. Such as when predicting the age, if " age " not instead of not merely with online duration change and change, simultaneously with the shopping of crowd And custom etc. correlation, different consumption habits change age distribution in the form of non-linear effects with Self-variation.Because Predict that " age of user " is the first purpose, if the pre- of model will be greatly reduced in some GLM model None- identified non-linear relations Survey performance.In the mode of existing solution, variable can be subjected to segmentation by branch mailbox and summarized, but the essence of many variables can be lost Parasexuality reduces prediction result.Another insurance business for being different from existing conventional implementation that this specification embodiment provides The implementation method of risk prediction, introducing GBDT, (Gradient Boosting Decision Tree, gradient promote decision Tree), risk forecast model rationally and effectively can be built using non-linear variable in risk profile, which can be simultaneous well Hold linear and nonlinear variable, relative to traditional linear model, the accuracy of prediction result is obviously improved.
GBDT (Gradient Boosting Decision Tree) is a kind of decision Tree algorithms of iteration, the algorithm by More decision tree compositions, the conclusion of all trees, which adds up, does final result.Tree in GBDT is all regression tree, can be used for doing Regression forecasting.In the processing method for the insurance business risk profile that this specification provides, the risk of mark can be used in advance Associated data builds decision-tree model, is gradually adjusted to the parameter in decision tree by the machine learning (distribution iteration) of recurrence excellent Change.When model prediction result meets the required precision of insurance business risk profile, can on line using predicting use to be predicted The risk numerical value at family or loss ratio etc..
Below to this specification embodiment by taking the application scenarios of a specific vehicle insurance business risk prediction processing as an example It illustrates.Specifically, Fig. 1 is the flow of the processing method embodiment for the insurance business risk profile that this specification provides Schematic diagram.Although present description provides such as following embodiments or method operating procedure shown in the drawings or apparatus structure, base It is less after it may include either routinely more in the method or device without performing creative labour or part merging Operating procedure or modular unit.In the step of there is no necessary causalities in logicality or structure, the execution of these steps Sequence or the modular structure of device are not limited to this specification embodiment or execution shown in the drawings sequence or modular structure.Described Device in practice, server or the end product of method or modular structure are in application, can be according to embodiment or attached drawing Shown in method or the execution of modular structure carry out sequence are either parallel executes (such as the ring of parallel processor or multiple threads Border, the even implementation environment including distributed treatment, server cluster).
Certainly, the description of the embodiment of following vehicle insurance business risks prediction not to based on this specification other are expansible To technical solution be construed as limiting.Such as in other implement scenes, the embodiment that this specification provides is equally applicable Into the implement scene of fund risk prediction, medical insurance risk profile etc., the application in other implement scenes is with reference to this explanation The embodiment of book vehicle insurance business describes, and no longer carries out alternative repeated description.A kind of specific embodiment is as shown in Figure 1, originally Specification provide a kind of insurance business risk profile processing method may include:
S0:Obtain the target risk associated data of user to be predicted;
S2:The target risk associated data is handled using the risk identification algorithm of structure, is waited for described in output pre- The risk profile of user is surveyed as a result, the risk identification algorithm includes:Gradient is promoted using the risk association data of mark Decision tree is trained determining risk forecast model.
In one or more embodiments of this specification, the risk forecast model based on GBDT can be built in advance.Specifically GBDT models training and structure summed data can be needed to carry out corresponding model structure and parameter according to practical business scene Setting, such as can individually be trained, trained residual error continues to train as another input set with single tree;Or The multistage connection of more trees is trained, and training residual error is re-used as the input of the number of another multistage connection.Certainly, other to implement It can also be using the non-linear relation for carrying out some deformations, transformation or the realization of improved Processing Algorithm based on GBDT algorithms in example The risk profile of insurance business data is handled, and this specification no longer repeats the realization process of GBDT model constructions one by one.
The training for determining risk forecast model can be acquired in the present embodiment previously according to history vehicle insurance business declaration form data Data, divide according to risk or setting requirements carry out mark to training data.In the present embodiment insurance business risk profile In implement scene, the training data is properly termed as risk association data, these risk association data are usually and insurance business It is associated, for the sample training to risk forecast model.Such as risk association can be the user characteristics for including multiple dimensions Data, an associated user characteristic data of user is one group of training data, and every group of risk association data can be arranged with mark Corresponding risk score value.Specifically, in one embodiment of this specification the method, the risk association data may include With the user characteristic data of at least one classification, the user characteristic data includes non-linear relation associated with insurance business Data information.Such as in an example, the risk association data of user A may include (A1, A2, A3 ..., A9) 9 dimensions User characteristic data.The user characteristic data of different dimensions can be chosen accordingly according to the demand that vehicle insurance is predicted, such as above-mentioned Exemplary 9 dimensions may include the age, gender, occupation, annual income, and history is in danger number, monthly average consumption, reference grade, wedding Relation by marriage situation, debt assets.Or the user characteristic data for obtaining 10 or 10 dimensions or more can be acquired in advance, determining wind The user characteristic data for needing to carry out model training is chosen when dangerous associated data from the user characteristic data of multiple dimensions.Example Such as, specific risk association data may include as shown in table 1 below:
The risk association data of 1 model training of table illustrate table
Certainly, in other embodiments, the risk association data can also include the people generated according to pre-defined rule Number evidence, such as self-defined setting carries out the wind of model training the case where operating personnel may be able to include according to expected risk Dangerous associated data.Alternatively, required risk association data are automatically generated by computer after the data create-rule of setting.Here The artificial data of generation be more in line with expected risk profile situation, and history vehicle insurance case data are then closer to true wind Dangerous situation condition, some are practiced in scene, can be used one such alternatively, in combination with artificial data and history vehicle insurance case Number of packages is according to the training for carrying out risk forecast model, to improve the accuracy of prediction result.
The risk association data of acquisition can be trained as training data in GBDT models, after learning training Threshold value (can be whole threshold values or partial threshold value) energy of decision feature in risk forecast model when decision tree branch Meet the required precision (usually may also require that the output of continuous-stable meets required precision) of model final output.This theory The GBDT used in bright book embodiment is a kind of decision Tree algorithms of iteration, can be mainly divided into decision tree (Regression DecisionTree, DT) and gradient promotion (Gradient boosting, GB).Decision tree is broadly divided into two classes:Classification tree and Regression tree, classification tree is commonly used to solve classification problem, for example whether user's gender, webpage are whether the rubbish page, user make Disadvantage etc..And regression tree is generally used to prediction actual value, such as the age of user, the probability of user's click, webpage degree of correlation Etc..The former is used for tag along sort paper, and the latter is for predicting real number value.It is emphasized that the result plus-minus of regression tree is Significant, such as -3 years old=12 years old+5 years old 10 years old, the latter was then that cumulative or accumulation result of having no idea is meaningless, such as man+man+female =be man on earth it is female.This specification embodiment can predict the vehicle insurance score value of vehicle insurance using regression tree, all trees of such as adding up Result as ultimate risk predict as a result,
Regression tree substantially flow is similar with classification tree, and difference lies in each node of regression tree can obtain one in advance Measured value, by taking the age as an example, which is equal to the average value at the owner's age for belonging to this node.When branch it is exhaustive each Feature finds optimal cutting variable and optimal cut-off, and the criterion weighed in the present embodiment is no longer the Geordie system in classification tree Number, but square error minimizes.The number for being namely predicted mistake is more, and square error is bigger, flat by minimizing Square error finds most reliable branch foundation.Branch until on each leaf node people to play it is interested be it is unique or Person reaches preset end condition (such as leaf number upper limit), if finally the age is not unique on leaf node, with the section Prediction result of the proprietary average age as the leaf node on point.
It is a kind of for returning, classifying and the machine learning skill of Sorting task that gradient, which promotes (Gradient boosting), Art belongs to a part for Boosting algorithms race.Boosting is the algorithm that weak learner can be promoted to strong learner by family, Belong to the scope of integrated study (ensemble learning).Boosting methods are based on such a thought:It is multiple for one For miscellaneous task, the judgement of multiple experts is subjected to the judgement that comprehensive income appropriate goes out, than one expert's list of any of which Only judgement will be got well.Generally, it is exactly the reason of " Three Stooges top Zhuge Liang ".Gradient is promoted with other boosting Method is the same, by integrating (ensemble) multiple weak learners, typically decision tree, to build final prediction model. Boosting methods build model by way of substep iteration (stage-wise), in weak that each step of iteration is built Device is practised to be provided to make up the deficiency of existing model.
Such as in a specific processing procedure, the tree of tree can be set when training, a tree for tree reaches specified (such as 80) can be with deconditioning after numerical value;Or when residual error very little (condition for meeting deconditioning), the two Condition meets a training can deconditioning.
If when the N residual error is not all 0 or is unsatisfactory for stop condition, the residual result for the node set using the N is replaced It is updated in the N+1 tree and is learnt for corresponding initial value;
Until the residual sum predicted value of N+K number leaf nodes is equal or is less than threshold value, current leaf node pair is exported The risk profile result (value-at-risk or loss ratio) answered.It specifically can be cumulative as ultimate risk predicted value by all residual errors.
Fig. 2 is a kind of processing procedure schematic diagram for structure risk forecast model that this specification provides.As shown in Fig. 2, this In another embodiment of the method that specification provides, train to obtain the risk forecast model using following manner:
S20:Determine the threshold value of total quantity and the decision tree used decision feature in each branch of decision tree, The decision is characterized as one kind in the classification of the user characteristic data;
S22:When being trained to one group of risk association data, if training decision tree tree reach default value or The residual error of person decision tree meets deconditioning condition, then stops the training of this group of risk association data, and the default value is less than Equal to the total quantity;
S24:The threshold value of the decision feature of corresponding decision tree is adjusted by the training result of risk association data, until adjustment When the prediction result output that the threshold value afterwards meets risk forecast model requires, the risk forecast model is determined.
In the present embodiment, it may be predetermined that the quantity for the decision tree that training uses gradually is optimized really by Gradient Iteration The threshold value of decision feature when fixed decision tree progress branch.80 decision trees can be such as used, each tree is per one tree Be before it is all tree conclusion sums residual errors.The threshold value of initial number can be configured based on experience value.If true point of A Value (mark score value is 80 points), but one tree is 60 points according to the decision feature at age prediction score value, 20 points poor, residual error It is 20.So set that (decision is characterized as the occupation of user) is inner to be set as 20 points the score value of A and go to learn at second, if second Tree can really assign to A 20 points of leaf node, conclusion of that cumulative two tree be exactly the true score value of A (prediction score value 60 divides+ Residual error 20 is divided);If the conclusion of second tree is 18 points, A still has 2 points of residual errors, and (decision is characterized as that year receives to third tree Entering) age of inner A reforms into 2 points, continues to learn.The residual computations of each step are equivalent to the power in a disguised form increasing misclassification event Weight, and divided to time be then all intended to 0, e.g., excessive or too small according to the age, then risk is bigger, and, income it is higher Risk is smaller;If it's greatly 60 years old pasts an age of user, but has been divided into the smaller branch L1 of risk, but point that risk is smaller Average age on group L1 is between 20-40 Sui, then the residual values obtained will increase accordingly, which can be by follow-up Income, marital status, driving age etc. gradually by its point to the leaf node close to practical risk.
After if the quantity of the decision tree of training reaches predetermined value, such as from root node until 10 trees of leaf node are equal It is 0 or other residual error outage thresholds that the parameter of time either current number of training, which meets deconditioning condition such as residual error, at this time It can stop the training of this group of data.When each threshold value looks for best cut-point, or meet the cut-point of training requirement, then may be used To determine the threshold value of the decision feature of decision tree, until the prediction result that the threshold value after adjustment meets risk forecast model is defeated When going out requirement, the risk forecast model is determined.Such as initial setting up risk score value be divided into 60 and 80 threshold value be the age whether More than 20 years old.After mass data training optimization, risk assessment this decision feature may finally will be carried out from age dimension Adjust whether the age is more than 24 years old, to meet true predictive result in most cases.
It is determined below with a simple age prediction example to illustrate how to realize using GBDT in this specification embodiment The training of plan tree.Following examples in this specification insurance business risk profile, will replace at the age vehicle insurance risk score value or Whether loss ratio puts question to moon purchase and consumption and often the corresponding classification for replacing with user characteristic data, threshold therein Value is configured accordingly.Specifically implementation process may include:
Assuming that it is 14,16,24 respectively that training set (risk association data), which only has 4 people, A, B, C, D, their age, 26.Wherein A, B are high one and high school senior respectively;C, D are the employee of graduating student and work 2 years respectively.If it is with one Traditional regression tree is trained.It chooses and does age prediction using GBDT, since data are very little, we limit leaf section Point do mostly there are two, i.e. each tree limits all only there are one branch and only learns two trees, can obtain result shown in Fig. 3. In one tree branch, due to A, the B ages are more close, C, and the D ages are more close, they are divided into two groups, flat per appropriation The equal age is as predicted value.Calculating residual error at this time, (meaning of residual error is exactly:The actual value of residual error=A of the predicted value+A of A), institute Residual error with A be exactly 16-15=1 (note that the predicted value of A refers to the cumulative sum of all trees in front, only one tree herein before So being directly 15, needing all to have added up if also having and setting is used as the predicted value of A).And then A, B, C, D can be respectively obtained Residual error be respectively -1,1, -1,1.Then residual error is taken to substitute A, B, C, the initial value of D goes to learn to second tree, if we Predicted value is equal with their residual error, then the conclusion that second is set need to be only added on one tree can obtain real age .Only there are two values 1 and -1 for second tree, are directly divided into two nodes.Proprietary residual error is all 0 at this time, i.e., everyone True predicted value is obtained.
The present A of processing set by two, the predicted value of B, C, D are all consistent with real age:
A:14 years old Students in grade one of senior middle school, shopping is less, often asks schoolmate's problem;Predict age A=15-1=14;
B:16 years old high school seniors;It does shopping less, is often learned younger brother and asked questions;Predict age B=15+1=16;
C:24 years old graduating students;It does shopping more, often asks senior apprentice's problem;Predict age C=25-1=24;
D:The 2 years employees of work in 26 years old;It does shopping more, is often asked questions by junior fellow apprentice;Predict age D=25+1=26.
In another embodiment, when determining the quantity for the decision tree that risk forecast model uses, the user can be based on The quantity of the corresponding classification of characteristic determines.Such as the user characteristic data of 80 dimensions is had chosen, each dimension can be with The decision feature of one tree is represented, can build nonlinear risk forecast model using 80 decision trees in this way.General feelings Condition can be arranged a dimension and correspond to more trees, specifically can be according to the data volume and application scenarios that prediction model is handled Processing requirement is arranged accordingly.Certainly, in this specification others embodiment, the total quantity of specific decision tree can basis Acquisition be data, branch's number of tree, tree the superior and the subordinate's connection relation etc. be determined.
As previously mentioned, the embodiment that this specification provides can be not only used for the implement scene of vehicle insurance business risk prediction In, it is also applied in the implement scene of fund risk prediction, medical insurance risk profile etc..Specifically in vehicle insurance business wind In the application scenarios nearly predicted, the risk forecast model includes being carried out based on risk association data associated with vehicle insurance business The vehicle insurance risk forecast model that training obtains;
S26:The risk profile result includes any one in the loss ratio of user to be predicted, vehicle insurance risk score value.
Certainly, loss ratio described above, vehicle insurance risk score value are only one or more embodiments to non-linear relation A kind of output characteristic manner of risk forecast model.This specification, which does not limit, can also other characterizations in other embodiments The characteristic manner of mode or the loss ratio, vehicle insurance risk score value by deformation, transformation, if loss ratio is after linear transformation Vehicle insurance point can be obtained, vehicle insurance point is bigger, and risk is smaller (for vehicle insurance risk score value on the contrary, risk score value is bigger, risk is higher).
It should be noted that the usually described linear relationship refers to that there are first power function, this explanations between two variables The linear relationship of variable may include y=ax+b forms in insurance or vehicle insurance described in book embodiment, and x is independent variable, y be because Variable.This specification embodiment is in specific insurance or vehicle insurance service application scene, the understanding of the linear relationship broad sense Can refer to relationship between two variables it be specific, fixed, can be stated with straight line under some cases or pass through one Linear relationship is converted into after fixed mathematics variation (information loss of conversion is in a certain range).The non-linear relation is main Refer to relationship between variable it is continually changing, can not be described with formula, with curve, curved surface or can not only be advised under some cases Line then indicates, such as risk score value and occupation, risk score value and gender.
In this specification one or more embodiment, the processing of the structure risk forecast model may be used offline The mode built in advance generates, and can choose the study instruction that the training data comprising non-linear relation carries out GBDT decision trees in advance Practice, is used on line again after the completion of training.This specification is not excluded for the risk forecast model and online structure or more may be used The mode of newly/maintenance, such as in the case where computer capacity is enough, risk forecast model can be constructed online, constructed Risk forecast model can synchronize online use, handle target risk associated data to be predicted.
A kind of processing method for insurance business risk profile that this specification embodiment provides, can be carried using gradient in advance Decision tree is risen to build risk forecast model, which can use the wind associated with insurance business of mark Dangerous associated data is trained.Risk profile on line can be used as to use when prediction requires when risk forecast model training reaches, Insurance business risk profile is carried out to user to be predicted, exports prediction result.It can using the method that this specification embodiment provides Rationally and effectively to apply the non-linear variable of various dimensions in insurance business, the non-linear relation of decision tree is promoted based on gradient Risk forecast model can be compatible with linear and nonlinear variable, relative to traditional linear model, the standard of prediction result True property is obviously improved, and effectively makes up the deficiency of traditional linear model, improves insurance business service experience.
Method described above can be used for the risk identification of client-side, be provided in being applied such as the payment of mobile terminal Insurance business risk assessment.The client can be PC (personal computer) machine, server, industrial personal computer It is (industrial control computer), intelligent movable phone, Flat electronic equipment, portable computer (such as laptop etc.), a Personal digital assistant (PDA) or desktop computer or intelligent wearable device etc..Mobile communication terminal, vehicle-mounted is set handheld device Standby, wearable device, television equipment, computing device.It can also apply and be in insurance company or third party insurance service organization Unite server in, the system server may include individual server, server cluster, distribution system services device or The server of person's processing equipment request data is combined with the system server that other associated datas are handled.For example, a kind of realization In may include establish Ali's cloud open data processing service (Open Data Processing Service, abbreviation ODPS) On platform.Unified programming interface and interface can be provided for the various data processing tasks from different user demands.It is based on ODPS carries out the guarantee of system performance, implements the system of this specification embodiment method and with parallel processing mass data and can reach Best operational performance.
As previously mentioned, the embodiment of the method that this specification embodiment is provided can be in mobile terminal, terminal, clothes It is executed in business device or similar arithmetic unit.For running on the server, Fig. 4 is a kind of application that this specification provides The hardware block diagram of the server of insurance business risk profile processing method.As shown in figure 4, server 10 may include one Or (processor 102 can include but is not limited to Micro-processor MCV or programmable to multiple (one is only shown in figure) processors 102 The processing unit of logical device FPGA etc.), memory 104 for storing data and the transmission module for communication function 106.It will appreciated by the skilled person that structure shown in Fig. 4 is only to illustrate, not to the knot of above-mentioned electronic device It is configured to limit.For example, server 10 may also include more than shown in Fig. 4 or less component, such as can also include Others processing hardware, such as database or multi-level buffer, or with the configuration different from shown in Fig. 4.
Memory 104 can be used for storing the software program and module of application software, such as the search in the embodiment of the present invention Corresponding program instruction/the module of method, processor 102 are stored in software program and module in memory 104 by operation, To perform various functions application and data processing, that is, realize the processing method of above-mentioned navigation interactive interface content displaying.It deposits Reservoir 104 may include high speed random access memory, may also include nonvolatile memory, as one or more magnetic storage fills It sets, flash memory or other non-volatile solid state memories.In some instances, memory 104 can further comprise relative to place The remotely located memory of device 102 is managed, these remote memories can pass through network connection to terminal 10.Above-mentioned network Example include but not limited to internet, intranet, LAN, mobile radio communication and combinations thereof.
Transmission module 106 is used to receive via a network or transmission data.Above-mentioned network specific example may include The wireless network that the communication providers of terminal 10 provide.In an example, transmission module 106 includes that a network is suitable Orchestration (Network Interface Controller, NIC), can be connected with other network equipments by base station so as to Internet is communicated.In an example, transmission module 106 can be radio frequency (Radio Frequency, RF) module, For wirelessly being communicated with internet.
Based on unit type recognition methods described above, this specification also provides a kind of insurance business risk profile processing Device.The device may include the system (including distributed system) for having used this specification embodiment the method, soft Part (application), module, component, server, client etc. simultaneously combine the necessary apparatus for implementing hardware.Based on same innovation Conceive, the processing unit in a kind of embodiment that this specification provides is as described in the following examples.Since device solves the problems, such as Implementation it is similar to method, therefore the implementation of the specific processing unit of this specification embodiment may refer to preceding method Implement, overlaps will not be repeated.Although device described in following embodiment is preferably realized with software, hardware, Or the realization of the combination of software and hardware is also that may and be contemplated.Specifically, as shown in figure 5, Fig. 5 is this specification carries Supply a kind of insurance business risk profile processing unit embodiment modular structure schematic diagram, may include:
Prediction data acquisition module 201 can be used for obtaining the target risk associated data of user to be predicted;
Risk profile module 202 can be used for the risk forecast model using structure to the target risk associated data It is handled, exports the risk profile of the user to be predicted as a result, the risk forecast model method includes:Utilize mark Risk association data promote decision tree to gradient and are trained determining prediction model.
It should be noted that this specification embodiment device described above and, according to the description of related method embodiment Can also include other embodiments.Concrete implementation mode is referred to the description of embodiment of the method, does not make herein one by one It repeats.
The server or client that this specification embodiment provides can execute corresponding journey by processor in a computer Sequence instruction realizes, such as using the c++ language of windows operating systems PC ends or server end realize or other for example The necessary hardware realization of the corresponding application design language set of Linux, system, or the processing logic based on quantum computer Realize etc..Above-mentioned processing equipment can specifically provide the service of risk profile for insurance server or the third party service organization Device, the server can be individual server, server cluster, distribution system services device or processing equipment request The server of data is combined with the system server that other associated datas are handled.This specification also provides a kind of insurance business wind Danger prediction processing equipment, may include specifically processor and the memory for storing processor-executable instruction, described Processor is realized when executing described instruction:
Obtain the target risk associated data of user to be predicted;
The target risk associated data is handled using the risk forecast model of structure, exports the use to be predicted The risk profile at family is as a result, the risk forecast model method includes:It is determined to gradient promotion using the risk association data of mark Plan tree is trained determining prediction model.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute The user characteristic data that risk association data include at least one classification is stated, the user characteristic data includes and insurance business phase The data information of associated non-linear relation.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute Processor is stated to train to obtain the risk forecast model using following manner:
Determine the threshold value of total quantity and the decision tree used decision feature in each branch of decision tree, it is described Decision is characterized as one kind in the classification of the user characteristic data;
When being trained to one group of risk association data, if a tree for the decision tree of training reaches default value or determines The residual error of plan tree meets deconditioning condition, then stops the training of this group of risk association data, and the default value is less than or equal to The total quantity;
The threshold value of the decision feature of corresponding decision tree is adjusted by the training result of risk association data, until after adjustment When the prediction result output that the threshold value meets risk forecast model requires, the risk forecast model is determined.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute The total quantity for stating decision tree is determined based on the quantity of the corresponding classification of the user characteristic data.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute It includes that the vehicle insurance risk that is trained based on risk association data associated with vehicle insurance business is pre- to state risk forecast model Survey model;
The risk profile result includes any one in the loss ratio of user to be predicted, vehicle insurance risk score value.
Above-mentioned instruction can be stored in a variety of computer readable storage mediums.The computer readable storage medium can To include the physical unit for storing information, can by after information digitalization again by the way of electricity, magnetic or optics etc. Media are stored.Computer readable storage medium described in the present embodiment, which has, may include:Information is stored in the way of electric energy Device such as, various memory, such as RAM, ROM;The device of information is stored in the way of magnetic energy such as, hard disk, floppy disk, tape, Core memory, magnetic bubble memory, USB flash disk;Using optical mode store information device such as, CD or DVD.Certainly, also other Readable storage medium storing program for executing of mode, such as quantum memory, graphene memory etc..Device or server described above or visitor Involved instruction in family end or processing equipment ibid describes.
It should be noted that device and processing equipment that this specification embodiment is described above, implement according to correlation technique The description of example can also include other embodiments.Concrete implementation mode is referred to the description of embodiment of the method, herein It does not repeat one by one.
Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for hardware+ For program class embodiment, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to side The part of method embodiment illustrates.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the action recorded in detail in the claims or step can be come according to different from the sequence in embodiment It executes and desired result still may be implemented.In addition, the process described in the accompanying drawings not necessarily require show it is specific suitable Sequence or consecutive order could realize desired result.In some embodiments, multitasking and parallel processing be also can With or it may be advantageous.
Processing method, device and the processing equipment for a kind of insurance business risk profile that this specification embodiment provides, can Risk forecast model is built to promote decision tree using gradient in advance, which can use mark and guarantor The dangerous associated risk association data of business are trained.Line can be used as when prediction requires when risk forecast model training reaches Upper risk profile uses, and carries out insurance business risk profile to user to be predicted, exports prediction result.Implemented using this specification The method that example provides can rationally and effectively apply the non-linear variable of various dimensions in insurance business, and decision tree is promoted based on gradient The risk forecast model of non-linear relation can be compatible with linear and nonlinear variable, relative to traditional linear mould The accuracy of type, prediction result is obviously improved, and effectively makes up the deficiency of traditional linear model, improves insurance business service Experience.
Although this application provides the method operating procedure as described in embodiment or flow chart, based on conventional or noninvasive The labour for the property made may include more or less operating procedure.The step of being enumerated in embodiment sequence is only numerous steps A kind of mode in execution sequence does not represent and unique executes sequence.Device or system server product in practice executes When, it can either method shown in the drawings sequence executes or parallel executes (such as parallel processor or more according to embodiment The environment of thread process).
Although mentioning decision tree in the definition of linear relationship/non-linear relation, GBDT in this specification embodiment content The operations such as data acquisition, storage, interaction, calculating, the judgement of structure, the processing procedure of GBDT model algorithms or the like and data are retouched State, still, this specification embodiment be not limited to must be meet industry communication standard, standard GBDT model algorithms processing, Situation described in communication protocol and normal data model/template or this specification embodiment.Certain professional standards or use Embodiment modified slightly can also realize above-described embodiment phase in self-defined mode or the practice processes of embodiment description The implementation result being anticipated that after same, equivalent or close or deformation.Using these modifications or deformed data acquisition, stores, sentences The embodiment of the acquisitions such as disconnected, processing mode, still may belong within the scope of the optional embodiment of this specification.
In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example, Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit. Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device (Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " patrols Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development, And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language (Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL (Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL (Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present Integrated Circuit Hardware Description Language) and Verilog.Those skilled in the art also answer This understands, it is only necessary to method flow slightly programming in logic and is programmed into integrated circuit with above-mentioned several hardware description languages, The hardware circuit for realizing the logical method flow can be readily available.
Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing The computer for the computer readable program code (such as software or firmware) that device and storage can be executed by (micro-) processor can Read medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), the form of programmable logic controller (PLC) and embedded microcontroller, the example of controller includes but not limited to following microcontroller Device:ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, are deposited Memory controller is also implemented as a part for the control logic of memory.It is also known in the art that in addition to Pure computer readable program code mode is realized other than controller, can be made completely by the way that method and step is carried out programming in logic Controller is obtained in the form of logic gate, switch, application-specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. to come in fact Existing identical function.Therefore this controller is considered a kind of hardware component, and to including for realizing various in it The device of function can also be considered as the structure in hardware component.Or even, it can will be regarded for realizing the device of various functions For either the software module of implementation method can be the structure in hardware component again.
Processing equipment, device, module or the unit that above-described embodiment illustrates, specifically can be real by computer chip or entity It is existing, or realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer example Such as can be personal computer, laptop computer, vehicle-mounted human-computer interaction device, cellular phone, camera phone, smart phone, Personal digital assistant, navigation equipment, electronic mail equipment, game console, tablet computer, wearable is set media player The combination of any equipment in standby or these equipment.
Although this specification embodiment provides the method operating procedure as described in embodiment or flow chart, based on conventional May include either more or less operating procedure without creative means.The step of being enumerated in embodiment sequence be only A kind of mode in numerous step execution sequences does not represent and unique executes sequence.Device or end product in practice is held When row, can according to embodiment either method shown in the drawings sequence execute or it is parallel execute (such as parallel processor or The environment of multiple threads, even distributed data processing environment).The terms "include", "comprise" or its any other change Body is intended to non-exclusive inclusion, so that process, method, product or equipment including a series of elements are not only wrapped Those elements are included, but also include other elements that are not explicitly listed, or further include for this process, method, product Or the element that equipment is intrinsic.In the absence of more restrictions, being not precluded in the process including the element, side There is also other identical or equivalent elements in method, product or equipment.
For convenience of description, it is divided into various modules when description apparatus above with function to describe respectively.Certainly, implementing this The function of each module is realized can in the same or multiple software and or hardware when specification embodiment, it can also be by reality Show the module of same function by the combination realization etc. of multiple submodule or subelement.Device embodiment described above is only Schematically, for example, the division of the unit, only a kind of division of logic function, can there is other draw in actual implementation The mode of dividing, such as multiple units or component can be combined or can be integrated into another system, or some features can be ignored, Or it does not execute.Another point, shown or discussed mutual coupling, direct-coupling or communication connection can be by one The INDIRECT COUPLING of a little interfaces, device or unit or communication connection can be electrical, machinery or other forms.
It is also known in the art that other than realizing controller in a manner of pure computer readable program code, it is complete Entirely can by by method and step carry out programming in logic come so that controller with logic gate, switch, application-specific integrated circuit, programmable The form of logic controller and embedded microcontroller etc. realizes identical function.Therefore this controller is considered one kind Hardware component, and the structure that the device for realizing various functions that its inside includes can also be considered as in hardware component.Or Person even, can will be considered as either the software module of implementation method can be hardware again for realizing the device of various functions Structure in component.
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions every first-class in flowchart and/or the block diagram The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided Instruct the processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine so that the instruction executed by computer or the processor of other programmable data processing devices is generated for real The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology realizes information storage.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or other magnetic storage apparatus Or any other non-transmission medium, it can be used for storage and can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It will be understood by those skilled in the art that the embodiment of this specification can be provided as method, system or computer program production Product.Therefore, in terms of this specification embodiment can be used complete hardware embodiment, complete software embodiment or combine software and hardware Embodiment form.Moreover, it wherein includes computer available programs that this specification embodiment, which can be used in one or more, Implement in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of code The form of computer program product.
This specification embodiment can describe in the general context of computer-executable instructions executed by a computer, Such as program module.Usually, program module includes routines performing specific tasks or implementing specific abstract data types, journey Sequence, object, component, data structure etc..This specification embodiment can also be put into practice in a distributed computing environment, in these points In cloth computing environment, by executing task by the connected remote processing devices of communication network.In Distributed Calculation ring In border, program module can be located in the local and remote computer storage media including storage device.
Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to embodiment of the method Part explanation.In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", The description of " specific example " or " some examples " etc. means specific features described in conjunction with this embodiment or example, structure, material Or feature is contained at least one embodiment or example of this specification embodiment.In the present specification, to above-mentioned term Schematic representation be necessarily directed to identical embodiment or example.Moreover, description specific features, structure, material or Person's feature may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, in not conflicting feelings Under condition, those skilled in the art by different embodiments or examples described in this specification and different embodiment or can show The feature of example is combined.
The foregoing is merely the embodiments of this specification embodiment, are not limited to this specification embodiment.It is right For those skilled in the art, this specification embodiment can have various modifications and variations.It is all in this specification embodiment Any modification, equivalent replacement, improvement and so within spirit and principle, the right that should be included in this specification embodiment are wanted Within the scope of asking.

Claims (11)

1. a kind of processing method of insurance business risk profile, the method includes:
Obtain the target risk associated data of user to be predicted;
The target risk associated data is handled using the risk forecast model of structure, exports the user's to be predicted Risk profile is as a result, the risk forecast model method includes:Decision tree is promoted to gradient using the risk association data of mark It is trained determining prediction model.
2. the method as described in claim 1, the risk association data include the user characteristic data of at least one classification, institute State the data information that user characteristic data includes non-linear relation associated with insurance business.
3. method as claimed in claim 2 trains to obtain the risk forecast model using following manner:
Determine the threshold value of total quantity and the decision tree used decision feature in each branch of decision tree, the decision One kind in classification characterized by the user characteristic data;
When being trained to one group of risk association data, if a tree for the decision tree of training reaches default value or decision tree Residual error meet deconditioning condition, then stop the training of this group of risk association data, the corresponding wind of output risk association data Danger prediction numerical value, the default value are less than or equal to the total quantity;
The threshold value of the decision feature of corresponding decision tree is adjusted by the training result of risk association data, until described after adjustment When the prediction result output that threshold value meets risk forecast model requires, the risk forecast model is determined.
4. the total quantity of method as claimed in claim 3, the decision tree is based on the corresponding classification of the user characteristic data Quantity determine.
5. the method as described in any one of Claims 1-4, the risk forecast model is based on related to vehicle insurance business The vehicle insurance risk forecast model that the risk association data of connection are trained;
The risk profile result includes any one in the loss ratio of user to be predicted, vehicle insurance risk score value.
6. a kind of insurance business risk profile processing unit, including:
Prediction data acquisition module, the target risk associated data for obtaining user to be predicted;
Risk profile module is handled the target risk associated data for the risk forecast model using structure, defeated Go out the risk profile of the user to be predicted as a result, the risk forecast model method includes:Utilize the risk association number of mark It is trained determining prediction model according to decision tree is promoted to gradient.
7. a kind of insurance business risk profile processing equipment, including processor and for storing depositing for processor-executable instruction Reservoir, the processor are realized when executing described instruction:
Obtain the target risk associated data of user to be predicted;
The target risk associated data is handled using the risk forecast model of structure, exports the user's to be predicted Risk profile is as a result, the risk forecast model method includes:Decision tree is promoted to gradient using the risk association data of mark It is trained determining prediction model.
8. processing equipment as claimed in claim 7, the risk association data include the user characteristics number of at least one classification According to the user characteristic data includes the data information of non-linear relation associated with insurance business.
9. processing equipment as claimed in claim 8, the processor trains to obtain the risk profile mould using following manner Type:
Determine the threshold value of total quantity and the decision tree used decision feature in each branch of decision tree, the decision One kind in classification characterized by the user characteristic data;
When being trained to one group of risk association data, if a tree for the decision tree of training reaches default value or decision tree Residual error meet deconditioning condition, then stop the training of this group of risk association data, the default value is less than or equal to described Total quantity;
The threshold value of the decision feature of corresponding decision tree is adjusted by the training result of risk association data, until described after adjustment When the prediction result output that threshold value meets risk forecast model requires, the risk forecast model is determined.
10. the total quantity of processing equipment as claimed in claim 9, the decision tree is corresponding based on the user characteristic data The quantity of classification determines.
11. the processing equipment as described in any one of claim 7-10, the risk forecast model includes being based on and vehicle insurance The vehicle insurance risk forecast model that the associated risk association data of business are trained;
The risk profile result includes any one in the loss ratio of user to be predicted, vehicle insurance risk score value.
CN201810469782.1A 2018-05-16 2018-05-16 A kind of processing method, device and the processing equipment of insurance business risk profile Pending CN108665175A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201810469782.1A CN108665175A (en) 2018-05-16 2018-05-16 A kind of processing method, device and the processing equipment of insurance business risk profile
TW108105613A TW201947470A (en) 2018-05-16 2019-02-20 Processing method, apparatus and device for risk prediction of insurance service
PCT/CN2019/076524 WO2019218751A1 (en) 2018-05-16 2019-02-28 Processing method, apparatus and device for risk prediction of insurance service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810469782.1A CN108665175A (en) 2018-05-16 2018-05-16 A kind of processing method, device and the processing equipment of insurance business risk profile

Publications (1)

Publication Number Publication Date
CN108665175A true CN108665175A (en) 2018-10-16

Family

ID=63779877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810469782.1A Pending CN108665175A (en) 2018-05-16 2018-05-16 A kind of processing method, device and the processing equipment of insurance business risk profile

Country Status (3)

Country Link
CN (1) CN108665175A (en)
TW (1) TW201947470A (en)
WO (1) WO2019218751A1 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377399A (en) * 2018-12-17 2019-02-22 泰康保险集团股份有限公司 Risk analysis method, medium and electronic equipment for insurance products air control
CN109543909A (en) * 2018-11-27 2019-03-29 平安科技(深圳)有限公司 Prediction technique, device and the computer equipment of vehicle caseload
CN109657696A (en) * 2018-11-05 2019-04-19 阿里巴巴集团控股有限公司 Multitask supervised learning model training, prediction technique and device
CN109657852A (en) * 2018-12-12 2019-04-19 四川易小保网络科技有限公司 A kind of insurance business processing method and system based on big data
CN109784586A (en) * 2019-03-07 2019-05-21 上海赢科信息技术有限公司 The prediction technique and system of the situation of being in danger of vehicle insurance
CN109919783A (en) * 2019-01-31 2019-06-21 德联易控科技(北京)有限公司 Risk Identification Method, device, equipment and the storage medium of vehicle insurance Claims Resolution case
CN110163481A (en) * 2019-04-19 2019-08-23 深圳壹账通智能科技有限公司 Electronic device, user's air control auditing system test method and storage medium
CN110289098A (en) * 2019-05-17 2019-09-27 天津科技大学 A kind of Risk Forecast Method for intervening data based on clinical examination and medication
CN110348684A (en) * 2019-06-06 2019-10-18 阿里巴巴集团控股有限公司 Service call risk model generation method, prediction technique and respective device
CN110428137A (en) * 2019-07-04 2019-11-08 阿里巴巴集团控股有限公司 A kind of update method and device of risk prevention system strategy
CN110442712A (en) * 2019-07-05 2019-11-12 阿里巴巴集团控股有限公司 Determination method, apparatus, server and the text of risk try system
WO2019218751A1 (en) * 2018-05-16 2019-11-21 阿里巴巴集团控股有限公司 Processing method, apparatus and device for risk prediction of insurance service
CN111612640A (en) * 2020-05-27 2020-09-01 上海海事大学 Data-driven vehicle insurance fraud identification method
CN112330432A (en) * 2020-11-10 2021-02-05 中国平安人寿保险股份有限公司 Risk level recognition model training method, recognition method, terminal and storage medium
CN112785090A (en) * 2021-03-03 2021-05-11 中国工商银行股份有限公司 Model training method, type prediction method, device and computing equipment
CN113393331A (en) * 2021-06-10 2021-09-14 罗忠明 Database and algorithm based big data insurance accurate wind control, management, intelligent customer service and marketing system
CN113449753A (en) * 2020-03-26 2021-09-28 中国电信股份有限公司 Service risk prediction method, device and system
CN113469584A (en) * 2021-09-02 2021-10-01 云账户技术(天津)有限公司 Risk management method and device for business service operation
CN116051296A (en) * 2022-12-28 2023-05-02 中国银行保险信息技术管理有限公司 Customer evaluation analysis method and system based on standardized insurance data

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242793B (en) * 2020-01-16 2024-02-06 上海金仕达卫宁软件科技有限公司 Medical insurance data abnormality detection method and device
CN111813823A (en) * 2020-05-25 2020-10-23 泰康保险集团股份有限公司 Insurance service policy adjustment system, vehicle-mounted recording device and server
CN113822435B (en) * 2020-06-19 2023-11-03 腾讯科技(深圳)有限公司 Prediction method of user conversion rate and related equipment
CN111652717A (en) * 2020-07-07 2020-09-11 中国银行股份有限公司 Animal husbandry credit risk assessment method and device
CN112800071A (en) * 2020-08-24 2021-05-14 支付宝(杭州)信息技术有限公司 Service processing method, device, equipment and storage medium based on block chain
CN112330476A (en) * 2020-11-27 2021-02-05 中国人寿保险股份有限公司 Method and device for predicting group insurance business
CN112487475B (en) * 2020-11-30 2023-06-09 北京京航计算通讯研究所 Secret-related carrier risk analysis method and system
CN112581259B (en) * 2020-12-16 2023-09-19 同盾控股有限公司 Account risk identification method and device, storage medium and electronic equipment
CN112818389B (en) * 2021-01-26 2023-12-22 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment based on privacy protection
CN112884215A (en) * 2021-02-02 2021-06-01 国网甘肃省电力公司信息通信公司 Parameter optimization method based on gradient enhancement tree population prediction model
CN113673844B (en) * 2021-08-04 2024-02-23 支付宝(杭州)信息技术有限公司 Information feedback method, device and equipment
CN113592606B (en) * 2021-08-10 2023-08-22 平安银行股份有限公司 Product recommendation method, device, equipment and storage medium based on multiple decisions
CN113723522B (en) * 2021-08-31 2023-06-16 平安科技(深圳)有限公司 Abnormal user identification method and device, electronic equipment and storage medium
CN113762621A (en) * 2021-09-09 2021-12-07 南京领行科技股份有限公司 Network taxi appointment driver departure prediction method and system
CN114154696A (en) * 2021-11-19 2022-03-08 中国建设银行股份有限公司 Method, system, computer device and storage medium for predicting fund flow
CN114860905A (en) * 2022-04-24 2022-08-05 支付宝(杭州)信息技术有限公司 Intention identification method, device and equipment
CN114969293A (en) * 2022-05-31 2022-08-30 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment
CN115034888A (en) * 2022-06-16 2022-09-09 支付宝(杭州)信息技术有限公司 Credit service providing method and device
CN115146725B (en) * 2022-06-30 2023-05-30 北京百度网讯科技有限公司 Method for determining object classification mode, object classification method, device and equipment
CN115688130B (en) * 2022-10-17 2023-10-20 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment
CN115953559B (en) * 2023-01-09 2024-04-12 支付宝(杭州)信息技术有限公司 Virtual object processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126215A (en) * 2016-06-17 2016-11-16 深圳市麦斯杰网络有限公司 Business rule scenario generation method and device
CN106971343A (en) * 2016-01-13 2017-07-21 平安科技(深圳)有限公司 The risk analysis method and system of insurance data
CN107292528A (en) * 2017-06-30 2017-10-24 阿里巴巴集团控股有限公司 Vehicle insurance Risk Forecast Method, device and server

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106447383A (en) * 2016-08-30 2017-02-22 杭州启冠网络技术有限公司 Cross-time multi-dimensional abnormal data monitoring method and system
CN106503863A (en) * 2016-11-10 2017-03-15 北京红马传媒文化发展有限公司 Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal
CN108009914A (en) * 2017-12-19 2018-05-08 马上消费金融股份有限公司 A kind of assessing credit risks method, system, equipment and computer-readable storage medium
CN108665175A (en) * 2018-05-16 2018-10-16 阿里巴巴集团控股有限公司 A kind of processing method, device and the processing equipment of insurance business risk profile

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106971343A (en) * 2016-01-13 2017-07-21 平安科技(深圳)有限公司 The risk analysis method and system of insurance data
CN106126215A (en) * 2016-06-17 2016-11-16 深圳市麦斯杰网络有限公司 Business rule scenario generation method and device
CN107292528A (en) * 2017-06-30 2017-10-24 阿里巴巴集团控股有限公司 Vehicle insurance Risk Forecast Method, device and server

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019218751A1 (en) * 2018-05-16 2019-11-21 阿里巴巴集团控股有限公司 Processing method, apparatus and device for risk prediction of insurance service
CN109657696B (en) * 2018-11-05 2023-06-30 创新先进技术有限公司 Multi-task supervised learning model training and predicting method and device
CN109657696A (en) * 2018-11-05 2019-04-19 阿里巴巴集团控股有限公司 Multitask supervised learning model training, prediction technique and device
CN109543909A (en) * 2018-11-27 2019-03-29 平安科技(深圳)有限公司 Prediction technique, device and the computer equipment of vehicle caseload
CN109543909B (en) * 2018-11-27 2023-04-18 平安科技(深圳)有限公司 Method and device for predicting number of vehicle cases and computer equipment
CN109657852A (en) * 2018-12-12 2019-04-19 四川易小保网络科技有限公司 A kind of insurance business processing method and system based on big data
CN109657852B (en) * 2018-12-12 2023-09-12 上海豹云网络信息服务有限公司 Insurance business processing method and system based on big data
CN109377399A (en) * 2018-12-17 2019-02-22 泰康保险集团股份有限公司 Risk analysis method, medium and electronic equipment for insurance products air control
CN109919783A (en) * 2019-01-31 2019-06-21 德联易控科技(北京)有限公司 Risk Identification Method, device, equipment and the storage medium of vehicle insurance Claims Resolution case
CN109784586B (en) * 2019-03-07 2023-08-29 上海赢科信息技术有限公司 Prediction method and system for danger emergence condition of vehicle danger
CN109784586A (en) * 2019-03-07 2019-05-21 上海赢科信息技术有限公司 The prediction technique and system of the situation of being in danger of vehicle insurance
CN110163481A (en) * 2019-04-19 2019-08-23 深圳壹账通智能科技有限公司 Electronic device, user's air control auditing system test method and storage medium
CN110289098A (en) * 2019-05-17 2019-09-27 天津科技大学 A kind of Risk Forecast Method for intervening data based on clinical examination and medication
CN110289098B (en) * 2019-05-17 2022-11-25 天津科技大学 Risk prediction method based on clinical examination and medication intervention data
CN110348684A (en) * 2019-06-06 2019-10-18 阿里巴巴集团控股有限公司 Service call risk model generation method, prediction technique and respective device
CN110428137A (en) * 2019-07-04 2019-11-08 阿里巴巴集团控股有限公司 A kind of update method and device of risk prevention system strategy
CN110442712A (en) * 2019-07-05 2019-11-12 阿里巴巴集团控股有限公司 Determination method, apparatus, server and the text of risk try system
CN110442712B (en) * 2019-07-05 2023-08-22 创新先进技术有限公司 Risk determination method, risk determination device, server and text examination system
CN113449753B (en) * 2020-03-26 2024-01-02 天翼云科技有限公司 Service risk prediction method, device and system
CN113449753A (en) * 2020-03-26 2021-09-28 中国电信股份有限公司 Service risk prediction method, device and system
CN111612640A (en) * 2020-05-27 2020-09-01 上海海事大学 Data-driven vehicle insurance fraud identification method
CN112330432B (en) * 2020-11-10 2024-03-15 中国平安人寿保险股份有限公司 Risk level identification model training method, risk level identification method, terminal and storage medium
CN112330432A (en) * 2020-11-10 2021-02-05 中国平安人寿保险股份有限公司 Risk level recognition model training method, recognition method, terminal and storage medium
CN112785090A (en) * 2021-03-03 2021-05-11 中国工商银行股份有限公司 Model training method, type prediction method, device and computing equipment
CN113393331A (en) * 2021-06-10 2021-09-14 罗忠明 Database and algorithm based big data insurance accurate wind control, management, intelligent customer service and marketing system
CN113469584B (en) * 2021-09-02 2021-11-16 云账户技术(天津)有限公司 Risk management method and device for business service operation
CN113469584A (en) * 2021-09-02 2021-10-01 云账户技术(天津)有限公司 Risk management method and device for business service operation
CN116051296A (en) * 2022-12-28 2023-05-02 中国银行保险信息技术管理有限公司 Customer evaluation analysis method and system based on standardized insurance data
CN116051296B (en) * 2022-12-28 2023-09-29 中国银行保险信息技术管理有限公司 Customer evaluation analysis method and system based on standardized insurance data

Also Published As

Publication number Publication date
TW201947470A (en) 2019-12-16
WO2019218751A1 (en) 2019-11-21

Similar Documents

Publication Publication Date Title
CN108665175A (en) A kind of processing method, device and the processing equipment of insurance business risk profile
CN108694673A (en) A kind of processing method, device and the processing equipment of insurance business risk profile
TWI746814B (en) Computer readable medium, car insurance risk prediction device and server
CN110363449B (en) Risk identification method, device and system
US20210065058A1 (en) Method, apparatus, device and readable medium for transfer learning in machine learning
CN110287477A (en) Entity emotion analysis method and relevant apparatus
CN108334647A (en) Data processing method, device, equipment and the server of Insurance Fraud identification
CN110033382B (en) Insurance service processing method, device and equipment
CN110347971A (en) Particle filter method, device and storage medium based on TSK fuzzy model
CN109871809A (en) A kind of machine learning process intelligence assemble method based on semantic net
CN109582774A (en) Natural language classification method, device, equipment and storage medium
CN109255629A (en) A kind of customer grouping method and device, electronic equipment, readable storage medium storing program for executing
Berenguer et al. Models of artificial neural networks applied to demand forecasting in nonconsolidated tourist destinations
CN109583473A (en) A kind of generation method and device of characteristic
Xue et al. Research and prediction of Shanghai-Shenzhen 20 index based on the support vector machine model and gradient boosting regression tree
US20150134306A1 (en) Creating understandable models for numerous modeling tasks
Chaturvedi Soft computing techniques and their applications
CN108898227A (en) Learning rate calculation method and device, disaggregated model calculation method and device
Priore et al. Dynamic scheduling of flexible manufacturing systems using neural networks and inductive learning
CN103198357A (en) Optimized and improved fuzzy classification model construction method based on nondominated sorting genetic algorithm II (NSGA- II)
Guzika et al. Organisation of Fuzzy Cognitive Maps Considering Real Parameters of Simulated Systems
Thu et al. Multi-step Ahead Wind Speed Forecasting Based on a Bi-LSTM Network Combined with Decomposition Technique
Kravchenko et al. Information and knowledge integration based on simulation modeling
Wang et al. A Novel Grey Residual Modification Model Using Neural Networks.
CN112036641B (en) Artificial intelligence-based retention prediction method, apparatus, computer device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20201022

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201022

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: Greater Cayman, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181016