Specific implementation mode
In order to make those skilled in the art more fully understand the technical solution in this specification, below in conjunction with this explanation
Attached drawing in book embodiment is clearly and completely described the technical solution in this specification embodiment, it is clear that described
Embodiment be only a part of the embodiment in this specification, instead of all the embodiments.Base in this manual one
A or multiple embodiments, the every other reality that those of ordinary skill in the art are obtained without creative efforts
Example is applied, the range of this specification embodiment protection should be all belonged to.
With the development of computer internet technology, data volume is skyrocketed through.Data characteristics when insurance business risk profile
Classification also more and more dimensions, detailed-oriented.Many influences of the variable to sifting sort such as are surfed the Internet with non-linear existing
Correlation is presented in duration and age, but the correlation can be diversified.Such as can be simple linear relationship, example
If online duration reduces by 1 percentage point, the age increases 1 years old;Can also be more complicated relationship, such as exponential relationship, online
Duration reduces by 4 percentage points, and the age increases 2 years old, can be changed by certain mathematics be converted into linear can use extensively at this time
Adopted linear model solves.In actual life, other than the variable of some substantially linear relationships, there is also a large amount of non-linear variables.
Such as when predicting the age, if " age " not instead of not merely with online duration change and change, simultaneously with the shopping of crowd
And custom etc. correlation, different consumption habits change age distribution in the form of non-linear effects with Self-variation.Because
Predict that " age of user " is the first purpose, if the pre- of model will be greatly reduced in some GLM model None- identified non-linear relations
Survey performance.In the mode of existing solution, variable can be subjected to segmentation by branch mailbox and summarized, but the essence of many variables can be lost
Parasexuality reduces prediction result.Another insurance business for being different from existing conventional implementation that this specification embodiment provides
The implementation method of risk prediction, introducing GBDT, (Gradient Boosting Decision Tree, gradient promote decision
Tree), risk forecast model rationally and effectively can be built using non-linear variable in risk profile, which can be simultaneous well
Hold linear and nonlinear variable, relative to traditional linear model, the accuracy of prediction result is obviously improved.
GBDT (Gradient Boosting Decision Tree) is a kind of decision Tree algorithms of iteration, the algorithm by
More decision tree compositions, the conclusion of all trees, which adds up, does final result.Tree in GBDT is all regression tree, can be used for doing
Regression forecasting.In the processing method for the insurance business risk profile that this specification provides, the risk of mark can be used in advance
Associated data builds decision-tree model, is gradually adjusted to the parameter in decision tree by the machine learning (distribution iteration) of recurrence excellent
Change.When model prediction result meets the required precision of insurance business risk profile, can on line using predicting use to be predicted
The risk numerical value at family or loss ratio etc..
Below to this specification embodiment by taking the application scenarios of a specific vehicle insurance business risk prediction processing as an example
It illustrates.Specifically, Fig. 1 is the flow of the processing method embodiment for the insurance business risk profile that this specification provides
Schematic diagram.Although present description provides such as following embodiments or method operating procedure shown in the drawings or apparatus structure, base
It is less after it may include either routinely more in the method or device without performing creative labour or part merging
Operating procedure or modular unit.In the step of there is no necessary causalities in logicality or structure, the execution of these steps
Sequence or the modular structure of device are not limited to this specification embodiment or execution shown in the drawings sequence or modular structure.Described
Device in practice, server or the end product of method or modular structure are in application, can be according to embodiment or attached drawing
Shown in method or the execution of modular structure carry out sequence are either parallel executes (such as the ring of parallel processor or multiple threads
Border, the even implementation environment including distributed treatment, server cluster).
Certainly, the description of the embodiment of following vehicle insurance business risks prediction not to based on this specification other are expansible
To technical solution be construed as limiting.Such as in other implement scenes, the embodiment that this specification provides is equally applicable
Into the implement scene of fund risk prediction, medical insurance risk profile etc., the application in other implement scenes is with reference to this explanation
The embodiment of book vehicle insurance business describes, and no longer carries out alternative repeated description.A kind of specific embodiment is as shown in Figure 1, originally
Specification provide a kind of insurance business risk profile processing method may include:
S0:Obtain the target risk associated data of user to be predicted;
S2:The target risk associated data is handled using the risk identification algorithm of structure, is waited for described in output pre-
The risk profile of user is surveyed as a result, the risk identification algorithm includes:Gradient is promoted using the risk association data of mark
Decision tree is trained determining risk forecast model.
In one or more embodiments of this specification, the risk forecast model based on GBDT can be built in advance.Specifically
GBDT models training and structure summed data can be needed to carry out corresponding model structure and parameter according to practical business scene
Setting, such as can individually be trained, trained residual error continues to train as another input set with single tree;Or
The multistage connection of more trees is trained, and training residual error is re-used as the input of the number of another multistage connection.Certainly, other to implement
It can also be using the non-linear relation for carrying out some deformations, transformation or the realization of improved Processing Algorithm based on GBDT algorithms in example
The risk profile of insurance business data is handled, and this specification no longer repeats the realization process of GBDT model constructions one by one.
The training for determining risk forecast model can be acquired in the present embodiment previously according to history vehicle insurance business declaration form data
Data, divide according to risk or setting requirements carry out mark to training data.In the present embodiment insurance business risk profile
In implement scene, the training data is properly termed as risk association data, these risk association data are usually and insurance business
It is associated, for the sample training to risk forecast model.Such as risk association can be the user characteristics for including multiple dimensions
Data, an associated user characteristic data of user is one group of training data, and every group of risk association data can be arranged with mark
Corresponding risk score value.Specifically, in one embodiment of this specification the method, the risk association data may include
With the user characteristic data of at least one classification, the user characteristic data includes non-linear relation associated with insurance business
Data information.Such as in an example, the risk association data of user A may include (A1, A2, A3 ..., A9) 9 dimensions
User characteristic data.The user characteristic data of different dimensions can be chosen accordingly according to the demand that vehicle insurance is predicted, such as above-mentioned
Exemplary 9 dimensions may include the age, gender, occupation, annual income, and history is in danger number, monthly average consumption, reference grade, wedding
Relation by marriage situation, debt assets.Or the user characteristic data for obtaining 10 or 10 dimensions or more can be acquired in advance, determining wind
The user characteristic data for needing to carry out model training is chosen when dangerous associated data from the user characteristic data of multiple dimensions.Example
Such as, specific risk association data may include as shown in table 1 below:
The risk association data of 1 model training of table illustrate table
Certainly, in other embodiments, the risk association data can also include the people generated according to pre-defined rule
Number evidence, such as self-defined setting carries out the wind of model training the case where operating personnel may be able to include according to expected risk
Dangerous associated data.Alternatively, required risk association data are automatically generated by computer after the data create-rule of setting.Here
The artificial data of generation be more in line with expected risk profile situation, and history vehicle insurance case data are then closer to true wind
Dangerous situation condition, some are practiced in scene, can be used one such alternatively, in combination with artificial data and history vehicle insurance case
Number of packages is according to the training for carrying out risk forecast model, to improve the accuracy of prediction result.
The risk association data of acquisition can be trained as training data in GBDT models, after learning training
Threshold value (can be whole threshold values or partial threshold value) energy of decision feature in risk forecast model when decision tree branch
Meet the required precision (usually may also require that the output of continuous-stable meets required precision) of model final output.This theory
The GBDT used in bright book embodiment is a kind of decision Tree algorithms of iteration, can be mainly divided into decision tree (Regression
DecisionTree, DT) and gradient promotion (Gradient boosting, GB).Decision tree is broadly divided into two classes:Classification tree and
Regression tree, classification tree is commonly used to solve classification problem, for example whether user's gender, webpage are whether the rubbish page, user make
Disadvantage etc..And regression tree is generally used to prediction actual value, such as the age of user, the probability of user's click, webpage degree of correlation
Etc..The former is used for tag along sort paper, and the latter is for predicting real number value.It is emphasized that the result plus-minus of regression tree is
Significant, such as -3 years old=12 years old+5 years old 10 years old, the latter was then that cumulative or accumulation result of having no idea is meaningless, such as man+man+female
=be man on earth it is female.This specification embodiment can predict the vehicle insurance score value of vehicle insurance using regression tree, all trees of such as adding up
Result as ultimate risk predict as a result,
Regression tree substantially flow is similar with classification tree, and difference lies in each node of regression tree can obtain one in advance
Measured value, by taking the age as an example, which is equal to the average value at the owner's age for belonging to this node.When branch it is exhaustive each
Feature finds optimal cutting variable and optimal cut-off, and the criterion weighed in the present embodiment is no longer the Geordie system in classification tree
Number, but square error minimizes.The number for being namely predicted mistake is more, and square error is bigger, flat by minimizing
Square error finds most reliable branch foundation.Branch until on each leaf node people to play it is interested be it is unique or
Person reaches preset end condition (such as leaf number upper limit), if finally the age is not unique on leaf node, with the section
Prediction result of the proprietary average age as the leaf node on point.
It is a kind of for returning, classifying and the machine learning skill of Sorting task that gradient, which promotes (Gradient boosting),
Art belongs to a part for Boosting algorithms race.Boosting is the algorithm that weak learner can be promoted to strong learner by family,
Belong to the scope of integrated study (ensemble learning).Boosting methods are based on such a thought:It is multiple for one
For miscellaneous task, the judgement of multiple experts is subjected to the judgement that comprehensive income appropriate goes out, than one expert's list of any of which
Only judgement will be got well.Generally, it is exactly the reason of " Three Stooges top Zhuge Liang ".Gradient is promoted with other boosting
Method is the same, by integrating (ensemble) multiple weak learners, typically decision tree, to build final prediction model.
Boosting methods build model by way of substep iteration (stage-wise), in weak that each step of iteration is built
Device is practised to be provided to make up the deficiency of existing model.
Such as in a specific processing procedure, the tree of tree can be set when training, a tree for tree reaches specified
(such as 80) can be with deconditioning after numerical value;Or when residual error very little (condition for meeting deconditioning), the two
Condition meets a training can deconditioning.
If when the N residual error is not all 0 or is unsatisfactory for stop condition, the residual result for the node set using the N is replaced
It is updated in the N+1 tree and is learnt for corresponding initial value;
Until the residual sum predicted value of N+K number leaf nodes is equal or is less than threshold value, current leaf node pair is exported
The risk profile result (value-at-risk or loss ratio) answered.It specifically can be cumulative as ultimate risk predicted value by all residual errors.
Fig. 2 is a kind of processing procedure schematic diagram for structure risk forecast model that this specification provides.As shown in Fig. 2, this
In another embodiment of the method that specification provides, train to obtain the risk forecast model using following manner:
S20:Determine the threshold value of total quantity and the decision tree used decision feature in each branch of decision tree,
The decision is characterized as one kind in the classification of the user characteristic data;
S22:When being trained to one group of risk association data, if training decision tree tree reach default value or
The residual error of person decision tree meets deconditioning condition, then stops the training of this group of risk association data, and the default value is less than
Equal to the total quantity;
S24:The threshold value of the decision feature of corresponding decision tree is adjusted by the training result of risk association data, until adjustment
When the prediction result output that the threshold value afterwards meets risk forecast model requires, the risk forecast model is determined.
In the present embodiment, it may be predetermined that the quantity for the decision tree that training uses gradually is optimized really by Gradient Iteration
The threshold value of decision feature when fixed decision tree progress branch.80 decision trees can be such as used, each tree is per one tree
Be before it is all tree conclusion sums residual errors.The threshold value of initial number can be configured based on experience value.If true point of A
Value (mark score value is 80 points), but one tree is 60 points according to the decision feature at age prediction score value, 20 points poor, residual error
It is 20.So set that (decision is characterized as the occupation of user) is inner to be set as 20 points the score value of A and go to learn at second, if second
Tree can really assign to A 20 points of leaf node, conclusion of that cumulative two tree be exactly the true score value of A (prediction score value 60 divides+
Residual error 20 is divided);If the conclusion of second tree is 18 points, A still has 2 points of residual errors, and (decision is characterized as that year receives to third tree
Entering) age of inner A reforms into 2 points, continues to learn.The residual computations of each step are equivalent to the power in a disguised form increasing misclassification event
Weight, and divided to time be then all intended to 0, e.g., excessive or too small according to the age, then risk is bigger, and, income it is higher
Risk is smaller;If it's greatly 60 years old pasts an age of user, but has been divided into the smaller branch L1 of risk, but point that risk is smaller
Average age on group L1 is between 20-40 Sui, then the residual values obtained will increase accordingly, which can be by follow-up
Income, marital status, driving age etc. gradually by its point to the leaf node close to practical risk.
After if the quantity of the decision tree of training reaches predetermined value, such as from root node until 10 trees of leaf node are equal
It is 0 or other residual error outage thresholds that the parameter of time either current number of training, which meets deconditioning condition such as residual error, at this time
It can stop the training of this group of data.When each threshold value looks for best cut-point, or meet the cut-point of training requirement, then may be used
To determine the threshold value of the decision feature of decision tree, until the prediction result that the threshold value after adjustment meets risk forecast model is defeated
When going out requirement, the risk forecast model is determined.Such as initial setting up risk score value be divided into 60 and 80 threshold value be the age whether
More than 20 years old.After mass data training optimization, risk assessment this decision feature may finally will be carried out from age dimension
Adjust whether the age is more than 24 years old, to meet true predictive result in most cases.
It is determined below with a simple age prediction example to illustrate how to realize using GBDT in this specification embodiment
The training of plan tree.Following examples in this specification insurance business risk profile, will replace at the age vehicle insurance risk score value or
Whether loss ratio puts question to moon purchase and consumption and often the corresponding classification for replacing with user characteristic data, threshold therein
Value is configured accordingly.Specifically implementation process may include:
Assuming that it is 14,16,24 respectively that training set (risk association data), which only has 4 people, A, B, C, D, their age,
26.Wherein A, B are high one and high school senior respectively;C, D are the employee of graduating student and work 2 years respectively.If it is with one
Traditional regression tree is trained.It chooses and does age prediction using GBDT, since data are very little, we limit leaf section
Point do mostly there are two, i.e. each tree limits all only there are one branch and only learns two trees, can obtain result shown in Fig. 3.
In one tree branch, due to A, the B ages are more close, C, and the D ages are more close, they are divided into two groups, flat per appropriation
The equal age is as predicted value.Calculating residual error at this time, (meaning of residual error is exactly:The actual value of residual error=A of the predicted value+A of A), institute
Residual error with A be exactly 16-15=1 (note that the predicted value of A refers to the cumulative sum of all trees in front, only one tree herein before
So being directly 15, needing all to have added up if also having and setting is used as the predicted value of A).And then A, B, C, D can be respectively obtained
Residual error be respectively -1,1, -1,1.Then residual error is taken to substitute A, B, C, the initial value of D goes to learn to second tree, if we
Predicted value is equal with their residual error, then the conclusion that second is set need to be only added on one tree can obtain real age
.Only there are two values 1 and -1 for second tree, are directly divided into two nodes.Proprietary residual error is all 0 at this time, i.e., everyone
True predicted value is obtained.
The present A of processing set by two, the predicted value of B, C, D are all consistent with real age:
A:14 years old Students in grade one of senior middle school, shopping is less, often asks schoolmate's problem;Predict age A=15-1=14;
B:16 years old high school seniors;It does shopping less, is often learned younger brother and asked questions;Predict age B=15+1=16;
C:24 years old graduating students;It does shopping more, often asks senior apprentice's problem;Predict age C=25-1=24;
D:The 2 years employees of work in 26 years old;It does shopping more, is often asked questions by junior fellow apprentice;Predict age D=25+1=26.
In another embodiment, when determining the quantity for the decision tree that risk forecast model uses, the user can be based on
The quantity of the corresponding classification of characteristic determines.Such as the user characteristic data of 80 dimensions is had chosen, each dimension can be with
The decision feature of one tree is represented, can build nonlinear risk forecast model using 80 decision trees in this way.General feelings
Condition can be arranged a dimension and correspond to more trees, specifically can be according to the data volume and application scenarios that prediction model is handled
Processing requirement is arranged accordingly.Certainly, in this specification others embodiment, the total quantity of specific decision tree can basis
Acquisition be data, branch's number of tree, tree the superior and the subordinate's connection relation etc. be determined.
As previously mentioned, the embodiment that this specification provides can be not only used for the implement scene of vehicle insurance business risk prediction
In, it is also applied in the implement scene of fund risk prediction, medical insurance risk profile etc..Specifically in vehicle insurance business wind
In the application scenarios nearly predicted, the risk forecast model includes being carried out based on risk association data associated with vehicle insurance business
The vehicle insurance risk forecast model that training obtains;
S26:The risk profile result includes any one in the loss ratio of user to be predicted, vehicle insurance risk score value.
Certainly, loss ratio described above, vehicle insurance risk score value are only one or more embodiments to non-linear relation
A kind of output characteristic manner of risk forecast model.This specification, which does not limit, can also other characterizations in other embodiments
The characteristic manner of mode or the loss ratio, vehicle insurance risk score value by deformation, transformation, if loss ratio is after linear transformation
Vehicle insurance point can be obtained, vehicle insurance point is bigger, and risk is smaller (for vehicle insurance risk score value on the contrary, risk score value is bigger, risk is higher).
It should be noted that the usually described linear relationship refers to that there are first power function, this explanations between two variables
The linear relationship of variable may include y=ax+b forms in insurance or vehicle insurance described in book embodiment, and x is independent variable, y be because
Variable.This specification embodiment is in specific insurance or vehicle insurance service application scene, the understanding of the linear relationship broad sense
Can refer to relationship between two variables it be specific, fixed, can be stated with straight line under some cases or pass through one
Linear relationship is converted into after fixed mathematics variation (information loss of conversion is in a certain range).The non-linear relation is main
Refer to relationship between variable it is continually changing, can not be described with formula, with curve, curved surface or can not only be advised under some cases
Line then indicates, such as risk score value and occupation, risk score value and gender.
In this specification one or more embodiment, the processing of the structure risk forecast model may be used offline
The mode built in advance generates, and can choose the study instruction that the training data comprising non-linear relation carries out GBDT decision trees in advance
Practice, is used on line again after the completion of training.This specification is not excluded for the risk forecast model and online structure or more may be used
The mode of newly/maintenance, such as in the case where computer capacity is enough, risk forecast model can be constructed online, constructed
Risk forecast model can synchronize online use, handle target risk associated data to be predicted.
A kind of processing method for insurance business risk profile that this specification embodiment provides, can be carried using gradient in advance
Decision tree is risen to build risk forecast model, which can use the wind associated with insurance business of mark
Dangerous associated data is trained.Risk profile on line can be used as to use when prediction requires when risk forecast model training reaches,
Insurance business risk profile is carried out to user to be predicted, exports prediction result.It can using the method that this specification embodiment provides
Rationally and effectively to apply the non-linear variable of various dimensions in insurance business, the non-linear relation of decision tree is promoted based on gradient
Risk forecast model can be compatible with linear and nonlinear variable, relative to traditional linear model, the standard of prediction result
True property is obviously improved, and effectively makes up the deficiency of traditional linear model, improves insurance business service experience.
Method described above can be used for the risk identification of client-side, be provided in being applied such as the payment of mobile terminal
Insurance business risk assessment.The client can be PC (personal computer) machine, server, industrial personal computer
It is (industrial control computer), intelligent movable phone, Flat electronic equipment, portable computer (such as laptop etc.), a
Personal digital assistant (PDA) or desktop computer or intelligent wearable device etc..Mobile communication terminal, vehicle-mounted is set handheld device
Standby, wearable device, television equipment, computing device.It can also apply and be in insurance company or third party insurance service organization
Unite server in, the system server may include individual server, server cluster, distribution system services device or
The server of person's processing equipment request data is combined with the system server that other associated datas are handled.For example, a kind of realization
In may include establish Ali's cloud open data processing service (Open Data Processing Service, abbreviation ODPS)
On platform.Unified programming interface and interface can be provided for the various data processing tasks from different user demands.It is based on
ODPS carries out the guarantee of system performance, implements the system of this specification embodiment method and with parallel processing mass data and can reach
Best operational performance.
As previously mentioned, the embodiment of the method that this specification embodiment is provided can be in mobile terminal, terminal, clothes
It is executed in business device or similar arithmetic unit.For running on the server, Fig. 4 is a kind of application that this specification provides
The hardware block diagram of the server of insurance business risk profile processing method.As shown in figure 4, server 10 may include one
Or (processor 102 can include but is not limited to Micro-processor MCV or programmable to multiple (one is only shown in figure) processors 102
The processing unit of logical device FPGA etc.), memory 104 for storing data and the transmission module for communication function
106.It will appreciated by the skilled person that structure shown in Fig. 4 is only to illustrate, not to the knot of above-mentioned electronic device
It is configured to limit.For example, server 10 may also include more than shown in Fig. 4 or less component, such as can also include
Others processing hardware, such as database or multi-level buffer, or with the configuration different from shown in Fig. 4.
Memory 104 can be used for storing the software program and module of application software, such as the search in the embodiment of the present invention
Corresponding program instruction/the module of method, processor 102 are stored in software program and module in memory 104 by operation,
To perform various functions application and data processing, that is, realize the processing method of above-mentioned navigation interactive interface content displaying.It deposits
Reservoir 104 may include high speed random access memory, may also include nonvolatile memory, as one or more magnetic storage fills
It sets, flash memory or other non-volatile solid state memories.In some instances, memory 104 can further comprise relative to place
The remotely located memory of device 102 is managed, these remote memories can pass through network connection to terminal 10.Above-mentioned network
Example include but not limited to internet, intranet, LAN, mobile radio communication and combinations thereof.
Transmission module 106 is used to receive via a network or transmission data.Above-mentioned network specific example may include
The wireless network that the communication providers of terminal 10 provide.In an example, transmission module 106 includes that a network is suitable
Orchestration (Network Interface Controller, NIC), can be connected with other network equipments by base station so as to
Internet is communicated.In an example, transmission module 106 can be radio frequency (Radio Frequency, RF) module,
For wirelessly being communicated with internet.
Based on unit type recognition methods described above, this specification also provides a kind of insurance business risk profile processing
Device.The device may include the system (including distributed system) for having used this specification embodiment the method, soft
Part (application), module, component, server, client etc. simultaneously combine the necessary apparatus for implementing hardware.Based on same innovation
Conceive, the processing unit in a kind of embodiment that this specification provides is as described in the following examples.Since device solves the problems, such as
Implementation it is similar to method, therefore the implementation of the specific processing unit of this specification embodiment may refer to preceding method
Implement, overlaps will not be repeated.Although device described in following embodiment is preferably realized with software, hardware,
Or the realization of the combination of software and hardware is also that may and be contemplated.Specifically, as shown in figure 5, Fig. 5 is this specification carries
Supply a kind of insurance business risk profile processing unit embodiment modular structure schematic diagram, may include:
Prediction data acquisition module 201 can be used for obtaining the target risk associated data of user to be predicted;
Risk profile module 202 can be used for the risk forecast model using structure to the target risk associated data
It is handled, exports the risk profile of the user to be predicted as a result, the risk forecast model method includes:Utilize mark
Risk association data promote decision tree to gradient and are trained determining prediction model.
It should be noted that this specification embodiment device described above and, according to the description of related method embodiment
Can also include other embodiments.Concrete implementation mode is referred to the description of embodiment of the method, does not make herein one by one
It repeats.
The server or client that this specification embodiment provides can execute corresponding journey by processor in a computer
Sequence instruction realizes, such as using the c++ language of windows operating systems PC ends or server end realize or other for example
The necessary hardware realization of the corresponding application design language set of Linux, system, or the processing logic based on quantum computer
Realize etc..Above-mentioned processing equipment can specifically provide the service of risk profile for insurance server or the third party service organization
Device, the server can be individual server, server cluster, distribution system services device or processing equipment request
The server of data is combined with the system server that other associated datas are handled.This specification also provides a kind of insurance business wind
Danger prediction processing equipment, may include specifically processor and the memory for storing processor-executable instruction, described
Processor is realized when executing described instruction:
Obtain the target risk associated data of user to be predicted;
The target risk associated data is handled using the risk forecast model of structure, exports the use to be predicted
The risk profile at family is as a result, the risk forecast model method includes:It is determined to gradient promotion using the risk association data of mark
Plan tree is trained determining prediction model.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute
The user characteristic data that risk association data include at least one classification is stated, the user characteristic data includes and insurance business phase
The data information of associated non-linear relation.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute
Processor is stated to train to obtain the risk forecast model using following manner:
Determine the threshold value of total quantity and the decision tree used decision feature in each branch of decision tree, it is described
Decision is characterized as one kind in the classification of the user characteristic data;
When being trained to one group of risk association data, if a tree for the decision tree of training reaches default value or determines
The residual error of plan tree meets deconditioning condition, then stops the training of this group of risk association data, and the default value is less than or equal to
The total quantity;
The threshold value of the decision feature of corresponding decision tree is adjusted by the training result of risk association data, until after adjustment
When the prediction result output that the threshold value meets risk forecast model requires, the risk forecast model is determined.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute
The total quantity for stating decision tree is determined based on the quantity of the corresponding classification of the user characteristic data.
Described in foregoing manner embodiment, in another embodiment for the processing equipment that this specification provides, institute
It includes that the vehicle insurance risk that is trained based on risk association data associated with vehicle insurance business is pre- to state risk forecast model
Survey model;
The risk profile result includes any one in the loss ratio of user to be predicted, vehicle insurance risk score value.
Above-mentioned instruction can be stored in a variety of computer readable storage mediums.The computer readable storage medium can
To include the physical unit for storing information, can by after information digitalization again by the way of electricity, magnetic or optics etc.
Media are stored.Computer readable storage medium described in the present embodiment, which has, may include:Information is stored in the way of electric energy
Device such as, various memory, such as RAM, ROM;The device of information is stored in the way of magnetic energy such as, hard disk, floppy disk, tape,
Core memory, magnetic bubble memory, USB flash disk;Using optical mode store information device such as, CD or DVD.Certainly, also other
Readable storage medium storing program for executing of mode, such as quantum memory, graphene memory etc..Device or server described above or visitor
Involved instruction in family end or processing equipment ibid describes.
It should be noted that device and processing equipment that this specification embodiment is described above, implement according to correlation technique
The description of example can also include other embodiments.Concrete implementation mode is referred to the description of embodiment of the method, herein
It does not repeat one by one.
Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment
Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for hardware+
For program class embodiment, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to side
The part of method embodiment illustrates.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims
It is interior.In some cases, the action recorded in detail in the claims or step can be come according to different from the sequence in embodiment
It executes and desired result still may be implemented.In addition, the process described in the accompanying drawings not necessarily require show it is specific suitable
Sequence or consecutive order could realize desired result.In some embodiments, multitasking and parallel processing be also can
With or it may be advantageous.
Processing method, device and the processing equipment for a kind of insurance business risk profile that this specification embodiment provides, can
Risk forecast model is built to promote decision tree using gradient in advance, which can use mark and guarantor
The dangerous associated risk association data of business are trained.Line can be used as when prediction requires when risk forecast model training reaches
Upper risk profile uses, and carries out insurance business risk profile to user to be predicted, exports prediction result.Implemented using this specification
The method that example provides can rationally and effectively apply the non-linear variable of various dimensions in insurance business, and decision tree is promoted based on gradient
The risk forecast model of non-linear relation can be compatible with linear and nonlinear variable, relative to traditional linear mould
The accuracy of type, prediction result is obviously improved, and effectively makes up the deficiency of traditional linear model, improves insurance business service
Experience.
Although this application provides the method operating procedure as described in embodiment or flow chart, based on conventional or noninvasive
The labour for the property made may include more or less operating procedure.The step of being enumerated in embodiment sequence is only numerous steps
A kind of mode in execution sequence does not represent and unique executes sequence.Device or system server product in practice executes
When, it can either method shown in the drawings sequence executes or parallel executes (such as parallel processor or more according to embodiment
The environment of thread process).
Although mentioning decision tree in the definition of linear relationship/non-linear relation, GBDT in this specification embodiment content
The operations such as data acquisition, storage, interaction, calculating, the judgement of structure, the processing procedure of GBDT model algorithms or the like and data are retouched
State, still, this specification embodiment be not limited to must be meet industry communication standard, standard GBDT model algorithms processing,
Situation described in communication protocol and normal data model/template or this specification embodiment.Certain professional standards or use
Embodiment modified slightly can also realize above-described embodiment phase in self-defined mode or the practice processes of embodiment description
The implementation result being anticipated that after same, equivalent or close or deformation.Using these modifications or deformed data acquisition, stores, sentences
The embodiment of the acquisitions such as disconnected, processing mode, still may belong within the scope of the optional embodiment of this specification.
In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example,
Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So
And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit.
Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause
This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device
(Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate
Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer
Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker
Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " patrols
Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development,
And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language
(Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL
(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description
Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL
(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby
Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present
Integrated Circuit Hardware Description Language) and Verilog.Those skilled in the art also answer
This understands, it is only necessary to method flow slightly programming in logic and is programmed into integrated circuit with above-mentioned several hardware description languages,
The hardware circuit for realizing the logical method flow can be readily available.
Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing
The computer for the computer readable program code (such as software or firmware) that device and storage can be executed by (micro-) processor can
Read medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit,
ASIC), the form of programmable logic controller (PLC) and embedded microcontroller, the example of controller includes but not limited to following microcontroller
Device:ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, are deposited
Memory controller is also implemented as a part for the control logic of memory.It is also known in the art that in addition to
Pure computer readable program code mode is realized other than controller, can be made completely by the way that method and step is carried out programming in logic
Controller is obtained in the form of logic gate, switch, application-specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. to come in fact
Existing identical function.Therefore this controller is considered a kind of hardware component, and to including for realizing various in it
The device of function can also be considered as the structure in hardware component.Or even, it can will be regarded for realizing the device of various functions
For either the software module of implementation method can be the structure in hardware component again.
Processing equipment, device, module or the unit that above-described embodiment illustrates, specifically can be real by computer chip or entity
It is existing, or realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer example
Such as can be personal computer, laptop computer, vehicle-mounted human-computer interaction device, cellular phone, camera phone, smart phone,
Personal digital assistant, navigation equipment, electronic mail equipment, game console, tablet computer, wearable is set media player
The combination of any equipment in standby or these equipment.
Although this specification embodiment provides the method operating procedure as described in embodiment or flow chart, based on conventional
May include either more or less operating procedure without creative means.The step of being enumerated in embodiment sequence be only
A kind of mode in numerous step execution sequences does not represent and unique executes sequence.Device or end product in practice is held
When row, can according to embodiment either method shown in the drawings sequence execute or it is parallel execute (such as parallel processor or
The environment of multiple threads, even distributed data processing environment).The terms "include", "comprise" or its any other change
Body is intended to non-exclusive inclusion, so that process, method, product or equipment including a series of elements are not only wrapped
Those elements are included, but also include other elements that are not explicitly listed, or further include for this process, method, product
Or the element that equipment is intrinsic.In the absence of more restrictions, being not precluded in the process including the element, side
There is also other identical or equivalent elements in method, product or equipment.
For convenience of description, it is divided into various modules when description apparatus above with function to describe respectively.Certainly, implementing this
The function of each module is realized can in the same or multiple software and or hardware when specification embodiment, it can also be by reality
Show the module of same function by the combination realization etc. of multiple submodule or subelement.Device embodiment described above is only
Schematically, for example, the division of the unit, only a kind of division of logic function, can there is other draw in actual implementation
The mode of dividing, such as multiple units or component can be combined or can be integrated into another system, or some features can be ignored,
Or it does not execute.Another point, shown or discussed mutual coupling, direct-coupling or communication connection can be by one
The INDIRECT COUPLING of a little interfaces, device or unit or communication connection can be electrical, machinery or other forms.
It is also known in the art that other than realizing controller in a manner of pure computer readable program code, it is complete
Entirely can by by method and step carry out programming in logic come so that controller with logic gate, switch, application-specific integrated circuit, programmable
The form of logic controller and embedded microcontroller etc. realizes identical function.Therefore this controller is considered one kind
Hardware component, and the structure that the device for realizing various functions that its inside includes can also be considered as in hardware component.Or
Person even, can will be considered as either the software module of implementation method can be hardware again for realizing the device of various functions
Structure in component.
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions every first-class in flowchart and/or the block diagram
The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided
Instruct the processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine so that the instruction executed by computer or the processor of other programmable data processing devices is generated for real
The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to
Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or
The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology realizes information storage.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or other magnetic storage apparatus
Or any other non-transmission medium, it can be used for storage and can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It will be understood by those skilled in the art that the embodiment of this specification can be provided as method, system or computer program production
Product.Therefore, in terms of this specification embodiment can be used complete hardware embodiment, complete software embodiment or combine software and hardware
Embodiment form.Moreover, it wherein includes computer available programs that this specification embodiment, which can be used in one or more,
Implement in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of code
The form of computer program product.
This specification embodiment can describe in the general context of computer-executable instructions executed by a computer,
Such as program module.Usually, program module includes routines performing specific tasks or implementing specific abstract data types, journey
Sequence, object, component, data structure etc..This specification embodiment can also be put into practice in a distributed computing environment, in these points
In cloth computing environment, by executing task by the connected remote processing devices of communication network.In Distributed Calculation ring
In border, program module can be located in the local and remote computer storage media including storage device.
Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment
Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to embodiment of the method
Part explanation.In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ",
The description of " specific example " or " some examples " etc. means specific features described in conjunction with this embodiment or example, structure, material
Or feature is contained at least one embodiment or example of this specification embodiment.In the present specification, to above-mentioned term
Schematic representation be necessarily directed to identical embodiment or example.Moreover, description specific features, structure, material or
Person's feature may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, in not conflicting feelings
Under condition, those skilled in the art by different embodiments or examples described in this specification and different embodiment or can show
The feature of example is combined.
The foregoing is merely the embodiments of this specification embodiment, are not limited to this specification embodiment.It is right
For those skilled in the art, this specification embodiment can have various modifications and variations.It is all in this specification embodiment
Any modification, equivalent replacement, improvement and so within spirit and principle, the right that should be included in this specification embodiment are wanted
Within the scope of asking.