CN113570469B - Intelligent vehicle change prediction method for vehicle insurance user - Google Patents

Intelligent vehicle change prediction method for vehicle insurance user Download PDF

Info

Publication number
CN113570469B
CN113570469B CN202110851738.9A CN202110851738A CN113570469B CN 113570469 B CN113570469 B CN 113570469B CN 202110851738 A CN202110851738 A CN 202110851738A CN 113570469 B CN113570469 B CN 113570469B
Authority
CN
China
Prior art keywords
vehicle
data
insurance
user
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110851738.9A
Other languages
Chinese (zh)
Other versions
CN113570469A (en
Inventor
邱卫东
黄征
崔海名
来春蕾
代德发
鲁静文
唐鹏
徐源
李昕朋
陆尔东
徐春雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
China Pacific Insurance Group Co Ltd CPIC
Original Assignee
Shanghai Jiaotong University
China Pacific Insurance Group Co Ltd CPIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, China Pacific Insurance Group Co Ltd CPIC filed Critical Shanghai Jiaotong University
Priority to CN202110851738.9A priority Critical patent/CN113570469B/en
Publication of CN113570469A publication Critical patent/CN113570469A/en
Application granted granted Critical
Publication of CN113570469B publication Critical patent/CN113570469B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Biophysics (AREA)
  • Tourism & Hospitality (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

An intelligent car change prediction system and method facing car insurance users, the system comprises: the system comprises a data processing module, an offline training module and an online prediction module, wherein the data processing module performs data screening and data labeling processing according to user insurance policy information and outputs whether a user changes a vehicle or not and a vehicle type changing result, the offline training module performs machine learning model training according to the user insurance policy and the labeling information and outputs a prediction model, and the online prediction module performs prediction on whether the user changes the vehicle and changes a specified vehicle type according to new user insurance policy information and the prediction model and outputs whether the user changes the vehicle and changes the specified vehicle type or not. According to whether the insurance vehicles in the year insurance policy before and after in the historical user insurance policy data are consistent, whether the user changes the vehicle and the changed vehicle type is marked, relevant feature sets of the user are screened to train a machine learning and deep learning model, and accurate prediction of whether the user changes the vehicle and whether the user changes the appointed vehicle type is completed.

Description

Intelligent vehicle change prediction method for vehicle insurance user
Technical Field
The invention relates to a technology in the field of neural network application, in particular to a vehicle change prediction method based on machine learning for a vehicle insurance user.
Background
Through the investigation of the prior art, the industry has some achievements in the field of accurate marketing at present. The user portrayal technology is the most commonly used technical means in the accurate marketing field, uses the modern computer technology to collect and analyze user information, classifies and screens user characteristics through technologies such as machine learning, deep learning and the like, establishes user portrayal, and realizes functions such as user potential value mining, user value subdivision, user management and the like. Based on the user portrait, the business objective and profit increase of the enterprise are realized through personalized marketing strategies.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention provides an intelligent car changing prediction system and method for car insurance users.
The invention is realized by the following technical scheme:
The invention relates to an intelligent vehicle change prediction method for a vehicle insurance user, which marks whether a user changes a vehicle and a changed vehicle type according to whether the insurance vehicles in a front year insurance policy and a rear year insurance policy in historical user vehicle insurance policy data are consistent, screens relevant characteristic sets of the user to train a machine learning and deep learning model, and completes accurate prediction of whether the user changes the vehicle and whether the user changes a specified vehicle type.
The invention relates to a vehicle-changing prediction system based on machine learning for a vehicle insurance user, which realizes the method, and comprises the following steps: the system comprises a data processing module, an offline training module and an online prediction module, wherein: the data processing module performs data screening and data marking processing according to the user insurance policy information and outputs the result of whether the user changes the vehicle and whether the user changes the vehicle type, the offline training module performs machine learning model training according to the user insurance policy and marking information and outputs a prediction model, and the online prediction module performs prediction on whether the user changes the vehicle and changes the specified vehicle type according to the new user insurance policy information and the prediction model and outputs whether the user changes the vehicle and changes the specified vehicle type.
The data processing module comprises: the device comprises a data screening unit and a data labeling unit, wherein: the data screening unit screens effective samples from the insurance policy data of the user, performs data cleaning according to the insurance date of the insurance policy and the certificate number field of the insurance applicant, obtains the insurance policy data of the same user in different years, and extracts relevant characteristics such as insurance user information, insurance vehicle information and insurance information in the insurance policy data; the data marking unit judges whether the vehicles applied in different years of the user are consistent or not according to whether the vin codes of the applied vehicles in the policy are consistent or not in the screened policy data of different years of the same user, so as to mark whether the user changes the vehicle and the changed vehicle type.
The offline training module comprises: the system comprises a characteristic engineering unit and a model training unit, wherein: the feature engineering unit cleans, sorts and normalizes the relevant features extracted from the data processing module, changes the policy information of a single user into a group of feature values through a character data digitizing method, and screens important features by XGBoost; the model training unit divides the data into a training data set and a test data set, trains a machine learning model by using the training set, tests the model effect by using the test set, stores the model with the optimal effect and provides the model with the optimal effect for the online prediction module to predict.
The online prediction module comprises: the device comprises a feature extraction unit and a vehicle change prediction unit, wherein: the feature extraction unit extracts and screens relevant fields and performs standardized processing according to a method used by a feature engineering unit in the offline training module according to initial feature input of online prediction by leading in policy information to be predicted by a user; the vehicle change prediction unit inputs the processed characteristics to the corresponding trained model, and the model outputs a vehicle change prediction result.
Technical effects
The invention integrally solves the problem that whether the user changes the vehicle or not is predicted by the user vehicle insurance policy; compared with the prior art, the method and the device complete the prediction of the vehicle change and the vehicle type based on the vehicle insurance policy data; the user's car change and prediction of the car type can be completed by using the user's car insurance policy data, a large amount of user personal information is not needed, and the method is more practical; the training model can predict whether all the dangerous users change the car, and meanwhile, the method can be expanded to the prediction of changing the car of various car types, and has strong flexibility.
Drawings
FIG. 1 is a block diagram of a system according to the present invention.
Detailed Description
As shown in fig. 1, an intelligent car change prediction for a car insurance user according to this embodiment includes: the system comprises a data processing module, an offline training module and an online prediction module, wherein: the data processing module performs data screening and data marking processing according to the user insurance policy information and outputs the result of whether the user changes the vehicle and whether the user changes the vehicle type, the offline training module performs machine learning model training according to the user insurance policy and marking information and outputs a prediction model, and the online prediction module performs prediction on whether the user changes the vehicle and changes the specified vehicle type according to the new user insurance policy information and the prediction model and outputs whether the user changes the vehicle and changes the specified vehicle type.
The data processing module comprises: the device comprises a data screening unit and a data labeling unit, wherein: the data screening unit is used for collecting data and screening the data, and the data labeling unit is used for searching the data of the insurance policy of the same user in the next year from the screened data according to the certificate number and the field of the insurance applicant and labeling the data.
The data collection refers to: user policy data provided by an insurance company is collected, data formats of all fields are standardized, 50 relevant fields such as user information, vehicle information and insurance information in the policy data are extracted as features, and a user policy database is established.
The data screening refers to: according to the insurance date of the insurance policy, the certificate number and the field of the insurance applicant, the data of the insurance vehicles of the same user in different years are searched in the user insurance policy database, the insurance policy data of different vin codes and the number more than 1 are deleted according to the vin codes and the field of the insurance policy, and the data of the insurance vehicles of different years, the number of which is 1, are reserved, namely the data records of the insurance vehicles of the same user in different years are screened out in the user insurance policy database.
The data label specifically comprises the following steps: when the vin code of the insurance vehicle in the current year is different from the vin code and field value of the insurance vehicle in the next year, marking the vehicle as a vehicle change, and marking the vehicle type replaced by the user by using the insurance vehicle type in the next year insurance policy data; when the vin code of the current year of the insurance application vehicle is the same as the vin code and field value of the next year of the insurance application vehicle, the vehicle is marked as not being changed.
The offline training module comprises: the system comprises a characteristic engineering unit and a model training unit, wherein: the feature engineering unit processes abnormal values, data standardization and feature screening of the data obtained by the data processing module, and the model training unit carries out modeling training of the MLP model and the GBDT model according to the screened features.
The abnormal value processing means: performing outlier processing on default values or outliers in the features obtained by the data processing module, wherein the processing features comprise: the area, the three-responsibility insurance policy, the ticket premium, the traffic violation coefficient, the expected odds, the train, the negotiated actual value, the age of the vehicle, the classification of the vehicle, the risk level of the vehicle, the type of the vehicle, the displacement, the number of times the platform returns to insurance, the platform returns to NCD coefficient, the total number of cases of the vehicle, the amount of the vehicle pay, the sex of the insured person, whether the applicant has an insurance client, whether the insurance client is an effective insurance client for life insurance, the total insurance number purchased by the applicant, the age, the vehicle type, the purchase price of new vehicle, the fuel type and the like. Because the data volume of the non-vehicle change is large, if the default value exists in the data of the non-vehicle change user, the abnormal data is directly removed, and if the abnormality exists in the data of the vehicle change user, the data is processed in the modes of mean value filling, hot card filling, manual filling and the like.
The artificial padding is suitable for the part of the missing value which can be deduced from the rest of the data, such as gender can be deduced from provincial evidence.
The hot card filling refers to: for an object that contains a null value, the hot card fill method finds an object that is most similar to it in the complete data and then fills with the value of this similar object.
The data normalization refers to: the characteristic values are standardized and then converted into standard normal distribution, such as the vehicle age, new vehicle acquisition price, actual negotiating value and the like, and are directly converted into standard normal distribution. The characteristics of the other numerical value types are standardized by an interval scaling method in a dimensionless method, the processing formula is as follows,Wherein x is the original value of the feature, min is the minimum value of all the values of the feature, max is the maximum value of all the values of the feature, and x' is the value normalized by the original value. And converting the character data into numerical values by using a onehot coding method according to the characteristic value belonging to the character string type.
The feature screening means that: and carrying out feature screening on the 50 standardized features by XGBoost, and screening features with higher importance for the classification model. XGBoost the main parameters are set as: the input data is 50 in length, the booster is tree type (gbtree), the activation function is multi: softmax, the maximum depth of the tree is 6 layers, and the gamma value is 0.1. Training runs 100 rounds. And selecting 5000 vehicle-changing data from the data set by adopting a ten-fold cross-validation method, inputting 5000 vehicle-changing data into a XGBoost model for learning, outputting a feature importance result of 50 features, and counting feature sets with front feature importance in a ten-fold experiment. And screening out the features with high importance from the 50 features according to the statistical result, wherein 28 features are included: regional, three-responsibility insurance policy, ticket policy, traffic violation coefficients, expected odds, final odds, vehicle systems, negotiated actual value, vehicle age, vehicle type classification, vehicle type risk level, vehicle type, displacement, number of platform returns to insurance, number of platform returns to NCD coefficients, total number of vehicle cases, vehicle pay amount, sex of insured, whether the insurer is a life insurance client, whether the life insurance client is a life insurance long effective policy client, the insurer purchases life insurance total policy, age, vehicle type, risk, new vehicle purchase price, fuel type, the insurer pays total policy, and the like.
The MLP model refers to: selecting a multi-layer perceptron MLP as a classification model, wherein the network structure and parameters of the MLP comprise: an input layer, three hidden layers, and an output layer. The nodes of the three hidden layers are 128, 256 and 64 respectively, the hidden layers adopt an activation function LeakyReLU, and the corresponding dropout is set to 0.2. The activation function of the output layer is Sigmod.
The data obtained by the feature engineering unit are trained one by one to obtain different models, and the model can be divided into two kinds of models of whether a user changes a vehicle model or not and whether the vehicle model changes into two kinds of models of a plurality of target vehicle types such as BMW, gekko Swinhonis, leishas, masses, mercedes-Benz and the like after changing the vehicle. For the model of whether to change the car, when the user changes the car, the label is 1, and when the user does not change the car, the label is 0. For the model of the target vehicle model, the data mark of the target vehicle model is 1, and the data marks of other vehicle models are 0. All data were read as per 4:1 split, where 75% of the data is trained and 25% of the data is used as test set.
The LeakyReLU formula isSigmod has the formula/>
The model effect of the MLP is shown in the following table.
The GBDT model refers to: the GradientBoostingClassifier model in sklearn library is selected, the size of the tree is set to be 500 in the experiment, the maximum depth of the tree is set to be 4, the learning rate is set to be 0.1, and the minimum number of samples required by splitting one internal node of the tree is set to be 100. The loss function is a logarithmic loss function L (Y, P (y|x)) = -logP (y|x).
Training the data obtained by the feature engineering unit one by one to obtain different models, wherein the trained model data are the same as the MLP model data, and dividing 4:1, wherein 75% of the data are trained and 25% of the data are used as test sets.
The GBDT model effects are shown in the following table.
Accuracy (accuracy) indicates that all samples with correct prediction result account for all sample ratios.
Precision (precision) indicates the proportion of samples that are truly valid in samples for which the prediction result is valid.
Recall (recall) that indicates the proportion of samples for which the predicted outcome is valid to all true valid samples.
And storing the trained model with the optimal effect to a local place for an online detection module.
The on-line detection module specifically comprises: the device comprises a feature extraction unit and a vehicle change prediction unit, wherein: the feature extraction unit extracts multidimensional feature information of the user required by prediction by using a method of a feature engineering unit in the offline training module, and the vehicle change prediction unit inputs the obtained multidimensional features of the user into the stored model in batches to obtain a predicted value of whether the user changes a vehicle or not and whether the user changes a target vehicle type or not.
The multi-dimensional characteristic information comprises: regional, three-responsibility insurance policy, ticket policy, traffic violation coefficients, expected odds, final odds, vehicle systems, negotiated actual value, vehicle age, vehicle type classification, vehicle type risk level, vehicle type, displacement, number of platform returns to insurance, number of platform returns to NCD coefficients, total number of vehicle cases, vehicle pay amount, sex of insured, whether the insurer is a life insurance client, whether the life insurance client is a life insurance long effective policy client, the insurer purchases life insurance total policy, age, vehicle type, risk, new vehicle purchase price, fuel type, the insurer pays total policy, and the like.
Through specific practical experiments, under a Linux operating system, a python programming language is configured, the shell command is used for starting the model, the accuracy of the model on a test set for changing vehicles is up to 70.2%, and the accuracy of the model for changing vehicle types is up to 74.8%. Experimental results show that the method has certain effect and practicability in predicting the vehicle change and the vehicle change based on policy data.
The foregoing embodiments may be partially modified in numerous ways by those skilled in the art without departing from the principles and spirit of the invention, the scope of which is defined in the claims and not by the foregoing embodiments, and all such implementations are within the scope of the invention.

Claims (1)

1. An intelligent car change prediction system for a car insurance user is characterized by comprising: the system comprises a data processing module, an offline training module and an online prediction module, wherein: the data processing module performs data screening and data marking processing according to the user insurance policy information and outputs the result of whether the user changes the vehicle and whether the user changes the vehicle type, the offline training module performs machine learning model training according to the user insurance policy and marking information and outputs a prediction model, and the online prediction module performs prediction on whether the user changes the vehicle and changes the specified vehicle type according to the new user insurance policy information and the prediction model and outputs whether the user changes the vehicle and changes the specified vehicle type;
the data processing module comprises: the device comprises a data screening unit and a data labeling unit, wherein: the data screening unit is used for collecting data and screening the data, and the data labeling unit is used for searching the data of the insurance policy of the same user in the next year from the screened data according to the certificate number and the field of the insurance applicant and labeling the data;
the data collection refers to: collecting user policy data provided by an insurance company, standardizing data formats of all fields, extracting user information, vehicle information and insurance information in the policy data as features, and establishing a user policy database;
The data screening refers to: according to the insurance date of the insurance policy, the certificate number and the field of the insurance applicant, searching data of the insurance vehicles of the same user in different years in a user insurance policy database, deleting the insurance policy data with different vin codes and the quantity larger than 1 according to the vin codes and the field of the insurance policy, and reserving the data with the quantity of the insurance vehicles of different years being 1, namely screening out the data records of the insurance vehicles of the same user in different years in the user insurance policy database;
The data label specifically comprises the following steps: when the vin code of the insurance vehicle in the current year is different from the vin code and field value of the insurance vehicle in the next year, marking the vehicle as a vehicle change, and marking the vehicle type replaced by the user by using the insurance vehicle type in the next year insurance policy data; when the vin code of the current year of the insurance application vehicle is the same as the vin code and field value of the next year of the insurance application vehicle, marking as not changing;
The offline training module comprises: the system comprises a characteristic engineering unit and a model training unit, wherein: the feature engineering unit processes abnormal values, data standardization and feature screening of the data obtained by the data processing module, and the model training unit carries out modeling training of the MLP model and the GBDT model according to the screened features;
the abnormal value processing means: performing outlier processing on default values or outliers in the features obtained by the data processing module, wherein the processing features comprise: regional, three-responsibility insurance policy, policy premium, traffic violation coefficient, expected odds, train, negotiating actual value, age, classification of model, model risk level, model type, displacement, number of platform returns to insurance, platform returns NCD coefficient, total number of vehicle cases, vehicle odds and amount, sex of insured, whether the applicant has a life insurance client, whether it is a life insurance long effective policy client, total insurance policy number purchased by applicant, age, model, new vehicle acquisition price, fuel type;
if the default value exists in the data of the non-vehicle-changing user, the abnormal data is directly removed, and if the abnormality exists in the data of the vehicle-changing user, the data is processed through mean filling, hot card filling and manual filling;
the part which is suitable for the missing value and is estimated by the rest data is filled manually;
The hot card filling refers to: for an object containing null values, the hot card fill method finds an object most similar to it in the complete data, and then fills with the value of this similar object;
The data normalization refers to: the characteristic value accords with the numerical value type of normal distribution, is converted into standard normal distribution after standardization, the characteristics of the other numerical value types are standardized by an interval scaling method in a dimensionless method, a processing formula is that, Wherein x is the original value of the feature, min is the minimum value of all values of the feature, max is the maximum value of all values of the feature, x' is the value after the original value is standardized, the feature value belongs to the character string type, and character data are converted into numerical values through a onehot coding method;
the feature screening means that: carrying out feature screening on the 50 standardized features by XGBoost, and screening features with higher importance for the classification model;
The main parameters of XGBoost are set as follows: the length of input data is 50, a boost is a tree type, an activation function is multi, the maximum depth of the tree is 6 layers, the gamma value is 0.1, training rounds are 100 rounds, a ten-fold cross validation method is adopted, 5000 vehicle-changing data are selected from a data set, 5000 vehicle-non-vehicle-changing data are input into a XGBoost model for learning, feature importance results of 50 features are output, feature sets with front feature importance in ten-fold experiments are counted, features with high importance degree are selected from the 50 features according to the counted results, 28 features are included: regional, three-responsibility insurance policy, ticket policy, traffic violation coefficient, expected odds, final odds, train, negotiated actual value, age, classification of models, risk class of models, type of vehicle, displacement of models, number of platform returns to insurance, number of platform returns to NCD coefficient, total number of vehicle cases, amount of vehicle odds, sex of insured, whether the applicant has a life insurance client, whether the customer is a life insurance long insurance effective policy client, the applicant has purchased the life insurance total policy, age, model, risk, new vehicle purchase price, fuel type, the applicant has paid total policy;
The MLP model refers to: selecting a multi-layer perceptron MLP as a classification model, wherein the network structure and parameters of the MLP comprise: the input layer, three hidden layers and the output layer, wherein the nodes of the three hidden layers are 128, 256 and 64 respectively, the hidden layers adopt an activation function LeakyReLU, the corresponding dropout is set to be 0.2, and the activation function of the output layer is Sigmod;
The method comprises the steps of training data obtained by a feature engineering unit one by one to obtain different models, classifying the models into two classification models of whether a user changes a vehicle model or not, changing the vehicle model into a target vehicle model or not after changing the vehicle, for whether the vehicle model is changed, when the user changes the vehicle, the label is 1, the label is 0, for changing the vehicle model into the target vehicle model, the data of changing the vehicle model into the target vehicle model is 1, the data of changing the vehicle model into the data of other vehicle models is 0, and all the data are as follows: 1, wherein 75% of the data are trained and 25% of the data are used as test sets;
the LeakyReLU formula is Sigmod has the formula/>
The GBDT model refers to: selecting GradientBoostingClassifier models in sklearn libraries, setting the size of a tree to be 500, setting the maximum depth of the tree to be 4, setting the learning rate to be 0.1, setting the minimum sample number required by splitting an internal node of the tree to be 100, and setting a loss function to be a logarithmic loss function L (Y, P (Y|X))= -log P (Y|X);
Training the data obtained by the feature engineering unit one by one to obtain different models, wherein the trained model data are the same as the MLP model data, and dividing 4:1, wherein 75% of the data are trained and 25% of the data are used as test sets;
The on-line prediction module specifically comprises: the device comprises a feature extraction unit and a vehicle change prediction unit, wherein: the feature extraction unit extracts multidimensional feature information of a user required by prediction by using a method of a feature engineering unit in the offline training module, and the vehicle change prediction unit inputs the obtained multidimensional features of the user into a stored model in batches to obtain a predicted value of whether the user changes a vehicle or not and whether the user changes a target vehicle type or not;
The multi-dimensional characteristic information comprises: regional, three-responsibility insurance policy, ticket policy, traffic violation coefficients, expected odds, final odds, train, negotiated actual value, age, classification of models, risk class of models, type of vehicle, displacement of models, number of platform returns to insurance, number of platform returns to NCD coefficients, total number of vehicle cases, amount of vehicle odds, sex of insured, whether the applicant has a life insurance client, whether the customer is a life insurance long insurance effective policy client, the applicant has purchased the life insurance total policy, age, model, risk, new vehicle purchase price, fuel type, and applicant has paid total policy.
CN202110851738.9A 2021-07-27 2021-07-27 Intelligent vehicle change prediction method for vehicle insurance user Active CN113570469B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110851738.9A CN113570469B (en) 2021-07-27 2021-07-27 Intelligent vehicle change prediction method for vehicle insurance user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110851738.9A CN113570469B (en) 2021-07-27 2021-07-27 Intelligent vehicle change prediction method for vehicle insurance user

Publications (2)

Publication Number Publication Date
CN113570469A CN113570469A (en) 2021-10-29
CN113570469B true CN113570469B (en) 2024-05-28

Family

ID=78168026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110851738.9A Active CN113570469B (en) 2021-07-27 2021-07-27 Intelligent vehicle change prediction method for vehicle insurance user

Country Status (1)

Country Link
CN (1) CN113570469B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053075A (en) * 2017-12-27 2018-05-18 北京中交兴路车联网科技有限公司 A kind of scrap-car Forecasting Methodology and system
WO2020077871A1 (en) * 2018-10-15 2020-04-23 平安科技(深圳)有限公司 Event prediction method and apparatus based on big data, computer device, and storage medium
CN112579900A (en) * 2020-12-22 2021-03-30 优必爱信息技术(北京)有限公司 Method, system and equipment for recommending second-hand vehicle replacement information

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332292A1 (en) * 2009-06-30 2010-12-30 Experian Information Solutions, Inc. System and method for evaluating vehicle purchase loyalty
US20150254719A1 (en) * 2014-03-05 2015-09-10 Hti, Ip, L.L.C. Prediction of Vehicle Transactions and Targeted Advertising Using Vehicle Telematics

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053075A (en) * 2017-12-27 2018-05-18 北京中交兴路车联网科技有限公司 A kind of scrap-car Forecasting Methodology and system
WO2020077871A1 (en) * 2018-10-15 2020-04-23 平安科技(深圳)有限公司 Event prediction method and apparatus based on big data, computer device, and storage medium
CN112579900A (en) * 2020-12-22 2021-03-30 优必爱信息技术(北京)有限公司 Method, system and equipment for recommending second-hand vehicle replacement information

Also Published As

Publication number Publication date
CN113570469A (en) 2021-10-29

Similar Documents

Publication Publication Date Title
US11373249B1 (en) Automobile monitoring systems and methods for detecting damage and other conditions
CN103294592B (en) User instrument is utilized to automatically analyze the method and system of the defect in its service offering alternately
CN109739844B (en) Data classification method based on attenuation weight
CN110706039A (en) Electric vehicle residual value rate evaluation system, method, equipment and medium
CN111079941B (en) Credit information processing method, credit information processing system, terminal and storage medium
CN112990386B (en) User value clustering method and device, computer equipment and storage medium
CN112434829A (en) Vehicle maintenance project determination method, system, device and storage medium
CN115147155A (en) Railway freight customer loss prediction method based on ensemble learning
CN114078050A (en) Loan overdue prediction method and device, electronic equipment and computer readable medium
CN113570469B (en) Intelligent vehicle change prediction method for vehicle insurance user
CN109766440B (en) Method and system for determining default classification information for object text description
US20230058076A1 (en) Method and system for auto generating automotive data quality marker
CN113421154B (en) Credit risk assessment method and system based on control chart
CN114331728A (en) Security analysis management system
JP2022082525A (en) Method and apparatus for providing information based on machine learning
CN112818215A (en) Product data processing method, device, equipment and storage medium
CN114443803A (en) Text information mining method and device, electronic equipment and storage medium
CN112905713A (en) Case-related news overlapping entity relation extraction method based on joint criminal name prediction
CN110119464A (en) The intelligent recommendation method and device of numerical value in a kind of contract
CN116913460B (en) Marketing business compliance judgment and analysis method for pharmaceutical instruments and inspection reagents
CN115953166B (en) Customer information management method and system based on big data intelligent matching
CN113191595B (en) Vehicle operation full life cycle cost associated data analysis method and system
CN117520994B (en) Method and system for identifying abnormal air ticket searching user based on user portrait and clustering technology
CN117764692A (en) Method for predicting credit risk default probability
CN118333738A (en) Method for constructing retail credit risk prediction model and credit card service Scorealpha model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant