CN118052582A

CN118052582A - Customer churn probability prediction method, apparatus, computer device and storage medium

Info

Publication number: CN118052582A
Application number: CN202311837475.1A
Authority: CN
Inventors: 郭佳灵
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2023-12-28
Filing date: 2023-12-28
Publication date: 2024-05-17

Abstract

The application relates to a customer churn probability prediction method, a device, computer equipment and a storage medium. The method comprises the following steps: acquiring target business data of a target user in a preset time period; the target service data is time sequence data; processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer; processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer; processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer. By adopting the method, the judgment accuracy of the loss condition of the client can be improved.

Description

Customer churn probability prediction method, apparatus, computer device and storage medium

Technical Field

The present application relates to the field of artificial intelligence technology, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for predicting customer churn probability.

Background

In today's competitive business environment, banks are increasingly aware of customer importance and customer reserved value. The cost of attracting new customers tends to be much higher than the cost of maintaining existing customers. Therefore, banks begin to pay attention to how to get insight into the behavior and the needs of customers in advance and take measures in time to improve customer satisfaction and retention rate, but customer churn prediction usually involves a large amount of customer data, is influenced by various factors, and is a dynamic process, customer behavior and preference may change with time, and massive customers are difficult to maintain only by means of customer manager fixed-point marketing, and a customer churn prediction model becomes an important tool for banks to formulate personalized marketing strategies and improve product and service quality.

In the traditional technology, the loss condition of the client is usually judged through manual experience, but the amount of data which can be processed through manual judgment is small, the standards of manual judgment cannot be the same, the manual judgment result is often influenced by the human subjective factors, and a large number of data support and objective judgment standards are lacked, so that the loss condition of the client is judged inaccurately.

At present, the judgment of the loss condition of the client lacks objective judgment standards, and the judgment accuracy is not high.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a customer churn probability prediction method, apparatus, computer device, computer readable storage medium, and computer program product that can improve the accuracy of determining customer churn conditions.

In a first aspect, the present application provides a method for predicting customer churn probability, including:

acquiring target business data of a target user in a preset time period; the target service data is time sequence data;

processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer;

Processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer;

Processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

In one embodiment, the obtaining manner of the loss probability prediction model includes:

Acquiring a training set; the training set comprises a plurality of training samples, each training sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user;

Acquiring a neural network model, wherein the neural network model comprises a first branch, a second branch, a batch normalization processing layer and at least one trunk full-connection layer; the first branch and the second branch are connected to the batch normalization processing layer in parallel, and the batch normalization processing layer is connected with at least one trunk full-connection layer in series; the first branch comprises a plurality of convolution layers and a first branch full-connection layer which are connected in series; the second branch comprises a plurality of long-short-term memory layers connected in parallel, and an attention mechanism layer and a second branch full-connection layer which are connected with the long-short-term memory layers in series;

And carrying out repeated iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to the training result of each iterative training to obtain the loss probability prediction model.

In one embodiment, obtaining the training set includes:

Acquiring original service data of a sample user in a sampling time period;

Performing time sequence analysis on the original service data to obtain time sequence data corresponding to the original service data, wherein the time sequence data is used as sample service data of a sample user in a sampling time period;

adding a corresponding loss classification label to the sample service data according to the data characteristics of the original service data, and obtaining a training sample according to the sample service data and the corresponding loss classification label;

a training set is obtained based on the plurality of training samples.

In one embodiment, performing iterative training on the neural network model for multiple times according to the training set, and adjusting model parameters of the neural network model according to a training result of each iterative training to obtain a loss probability prediction model, including:

Acquiring a verification set corresponding to the training set; the verification set comprises a plurality of verification samples, each verification sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user;

Performing iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to a training result of the iterative training;

According to the verification set, acquiring a first result error of the neural network model before the model parameters are adjusted and a second result error of the neural network model after the model parameters are adjusted; the first result error and the second result error are used for representing the degree of difference between the processing result obtained by processing the verification sample by the neural network model and the corresponding loss classification label;

Storing the adjusted model parameters when the second result error is smaller than the first result error, and storing the model parameters before adjustment when the second result error is not smaller than the first result error;

returning to execute the step of performing one-time iterative training on the neural network model according to the training set under the condition that the iterative training times are smaller than the preset iterative times, and adjusting model parameters of the neural network model according to the training result of one-time iterative training;

And under the condition that the iteration training frequency is not less than the preset iteration frequency, obtaining a loss probability prediction model according to the latest saved model parameters.

In one embodiment, performing an iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to a training result of the iterative training, including:

the training samples in the training set are simultaneously input into a first branch and a second branch of the neural network model for processing, and first characteristic information and second characteristic information corresponding to the training samples are respectively obtained;

inputting the first characteristic information and the second characteristic information corresponding to the training samples into a batch normalization processing layer of the neural network model to obtain sample prediction results corresponding to the training samples output by a last trunk full-connection layer;

comparing the sample prediction result corresponding to the training sample with the loss classification label in the training sample, and adjusting model parameters of the neural network model according to the comparison result.

In one embodiment, the method further comprises:

obtaining loss probability prediction results of a plurality of target users by adopting a loss probability prediction model;

arranging a plurality of target users according to the sequence from big to small of the loss probability prediction result to obtain a user sequence;

determining a target loss level corresponding to each target user in the user sequence according to a preset dividing proportion;

and determining the target processing strategy corresponding to each target user according to the mapping relation between the loss level and the processing strategy.

In a second aspect, the present application further provides a customer churn probability prediction apparatus, including:

The acquisition module is used for acquiring target business data of a target user in a preset time period; the target service data is time sequence data;

the processing module is used for processing the target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer;

The processing module is also used for processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer;

The processing module is also used for processing the first characteristic information and the second characteristic information through the trunk road of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

In a third aspect, the present application also provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

In a fourth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of:

The client loss probability prediction method, the client loss probability prediction device, the computer equipment, the storage medium and the computer program product acquire target business data of a target user in a preset time period; the target service data is time sequence data; processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer; processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer; processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer. And processing the target business data of the target user through a deep neural network model with parallel convolution paths and deep long-short-term memory paths, predicting the loss probability of the target user, and improving the judgment accuracy of the loss condition of the client.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the related art, the drawings that are required to be used in the embodiments or the related technical descriptions will be briefly described, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.

FIG. 1 is a diagram of an application environment for a customer churn probability prediction method in one embodiment;

FIG. 2 is a flow chart of a method for predicting customer churn probability in one embodiment;

FIG. 3 is a schematic diagram of a fluid loss probability prediction model structure in one embodiment;

FIG. 4 is a schematic diagram of a CNN structure according to an embodiment;

FIG. 5 is a logic diagram of a client churn probability prediction method in one embodiment;

FIG. 6 is a block diagram of a customer churn probability prediction apparatus in one embodiment;

Fig. 7 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The client churn probability prediction method provided by the embodiment of the application can be applied to an application environment shown in figure 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

In an exemplary embodiment, as shown in fig. 2, a method for predicting a customer loss probability is provided, and the method is applied to the terminal 102 in fig. 1, for example, and includes the following steps 202 to 208. Wherein:

step 202, obtaining target business data of a target user in a preset time period; the target service data is time series data.

The target business data are time series data obtained by adopting a time series analysis method based on information such as transaction behaviors, purchase histories, consumption habits and the like of a target user in a preset time period. The preset time period may be, but is not limited to, set to the last three months.

Optionally, for a target user to be subjected to the attrition probability prediction, acquiring target service data in a preset time period as input data of the attrition probability prediction model.

204, Processing the target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first leg includes a convolutional layer and a first leg full link layer.

The first branch is a CNN path, namely a convolutional neural network path, and comprises a plurality of CNN blocks, wherein each CNN Block comprises a convolutional layer.

Optionally, the attrition probability prediction model adopts a parallel hybrid deep neural network structure, as shown in fig. 3, including a deep CNN path and a deep LSTM path, followed by a fusion portion, and finally predicting the customer attrition probability through full connection. The CNN path uses a deep one-dimensional convolution layer to extract spatial features in historical transaction data, the LSTM path uses the LSTM layer to extract multi-dimensional time sequence features on the historical transaction data, meanwhile, a attention mechanism is introduced to extract remarkable fine granularity features, important information is strengthened, and then the two parts of extracted deep feature information are combined to realize prediction of customer loss.

The main architecture of the CNN path consists of a time window, batch normalization, and a deep one-dimensional convolution structure. For example, 51 DCNN may be stacked in the CNN path for feature extraction. Compared with 2DCNN, the 1DCNN operation only moves the convolution kernel along the time axis, instead of simultaneously moving the dimension axis and the time axis, the general features are easier to learn from time sequence data, so that 1DCNN is selected to process historical transaction data, and the 1DCNN is composed of a convolution layer and a maximum pooling layer, and the maximum pooling operation can reduce the size of a feature map and remarkably reduce the computational complexity. The 1DCNN of the 5 layers is designed by referring to the current mainstream model architecture ideas such as VGGNet, along with the increase of the depth of CNNs, the size of a convolution kernel should be smaller, and the number of characteristic channels should be larger, so that the receptive field of a characteristic diagram finally generated by the convolution layer is large enough to cover the length of a sequence and capture continuous information, and therefore, the information of any frame in the sequence is not lost.

Considering that the depth gradually increases, the problem of internal covariant shift of 1DCNN becomes serious, so that the batch normalization (Batch Normalization, BN) layer is adopted to improve, and the batch normalization layer is placed before the convolution layer, so that the convolution layer can be more stable, better characteristic representation is extracted, and the structure diagram of the CNN part is shown in fig. 4.

Finally, for tensor combination generated by the LSTM path, the feature map is converted into specific dimensions by using two full-connection layers, the convolved feature map is flattened to be used as the input of the first full-connection layer, and a dropout technology is adopted between the two full-connection layers, so that overfitting is effectively prevented. All layers use ReLU as an activation function while using Kaiming initializers for weight initialization.

The structural parameters of the 1DCNN path are shown in table 1 below.

TABLE 1

Step 206, processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, an attention mechanism layer and a second branch full-connection layer.

The second branch is an LSTM path, namely a long-short-term memory network path, and comprises a plurality of LSTM layers which are connected in parallel, and an attention mechanism layer is added.

Optionally, the loss probability prediction model adopts a parallel hybrid deep neural network structure, as shown in fig. 3, an LSTM path is formed by a long-short-term memory network based on an attention mechanism, three LSTM layers are stacked together, and the number of units in each layer is 32, 16 and 16 respectively. At present, most LSTM models for time sequence prediction only use information in the last time step, the conversion dimension predicts, but the characteristics in the last time step can not completely cover all information, so that the invention proposes to utilize a self-attention mechanism to improve the LSTM network on the premise of not having proper prior information in order to improve the long-term memory capacity. The proposed sense of attention mechanisms originates from the human visual system and is first applied in the field of images. In the identification process, people pay attention to a certain area of the image, which means that different areas of the image are distributed with different weights by the brain to obtain different importance degrees, attention is paid to essence, calculation efficiency can be improved, and effectiveness of information extraction is improved.

The time sequence is processed by adopting an attention mechanism, and the specific steps are as follows: the original data sequence is input into an LSTM layer for learning, and the learning characteristic of the LSTM on one sample can be expressed asIn/>The learned sequence features are used as inputs of the attention mechanism, then attention weights are obtained, the importance of the sequence features and the time steps is represented, and then the formulas of the importance of the different sequence steps of the ith input h _i are as follows:

s_i＝Φ(W·h_i+b)

where W and b are weights and offsets, and Φ (·) is the activation function. After obtaining the importance of the ith feature vector, normalizing the importance weights by a softmax function:

The sequence features and the attention weights are then combined, and the final output feature o is expressed as:

In the above equation, a= { a ₁,a₂,...,a_d }, Representing an element-by-element multiplication operation. Thus, more important characteristic information and time steps can be focused, large weights are allocated to the characteristic information and the time steps, contribution of the characteristic information and the time steps is improved, and prediction performance is improved.

Step 208, processing the first characteristic information and the second characteristic information through the trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

Optionally, the loss probability prediction model adopts a parallel hybrid deep neural network structure, as shown in fig. 3, and finally uses two full-connection layers and a ReLU activation function to perform nonlinear transformation on characteristic information, and converts the characteristic information into a specific dimension to be fused.

Most current data-driven customer churn prediction methods are based on a single model implementation. However, under a complex scene, it is difficult for a single model to comprehensively extract feature information and ensure good generalization capability, and some hybrid models fuse CNN and LSTM in series, but when CNN is used as a feature extractor, learned information has great influence on LSTM training, so that parameter adjustment is difficult. Thus, the present invention merges two paths in a parallel fashion. The construction of the fusion layer is as follows: the inputs of the CNN path and the LSTM path are flattened into the same dimension, the two tensors are combined (Concatenate), the information in the two parallel paths is combined and continuously transmitted forwards, the combined data is processed through the BN layer and then is input into the two full-connection layers, and the output node of the last layer is the estimated customer loss probability.

In the customer loss probability prediction method, target business data of a target user in a preset time period is obtained; the target service data is time sequence data; processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer; processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer; processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer. And processing the target business data of the target user through a deep neural network model with parallel convolution paths and deep long-short-term memory paths, predicting the loss probability of the target user, and improving the judgment accuracy of the loss condition of the client.

In one embodiment, a client churn probability prediction method, as shown in fig. 5, includes:

Acquiring original service data of a sample user in a sampling time period; performing time sequence analysis on the original service data to obtain time sequence data corresponding to the original service data, wherein the time sequence data is used as sample service data of a sample user in a sampling time period; adding a corresponding loss classification label to the sample service data according to the data characteristics of the original service data, and obtaining a training sample according to the sample service data and the corresponding loss classification label; a training set is obtained based on the plurality of training samples. The training set comprises a plurality of training samples, each training sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user.

Acquiring a neural network model, wherein the neural network model comprises a first branch, a second branch, a batch normalization processing layer and at least one trunk full-connection layer; the first branch and the second branch are connected to the batch normalization processing layer in parallel, and the batch normalization processing layer is connected with at least one trunk full-connection layer in series; the first branch comprises a plurality of convolution layers and a first branch full-connection layer which are connected in series; the second branch comprises a plurality of long-short-term memory layers connected in parallel, and an attention mechanism layer and a second branch full-connection layer which are connected with the long-short-term memory layers in series.

Acquiring a verification set corresponding to the training set; the verification set comprises a plurality of verification samples, each verification sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user.

Performing one iteration training on the neural network model: the training samples in the training set are simultaneously input into a first branch and a second branch of the neural network model for processing, and first characteristic information and second characteristic information corresponding to the training samples are respectively obtained; inputting the first characteristic information and the second characteristic information corresponding to the training samples into a batch normalization processing layer of the neural network model to obtain sample prediction results corresponding to the training samples output by a last trunk full-connection layer; comparing the sample prediction result corresponding to the training sample with the loss classification label in the training sample, and adjusting model parameters of the neural network model according to the comparison result.

According to the verification set, acquiring a first result error of the neural network model before the model parameters are adjusted and a second result error of the neural network model after the model parameters are adjusted; the first result error and the second result error are used for representing the degree of difference between the processing result obtained by processing the verification sample by the neural network model and the corresponding loss classification label.

And storing the adjusted model parameters when the second result error is smaller than the first result error, and storing the model parameters before adjustment when the second result error is not smaller than the first result error.

And returning to execute the step of performing one-time iterative training on the neural network model according to the training set under the condition that the iterative training times are smaller than the preset iterative times, and adjusting the model parameters of the neural network model according to the training result of one-time iterative training.

Acquiring target business data of a target user in a preset time period; the target service data is time sequence data; and inputting the target service data into the loss probability prediction model to obtain a loss probability prediction result of the target user output by the loss probability prediction model.

Specifically, firstly, historical customer transaction data are subjected to data preprocessing, then a training set, a verification set and a test set are divided, a model is built according to the structure of the parallel hybrid deep neural network, a series of data are directly input into the parallel hybrid deep neural network, manual feature extraction is not needed, and the loss condition of customers is predicted end to end.

When the training set and the verification set are acquired, the used data is transaction running water of historical bank clients, and the transaction running water of the clients records the information of the transaction behaviors, purchase history, consumption habits and the like of the clients. These data may reflect the customer's usage of the product or service, and by analyzing these patterns of behavior and trends, it may be inferred whether the customer has a chance of churn. The transaction flow has time sequence, the historical change trend is required to be mined and extracted by using a time sequence analysis method, and after the historical transaction data are collected, sufficient data preprocessing work is required to be carried out, so that end-to-end modeling is realized. Establishing a training set, a verification set and a test set according to the transaction data of historical bank clients by using a reserving method according to the dividing ratio of 7:2:1, randomly extracting 70% of clients as the training set, and training a machine learning model; 20% of clients are used as verification sets for parameter selection and adjustment of the model; the remaining 10% of the clients are used as test sets for testing the prediction accuracy of the machine learning model. In order to eliminate the dimensional difference in the time series data, a consistent data scale is provided, the convergence speed and stability of the model are improved, and Normalization (Normalization) is performed on each sample, wherein the formula is as follows:

Wherein, X _norm is the normalized data sequence, X is the original data sequence, X _max is the maximum value in the data, and X _min is the minimum value in the data.

In this embodiment, the target service data of the target user is processed through the deep neural network model with parallel convolution paths and deep long-short-term memory paths, so that the loss probability of the target user is predicted, and the accuracy of judging the loss condition of the client can be improved.

In one embodiment, the method further comprises: obtaining loss probability prediction results of a plurality of target users by adopting a loss probability prediction model; arranging a plurality of target users according to the sequence from big to small of the loss probability prediction result to obtain a user sequence; determining a target loss level corresponding to each target user in the user sequence according to a preset dividing proportion; and determining the target processing strategy corresponding to each target user according to the mapping relation between the loss level and the processing strategy.

Optionally, in practical application, the historical three-month customer transaction flow data is used as the input of customer loss prediction, firstly, the historical three-month customer flow data is normalized and then is directly input into a model for prediction according to a time step sequence, the model predicts the customer loss condition according to the corresponding relation between the learned change trend and the customer loss probability, the output result of the model is the probability between 0 and 1, the probability of customer loss is represented, the larger the numerical value is represented, the larger the probability of customer loss is, and after the loss probability of the customer is obtained, corresponding strategies can be implemented for different kinds of customers: clients with high loss probability (front 30%), and timely notifying the client manager to follow up and retrieve (front 30% -front 60%); for clients with medium loss probability, regularly tracking the dynamic state, and if the clients have the possibility of further loss, tracking in advance; for customers with low churn probability (the latter 40%), normal marketing strategies are adopted.

In the embodiment, the transaction data of the user is analyzed through the parallel deep neural network model of the convolution path and the deep long-short-term memory path, the probability that the customer possibly runs off is predicted, the potential customer that runs off is identified, the bank is helped to take measures in time to save the customer that runs off, the marketing strategy is optimized, the customer running off can be effectively reduced, and the resource loss is effectively reduced.

In one embodiment, a training method of a loss probability prediction model includes:

In one embodiment, a method for predicting and processing customer churn probability includes:

Acquiring target business data of a target user in a preset time period; the target service data is time series data.

Inputting the target business data into a loss probability prediction model; processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer; processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer; processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

The same method is adopted to obtain loss probability prediction results of a plurality of users; arranging a plurality of users according to the sequence from big to small of the loss probability prediction result to obtain a user sequence; determining a target loss level corresponding to each user in a user sequence according to a preset dividing proportion; and determining a target processing strategy corresponding to each user according to the mapping relation between the loss level and the processing strategy.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a customer loss probability prediction device for realizing the above-mentioned customer loss probability prediction method. The implementation of the solution provided by the apparatus is similar to the implementation described in the above method, so the specific limitation in the embodiments of the apparatus for predicting customer loss probability provided below may be referred to the limitation of the method for predicting customer loss probability hereinabove, and will not be described herein.

In an exemplary embodiment, as shown in fig. 6, there is provided a customer churn probability prediction apparatus 600, including: an acquisition module 601 and a processing module 602, wherein:

An obtaining module 601, configured to obtain target service data of a target user in a preset time period; the target service data is time series data.

The processing module 602 is configured to process the target service data through a first branch of the loss probability prediction model to obtain first feature information; the first leg includes a convolutional layer and a first leg full link layer.

The processing module 602 is further configured to process the target service data through a second branch of the loss probability prediction model to obtain second feature information; the second branch comprises a long-period memory layer, an attention mechanism layer and a second branch full-connection layer.

The processing module 602 is further configured to process the first feature information and the second feature information through a trunk of the loss probability prediction model, so as to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

In one embodiment, the apparatus further comprises:

A training module 603, configured to obtain a training set; the training set comprises a plurality of training samples, each training sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user; acquiring a neural network model, wherein the neural network model comprises a first branch, a second branch, a batch normalization processing layer and at least one trunk full-connection layer; the first branch and the second branch are connected to the batch normalization processing layer in parallel, and the batch normalization processing layer is connected with at least one trunk full-connection layer in series; the first branch comprises a plurality of convolution layers and a first branch full-connection layer which are connected in series; the second branch comprises a plurality of long-short-term memory layers connected in parallel, and an attention mechanism layer and a second branch full-connection layer which are connected with the long-short-term memory layers in series; and carrying out repeated iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to the training result of each iterative training to obtain the loss probability prediction model.

In one embodiment, the training module 603 is further configured to obtain raw service data of the sample user during the sampling period; performing time sequence analysis on the original service data to obtain time sequence data corresponding to the original service data, wherein the time sequence data is used as sample service data of a sample user in a sampling time period; adding a corresponding loss classification label to the sample service data according to the data characteristics of the original service data, and obtaining a training sample according to the sample service data and the corresponding loss classification label; a training set is obtained based on the plurality of training samples.

In one embodiment, the training module 603 is further configured to obtain a verification set corresponding to the training set; the verification set comprises a plurality of verification samples, each verification sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user; performing iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to a training result of the iterative training; according to the verification set, acquiring a first result error of the neural network model before the model parameters are adjusted and a second result error of the neural network model after the model parameters are adjusted; the first result error and the second result error are used for representing the degree of difference between the processing result obtained by processing the verification sample by the neural network model and the corresponding loss classification label; storing the adjusted model parameters when the second result error is smaller than the first result error, and storing the model parameters before adjustment when the second result error is not smaller than the first result error; returning to execute the step of performing one-time iterative training on the neural network model according to the training set under the condition that the iterative training times are smaller than the preset iterative times, and adjusting model parameters of the neural network model according to the training result of one-time iterative training; and under the condition that the iteration training frequency is not less than the preset iteration frequency, obtaining a loss probability prediction model according to the latest saved model parameters.

In one embodiment, the training module 603 is further configured to input training samples in the training set into the first branch and the second branch of the neural network model for processing, so as to obtain first feature information and second feature information corresponding to the training samples respectively; inputting the first characteristic information and the second characteristic information corresponding to the training samples into a batch normalization processing layer of the neural network model to obtain sample prediction results corresponding to the training samples output by a last trunk full-connection layer; comparing the sample prediction result corresponding to the training sample with the loss classification label in the training sample, and adjusting model parameters of the neural network model according to the comparison result.

In one embodiment, the processing module 602 is further configured to obtain the attrition probability prediction results of the plurality of target users using the attrition probability prediction model; arranging a plurality of target users according to the sequence from big to small of the loss probability prediction result to obtain a user sequence; determining a target loss level corresponding to each target user in the user sequence according to a preset dividing proportion; and determining the target processing strategy corresponding to each target user according to the mapping relation between the loss level and the processing strategy.

The modules in the customer churn probability prediction apparatus may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one exemplary embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing business data. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a customer churn probability prediction method.

It will be appreciated by those skilled in the art that the structure shown in FIG. 7 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one exemplary embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of: acquiring target business data of a target user in a preset time period; the target service data is time sequence data; processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer; processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer; processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

In one embodiment, the processor when executing the computer program further performs the steps of: acquiring a training set; the training set comprises a plurality of training samples, each training sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user; acquiring a neural network model, wherein the neural network model comprises a first branch, a second branch, a batch normalization processing layer and at least one trunk full-connection layer; the first branch and the second branch are connected to the batch normalization processing layer in parallel, and the batch normalization processing layer is connected with at least one trunk full-connection layer in series; the first branch comprises a plurality of convolution layers and a first branch full-connection layer which are connected in series; the second branch comprises a plurality of long-short-term memory layers connected in parallel, and an attention mechanism layer and a second branch full-connection layer which are connected with the long-short-term memory layers in series; and carrying out repeated iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to the training result of each iterative training to obtain the loss probability prediction model.

In one embodiment, the processor when executing the computer program further performs the steps of: acquiring original service data of a sample user in a sampling time period; performing time sequence analysis on the original service data to obtain time sequence data corresponding to the original service data, wherein the time sequence data is used as sample service data of a sample user in a sampling time period; adding a corresponding loss classification label to the sample service data according to the data characteristics of the original service data, and obtaining a training sample according to the sample service data and the corresponding loss classification label; a training set is obtained based on the plurality of training samples.

In one embodiment, the processor when executing the computer program further performs the steps of: acquiring a verification set corresponding to the training set; the verification set comprises a plurality of verification samples, each verification sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user; performing iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to a training result of the iterative training; according to the verification set, acquiring a first result error of the neural network model before the model parameters are adjusted and a second result error of the neural network model after the model parameters are adjusted; the first result error and the second result error are used for representing the degree of difference between the processing result obtained by processing the verification sample by the neural network model and the corresponding loss classification label; storing the adjusted model parameters when the second result error is smaller than the first result error, and storing the model parameters before adjustment when the second result error is not smaller than the first result error; returning to execute the step of performing one-time iterative training on the neural network model according to the training set under the condition that the iterative training times are smaller than the preset iterative times, and adjusting model parameters of the neural network model according to the training result of one-time iterative training; and under the condition that the iteration training frequency is not less than the preset iteration frequency, obtaining a loss probability prediction model according to the latest saved model parameters.

In one embodiment, the processor when executing the computer program further performs the steps of: the training samples in the training set are simultaneously input into a first branch and a second branch of the neural network model for processing, and first characteristic information and second characteristic information corresponding to the training samples are respectively obtained; inputting the first characteristic information and the second characteristic information corresponding to the training samples into a batch normalization processing layer of the neural network model to obtain sample prediction results corresponding to the training samples output by a last trunk full-connection layer; comparing the sample prediction result corresponding to the training sample with the loss classification label in the training sample, and adjusting model parameters of the neural network model according to the comparison result.

In one embodiment, the processor when executing the computer program further performs the steps of: obtaining loss probability prediction results of a plurality of target users by adopting a loss probability prediction model; arranging a plurality of target users according to the sequence from big to small of the loss probability prediction result to obtain a user sequence; determining a target loss level corresponding to each target user in the user sequence according to a preset dividing proportion; and determining the target processing strategy corresponding to each target user according to the mapping relation between the loss level and the processing strategy.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring target business data of a target user in a preset time period; the target service data is time sequence data; processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer; processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer; processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring a training set; the training set comprises a plurality of training samples, each training sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user; acquiring a neural network model, wherein the neural network model comprises a first branch, a second branch, a batch normalization processing layer and at least one trunk full-connection layer; the first branch and the second branch are connected to the batch normalization processing layer in parallel, and the batch normalization processing layer is connected with at least one trunk full-connection layer in series; the first branch comprises a plurality of convolution layers and a first branch full-connection layer which are connected in series; the second branch comprises a plurality of long-short-term memory layers connected in parallel, and an attention mechanism layer and a second branch full-connection layer which are connected with the long-short-term memory layers in series; and carrying out repeated iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to the training result of each iterative training to obtain the loss probability prediction model.

In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring original service data of a sample user in a sampling time period; performing time sequence analysis on the original service data to obtain time sequence data corresponding to the original service data, wherein the time sequence data is used as sample service data of a sample user in a sampling time period; adding a corresponding loss classification label to the sample service data according to the data characteristics of the original service data, and obtaining a training sample according to the sample service data and the corresponding loss classification label; a training set is obtained based on the plurality of training samples.

In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring a verification set corresponding to the training set; the verification set comprises a plurality of verification samples, each verification sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing whether the corresponding sample user is a lost user or a non-lost user; performing iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to a training result of the iterative training; according to the verification set, acquiring a first result error of the neural network model before the model parameters are adjusted and a second result error of the neural network model after the model parameters are adjusted; the first result error and the second result error are used for representing the degree of difference between the processing result obtained by processing the verification sample by the neural network model and the corresponding loss classification label; storing the adjusted model parameters when the second result error is smaller than the first result error, and storing the model parameters before adjustment when the second result error is not smaller than the first result error; returning to execute the step of performing one-time iterative training on the neural network model according to the training set under the condition that the iterative training times are smaller than the preset iterative times, and adjusting model parameters of the neural network model according to the training result of one-time iterative training; and under the condition that the iteration training frequency is not less than the preset iteration frequency, obtaining a loss probability prediction model according to the latest saved model parameters.

In one embodiment, the computer program when executed by the processor further performs the steps of: the training samples in the training set are simultaneously input into a first branch and a second branch of the neural network model for processing, and first characteristic information and second characteristic information corresponding to the training samples are respectively obtained; inputting the first characteristic information and the second characteristic information corresponding to the training samples into a batch normalization processing layer of the neural network model to obtain sample prediction results corresponding to the training samples output by a last trunk full-connection layer; comparing the sample prediction result corresponding to the training sample with the loss classification label in the training sample, and adjusting model parameters of the neural network model according to the comparison result.

In one embodiment, the computer program when executed by the processor further performs the steps of: obtaining loss probability prediction results of a plurality of target users by adopting a loss probability prediction model; arranging a plurality of target users according to the sequence from big to small of the loss probability prediction result to obtain a user sequence; determining a target loss level corresponding to each target user in the user sequence according to a preset dividing proportion; and determining the target processing strategy corresponding to each target user according to the mapping relation between the loss level and the processing strategy.

In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of: acquiring target business data of a target user in a preset time period; the target service data is time sequence data; processing target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer; processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer; processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full connection layer.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are both information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to meet the related regulations.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A method for predicting customer churn probability, said method comprising:

Processing the target service data through a first branch of the loss probability prediction model to obtain first characteristic information; the first branch comprises a convolution layer and a first branch full-connection layer;

processing the first characteristic information and the second characteristic information through a trunk of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full-connection layer.

2. The method of claim 1, wherein the obtaining the attrition probability prediction model includes:

acquiring a training set; the training set comprises a plurality of training samples, each training sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing that the corresponding sample user is a lost user or a non-lost user;

Acquiring a neural network model, wherein the neural network model comprises the first branch, the second branch, the batch normalization processing layer and at least one trunk full-connection layer; the first branch and the second branch are connected to the batch normalization processing layer in parallel, and the batch normalization processing layer is connected with at least one trunk full-connection layer in series; the first branch comprises a plurality of convolution layers and a first branch full-connection layer which are connected in series; the second branch comprises a plurality of long-short-term memory layers connected in parallel, and an attention mechanism layer and a second branch full-connection layer which are connected with the long-short-term memory layers in series;

And carrying out repeated iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to a training result of each iterative training to obtain the loss probability prediction model.

3. The method of claim 2, wherein the acquiring the training set comprises:

acquiring original service data of the sample user in the sampling time period;

performing time sequence analysis on the original service data to obtain time sequence data corresponding to the original service data, wherein the time sequence data is used as the sample service data of the sample user in the sampling time period;

The training set is obtained based on a plurality of training samples.

4. The method according to claim 2, wherein the performing iterative training on the neural network model according to the training set for a plurality of times, and adjusting model parameters of the neural network model according to a training result of each iterative training, to obtain the attrition probability prediction model, includes:

Acquiring a verification set corresponding to the training set; the verification set comprises a plurality of verification samples, each verification sample comprises sample service data of a sample user in a sampling time period and a loss classification label corresponding to the sample service data, and the loss classification label is used for representing that the corresponding sample user is a lost user or a non-lost user;

acquiring a first result error of the neural network model before adjusting model parameters and a second result error of the neural network model after adjusting model parameters according to the verification set; the first result error and the second result error are used for representing the degree of difference between a processing result obtained by processing the verification sample by the neural network model and a corresponding loss classification label;

returning to execute the step of performing one iteration training on the neural network model according to the training set under the condition that the iteration training frequency is smaller than the preset iteration frequency, and adjusting model parameters of the neural network model according to a training result of one iteration training;

and under the condition that the iteration training frequency is not less than the preset iteration frequency, obtaining the loss probability prediction model according to the latest saved model parameters.

5. The method of claim 4, wherein the performing an iterative training on the neural network model according to the training set, and adjusting model parameters of the neural network model according to a training result of the iterative training, comprises:

simultaneously inputting training samples in the training set into the first branch and the second branch of the neural network model for processing to obtain first characteristic information and second characteristic information corresponding to the training samples;

inputting the first characteristic information and the second characteristic information corresponding to the training samples into the batch normalization processing layer of the neural network model to obtain sample prediction results corresponding to the training samples output by the last trunk full-connection layer;

comparing the sample prediction result corresponding to the training sample with the loss classification label in the training sample, and adjusting the model parameters of the neural network model according to the comparison result.

6. The method according to claim 1, wherein the method further comprises:

Obtaining loss probability prediction results of a plurality of target users by adopting the loss probability prediction model;

7. A customer churn probability prediction apparatus, said apparatus comprising:

The processing module is further used for processing the target service data through a second branch of the loss probability prediction model to obtain second characteristic information; the second branch comprises a long-period memory layer, a short-period memory layer, a attention mechanism layer and a second branch full-connection layer;

the processing module is further used for processing the first characteristic information and the second characteristic information through a trunk road of the loss probability prediction model to obtain a loss probability prediction result of the target user; the trunk comprises a batch normalization processing layer and at least one trunk full-connection layer.

8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.

10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.