CN113244629A - Lost account recall method and device, storage medium and electronic equipment - Google Patents

Lost account recall method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN113244629A
CN113244629A CN202110693747.XA CN202110693747A CN113244629A CN 113244629 A CN113244629 A CN 113244629A CN 202110693747 A CN202110693747 A CN 202110693747A CN 113244629 A CN113244629 A CN 113244629A
Authority
CN
China
Prior art keywords
account
attrition
lost
subset
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110693747.XA
Other languages
Chinese (zh)
Other versions
CN113244629B (en
Inventor
陶冶
刘阳
徐广根
刘妍
万志远
叶沐芊
邹丰富
江鑫
李鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110693747.XA priority Critical patent/CN113244629B/en
Publication of CN113244629A publication Critical patent/CN113244629A/en
Application granted granted Critical
Publication of CN113244629B publication Critical patent/CN113244629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/70Game security or game management aspects
    • A63F13/79Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Security & Cryptography (AREA)
  • General Business, Economics & Management (AREA)
  • Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a lost account number recalling method and device, a storage medium and electronic equipment. Wherein, the method comprises the following steps: acquiring a lost account set to be identified and a corresponding lost account feature set; according to the lost account feature set, performing clustering operation on the lost account set in a plurality of predetermined clustering clusters to obtain a first lost account subset in the lost account set; inputting the lost account feature set into a target neural network model to obtain a second lost account subset in the lost account set; and determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset. The invention solves the technical problem of low recall rate of lost accounts.

Description

Lost account recall method and device, storage medium and electronic equipment
Technical Field
The invention relates to the field of computers, in particular to a lost account number recalling method and device, a storage medium and electronic equipment.
Background
In the related art, the problem of account loss exists in the application client or the applet, for example, the problem of account loss exists in the game client, the short video client and the mall client. The recall of lost accounts is of great significance to guarantee the continuous development of application clients or applets.
Most of the existing lost account recalling schemes are used for establishing a uniform recall strategy aiming at all lost accounts and recalling by matching with some operation strategies. However, the recall scheme in the prior art is lack of individuation, and cannot accurately locate accounts which are possibly recalled, and recall operations are also performed on some accounts which are not possible to recall, so that the recall rate of lost accounts is low, and the operation cost is wasted.
Aiming at the problem that the recall rate of lost accounts is low in the related technology, an effective solution does not exist at present.
Disclosure of Invention
The embodiment of the invention provides a lost account recall method and device, a storage medium and electronic equipment, which are used for at least solving the technical problem of low lost account recall rate.
According to an aspect of an embodiment of the present invention, a method for recalling an attrition account is provided, including: acquiring a lost account set to be identified and a corresponding lost account feature set, wherein the lost account set comprises accounts currently in a lost state in a target application, and the lost account feature set comprises a plurality of features of each lost account in the lost account set; according to the lost account feature set, performing clustering operation on the lost account set in a plurality of predetermined clustering clusters to obtain a first lost account subset in the lost account set, wherein the clustering cluster to which the first lost account subset belongs is a target clustering cluster of which the recall probability meets a first preset condition, and a plurality of features of lost accounts in the first lost account subset meet the clustering condition corresponding to the target clustering cluster; inputting the attrition account feature set into a target neural network model to obtain a second attrition account subset in the attrition account set, wherein an attrition account in the second attrition account subset is an attrition account in the recall account category predicted by the target neural network model; and determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
According to another aspect of the embodiments of the present invention, there is also provided an apparatus for recalling an attrition account, including: the system comprises an acquisition module, a management module and a management module, wherein the acquisition module is used for acquiring a lost account set to be identified and a corresponding lost account feature set, the lost account set comprises accounts in a lost state currently in a target application, and the lost account feature set comprises a plurality of features of each lost account in the lost account set; an execution module, configured to perform a clustering operation on the attrition account set in a plurality of predetermined clustering clusters according to the attrition account feature set, so as to obtain a first attrition account subset in the attrition account set, where a clustering cluster to which the first attrition account subset belongs is a target clustering cluster in the plurality of clustering clusters, where a recall probability of the target clustering cluster meets a first preset condition, and a plurality of features of attrition accounts in the first attrition account subset meet a clustering condition corresponding to the target clustering cluster; an input module, configured to input the attrition account feature set into a target neural network model, to obtain a second attrition account subset in the attrition account set, where an attrition account in the second attrition account subset is an attrition account in the recall account category predicted by the target neural network model; and the determining module is used for determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
According to another aspect of the embodiment of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the above-mentioned method for recalling attrition accounts when running.
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the above method for recalling an attrition account through the computer program.
In the embodiment of the invention, a clustering mode and a neural network model processing mode are adopted, a clustering operation is performed on a lost account set, a target neural network model is used for processing a lost account feature set, and a third lost account subset to be recalled is determined according to a first lost account subset obtained by the clustering operation and a second lost account subset obtained by the target neural network model. The aim of accurately recalling the lost account is achieved, so that the technical effect of improving the recall rate of the lost account is achieved, and the technical problem of low recall rate of the lost account is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a diagram illustrating an application environment of an alternative attrition account recall method according to an embodiment of the present invention;
FIG. 2 is a flowchart of an alternative attrition account recall method according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating an alternative LightGBM model training according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an alternative DeepFM model architecture according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an alternative Stacking fusion according to embodiments of the present invention;
FIG. 6 is a diagram illustrating an alternative plurality of cluster clusters, according to an embodiment of the present invention;
FIG. 7 is a pictorial illustration of an alternative portrait radar, in accordance with an embodiment of the present invention;
FIG. 8 is a schematic illustration of an alternative recall flow according to an embodiment of the present invention;
FIG. 9 is a block diagram illustrating an alternative attrition account recall mechanism, according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of the embodiment of the present invention, a method for recalling an attrition account is provided, and optionally, as an optional implementation manner, the method for recalling an attrition account may be applied, but not limited to, in a system environment as shown in fig. 1. The system environment includes: terminal device 102, network 110, server 112.
Optionally, in this embodiment, the terminal device may be configured with a target application, and the terminal device may include, but is not limited to, at least one of the following: mobile phones (such as Android phones, iOS phones, etc.), notebook computers, tablet computers, palm computers, MID (Mobile Internet Devices), PAD, desktop computers, smart televisions, etc. The target application may be a game application client, a video application client, an instant messaging application client, a browser application client, an educational application client, and the like. The terminal device includes, but is not limited to, a memory 104 for storing the attrition account set and the corresponding attrition account feature set, a processor 106, and a display 108. The processor is used for processing the lost account set and the corresponding lost account feature set. The display can be used for displaying the lost accounts in the lost account set.
The network 110 may include, but is not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication.
The server 112 may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The server includes a database 114 for storing data including, but not limited to, a set of attrition accounts and a corresponding set of attrition account characteristics, and a processing engine 116. The processing engine is configured to process data, including but not limited to performing a clustering operation on the attrition account set in a plurality of predetermined clustering clusters according to the attrition account feature set, to obtain a first attrition account subset in the attrition account set; inputting the lost account feature set into a target neural network model to obtain a second lost account subset in the lost account set; and determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
The above is merely an example, and this is not limited in this embodiment.
Optionally, as an optional implementation manner, as shown in fig. 2, the method for recalling the attrition account includes:
step S202, acquiring a lost account set to be identified and a corresponding lost account feature set, wherein the lost account set comprises accounts currently in a lost state in a target application, and the lost account feature set comprises a plurality of features of each lost account in the lost account set;
the attrition account set includes a plurality of attrition accounts, for example, 100, 1000, 5000, etc. attrition accounts. The attrition account is an account currently in an attrition state in the target application, the attrition state includes but is not limited to an account that has not logged in the target for more than a predetermined time period, and the predetermined time period may be determined according to actual conditions, for example, may be one week, one month, 45 days, 5 hours, and the like. Taking the target application as a game application client as an example, the attrition account may be a player who has not logged in to the game client for more than one week. The attrition account feature set may include a plurality of feature sets, and each feature set may include a plurality of features. The specific characteristics included in the lost account characteristic set may be determined according to actual conditions, for example, the characteristics of accounts in different application clients are different, and the lost account characteristic set may be set according to the characteristics of accounts in the clients. Taking the target application as a game client, the attrition account feature set may include: a basic attribute feature set, an active feature set, a paid feature set, a play feature set, an in-game social feature set, and the like. The basic attribute feature set may include: battle effectiveness, account number level, VIP level and the like. The active feature set may include: the last active login days of a month, the online time, the average daily online time, the daytime/nighttime time ratio and the like of the player; the payment feature set may include: historical payment amount, last active one month payment amount, etc.; the set of play characteristics may include: the proportion of playing methods historically; in-game social feature set: the number of chatting with others, the number of applying for friends, the number of participating in group battles, etc.
Step S204, according to the lost account feature set, performing clustering operation on the lost account set in a plurality of predetermined clustering clusters to obtain a first lost account subset in the lost account set, wherein the clustering cluster to which the first lost account subset belongs is a target clustering cluster of which the recall probability meets a first preset condition in the clustering clusters, and a plurality of features of lost accounts in the first lost account subset meet the clustering condition corresponding to the target clustering cluster;
the predetermined plurality of cluster clusters may be obtained through a clustering algorithm, and the clustering algorithm includes, but is not limited to, a k-means clustering algorithm, a Kmeans + + clustering algorithm, and the like. Each cluster satisfies a corresponding clustering condition, which may be one or several characteristics. For example, the attrition account in a cluster of the plurality of clusters meets the condition that the online time exceeds a preset time and the historical payment amount exceeds a preset amount (clustering condition). The target cluster may be a cluster with a higher recall probability, for example, an account with a shorter drain time (shorter than a preset time) and a higher historical fee (larger than the preset time) is easier to recall, and the drain accounts in the first drain account subset satisfy the clustering condition with a shorter drain time and a higher historical fee.
Step S206, inputting the attrition account feature set into a target neural network model, to obtain a second attrition account subset in the attrition account set, where an attrition account in the second attrition account subset is an attrition account in the recall account category predicted by the target neural network model;
the target neural network model can be a Stacking fusion neural network model, can include LightGBM and deep FM models, can use training samples to train the initial model, can obtain the lost accounts belonging to the recall account category by using the LightGBM and deep FM models which are trained, and the second lost account subset comprises a plurality of lost accounts belonging to the recall account category in the lost account set.
Step S208, determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
And the lost accounts in the first lost account subset and the second lost account subset are lost accounts with higher recall probability. Duplicate accounts may exist in the first attrition account subset and the second attrition account subset, and a third attrition account subset can be obtained by performing deduplication on the first attrition account subset and the second attrition account subset.
Through the steps, a clustering mode and a neural network model processing mode are adopted, clustering operation is carried out on the lost account number set, the lost account number feature set is processed by using the target neural network model, and a third lost account number subset to be recalled is determined according to a first lost account number subset obtained through clustering operation and a second lost account number subset obtained through the target neural network model. The aim of accurately recalling the lost account is achieved, so that the technical effect of improving the recall rate of the lost account is achieved, and the technical problem of low recall rate of the lost account is solved.
Optionally, the inputting the attrition account feature set into a target neural network model to obtain a second attrition account subset in the attrition account set includes: inputting the feature set of attrition accounts into a first predictive neural network model in the target neural network model, obtaining a first set of probabilities predicted by the first predictive neural network model, wherein the first set of probabilities includes a probability that each attrition account in the set of attrition accounts belongs to the recall account category, and the first predictive neural network model is configured to determine the probability that each attrition account belongs to the recall account category according to the feature set of attrition accounts; inputting the attrition account feature set and the first probability set into a second predictive neural network model in the target neural network model, obtaining a second probability set predicted by the second predictive neural network model, wherein the second probability set includes a probability that each attrition account in the attrition account set belongs to the recall account category, and the second predictive neural network model is configured to determine a cross feature of order 2 and a cross feature higher than order 2 according to the attrition account feature set and the first probability set, and determine a probability that each attrition account belongs to the recall account category according to the cross feature of order 2 and the cross feature higher than order 2; determining the second subset of attrition accounts in the set of attrition accounts according to the first set of probabilities and the second set of probabilities.
As an alternative embodiment, the first predictive neural network model may be a LightGBM classification model. The light GBM (light Gradient Boosting machine) is a framework for realizing a GBDT (Gradient Boosting Decision Tree) algorithm, the GBDT has the main idea that a weak classifier (Decision tree) is used for iterative training to obtain an optimal model, and the model has the advantages of good training effect, difficulty in overfitting and the like. LightGBM is a distributed gradient boosting framework based on a decision tree algorithm. The design idea of LightGBM is mainly two points: the use of data to a memory is reduced, and more data can be used as much as possible by a single machine under the condition of not sacrificing the speed; the communication cost is reduced, the efficiency of multi-machine parallel is improved, and the linear acceleration on calculation is realized. It can be seen that the LightGBM was originally designed to provide a fast, efficient, low-memory, high-accuracy data science tool that supports parallel and large-scale data processing.
As an optional implementation manner, the initial LightGBM model may be trained by using training sample data, the trained LightGBM model may identify attrition accounts belonging to the recall account category, and the attrition account set to be identified is input into the trained LightGBM model, so that an output result of the LightGBM model may be obtained, where the output result includes a probability that each attrition account in the attrition account set belongs to the recall account category, and a larger probability indicates a larger probability that the corresponding attrition account can be recalled. Fig. 3 is a schematic diagram illustrating training of a LightGBM model according to an alternative embodiment of the present invention, taking training sample data as an attrition sample account set as an example, account characteristics of an attrition sample account are input, including but not limited to basic attribute characteristics, activity characteristics, payment characteristics, play characteristics, and social characteristics in the diagram. The LightGBM model may output the probability that each attrition sample account belongs to a positive sample, which is a recall sample. Namely the probability of recalling the loss sample account, when the positive sample probability output by the LightGBM model and the positive sample probability of which the loss sample account is known satisfy the preset convergence condition in the training process, the training is stopped to obtain the LightGBM model after the training is completed.
As an alternative embodiment, the second predictive neural network model may be a deep FM model, and the deep FM model may learn the cross feature in the form of dot product and hidden vector. Due to complexity constraints, FM typically applies the 2-fold cross feature of order-2. The deep model is good at capturing high-order complex features. DeepFM is an algorithm derived from FM, Deep and FM are combined, FM is used as a low-order combination between features, a Deep part can be used for realizing high-order combination between features, and the two combination modes are carried out in parallel in Deep FM. Fig. 4 is a schematic diagram of a deep fm model structure according to an alternative embodiment of the present invention, where Field may represent a feature set and a probability that each attrition account included in the first set of probabilities output by the first predictive neural network model belongs to the category of recall accounts. For example, Field i may represent a base property feature group, with each node in Field i representing a feature included in the base property feature group: battle effectiveness, account number level, VIP level and the like. Field j may represent an active feature group, with each node in Field j representing a feature included in the active feature group: the last active one month's login days, online time, average daily online time, day/night time ratio, etc. Field m may represent the probability that the attrition account belongs to the recall account category. Through the lookup operation, each feature can obtain the corresponding sense embedding.
Figure BDA0003127211670000091
W, V in the above formulai,VjThe model is a parameter to be learned by the model, x represents a model input feature vector, and the model input feature vector comprises the attrition account feature set and the first probability set. x is the number ofj1,xj2Respectively represent the j th1,j2The characteristic value, < w, x > represents that the first-order characteristic is extracted,
Figure BDA0003127211670000092
representing the extraction of second-order cross features.
a(0)={e1,e2,e3,…,em}
a(l+1)=σ(W(l)a(l)+b(l))
yDNN=σ(W|H|+1aH+b|H|+1)
In the above formula, eiThe embedding vector, α, representing the ith Field(0)Is an input value, W, of deep neural network(l),a(l),b(l)Respectively representing the model weight, output and bias term parameters of the ith layer, | H | represents the number of layers of a network hidden layer, and σ represents an activation function.
Figure BDA0003127211670000101
The deep FM model is an end-to-end model which can be extracted from original features to various complexity features, and has the following two advantages: the DeepFM model comprises FM and DNN, wherein the FM model can extract low-order features, feature crossing is automatically realized, and the DNN can extract high-order features. The DeepFM model is fast in training speed because the input is only the original features and the FM and DNN share the input vector features.
As an optional implementation manner, the FM layer applies 2-fold cross features of order-2 to determine cross features of 2 orders, the Hidden layer implements high-order combination between features to determine cross features higher than 2 orders, the Output Units can obtain the probability that each attrition account in the attrition account feature set belongs to the category of recall accounts according to the cross features of 2 orders determined by the FM layer and the cross features higher than 2 orders determined by the Hidden layer, and the larger the probability, the higher the probability that the corresponding attrition account can be recalled.
As an optional implementation manner, the LightGBM and the deep fm model are subjected to Stacking fusion, as shown in fig. 5, which is a schematic diagram of Stacking fusion according to an optional embodiment of the present invention, the attrition account feature set is input into the LightGBM model, a first probability set output by the LightGBM model is obtained, the attrition account feature set and the first probability set are input into the deep fm, a second probability set output by the deep fm is obtained, the second probability set includes a plurality of probability values, and each probability value is used to indicate a probability that each attrition account belongs to a recall account category.
As an optional implementation manner, taking the average of the LightGBM output probability and the DeepFM output probability as the probability that each attrition account belongs to the category of the recall account, setting a suitable threshold to obtain a final second attrition account subset, where the prediction result in the graph includes the second attrition account subset. The average of the probabilities for each attrition account in the second subset of attrition accounts is greater than the threshold. The threshold may be determined according to actual conditions, and assuming that the threshold is 0.5, the attrition account with the average value of the LightGBM output probability and the DeepFM output probability being greater than 0.5 belongs to the recall account category and is an account in the second subset of attrition accounts.
In the above embodiment, by Stacking and fusing the LightGBM model and the deep fm model, the probability that each attrition account output by the LightGBM model belongs to the recall account category is used as the input of the deep fm model, so that the fitting capability of the model can be enhanced, and the accuracy of model identification can be improved.
Optionally, said determining said second subset of attrition accounts from said set of attrition accounts according to said first set of probabilities and said second set of probabilities comprises: determining a final predicted probability that each attrition account in the set of attrition accounts belongs to the category of recall accounts according to the first set of probabilities and the second set of probabilities, wherein the final predicted probability of each attrition account is an average of the probabilities corresponding to each attrition account in the first set of probabilities and the second set of probabilities; searching the attrition account number with the final prediction probability larger than a preset threshold value in the attrition account number set to obtain the second attrition account number subset.
As an optional implementation manner, the final prediction probability is a probability mean value corresponding to each attrition account in the first probability set and the second probability set. Assume that the probability value of the lost account a belonging to the recall account category in the first probability set is 0.6, and the probability value of the lost account a belonging to the recall account category in the second probability set is 0.5. The average of 0.6 and 0.5, 0.55, is calculated as the final predicted probability that attrition account a belongs to the recall account category. Assume that the probability value of the lost account B belonging to the recall account category in the first probability set is 0.3, and the probability value of the lost account a belonging to the recall account category in the second probability set is 0.4. The mean of 0.3 and 0.4, 0.35, was calculated as the final predicted probability that attrition account B belongs to the recall account category. The preset threshold may be determined according to actual conditions, and assuming that the preset threshold is 0.5, since the final prediction probability of the account a is 0.55, the account a belongs to the second lost account subset. The final prediction probability of the lost account B is 0.35, and the account B does not belong to the second lost account subset.
Optionally, the method further comprises: acquiring a loss sample account feature set corresponding to a loss sample account set and a corresponding actual labeling result set, wherein the loss sample account set includes accounts in a loss state in a preset first time period in the target application, the loss sample account feature set includes a plurality of features of each loss account in the loss sample account set, each labeling result in the actual labeling result set indicates whether the corresponding loss account in the loss sample account set actually becomes a recall account in a preset second time period, and the second time period is later than the first time period; and training a first to-be-trained neural network model by using the loss sample account feature set and the actual labeling result set until a loss function between a first prediction labeling result set output by the first to-be-trained neural network model and the actual labeling result set meets a second preset condition to obtain the first prediction neural network model, wherein each labeling result in the first prediction labeling result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
As an optional implementation manner, taking the first predictive neural network model as the LightGBM model as an example, the LightGBM model is obtained by training an attrition sample account feature set and an actual annotation result set. The attrition sample account feature set comprises a plurality of features for each attrition sample account in the attrition sample account set. The lost sample account set is used as a training sample and is a lost sample account in historical time, for example, an account which does not log in the game client for one week is a lost sample account, and the lost sample account is labeled according to whether the lost sample account is recalled to obtain an actual labeling result.
As an alternative embodiment, assume that 100 accounts of the target application are not registered in 20201101-. The attrition sample account re-logged into the target application becomes the recall account. Recalled accounts and unrerecalled accounts may be noted using 1 s and 0 s. For example, the recall account of the target application that is re-registered in 20201108-20201208 (second time period) is labeled with 1, and the attrition sample account of the target application that is not re-registered is labeled with 0. And training the LightGBM model by using the attrition sample account number in the attrition sample account number set and the actual labeling result set, and stopping training to obtain the trained LightGBM model if a loss function between a prediction labeling result of the attrition sample account number output by the LightGBM model and an actual labeling result of the attrition sample account number meets a convergence condition when the LightGBM model is trained, wherein the prediction labeling result can be used for representing the probability that the attrition sample account number becomes a recall account number in 20201108-plus 20201208 (second time period).
Optionally, the method further comprises: and training a second to-be-trained neural network model by using the loss sample account feature set, the actual annotation result set and the first prediction annotation result set until a loss function between a second prediction annotation result set output by the second to-be-trained neural network model and the actual annotation result set meets a third preset condition to obtain the second prediction neural network model, wherein each annotation result in the second prediction annotation result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
As an optional implementation manner, taking the second predictive neural network model as the deep fm model as an example, the deep fm model is obtained by training using the lost sample account feature set, the actual tagging result set, and the predictive tagging result output by the LightGBM model, when the deep fm model is trained, if the loss function between the predictive tagging result of the lost sample account output by the deep fm model and the actual tagging result of the lost sample account satisfies the convergence condition, the training is stopped to obtain the trained deep fm model, and the predictive tagging result may be used to indicate the probability that the lost sample account becomes a recall account in 20201108-.
Optionally, the method further comprises: acquiring a recall sample account feature set corresponding to a recall sample account set, wherein each account in the recall sample account set is in a loss state in a preset first time period and becomes a recall account in a preset second time period, the recall sample account feature set comprises a plurality of features of each account in the recall sample account set, and the second time period is later than the first time period; determining a group of key features and feature values of the group of key features according to the recall sample account feature set; clustering lost accounts in a lost sample account set by using the group of key features and feature values of the group of key features to obtain a plurality of cluster clusters and a cluster condition corresponding to each cluster, wherein the cluster condition corresponding to each cluster comprises one or more key features in the group of key features and corresponding feature values, the lost sample account set comprises a recall sample account set and an unrecalled sample account set, and each account in the unrecalled sample account set is in a lost state in the first time period and does not become a recall account in the second time period.
As an alternative embodiment, the recall sample account included in the recall sample account set is in an attrition state at a first time period and becomes a recall account at a second time period. Assume that 100 accounts of the target application are not registered in 20201101-. If 30 accounts log in the target application again at 20201108-20201208 (second time period), it is determined that the 30 accounts become recall accounts. The recall sample account feature set includes a plurality of features of the recall account, such as the basic attribute features described above, an active feature, a pay feature, a play feature, an in-game social feature, and the like. The key features may be features commonly possessed by recall account numbers, that is, features possessed by most recall account numbers. Assuming that the recalled account mostly has the characteristics that the elapsed time is less than the preset time and the historical payment is greater than the preset threshold value, the preset time and the preset threshold value may be determined according to the actual situation, for example, the preset time may be 10 hours, 20 hours, 30 hours, etc., and the preset threshold value may be 500, 600, 1000, etc. Determining that the lapsed time and the historical payment are a set of key features, and obtaining the feature values of the lapsed time and the historical payment of the recalled account, for example, the lapsed time of a certain recalled account is 5 hours, and the historical payment is 3000 yuan.
As an alternative, the account number may have a number of features due to attrition. For example, the clustering effect is more obvious, and the determined group of key features is used for clustering the lost sample accounts in the embodiment. Assume that the determined set of key features includes: online duration, historical payment amount, loss duration, and level. The attrition accounts in the attrition sample account set may be clustered according to the set of key features. Fig. 6 is a schematic diagram of a plurality of cluster clusters according to an alternative embodiment of the present invention, where each cluster may include attrition accounts in one or more attrition sample account sets, each attrition account includes a recall account and an unrecalled account, and white is marked as a recall account and gray is marked as an unrecalled account in the diagram. Assume that 100 accounts of the target application are not registered in 20201101-20201107 as attrition sample accounts, and are in an attrition state in 20201101-20201107. If 30 accounts log in the target application again at 20201108-20201208, it is determined that the 30 accounts become recall accounts, and it is determined that the other 70 accounts are un-recalled sample accounts. The clustering condition corresponding to each cluster can be one or more key features in a group of keywords and the feature value, and it is assumed that the historical pay fee and the loss duration are the clustering conditions of the first cluster, the online duration and the grade are the clustering conditions of the second cluster, the historical pay fee and the online duration are the clustering conditions of the third cluster, and the loss duration and the grade are the clustering conditions of the fourth cluster in fig. 5. By clustering lost sample accounts according to a group of key characteristics, clustering clusters to which the recalled accounts and the unrecalled accounts belong respectively and the characteristics of the accounts in the corresponding clustering clusters can be determined visually due to the corresponding clustering conditions of each clustering cluster. A representation of the missing sample accounts can be determined, as shown in FIG. 7, which is a representation of a radar image according to an alternative embodiment of the present invention, from which the features of different types of accounts can be clearly distinguished. In this embodiment, assuming that the number of the recalled accounts in the first cluster is large, it is determined that the clustering conditions (the historical payment amount and the loss duration) corresponding to the first cluster are characteristics possessed by the recalled accounts, and the preset duration and the preset threshold may be determined by counting the historical payment amount and the loss duration of the recalled accounts, and assuming that the historical payment amounts of the recalled accounts are greater than 1000 yuan and the loss durations are less than 5 hours, it is determined that the preset duration is 5 hours and the preset threshold is 1000 yuan. Therefore, the lost account numbers with the historical payment amount larger than 1000 yuan and the lost time length smaller than 5 hours can be easily recalled. In this embodiment, the lost accounts in the lost sample account set are clustered by using the determined group of key features, so that recalled accounts and unrecalled accounts in the lost sample account set can be distinguished more obviously.
As an alternative embodiment, the clustering operation may be performed using Kmeans clustering or Kmeans + +. The Kmeans clustering algorithm is a clustering analysis algorithm for iterative solution, loss accounts in a loss account set to be identified can be divided into K groups, K loss accounts can be randomly selected from the loss account set to be identified as initial clustering centers, then the distance between each loss account in the loss account set to be identified and each clustering center is calculated, each loss account is allocated to the clustering center closest to the nearest cluster center, and the clustering center and the loss accounts allocated to the clustering center represent one clustering cluster. And recalculating the clustering center according to the existing lost accounts in the clustering cluster when one lost account is distributed. This process is repeated until a predetermined termination condition is satisfied. The preset termination condition may be that the cluster center in the cluster is not changed any more, or the square sum of the errors is locally minimum, or no lost account is reassigned to a different cluster.
The Kmeans clustering algorithm may include the following steps:
in step S11, the loss sample account set includes M loss sample accounts, and K points may be randomly selected from the loss sample account set as an initial clustering center.
Step S12, calculating a distance, such as an euclidean distance or a cosine distance, between each lost sample account and each cluster center in the lost sample account set, and dividing the lost sample accounts into cluster clusters corresponding to the cluster centers closest to each other.
Step S13, after all the lost sample accounts in the lost sample account set belong to the corresponding cluster, the M lost sample accounts are divided into K cluster.
And step S14, recalculating the clustering center of each clustering cluster according to the average distance center of each clustering cluster, and repeatedly executing the steps S12 and S13 until the termination condition is met to obtain the final clustering result. The suspension condition may be that the cluster center in the cluster is not changed any more, or the sum of squared errors is locally minimum, or no missing account is reassigned to a different cluster.
As an alternative embodiment, taking the above target clustering model as a Kmeans + + clustering algorithm as an example, the Kmeans + + clustering algorithm is an improvement of the Kmeans algorithm. The Kmeans + + clustering algorithm may include the following steps:
step S21, randomly selecting one lost sample account in the lost sample account set as the first clustering center mu1
Step S22, calculating other lost sample accounts to the first cluster center mu1Respectively, are recorded as: [ d1, d2, d3 ].]D is d1+ d2+ d3+ …;
in step S23, each attrition sample account is selected as the second cluster center μ2Respectively has a probability of
Figure BDA0003127211670000161
Step S24, one lost sample account is randomly selected as a second clustering center mu according to the probability2. Specifically, the attrition sample account with the highest probability can be selected as the second clustering center;
step S25, calculating the distances from the other samples to the centers of the first and second clusters, respectively, based on the shortest distance. The above steps 23, 24 are repeated to calculate the remaining cluster centers.
Optionally, the determining a group of key features and feature values of the group of key features according to the recall sample account feature set includes: when the number of features meeting the current feature condition in the recall sample account feature set is greater than a preset threshold value, determining the current feature corresponding to the current feature condition as a key feature, and determining the feature value of the current feature corresponding to the current feature condition as the feature value of the key feature, wherein the current feature condition is a logic judgment condition executed on the current feature and the feature value of the current feature.
As an optional implementation manner, the recall sample account number features include a plurality of features of each recall sample account number, and a set of key features is determined from the plurality of features of the recall sample account number. The current characteristic is a characteristic possessed by the recall sample, for example, a historical payment amount, a loss duration and the like, and the characteristic value may be a characteristic value of the recall sample in the current characteristic, for example, the historical payment amount is 100 yuan, and the loss duration is 5 hours. The logical decision condition may be greater than a preset value. For example, the current characteristic condition may be that the historical payment amount is greater than 100 dollars and the attrition duration is less than 5 hours. The logic judgment condition is that the number is more than 100 yuan and less than 5 hours. And if the number of the features meeting the current feature condition in the recall sample feature set is greater than a preset threshold, determining the features as key features. For example, if the historical payment amount is greater than 100 yuan and the number of features with the loss duration less than 5 hours in the return sample feature set is greater than a preset threshold, the historical payment amount and the loss duration are determined as key features, and the preset threshold may be determined according to actual situations, and may be, for example, 100, 200, 1000, and the like. In this embodiment, the importance of the features may be ranked by the LightGBM model, as shown in fig. 3, in the process of training the LightGBM model, the LightGBM model may output the ranking result of the importance of the features, and select top ranked features, for example, 5 or 10 features, as the key features. The key features are determined according to the number of the features meeting the current feature conditions in the recall sample account feature set, the key features can be determined from a plurality of features of the recall account, the lost sample accounts are clustered by adopting the key features, and the recall account and the account which is not recalled can be obviously distinguished.
Optionally, the performing, according to the attrition account feature set, a clustering operation on the attrition account set in a plurality of predetermined clustering clusters to obtain a first attrition account subset in the attrition account set includes: acquiring the group of key features and corresponding feature values of each lost account in the lost account feature set; and clustering each lost account in the lost account set into the plurality of cluster clusters according to the obtained group of key features and corresponding feature values of each lost account.
As an optional implementation manner, a set of key features is obtained through the attrition sample accounts, and a plurality of cluster clusters are obtained by clustering the attrition sample accounts through the set of key features. And for the lost account set to be identified, acquiring a feature value corresponding to the key feature of each lost account in a lost account feature set corresponding to the lost account set to be identified, and clustering each lost account into a plurality of cluster clusters clustered by lost sample accounts according to the feature value corresponding to the key feature of each lost account. For example, the clustering condition of the first clustering cluster in fig. 5 is the historical payment amount and the attrition duration, and if the attrition account to be identified meets the clustering condition of the first clustering cluster, the account to be attrited is clustered into the first clustering cluster. In this embodiment, the lost account to be identified may be clustered into a plurality of predetermined cluster clusters according to the feature value of the key feature of the lost account to be identified, and whether the lost account is an account that is easy to recall may be determined according to the clustering result of the lost account to be identified. If the number of accounts recalled in the lost account in the first cluster is large, and the lost account to be identified is clustered to the first cluster, it is determined that the lost account to be identified is an account which is easy to recall.
Optionally, performing a clustering operation on the attrition account set in a plurality of predetermined clustering clusters according to the attrition account feature set to obtain a first attrition account subset in the attrition account set, including: determining the distance from each lost account to the cluster core of the target cluster according to a plurality of characteristics of each lost account, wherein the target cluster is the cluster with the highest recall probability in the plurality of clusters; and determining the first attrition account subset in the attrition account set according to the distance from each attrition account to the cluster core of the target cluster, wherein the distance corresponding to the attrition account in the first attrition account subset meets a preset distance condition.
As an optional implementation manner, assuming that the number of recall accounts in the first cluster is the largest, or the ratio of the recall account in the first cluster to the loss account in the first cluster is the largest, it is determined that the first cluster is the target cluster with the highest recall probability. The cluster core is the cluster center of the target cluster, the distance between each lost account to be identified and the cluster center of the target cluster can be determined for the lost accounts to be identified, the lost accounts which are easy to recall can be determined in the lost account set to be identified according to the distance, and the accounts in the first lost account subset are the lost accounts which are easy to recall. The preset distance condition may be that the distance is smaller than a preset distance, and the preset distance may be determined according to actual conditions, for example, 5, 10, 15, and the like. In this embodiment, it may be determined whether the attrition account to be identified is an account that is easy to be recalled by using a distance between the attrition account to be identified and the cluster center of the target cluster, where the set of accounts that are easy to be recalled is the first attrition account subset.
Optionally, the determining, according to the first attrition account subset and the second attrition account subset, a third attrition account subset to be recalled includes: and determining the union of the first lost account subset and the second lost account subset as the third lost account subset.
As an optional implementation manner, the first attrition account subset is accounts which are determined by a clustering algorithm and are easy to recall, the second attrition account subset is accounts which are determined by a neural network algorithm and are easy to recall, the first attrition account subset and the second attrition account subset are merged to obtain attrition accounts which are easy to recall in the account set to be attrited, and the third attrition account subset is obtained.
Optionally, after determining the third subset of lost account numbers to be recalled, the method further includes: searching an active account related to the third lost account subset in the account set of the target application to obtain an active account set, wherein the active account set comprises accounts currently in an active state in the target application, and the correlation between the active accounts in the active account set and the lost accounts corresponding to the third lost account subset meets a fourth preset condition; and sending preset recall information to the third lost account subset through the active account set.
As an optional embodiment, the active account may be an account currently logged in the target application, an account that has recently logged in the target application, or an account that has recently logged in the target application frequently. The recent period may be determined according to actual conditions, and may be, for example, about 1 day and about 1 hour. The frequency of the times can be determined according to actual conditions, for example, 10 logins are logged in for 1 day. The correlation degree can be the intimacy degree of friends, and the intimacy degree can be determined according to the number of times of communication or the number of times of participating in the same game. The fourth preset condition may be determined according to actual situations, for example, the number of exchanges is greater than 100, and the number of exchanges exceeds 50 in the same game. And taking the active account meeting the fourth preset condition as an intimate friend of the lost account to be identified, and sending a recall message to the lost account through the intimate friend to recall the lost account under the condition that the lost account is determined to be an account which is easy to recall.
Optionally, the sending, by the active account number set, preset recall information to the third lost account number subset includes: according to the characteristics of each active account in the active account set, performing clustering operation on the active account set to obtain a target clustering result, wherein the target clustering result comprises an account category to which each active account in the active account set belongs; setting recall information corresponding to the account category to which each active account belongs for each active account in the active account set; and sending preset recall information corresponding to the account type of each active account to a loss account corresponding to the third loss account subset through each active account in the active account set.
As an optional implementation manner, the target clustering model may be a Kmeans + + clustering model, and a clustering operation is performed on each active account in the active account set through the Kmeans + + clustering model. The cluster result includes account categories of active accounts, including but not limited to core players, lost players. Setting personalized recall information according to account categories of active accounts, as shown in fig. 8, which is a schematic diagram of a recall flow according to an optional embodiment of the present invention, wherein a first attrition account subset is obtained through clustering, a second attrition account subset is obtained using a Staking model (a fusion model of LightGBM and deep fm), a third attrition account subset which is possibly recalled is obtained through fusion deduplication, a most intimate active account is searched for each attrition account in the third attrition account subset, categories of intimate active accounts are searched for, recall information is set according to categories of active accounts, and recall information is sent to accounts in the third attrition account subset which can be recalled. For example, the recall information set for the core player is "team with XX, together with the team with XX, in the world of the drawing", the recall information set for the lost player is "XX has a tragic moment and is urgently needed to help", and the recall rate can be improved by setting personalized recall information and sending the personalized recall information to the lost account by close friends.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
According to another aspect of the embodiment of the present invention, there is also provided an attrition account recalling apparatus for implementing the above attrition account recalling method. As shown in fig. 9, the apparatus includes: an obtaining module 902, configured to obtain a lost account set to be identified and a corresponding lost account feature set, where the lost account set includes accounts currently in a lost state in a target application, and the lost account feature set includes multiple features of each lost account in the lost account set; an executing module 904, configured to perform a clustering operation on the attrition account set in a plurality of predetermined clustering clusters according to the attrition account feature set, so as to obtain a first attrition account subset in the attrition account set, where a clustering cluster to which the first attrition account subset belongs is a target clustering cluster in the plurality of clustering clusters, where a recall probability of the target clustering cluster meets a first preset condition, and a plurality of features of attrition accounts in the first attrition account subset meet a clustering condition corresponding to the target clustering cluster; an input module 906, configured to input the attrition account feature set into a target neural network model, to obtain a second attrition account subset in the attrition account set, where an attrition account in the second attrition account subset is an attrition account belonging to a recall account category predicted by the target neural network model; a determining module 908, configured to determine a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
Optionally, the apparatus is further configured to input the feature set of attrition accounts into a first predictive neural network model in the target neural network model, to obtain a first set of probabilities predicted by the first predictive neural network model, wherein the first set of probabilities includes a probability that each attrition account in the set of attrition accounts belongs to the recall account category, and the first predictive neural network model is configured to determine a probability that each attrition account belongs to the recall account category according to the feature set of attrition accounts; inputting the attrition account feature set and the first probability set into a second predictive neural network model in the target neural network model, obtaining a second probability set predicted by the second predictive neural network model, wherein the second probability set includes a probability that each attrition account in the attrition account set belongs to the recall account category, and the second predictive neural network model is configured to determine a cross feature of order 2 and a cross feature higher than order 2 according to the attrition account feature set and the first probability set, and determine a probability that each attrition account belongs to the recall account category according to the cross feature of order 2 and the cross feature higher than order 2; determining the second subset of attrition accounts in the set of attrition accounts according to the first set of probabilities and the second set of probabilities.
Optionally, the above apparatus is further configured to determine a final predicted probability that each attrition account in the set of attrition accounts belongs to the category of the recall account according to the first set of probabilities and the second set of probabilities, wherein the final predicted probability of each attrition account is a mean of the probabilities corresponding to each attrition account in the first set of probabilities and the second set of probabilities; searching the attrition account number with the final prediction probability larger than a preset threshold value in the attrition account number set to obtain the second attrition account number subset.
Optionally, the apparatus is further configured to obtain a attrition sample account feature set and a corresponding actual tagging result set corresponding to the attrition sample account set, where the attrition sample account set includes accounts in an attrition state in the target application within a preset first time period, the attrition sample account feature set includes multiple features of each attrition account in the attrition sample account set, each tagging result in the actual tagging result set indicates whether the corresponding attrition account in the attrition sample account set actually becomes a recall account within a preset second time period, and the second time period is later than the first time period; and training a first to-be-trained neural network model by using the loss sample account feature set and the actual labeling result set until a loss function between a first prediction labeling result set output by the first to-be-trained neural network model and the actual labeling result set meets a second preset condition to obtain the first prediction neural network model, wherein each labeling result in the first prediction labeling result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
Optionally, the apparatus is further configured to train a second to-be-trained neural network model by using the loss sample account feature set, the actual annotation result set, and the first prediction annotation result set, until a loss function between a second prediction annotation result set output by the second to-be-trained neural network model and the actual annotation result set satisfies a third preset condition, to obtain the second prediction neural network model, where each annotation result in the second prediction annotation result set represents a probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
Optionally, the apparatus is further configured to obtain a recall sample account feature set corresponding to a recall sample account set, where each account in the recall sample account set is in a attrition state within a preset first time period and becomes a recall account within a preset second time period, the recall sample account feature set includes a plurality of features of each account in the recall sample account set, and the second time period is later than the first time period; determining a group of key features and feature values of the group of key features according to the recall sample account feature set; clustering lost accounts in a lost sample account set by using the group of key features and feature values of the group of key features to obtain a plurality of cluster clusters and a cluster condition corresponding to each cluster, wherein the cluster condition corresponding to each cluster comprises one or more key features in the group of key features and corresponding feature values, the lost sample account set comprises a recall sample account set and an unrecalled sample account set, and each account in the unrecalled sample account set is in a lost state in the first time period and does not become a recall account in the second time period.
Optionally, the apparatus is further configured to determine, when the number of features that satisfy a current feature condition in the recall sample account feature set is greater than a preset threshold, a current feature corresponding to the current feature condition as a key feature, and determine a feature value of the current feature corresponding to the current feature condition as a feature value of the key feature, where the current feature condition is a logical judgment condition executed on the current feature and the feature value of the current feature.
Optionally, the apparatus is further configured to obtain the group of key features and corresponding feature values of each attrition account in the attrition account feature set; and clustering each lost account in the lost account set into the plurality of cluster clusters according to the obtained group of key features and corresponding feature values of each lost account.
Optionally, the apparatus is further configured to determine, according to a plurality of characteristics of each attrition account, a distance from each attrition account to a cluster core of the target cluster, where the target cluster is a cluster with a highest recall probability in the plurality of clusters; and determining the first attrition account subset in the attrition account set according to the distance from each attrition account to the cluster core of the target cluster, wherein the distance corresponding to the attrition account in the first attrition account subset meets a preset distance condition.
Optionally, the apparatus is further configured to determine a union of the first attrition account subset and the second attrition account subset as the third attrition account subset.
Optionally, the apparatus is further configured to, after determining a third lost account subset to be recalled, find an active account related to the third lost account subset in the account set of the target application, to obtain an active account set, where the active account set includes an account currently in an active state in the target application, and a correlation between an active account in the active account set and a lost account corresponding to the third lost account subset satisfies a fourth preset condition; and sending preset recall information to the third lost account subset through the active account set.
Optionally, the device is further configured to perform a clustering operation on the active account set according to characteristics of each active account in the active account set, so as to obtain a target clustering result, where the target clustering result includes an account category to which each active account in the active account set belongs; setting recall information corresponding to the account category to which each active account belongs for each active account in the active account set; and sending preset recall information corresponding to the account type of each active account to a loss account corresponding to the third loss account subset through each active account in the active account set.
According to another aspect of the embodiment of the present invention, there is further provided an electronic device for implementing the method for recalling an attrition account, where the electronic device may be a terminal device or a server shown in fig. 1. The present embodiment takes the electronic device as a server as an example for explanation. As shown in fig. 10, the electronic device comprises a memory 1002 and a processor 1004, the memory 1002 having stored therein a computer program, the processor 1004 being arranged to execute the steps of any of the method embodiments described above by means of the computer program.
Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, acquiring a lost account set to be identified and a corresponding lost account feature set, where the lost account set includes accounts currently in a lost state in a target application, and the lost account feature set includes a plurality of features of each lost account in the lost account set;
s2, according to the attrition account feature set, performing a clustering operation on the attrition account set in a plurality of predetermined clustering clusters to obtain a first attrition account subset in the attrition account set, where a clustering cluster to which the first attrition account subset belongs is a target clustering cluster in the plurality of clustering clusters, where recall probability of the target clustering cluster meets a first preset condition, and a plurality of features of attrition accounts in the first attrition account subset meet a clustering condition corresponding to the target clustering cluster;
s3, inputting the attrition account feature set into a target neural network model, to obtain a second attrition account subset in the attrition account set, where an attrition account in the second attrition account subset is an attrition account in the recall account category predicted by the target neural network model;
and S4, determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 10 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 10 is a diagram illustrating a structure of the electronic device. For example, the electronics may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
The memory 1002 may be configured to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for recalling an attrition account in the embodiment of the present invention, and the processor 1004 executes various functional applications and data processing by running the software programs and modules stored in the memory 1002, that is, the method for recalling an attrition account is implemented. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may be specifically, but not limited to, used for storing information such as sample characteristics of an item and a target virtual resource account number. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, an obtaining module 902, an executing module 904, an inputting module 906, and a determining module 908 of the recall device of attrition accounts. In addition, the system may further include, but is not limited to, other module units in the foregoing revocation account recalling apparatus, which is not described in this example again.
Optionally, the above-mentioned transmission device 1006 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 1006 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices so as to communicate with the internet or a local area Network. In one example, the transmission device 1006 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In addition, the electronic device further includes: a display 1008 for displaying the information of the order to be processed; and a connection bus 1010 for connecting the respective module parts in the above-described electronic apparatus.
In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. Nodes can form a Peer-To-Peer (P2P, Peer To Peer) network, and any type of computing device, such as a server, a terminal, and other electronic devices, can become a node in the blockchain system by joining the Peer-To-Peer network.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations described above. Wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring a lost account set to be identified and a corresponding lost account feature set, where the lost account set includes accounts currently in a lost state in a target application, and the lost account feature set includes a plurality of features of each lost account in the lost account set;
s2, according to the attrition account feature set, performing a clustering operation on the attrition account set in a plurality of predetermined clustering clusters to obtain a first attrition account subset in the attrition account set, where a clustering cluster to which the first attrition account subset belongs is a target clustering cluster in the plurality of clustering clusters, where recall probability of the target clustering cluster meets a first preset condition, and a plurality of features of attrition accounts in the first attrition account subset meet a clustering condition corresponding to the target clustering cluster;
s3, inputting the attrition account feature set into a target neural network model, to obtain a second attrition account subset in the attrition account set, where an attrition account in the second attrition account subset is an attrition account in the recall account category predicted by the target neural network model;
and S4, determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (15)

1. A lost account recall method is characterized by comprising the following steps:
acquiring a lost account set to be identified and a corresponding lost account feature set, wherein the lost account set comprises accounts currently in a lost state in a target application, and the lost account feature set comprises a plurality of features of each lost account in the lost account set;
according to the lost account feature set, performing clustering operation on the lost account set in a plurality of predetermined clustering clusters to obtain a first lost account subset in the lost account set, wherein the clustering cluster to which the first lost account subset belongs is a target clustering cluster of which the recall probability meets a first preset condition, and a plurality of features of lost accounts in the first lost account subset meet the clustering condition corresponding to the target clustering cluster;
inputting the attrition account feature set into a target neural network model to obtain a second attrition account subset in the attrition account set, wherein an attrition account in the second attrition account subset is an attrition account in the recall account category predicted by the target neural network model;
and determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
2. The method of claim 1, wherein inputting the attrition account feature set into a target neural network model, resulting in a second subset of attrition accounts in the attrition account set, comprises:
inputting the feature set of attrition accounts into a first predictive neural network model in the target neural network model, obtaining a first set of probabilities predicted by the first predictive neural network model, wherein the first set of probabilities includes a probability that each attrition account in the set of attrition accounts belongs to the recall account category, and the first predictive neural network model is configured to determine the probability that each attrition account belongs to the recall account category according to the feature set of attrition accounts;
inputting the attrition account feature set and the first probability set into a second predictive neural network model in the target neural network model, obtaining a second probability set predicted by the second predictive neural network model, wherein the second probability set includes a probability that each attrition account in the attrition account set belongs to the recall account category, and the second predictive neural network model is configured to determine a cross feature of order 2 and a cross feature higher than order 2 according to the attrition account feature set and the first probability set, and determine a probability that each attrition account belongs to the recall account category according to the cross feature of order 2 and the cross feature higher than order 2;
determining the second subset of attrition accounts in the set of attrition accounts according to the first set of probabilities and the second set of probabilities.
3. The method of claim 2, wherein said determining the second subset of attrition accounts from the set of attrition accounts based on the first set of probabilities and the second set of probabilities comprises:
determining a final predicted probability that each attrition account in the set of attrition accounts belongs to the category of recall accounts according to the first set of probabilities and the second set of probabilities, wherein the final predicted probability of each attrition account is an average of the probabilities corresponding to each attrition account in the first set of probabilities and the second set of probabilities;
searching the attrition account number with the final prediction probability larger than a preset threshold value in the attrition account number set to obtain the second attrition account number subset.
4. The method of claim 2, further comprising:
acquiring a loss sample account feature set corresponding to a loss sample account set and a corresponding actual labeling result set, wherein the loss sample account set includes accounts in a loss state in a preset first time period in the target application, the loss sample account feature set includes a plurality of features of each loss account in the loss sample account set, each labeling result in the actual labeling result set indicates whether the corresponding loss account in the loss sample account set actually becomes a recall account in a preset second time period, and the second time period is later than the first time period;
and training a first to-be-trained neural network model by using the loss sample account feature set and the actual labeling result set until a loss function between a first prediction labeling result set output by the first to-be-trained neural network model and the actual labeling result set meets a second preset condition to obtain the first prediction neural network model, wherein each labeling result in the first prediction labeling result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
5. The method of claim 4, further comprising:
and training a second to-be-trained neural network model by using the loss sample account feature set, the actual annotation result set and the first prediction annotation result set until a loss function between a second prediction annotation result set output by the second to-be-trained neural network model and the actual annotation result set meets a third preset condition to obtain the second prediction neural network model, wherein each annotation result in the second prediction annotation result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
6. The method of claim 1, further comprising:
acquiring a recall sample account feature set corresponding to a recall sample account set, wherein each account in the recall sample account set is in a loss state in a preset first time period and becomes a recall account in a preset second time period, the recall sample account feature set comprises a plurality of features of each account in the recall sample account set, and the second time period is later than the first time period;
determining a group of key features and feature values of the group of key features according to the recall sample account feature set;
clustering lost accounts in a lost sample account set by using the group of key features and feature values of the group of key features to obtain a plurality of cluster clusters and a cluster condition corresponding to each cluster, wherein the cluster condition corresponding to each cluster comprises one or more key features in the group of key features and corresponding feature values, the lost sample account set comprises a recall sample account set and an unrecalled sample account set, and each account in the unrecalled sample account set is in a lost state in the first time period and does not become a recall account in the second time period.
7. The method of claim 6, wherein the determining a set of key features and feature values of the set of key features according to the recall sample account feature set comprises:
when the number of features meeting the current feature condition in the recall sample account feature set is greater than a preset threshold value, determining the current feature corresponding to the current feature condition as a key feature, and determining the feature value of the current feature corresponding to the current feature condition as the feature value of the key feature, wherein the current feature condition is a logic judgment condition executed on the current feature and the feature value of the current feature.
8. The method of claim 6, wherein said clustering said attrition account set in a predetermined plurality of cluster clusters according to said attrition account feature set to obtain a first attrition account subset of said attrition account set comprises:
acquiring the group of key features and corresponding feature values of each lost account in the lost account feature set;
and clustering each lost account in the lost account set into the plurality of cluster clusters according to the obtained group of key features and corresponding feature values of each lost account.
9. The method of claim 6, wherein clustering the attrition account set in a predetermined plurality of cluster clusters according to the attrition account feature set to obtain a first attrition account subset of the attrition account set comprises:
determining the distance from each lost account to the cluster core of the target cluster according to a plurality of characteristics of each lost account, wherein the target cluster is the cluster with the highest recall probability in the plurality of clusters;
and determining the first attrition account subset in the attrition account set according to the distance from each attrition account to the cluster core of the target cluster, wherein the distance corresponding to the attrition account in the first attrition account subset meets a preset distance condition.
10. The method of any of claims 1 to 9, wherein determining a third subset of attrition accounts to recall from the first subset of attrition accounts and the second subset of attrition accounts comprises:
and determining the union of the first lost account subset and the second lost account subset as the third lost account subset.
11. The method of any one of claims 1 to 9, wherein after determining the third subset of losing accounts to recall, the method further comprises:
searching an active account related to the third lost account subset in the account set of the target application to obtain an active account set, wherein the active account set comprises accounts currently in an active state in the target application, and the correlation between the active accounts in the active account set and the lost accounts corresponding to the third lost account subset meets a fourth preset condition;
and sending preset recall information to the third lost account subset through the active account set.
12. The method of claim 11, wherein the sending of the preset recall information to the third subset of lost accounts through the set of active accounts comprises:
according to the characteristics of each active account in the active account set, performing clustering operation on the active account set to obtain a target clustering result, wherein the target clustering result comprises an account category to which each active account in the active account set belongs;
setting recall information corresponding to the account category to which each active account belongs for each active account in the active account set;
and sending preset recall information corresponding to the account type of each active account to a loss account corresponding to the third loss account subset through each active account in the active account set.
13. An lost account recall device, comprising:
the system comprises an acquisition module, a management module and a management module, wherein the acquisition module is used for acquiring a lost account set to be identified and a corresponding lost account feature set, the lost account set comprises accounts in a lost state currently in a target application, and the lost account feature set comprises a plurality of features of each lost account in the lost account set;
an execution module, configured to perform a clustering operation on the attrition account set in a plurality of predetermined clustering clusters according to the attrition account feature set, so as to obtain a first attrition account subset in the attrition account set, where a clustering cluster to which the first attrition account subset belongs is a target clustering cluster in the plurality of clustering clusters, where a recall probability of the target clustering cluster meets a first preset condition, and a plurality of features of attrition accounts in the first attrition account subset meet a clustering condition corresponding to the target clustering cluster;
an input module, configured to input the attrition account feature set into a target neural network model, to obtain a second attrition account subset in the attrition account set, where an attrition account in the second attrition account subset is an attrition account in the recall account category predicted by the target neural network model;
and the determining module is used for determining a third lost account subset to be recalled according to the first lost account subset and the second lost account subset.
14. A computer-readable storage medium, comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 12.
15. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 12 by means of the computer program.
CN202110693747.XA 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment Active CN113244629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110693747.XA CN113244629B (en) 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110693747.XA CN113244629B (en) 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN113244629A true CN113244629A (en) 2021-08-13
CN113244629B CN113244629B (en) 2023-05-12

Family

ID=77189239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110693747.XA Active CN113244629B (en) 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113244629B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115624755B (en) * 2022-12-08 2023-03-14 腾讯科技(深圳)有限公司 Data processing method and device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110585726A (en) * 2019-09-16 2019-12-20 腾讯科技(深圳)有限公司 User recall method, device, server and computer readable storage medium
CN111275503A (en) * 2020-03-20 2020-06-12 京东数字科技控股有限公司 Data processing method and device for acquiring lost user recall success rate

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110585726A (en) * 2019-09-16 2019-12-20 腾讯科技(深圳)有限公司 User recall method, device, server and computer readable storage medium
CN111275503A (en) * 2020-03-20 2020-06-12 京东数字科技控股有限公司 Data processing method and device for acquiring lost user recall success rate

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115624755B (en) * 2022-12-08 2023-03-14 腾讯科技(深圳)有限公司 Data processing method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113244629B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
CN110245301A (en) A kind of recommended method, device and storage medium
CN111353092B (en) Service pushing method, device, server and readable storage medium
CN109508426A (en) A kind of intelligent recommendation method and its system and storage medium based on physical environment
CN103703466A (en) Social network powered query suggestions
CN103581270A (en) User recommendation method and system
CN110995569A (en) Intelligent interaction method and device, computer equipment and storage medium
CN113836318A (en) Dynamic knowledge graph completion method and device and electronic equipment
CN111428127A (en) Personalized event recommendation method and system integrating topic matching and two-way preference
CN111414842A (en) Video comparison method and device, computer equipment and storage medium
CN110457601B (en) Social account identification method and device, storage medium and electronic device
CN113244629A (en) Lost account recall method and device, storage medium and electronic equipment
CN113590898A (en) Data retrieval method and device, electronic equipment, storage medium and computer product
CN110826867B (en) Vehicle management method, device, computer equipment and storage medium
CN113962417A (en) Video processing method and device, electronic equipment and storage medium
CN111353093B (en) Problem recommendation method, device, server and readable storage medium
KR102553169B1 (en) Method and apparatus for providing solutions for brand improvement
CN113672816B (en) Account feature information generation method and device, storage medium and electronic equipment
CN110852338A (en) User portrait construction method and device
CN115203568A (en) Content recommendation method based on deep learning model, related device and equipment
CN111353090B (en) Service distribution method, device, server and readable storage medium
CN110942345B (en) Seed user selection method, device, equipment and storage medium
CN111935259A (en) Method and device for determining target account set, storage medium and electronic equipment
CN112016004A (en) Multi-granularity information fusion-based job crime screening system and method
CN118051782B (en) Model training method, business processing method and related device
Özer et al. A machine learning-based framework for predicting game server load

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40052201

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant