WO2021174699A1 - 用户筛选方法、装置、设备及存储介质 - Google Patents

用户筛选方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021174699A1
WO2021174699A1 PCT/CN2020/093424 CN2020093424W WO2021174699A1 WO 2021174699 A1 WO2021174699 A1 WO 2021174699A1 CN 2020093424 W CN2020093424 W CN 2020093424W WO 2021174699 A1 WO2021174699 A1 WO 2021174699A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
user
identified
feature
offline
Prior art date
Application number
PCT/CN2020/093424
Other languages
English (en)
French (fr)
Inventor
余雯
黄承伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021174699A1 publication Critical patent/WO2021174699A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Definitions

  • This application relates to the field of feature recognition in artificial intelligence, and in particular to a user screening method, device, equipment, and storage medium.
  • a machine model is usually used to perform statistical analysis on the user's historical purchase information, so as to filter the target user from the user group.
  • the machine model cannot accurately screen out the target users when identifying new users. The accuracy of user screening is low. Therefore, how to improve the screening accuracy of screening out target users from new users has become an urgent problem to be solved.
  • This application provides a user screening method, device, equipment, and storage medium to improve the screening accuracy of screening target users from new users.
  • this application provides a user screening method, including:
  • the degree of matching between the characteristic data of the user to be identified and the characteristic factors in the user identification model meets a preset condition, it is determined that the user to be identified is a target user.
  • this application also provides a user screening device, the device including:
  • a data acquisition module for acquiring offline image data and online data of the user to be identified, the offline image data including visit record images;
  • An image processing module configured to perform image processing on the visit record image to obtain offline visit data of the user to be identified
  • a feature derivation module configured to use the offline visit data and online data as the basic data of the user to be identified, and perform feature derivation on the basic data to obtain the feature data of the user to be identified;
  • the feature matching module is configured to perform feature factor matching on feature data of the user to be identified according to a pre-trained user identification model, and the pre-trained user identification model is used to identify feature factors of the target user;
  • the user determination module is configured to determine that the user to be identified is a target user when the degree of matching between the characteristic data of the user to be identified and the characteristic factors in the user identification model meets a preset condition.
  • the present application also provides a computer device, the computer device includes a memory and a processor; the memory is used to store a computer program; the processor is used to execute the computer program and execute the A user screening method is implemented in a computer program, wherein the user screening method includes:
  • the degree of matching between the characteristic data of the user to be identified and the characteristic factors in the user identification model meets a preset condition, it is determined that the user to be identified is a target user.
  • the present application also provides a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the processor implements a user screening method,
  • the user screening method includes the following steps:
  • the degree of matching between the characteristic data of the user to be identified and the characteristic factors in the user identification model meets a preset condition, it is determined that the user to be identified is a target user.
  • This application discloses a user screening method, device, equipment and storage medium, which improve the accuracy and speed of acquisition of offline visit data by performing image processing on offline image data.
  • the feature derivation enriches the feature data of new users, and performs feature factor matching through the pre-trained user recognition model to determine whether the user to be identified is the target user and improve the target user The accuracy of screening.
  • FIG. 1 is a schematic flowchart of a user recognition model training method provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a user screening method provided by an embodiment of the present application
  • FIG. 3 is a schematic flowchart of sub-steps of a user screening method provided in FIG. 2;
  • FIG. 4 is a schematic flowchart of a user screening method provided by an embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a user recognition model training device provided by an embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a user screening device according to an embodiment of the present application.
  • FIG. 7 is a schematic block diagram of a user screening device provided in an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
  • the embodiments of the present application provide a user screening method, device, computer equipment, and storage medium.
  • the user screening method can be used to screen new users, thereby screening target users from new users, and improving the accuracy of screening target users from new users.
  • the target user may be a new user who has purchasing potential for a specific product or multiple products.
  • a detailed description will be given by taking an example of a specific product having purchasing potential.
  • FIG. 1 is a schematic flowchart of a user recognition model training method provided by an embodiment of the present application.
  • the user recognition model training method uses feature derivation of sample data to obtain sample feature data, and then uses random forest algorithm training to improve the accuracy of screening target users of the trained model.
  • the user recognition model training method specifically includes: step S101 to step S104.
  • the sample data includes non-insurance sales data, online interactive data, offline interactive data, basic information, asset information, liability information, and third-party user portraits of the sample user during a historical period of time, as well as the current user profile of the sample user.
  • the specific product refers to a product that needs to be promoted.
  • the third-party user portrait may be a user portrait generated by the sample user on other platforms, and the third-party user portrait may include information such as user hobbies and purchasing preferences.
  • the sample feature data includes sample features and sample feature values
  • the sample feature data is used to construct a user recognition model. Perform feature extraction on the sample data to obtain each sample feature and the feature value corresponding to the sample feature, and use the obtained sample feature and the feature value corresponding to the sample feature as the sample feature data.
  • user basic information may include sample characteristics such as gender characteristics, age characteristics, and occupation characteristics
  • user asset information may include annual income characteristics, movable property characteristics, and real estate characteristics.
  • the sample feature is gender
  • the sample feature value is male
  • the sample feature is age
  • the sample feature value is 35
  • the sample feature is a specific product
  • the sample feature value is bought.
  • performing feature derivation on sample data refers to performing feature learning on sample data to obtain new sample user data, and use the new sample user data and sample data as sample feature data.
  • the feature derivation can be performed by basic transformation of a single feature, or by adding time latitude to the feature to perform time slicing, or by adding or multiplying multiple features. For example, two features can be added to obtain a new feature, such as adding the user's asset information and user's liability information to obtain the user's actual asset information, and use the obtained actual asset information as the derived new feature.
  • S102 Perform data cleaning on the sample feature data to obtain cleaned sample feature data.
  • performing data cleaning on the sample feature data refers to performing data cleaning on the feature values in the sample feature data, which specifically includes data null value inspection, data abnormal value inspection, and data exploration.
  • the data null value test is performed on the sample feature data
  • the general global constant filling method cannot be used, and different types of feature data need to be considered separately. For example, for some interactive behavior characteristics, zero value can be filled when it cannot be collected or missing; for the user's income and assets, the average value can be filled; for gender, occupation, etc., the empty value can be left as unknown.
  • performing data exploration on the sample feature data by calculating the missing rate, maximum, minimum, average, variance, percentile value, etc.
  • the cleaned sample feature data may further include: performing variable processing on the cleaned sample feature data to obtain processed sample feature data.
  • performing variable processing on the cleaned sample feature data includes removing repetitive sample features and calculating the IV value of the sample feature in the cleaned sample data.
  • removing repetitive sample features refers to judging whether there are sample features with high correlation in the sample features. For example, there are two features of work city and living city in the sample features, and these two features are highly correlated, so you can Eliminate any one of the city of life and the city of work.
  • the IV value is used to measure the predictive ability of the sample characteristics.
  • the IV value is calculated for the sample characteristics in the cleaned sample data.
  • the characteristic values of the sample characteristics are usually binned and the weight of evidence WOE value is calculated. Then the WOE value is based Calculate the IV value.
  • WOEi is the WOE value corresponding to the i-th attribute of a sample feature
  • gi represents the number of purchased product users corresponding to the i-th attribute of a sample feature
  • g is the total number of purchased product users in the sample
  • bi represents a sample feature
  • b is the total number of users who have not purchased the product in the sample.
  • the higher the WOE value the lower the probability that the user in the group did not purchase the product.
  • the larger the IV value the greater the difference in the distribution of the sample features between the sample users who purchased the specific product and the sample users who did not purchase the specific product, that is, the better the distinguishing ability of the sample features.
  • the sample feature contains post-factors.
  • the post-factors refer to factors that may affect the model after the product is purchased. For example, the transaction records of a sample user after purchasing a specific product. Although the characteristics of the transaction record are strongly related to whether or not to purchase the product, since it occurs after the purchase of the specific product, the sample feature needs to be removed.
  • S103 Classify the cleaned sample feature data according to product purchase rules to obtain a positive sample set and a negative sample set respectively.
  • the product purchase rule refers to the rule of whether the sample user purchases a specific product.
  • product purchase rules can be pre-set by the engineer.
  • sample user Determine whether the sample user has purchased a specific product according to the sample feature and sample feature value related to the purchase of a specific product in the cleaned sample feature data. If the sample user has purchased the specific product, the sample user is classified as purchased, and Record the sample feature data as a positive sample, and use the sample feature data in the cleaned sample feature data that meets the classification as a positive sample set; if the sample user has not purchased the specific product, the sample user is classified as not purchased, The sample feature data is recorded as a negative sample, and the sample feature data conforming to the classification in the cleaned sample feature data is taken as a negative sample set. In the specific implementation process, you can record the purchased category as 1 and the unpurchased category as 0.
  • step S103 in order to avoid the imbalance of the number of samples in the positive sample set and the negative sample set, resulting in overfitting when training the user recognition model, after step S103, it may further include:
  • the preset threshold may be preset by an engineer.
  • the preset threshold may be 40% of the sum of the number of samples in the positive sample set and the negative sample set. For example, when the sum of the number of samples in the positive sample set and the negative sample set is 100, when the number of samples in the positive sample set is 20 and the number of samples in the negative sample set is 80, the difference between the number of samples in the positive sample set and the negative sample set is 60 %, indicating that the number of samples in the positive sample set is small at this time, so the samples in the positive sample set are analyzed to synthesize new samples and construct a new positive sample set.
  • the samples in the positive sample set are analyzed.
  • the SMOTE algorithm can be used to upsample the positive sample set to synthesize new samples, and the new samples are added to the positive sample set to construct a new positive sample set. Sample set.
  • S104 Train a user recognition model by using a random forest algorithm based on the positive sample set and the negative sample set to obtain the pre-trained user recognition model.
  • the positive sample set and the negative sample set are integrated to obtain a sample set, and the sample set is divided into a training set and a test set, where the ratio of the number of positive samples and the number of negative samples in the training set and the test set is the same.
  • the specific training process of using the training set and random forest algorithm to train the user recognition model is as follows:
  • a sample S'with the same size as S is randomly selected with replacement from the training set S, and the undrawn samples form out-of-bag data.
  • a total of n samplings are performed to generate n training subsets;
  • the generated multiple decision trees are formed into a random forest F to complete the training of the user recognition model.
  • the calculation method is as follows:
  • each sample feature is sorted according to the importance value, and the sample feature with the highest importance ranking is used as the feature for identifying the target user.
  • the sample feature data is obtained by feature extraction and feature derivation of the sample data, and then the sample feature data is cleaned to obtain the cleaned sample feature data.
  • the feature derivation of the sample data increases the obtained sample feature data, and data cleaning reduces the interference of dirty data on the user recognition model.
  • Use the random forest algorithm to train the user recognition model on the cleaned sample feature data improve the accuracy of the feature factor recognition of the user recognition model.
  • FIG. 2 is a schematic flowchart of a user screening method provided by an embodiment of the present application.
  • the user screening method can perform image processing on the offline image data of the user to be identified, and use the pre-trained user recognition model to match the feature factors in the feature data of the user to be identified according to the user data of the user to be identified after processing, thereby screening Out the target user.
  • the user screening method specifically includes: step S201 to step S205.
  • the offline image data includes a visit record image and a face recognition image.
  • the visit record image includes the scanned image or captured image of the offline visit record made by the business personnel according to the communication situation with the to-be-identified user when the business person conducts an offline communication visit with the to-be-identified user.
  • the online data includes non-insurance sales data, online interactive data, basic information, asset information, liability information, and third-party user portraits of the user to be identified.
  • the third-party user portrait may be a user portrait generated by the user to be identified on other platforms, and the third-party user portrait may include information such as user hobbies, purchase preferences, and so on.
  • S202 Perform image processing on the visit record image to obtain offline visit data of the user to be identified.
  • offline visit data may include the user's name, ID number, the user's knowledge of the product, the name and price of the product purchased by the user, and user needs, etc.
  • Image processing is performed on the offline visit records made by the business personnel, so as to extract the offline visit data of the user to be identified.
  • step S202 may include:
  • the preprocessing may include binarization, noise removal, tilt correction, etc., wherein when the visited recorded image is a color image, the visited recorded image may be binarized to obtain a black and white binary image. Preprocess the visit record image to improve the recognition accuracy of the visit record image.
  • the layout analysis refers to segmenting and branching the visit record image according to the text content it includes. After the segmentation and branching are performed, each The text content included in the paragraph and each line is subjected to character cutting and character recognition, so that the text content included in the visit record image is recognized.
  • determining the offline visit data according to the recognition result may refer to post-processing the recognition result, and using the post-processed data as the offline visit data.
  • Post-processing the recognition results refers to correcting the recognition results according to the specific language context, thereby improving the accuracy of image processing, and saving the post-processing recognition results as offline visit data.
  • S203 Use the offline visit data and online data as the basic data of the user to be identified, and perform feature derivation on the basic data to obtain the feature data of the user to be identified.
  • the processed offline visit data and online data are used together as the basic data of the user to be identified, and feature derivation is performed on the basic data to obtain the characteristic data of the user to be identified.
  • the feature data of the user to be identified includes features and feature values.
  • user basic information may include gender characteristics, age characteristics, occupation characteristics, and other characteristics
  • user asset information may include annual income characteristics, movable property characteristics, real estate characteristics, and the like.
  • the feature derivation of the basic data of the user to be identified refers to the feature learning of the basic data of the user to be identified to obtain new user data to be identified, and the new user data to be identified and the basic data of the user to be identified are used as the user to be identified Characteristic data.
  • feature derivation can be performed by the following three methods: 1. Basic conversion of a single feature, such as squaring a single feature, rooting, log conversion, etc. 2.
  • Features are derived by adding a time dimension. For example, time slices can be performed based on the basic data of users to be identified to obtain online interaction data of users to be identified within 1 month, 3 months, 6 months, and 12 months.
  • Multi-feature operations such as adding two features, multiplying them, or calculating a ratio between features to obtain new features.
  • user asset information and user liability information can be summarized to obtain new user data to be identified. Or perform operations such as addition and multiplication on user asset information and user liability information to obtain new user data to be identified.
  • S204 Perform feature factor matching on the feature data of the user to be identified according to the pre-trained user identification model.
  • the pre-trained user identification model is used to identify the characteristic factors of the target user. Since the output result of the trained user identification model is the importance of the feature, the user identification model is used to perform feature matching on the feature data of the user to be identified to determine whether the feature data includes the important features output by the user identification model.
  • the matching degree between the characteristic data of the user to be identified and the characteristic factors in the user identification model meets a preset condition, it is indicated that the user to be identified is a user with purchasing potential, and the identified user is determined to be the target user.
  • the preset condition may be that the degree of matching between the feature data of the user to be identified and the feature factor in the user identification model reaches a preset threshold, or it may be that the feature data of the user to be identified and the feature data in the user identification model
  • the matching degree of the feature factor is within a numerical range.
  • the preset threshold may be a percentage or a specific value preset by the engineer.
  • the preset threshold is a percentage
  • the percentage refers to the percentage of the features of the user to be identified and the number of important features output by the user identification model.
  • other models can be trained to calculate and adjust specific values of the threshold, and models can also be trained to adjust the weight coefficients of important features.
  • the user screening method disclosed in the above embodiment performs image processing on the acquired offline image data of the user to be identified to obtain the offline visit data, and the offline visit data and the online data are used together as the basic data of the user to be identified , Increase the data volume of the basic data of the user to be identified.
  • Feature derivation is performed on basic data to obtain feature data, which increases the data volume of feature data and enriches data for new users.
  • the user identification model the feature data is matched with feature factors. When the matching degree meets the preset condition, the user to be identified is determined as the target user, and the trained user identification model is used to improve the screening accuracy of the target user.
  • FIG. 4 is a schematic flowchart of a user screening method provided by an embodiment of the present application.
  • the user screening method can perform image processing on the offline image data of the user to be identified, use face data to associate the identified offline data with online data to obtain user data, and then use a pre-trained user recognition model to target the user to be identified Screening of users.
  • the user screening method specifically includes: step S301 to step S306.
  • the offline image data includes a visit record image and a face recognition image.
  • the visit record image includes the scanned image or captured image of the offline visit record made by the business personnel according to the communication situation with the to-be-identified user when the business person conducts an offline communication visit with the to-be-identified user.
  • the face recognition image includes the face image collected when the user to be identified conducts an offline communication visit.
  • S302 Perform image processing on the visit record image to obtain offline visit data of the user to be identified.
  • offline visit data may include the user's name, ID number, the user's knowledge of the product, the name and price of the product purchased by the user, and user needs, etc.
  • Image processing is performed on the offline visit records made by the business personnel, so as to extract the offline visit data of the user to be identified.
  • the face recognition image may also include a face image collected when the user to be identified performs face recognition and signs in while participating in an offline activity.
  • a face collection image is set at the user's sign-in place instead of manual sign-in.
  • the face collects the face information of the user who has signed in, and combines the collected face information with the sign-in time,
  • the sign-in locations are stored in the database together.
  • the face image collected when the user to be identified participates in offline activities can also be used to match the offline visit data of the user to be identified In this way, the check-in time and check-in location corresponding to the data that the user to be identified participates in offline activities are also used as the user's offline visit data.
  • the online data of the user to be identified includes the basic information of the user to be identified, that is, information such as name, gender, ID number, and face image
  • the offline visits are obtained through image processing. After the data is collected, it is necessary to save the obtained offline visit data and the user's online data together as the user's basic data. Therefore, when performing data association, the face recognition image corresponding to the offline visit data is matched with the face image in the online data; when the face recognition image is successfully matched, the offline visit data and the line are established.
  • the association between the online data, and the offline visit data and the online data are integrated to obtain the basic data of the user to be identified.
  • face recognition image matching instead of manual data input by business personnel, on the one hand, it improves the efficiency of data input; and because when inputting offline visit data, most of them use name screening, so that the user's offline visit Data input is easy to cause confusion between online data and offline visit data of users with the same name.
  • face recognition image matching on the other hand also avoids user information confusion caused by offline visit data input.
  • the feature derivation of the basic data of the user to be identified refers to the feature learning of the basic data of the user to be identified to obtain new user data to be identified, and the new user data to be identified and the basic data of the user to be identified are used together as the data to be identified. Identify the user's characteristic data.
  • S304 Perform data cleaning on the characteristic data of the user to be identified to obtain the cleaned characteristic data of the user to be identified.
  • performing data cleaning on the characteristic data of the user to be identified includes data null value checking and data abnormal value checking.
  • data null value checking when the feature data is tested for data null value, because the feature data includes multiple different types of data, the general global constant filling method cannot be used, and different types of feature data need to be considered separately. For example, for some interactive behavior characteristics, zero value can be filled when it cannot be collected or missing; for the user's income and assets, the average value can be filled; for gender, occupation, etc., the blank value can be left as unknown.
  • S305 Perform feature factor matching on the feature data of the user to be identified according to the pre-trained user identification model.
  • the user recognition model is used to perform feature factor matching on the feature data of the user to be identified after cleaning, and determine whether the cleaned feature data includes the user Identify the important feature factors output by the model.
  • the matching degree between the cleaned feature data of the user to be identified and the feature factors in the user identification model meets the preset condition, it is indicated that the user to be identified is a user with purchasing potential, and the identified user is determined as the target user.
  • the foregoing embodiment discloses a user screening method that uses face recognition images to associate offline visit data and online data, so that the two together serve as basic data, which improves the accuracy and efficiency of offline visit data entry.
  • Feature derivation is performed on basic data to obtain feature data, which increases the data volume of feature data and enriches data for new users.
  • Performing data cleaning on the feature data reduces the interference of dirty data on the user identification model and improves the accuracy of feature factor matching.
  • the feature data is matched with feature factors. When the matching degree meets the preset condition, the user to be identified is determined as the target user, and the trained user identification model is used to improve the accuracy of the target user's identification.
  • FIG. 5 is a schematic block diagram of a user recognition model training device provided by an embodiment of the present application.
  • the user recognition model training device may be configured in a server to execute the aforementioned user recognition model training method.
  • the user recognition model training device 400 includes: a data acquisition module 401, a data cleaning module 402, a data classification module 403, and a model training module 404.
  • the data acquisition module 401 is configured to acquire sample data, and sequentially perform feature extraction and feature derivation on the sample data to obtain sample feature data.
  • the data cleaning module 402 is configured to perform data cleaning on the sample feature data to obtain cleaned sample feature data.
  • the data classification module 403 is configured to classify the cleaned sample feature data according to product purchase rules to obtain a positive sample set and a negative sample set respectively.
  • the model training module 404 is configured to train a user recognition model by using a random forest algorithm based on the positive sample set and the negative sample set to obtain the pre-trained user recognition model.
  • FIG. 6 is a schematic block diagram of a user screening device according to an embodiment of the present application, and the user screening device is configured to execute the aforementioned user screening method.
  • the user screening device can be configured in a server or a terminal.
  • the user screening device 500 includes: a data acquisition module 501, an image processing module 502, a feature derivation module 503, a feature matching module 504, and a user determination module 505.
  • the data acquisition module 501 is configured to acquire offline image data and online data of the user to be identified, where the offline image data includes a visit record image.
  • the image processing module 502 is configured to perform image processing on the visit record image to obtain offline visit data of the user to be identified.
  • the image processing module 502 includes a preprocessing submodule 5021, a recognition result submodule 5022, and a data determination submodule 5023.
  • the preprocessing submodule 5021 is used to preprocess the visit record image.
  • the recognition result sub-module 5022 is used to perform layout analysis and character recognition on the preprocessed visit record image to obtain a recognition result.
  • the data determining sub-module 5023 is configured to determine offline visit data according to the recognition result.
  • the feature derivation module 503 is configured to use the offline visit data and online data as the basic data of the user to be identified, and perform feature derivation on the basic data to obtain the feature data of the user to be identified.
  • the feature matching module 504 is configured to perform feature factor matching on feature data of the user to be identified according to a pre-trained user identification model, and the pre-trained user identification model is used to identify feature factors of the target user.
  • the user determination module 505 is configured to determine that the user to be identified is a target user when the degree of matching between the characteristic data of the user to be identified and the characteristic factors in the user identification model meets a preset condition.
  • FIG. 7 is a schematic block diagram of a user screening device provided in an embodiment of the present application, and the user screening device is configured to execute the aforementioned user screening method.
  • the user screening device can be configured in a server or a terminal.
  • the user screening device 600 includes: a data acquisition module 601, an image processing module 602, a feature derivation module 603, a data cleaning module 604, a feature matching module 605 and a user determination module 606.
  • the data acquisition module 601 is configured to acquire offline image data and online data of a user to be identified, where the offline image data includes a visit record image and a face recognition image.
  • the image processing module 602 is configured to perform image processing on the visit record image to obtain offline visit data of the user to be identified.
  • the feature derivation module 603 is configured to associate the offline visit data with online data according to the face recognition image to obtain basic data of the user to be identified, and perform feature derivation on the basic data to obtain The characteristic data of the user to be identified.
  • the data cleaning module 604 is configured to perform data cleaning on the characteristic data of the user to be identified to obtain the cleaned characteristic data of the user to be identified.
  • the feature matching module 605 is configured to perform feature factor matching on feature data of the user to be identified according to a pre-trained user identification model, and the pre-trained user identification model is used to identify feature factors of the target user.
  • the user determination module 606 is configured to determine that the user to be identified is a target user when the degree of matching between the characteristic data of the user to be identified and the characteristic factors in the user identification model meets a preset condition.
  • FIG. 8 is a schematic block diagram of a structure of a computer device provided by an embodiment of the present application.
  • the computer equipment can be a server or a terminal.
  • the computer device includes a processor, a memory, and a network interface connected through a system bus, where the memory may include a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium can store an operating system and a computer program.
  • the computer program includes program instructions, and when the program instructions are executed, the processor can execute the user screening method shown in any of the foregoing embodiments.
  • the processor is used to provide computing and control capabilities and support the operation of the entire computer equipment.
  • the internal memory provides an environment for the operation of the computer program in the non-volatile storage medium.
  • the processor can make the processor execute the user screening method shown in any of the above embodiments, wherein the User screening methods include:
  • Obtain offline image data and online data of the user to be identified where the offline image data includes a visit record image; perform image processing on the visit record image to obtain the offline visit data of the user to be identified;
  • the offline visit data and online data are used as the basic data of the user to be identified, and feature derivation is performed on the basic data to obtain the characteristic data of the user to be identified;
  • Identify the characteristic data of the user to perform characteristic factor matching, and the pre-trained user recognition model is used to identify the characteristic factor of the target user; when the characteristic data of the user to be identified is matched with the characteristic factor in the user recognition model
  • the preset condition is met, it is determined that the user to be identified is the target user.
  • the network interface is used for network communication, such as sending assigned tasks.
  • FIG. 8 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer-readable storage medium may be non-volatile or volatile.
  • the computer program includes program instructions, and the processor executes the program instructions to implement the user screening method shown in any of the foregoing embodiments provided in the embodiments of the present application, wherein the user screening method includes the following steps: obtaining Offline image data and online data of the user to be identified, where the offline image data includes a visit record image; image processing is performed on the visit record image to obtain the offline visit data of the user to be identified; Offline visit data and online data are used as the basic data of the user to be identified, and feature derivation of the basic data is performed to obtain the characteristic data of the user to be identified;
  • the characteristic data of the user is matched with characteristic factors, and the pre-trained user recognition model is used to identify the characteristic factors of the target user; when the characteristic data of the user to be recognized and the characteristic factors in the user recognition model meet the matching degree When the
  • the computer-readable storage medium may be the internal storage unit of the computer device described in the foregoing embodiment, for example, the hard disk or memory of the computer device.
  • the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk equipped on the computer device, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) ) Card, Flash Card, etc.
  • a plug-in hard disk equipped on the computer device such as a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) ) Card, Flash Card, etc.
  • SD Secure Digital

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及人工智能中的数据分析领域,公开了一种用户筛选方法、装置、设备及存储介质,所述方法包括:获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配;当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。提高从新用户中筛选出目标用户的准确率。

Description

用户筛选方法、装置、设备及存储介质
本申请要求于2020年03月04日提交中国专利局、申请号为202010144416.6,发明名称为“用户筛选方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能中的特征识别领域,尤其涉及一种用户筛选方法、装置、设备及存储介质。
背景技术
目前,为了从用户群中识别出目标用户,通常会利用机器模型对用户的历史购买信息进行统计分析,以从用户群中筛选出目标用户。但是发明人意识到,对于新用户而言,由于其历史购买信息较少,可用信息较少,导致机器模型在对新用户进行目标用户的识别时,不能够准确的从中筛选出目标用户,目标用户筛选的准确率较低。因此,如何提高从新用户中筛选出目标用户的筛选准确率成为亟待解决的问题。
技术问题
本申请提供了一种用户筛选方法、装置、设备及存储介质,以提高从新用户中筛选出目标用户的筛选准确率。
技术解决方案
为实现上述目的,第一方面,本申请提供了一种用户筛选方法,包括:
获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
第二方面,本申请还提供了一种用户筛选装置,所述装置包括:
数据获取模块,用于获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
图像处理模块,用于对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
特征衍生模块,用于将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
特征匹配模块,用于根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
用户确定模块,用于当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
第三方面,本申请还提供了一种计算机设备,所述计算机设备包括存储器和处理器;所述存储器用于存储计算机程序;所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现一种用户筛选方法,其中,所述用户筛选方法包括:
获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
第四方面,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序, 所述计算机程序被处理器执行时使所述处理器实现一种用户筛选方法,其中,所述用户筛选方法包括以下步骤:
获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
有益效果
本申请公开了一种用户筛选方法、装置、设备及存储介质,通过对线下图像数据进行图像处理,提高获取到的线下拜访数据的准确度和获取速度。对待识别用户的基础数据进行特征衍生,得到特征数据,特征衍生丰富新用户的特征数据,并通过预先训练的用户识别模型进行特征因子匹配,以确定待识别用户是否为目标用户,提高对目标用户的筛选准确度。
附图说明
图1是本申请实施例提供的一种用户识别模型训练方法的示意流程图;
图2是本申请实施例提供的一种用户筛选方法的示意流程图;
图3是图2中提供的一种用户筛选方法的子步骤示意流程图;
图4是本申请实施例提供的一种用户筛选方法的示意流程图;
图5是本申请一实施例提供的一种用户识别模型训练装置的示意性框图;
图6是本申请的实施例还提供一种用户筛选装置的示意性框图;
图7是本申请的实施例还提供一种用户筛选装置的示意性框图;
图8为本申请一实施例提供的一种计算机设备的结构示意性框图。
本发明的最佳实施方式
为了解决上述问题,本申请的实施例提供了一种用户筛选方法、装置、计算机设备及存储介质。用户筛选方法可用于对新用户进行筛选,从而从新用户中筛选出目标用户,提高从新用户中筛选出目标用户的准确率。其中,所述目标用户可以是对一个特定产品或者多个产品具有购买潜力的新用户。在本实施例中,为了便于描述和理解,以对特定产品具有购买潜力为例进行详细说明。
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
请参阅图1,图1是本申请实施例提供的一种用户识别模型训练方法的示意流程图。该用户识别模型训练方法通过对样本数据进行特征衍生,获取样本特征数据,再利用随机森林算法训练,从而提高训练得到的模型对于目标用户的筛选准确度。
如图1所示,该用户识别模型训练方法,具体包括:步骤S101至步骤S104。
S101、获取样本数据,并依次对所述样本数据进行特征提取和特征衍生以得到样本特征数据。
具体地,样本数据包括样本用户在历史一时间段内的非保险销售数据、线上互动数据、线下互动数据、基本信息、资产信息、负债信息和第三方用户画像等,还包括样本用户当前对于特定产品的购买情况,该特定产品是指需要进行产品推广的产品。其中,第三方用户画像可以是该样本用户在其他平台上所产生的用户画像,第三方用户画像中可以包括用户爱好,购买偏好等信息。
其中,样本特征数据包括样本特征和样本特征值,样本特征数据用于构建用户识别模型。对样本数据进行特征提取,得到各样本特征和与样本特征对应的特征值,并将得到的样本特征和与样本特征对应的特征值作为样本特征数据。
例如,用户基本信息中可以包括性别特征、年龄特征、职业特征等样本特征,用户资产信息中可以包括年收入特征、动产特征、不动产特征等。当用户基本信息为性别男,年龄35岁,购买特定产品时,其样本特征为性别,样本特征值为男,样本特征为年龄,样本特征值为35,样本特征为特定产品,样本 特征值为已购买。
其中,对样本数据进行特征衍生是指对样本数据进行特征学习,以得到新的样本用户数据,并将新的样本用户数据和样本数据共同作为样本特征数据。其中,特征衍生可以通过单一特征的基础转换进行,也可以通过对特征添加时间纬度,进行时间切片进行,还可以通过对多个特征进行相加或者相乘等运算进行。例如,可以将两个特征相加得到新特征,比如将用户的资产信息和用户负债信息相加,得到用户的实际资产信息,并将得到的实际资产信息作为衍生得到的新特征。
S102、对所述样本特征数据进行数据清洗,以得到清洗后的样本特征数据。
具体地,对样本特征数据进行数据清洗是指对样本特征数据中的特征值进行数据清洗,具体包括数据空值检验、数据异常值检验和数据探索等。其中,在对样本特征数据进行数据空值检验时,由于样本特征数据中包括多种不同类型的数据,因此不能采用一般的全局常量填充的方式,需要结合不同类型的特征数据分别考虑。例如,对于一些互动行为特征,无法采集或者缺少的情况下可以进行零值填充;对于用户的收入、资产可以填充均值;对于性别、职业等可以保留空值为未知。对所述样本特征数据进行数据探索时,通过计算样本特征数据中各样本特征的特征值的缺失率、最大值、最小值、平均值、方差、百分位值等,如果存在缺失率太高或者某些值导致方差很小,没有实际意义,则剔除这些样本特征和对应的样本特征值。当存在异常值时,例如存在最大值异常,也可以采用95%或者其他比例的分位数进行替换。
在一些实施例中,在得到清洗后的样本特征数据之后,还可以包括:对所述清洗后的样本特征数据进行变量处理,得到处理后的样本特征数据。
具体地,对清洗后的样本特征数据进行变量处理包括去除重复性样本特征和对清洗后的样本数据中的样本特征计算其IV值。其中,去除重复性样本特征是指判断样本特征中是否存在相关性较高的样本特征,例如,样本特征中存在工作城市和生活城市两个特征,而这两个特征相关性较高,因此可以去除生活城市和工作城市中的任意一个。IV值用于衡量样本特征的预测能力,对清洗后的样本数据中的样本特征计算其IV值,通常会对样本特征的特征值进行分箱并计算证据权重WOE值,随后在WOE值的基础上计算IV值。
WOE值的计算公式为:
Figure PCTCN2020093424-appb-000001
IV值的计算公式为:
Figure PCTCN2020093424-appb-000002
其中,WOEi为某样本特征第i个属性对应的WOE值,gi表示某样本特征第i个属性对应的已购买产品用户数量,g为样本中总的已购买产品用户数量,bi表示某样本特征第i个属性对应的未购买产品用户数量,b为样本中总的未购买产品用户数量。WOE值越高,代表该分组中用户未购买产品的概率越低。而IV值越大,则表示购买该特定产品的样本用户和未购买该特定产品的样本用户在该样本特征上的分布差异越大,也即该样本特征的区分能力越好。
如果计算得到某样本特征的IV值过高(例如,IV值大于0.5),则判断该样本特征是否包含事后因素,所述事后因素是指发生在产品购买之后的可能对模型产生影响的因素。例如,样本用户在购买了特定产品之后发生的交易记录,尽管该交易记录的特征与是否购买产品为强关联特征,但是由于其发生在特定产品购买之后,因此需要剔除该样本特征。
S103、将所述清洗后的样本特征数据按照产品购买规则进行分类,以分别获得正样本集和负样本集。
具体地,产品购买规则是指样本用户是否购买特定产品的规则。在具体实施过程中,产品购买规则可以由工程师预先设定。
根据清洗后的样本特征数据中的与特定产品购买相关的样本特征和样本特征值判断样本用户是否购买了特定产品,如果样本用户购买过该特定产品,则将该样本用户分类为已购买,并将该样本特征数据记为正样本,将清洗后的样本特征数据中符合该分类的样本特征数据作为正样本集;如果样本用户未购买过该特定产品,则将该样本用户分类为未购买,并将该样本特征数据记为负样本,将清洗后的样本特征数据中符合该分类的样本特征数据作为负样本集。在具体实施过程中,可以将已购买的分类记为1,未购买的分类记为0。
在一些实施例中,为了避免正样本集和负样本集中样本数量比例不均衡,导致在训练用户识别模型时产生过拟合的情况,在步骤S103之后,还可以包括:
判断所述负样本集与所述正样本集的样本数量差是否大于预设阈值;若所述负样本集与所述正样本集的样本数量差大于预设阈值,则对所述正样本集中的样本进行分析,以合成新样本并将所述新样本添加至所述正样本集中,以构建新的正样本集。
具体地,所述预设阈值可以由工程师预先设定。在一实施过程中,预设阈值可以是正样本集和负样本集中样本数量和的40%。例如,当正样本集和负样本集中的样本数量和为100时,当正样本集中的样本数量为20,负样本集中的样本数量为80,则正样本集和负样本集中样本数量差为60%,说明此时正样本集中的样本数量较少,因此对正样本集中的样本进行分析,以合成新样本,构建新的正样本集。在具体实施过程中,对正样本集中的样本进行分析,其中,可以采用SMOTE算法对正样本集进行上采样,以合成出新样本,并将新样本添加至正样本集中,构建得到新的正样本集。
S104、基于所述正样本集和负样本集利用随机森林算法训练用户识别模型,得到所述预先训练的用户识别模型。
将正样本集和负样本集进行整合以得到样本集,并将样本集分为训练集和测试集,其中,训练集和测试集中正样本的数量和负样本的数量比相同。利用训练集和随机森林算法训练用户识别模型,利用测试集对训练的用户识别模型进行验证。在具体实施过程中,利用训练集和随机森林算法训练用户识别模型的具体训练过程如下:
1、从训练集S中随机有放回地抽取大小和S一样的样本S’,未被抽取到的样本组成袋外数据。共进行n次抽样,生成n个训练子集;
2、对n个训练子集,分别训练n个CART决策树模型G m(x),m∈{1,2,…,n};
3、对第t个决策树模型G t(x),假设训练样本特征的维数为W,随机算则节点上的一部分特征w(w<W),在W i中根据信息增益指数选择最好的特征进行***。对于某个类x i,其信息定义为:I(X=x i)=-log 2P(x i),其中,I(x)表示随机变量的信息,P(x i)表示x i发生的概率。
4、每棵树都按照步骤3中的方式进行***,直到该节点的所有训练样例都属于同一类,在决策树的***过程中不需要剪枝。
5、将生成的多棵决策树组成随机森林F,以完成用户识别模型的训练。
在用户识别模型建立完成后,可以输出样本集中各个样本特征的重要性,计算方法如下所示:
1、对于随机森林F中每一棵决策树,使用相应的袋外数据计算袋外误差,记为errOOB1;
2、随机对袋外数据中所有样本特征X加入噪声干扰,再次计算它的袋外数据误差,记为errOOB2;
3、假设随机森林中有N棵树,那么对于样本特征X的重要性为:
Figure PCTCN2020093424-appb-000003
其中,P为样本特征X的重要性。
在计算出各样本特征的重要性后,对各样本特征按照重要性数值的大小进行排序,将重要性排序靠前的样本特征作为识别目标用户的特征。
上述实施例提供的用户识别模型训练方法,通过对样本数据进行特征提取和特征衍生得到样本特征数据,再对样本特征数据进行数据清洗,得到清洗后的样本特征数据。对样本数据进行特征衍生增加了得到的样本特征数据,数据清洗降低了脏数据对用户识别模型的干扰。将清洗后的样本特征数据利用随机森林算法训练用户识别模型。提高用户识别模型的对于特征因子识别的准确率。
请参阅图2,图2是本申请实施例提供的一种用户筛选方法的示意流程图。该用户筛选方法可以对待识别用户的线下图像数据进行图像处理,根据处理后得到的待识别用户的用户数据利用预先训练的用户识别模型对待识别用户的特征数据中的特征因子进行匹配,从而筛选出目标用户。
如图2所示,该用户筛选方法,具体包括:步骤S201至步骤S205。
S201、获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像。
具体地,所述线下图像数据包括拜访记录图像和人脸识别图像。其中,所述拜访记录图像包括业务人员在对待识别用户进行线下交流拜访时,根据与待识别用户的交流沟通情况所做的线下拜访记录的扫描图像或者拍摄图像。
所述线上数据包括待识别用户的非保险销售数据、线上互动数据、基本信息、资产信息、负债信息和第三方用户画像等。其中,第三方用户画像可以是该待识别用户在其他平台上所产生的用户画像,第三方用户画像中可以包括用户爱好,购买偏好等信息。
S202、对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据。
具体地,线下拜访数据可以包括用户姓名、身份证号、用户对产品的了解程度、用户购买产品的名称和价格以及用户需求等。对业务人员所做的线下拜访记录进行图像处理,从而提取出待识别用户的线下拜访数据。
在一些实施例中,请参考图3,为了提高图像处理的准确度,步骤S202可以包括:
S2021、对所述拜访记录图像进行预处理。
具体地,预处理可以包括二值化、噪声去除和倾斜矫正等,其中,当拜访记录图像为彩色图像时,可以对拜访记录图像进行二值化,以得到黑白色的二值化图。对拜访记录图像进行预处理,以提高对拜访记录图像的识别准确度。
S2022、对预处理后的所述拜访记录图像进行版面分析和字符识别,以得到识别结果。
具体地,在对预处理后的拜访记录图像进行版面分析和字符识别时,版面分析是指将拜访记录图像根据其包括的文本内容进行分段和分行,在进行分段和分行后,对每段和每行内所包括的文本内容进行字符切割和字符识别,从而识别出所述拜访记录图像内所包括的文本内容。
S2023、根据所述识别结果确定线下拜访数据。
具体地,根据所述识别结果确定线下拜访数据可以是指对识别结果进行后处理,并将经后处理之后的数据作为线下拜访数据。对识别结果进行后处理是指根据特定的语言上下文的关系,对识别结果进矫正,从而提高图像处理的准确度,将后处理后的识别结果作为线下拜访数据进行保存。
S203、将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据。
具体地,将处理得到的线下拜访数据和线上数据共同作为待识别用户的基础数据,并对基础数据进行特征衍生,以得到待识别用户的特征数据。
其中,待识别用户的特征数据中包括特征和特征值。例如,用户基本信息中可以包括性别特征、年龄特征、职业特征等特征,用户资产信息中可以包括年收入特征、动产特征、不动产特征等。当待识别用户的基本信息为性别男、年龄35岁时,则特征为性别,特征值为男,特征为年龄,特征值为35。
对待识别用户的基础数据进行特征衍生是指对待识别用户的基础数据进行特征学***方,开根号,log转换等。2、特征通过添加时间维度进行衍生,比如可以根据待识别用户的基础数据进行时间切片,以分别获得1个月、3个月、6个月和12个月内待识别用户的线上互动数据、待识别用户的线下互动数据和待识别用户的非保险销售数据等。3、多特征的运算,比如两个特征相加,相乘或特征间计算一个比率后得到新特征,比如可以对用户资产信息和用户负债信息等进行汇总,以得到新的待识别用户数据,或者对用户资产信息和用户负债信息进行相加、相乘等运算,以得到新的待识别用户数据。
S204、根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配。
具体地,所述预先训练的用户识别模型用于识别出目标用户的特征因子。由于训练好的用户识别模型输出的结果为特征的重要性,利用该用户识别模型对待识别用户的特征数据进行特征匹配,判断特征数据中是否包括用户识别模型所输出的重要特征。
S205、当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
具体地,当待识别用户的特征数据与用户识别模型中的特征因子的匹配度满足预设条件时,则说明该待识别用户为具有购买潜力的用户,并确定该识别用户为目标用户。所述预设条件,可以是待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度达到一个预设阈值,也可以是待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度处于一个数值范围内。
当预设条件为待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度达到一个预设阈值时,该预设阈值可以是工程师预先设置的百分数或具体数值。当预设阈值为百分数时,该百分数是指待识别用户的特征与用户识别模型输出的重要特征数量的百分比。在其他实施例中,可以训练其他模型来计算和调整阈值的具体数值,还可以训练模型来对重要特征的权重系数进行调节。
上述实施例公开的一种用户筛选方法,对获取到的待识别用户的线下图像数据进行图像处理,得到线下拜访数据,将线下拜访数据和线上数据共同作为待识别用户的基础数据,增加了待识别用户的基础数据的数据量。对基础数据进行特征衍生得到特征数据,增加了特征数据的数据量,丰富新用户的数据。根据用户识别模型对特征数据进行特征因子的匹配,当匹配度满足预设条件时,确定该待识别用户为目标用户,利用训练的用户识别模型提高目标用户的筛选准确率。
请参阅图4,图4是本申请实施例提供的一种用户筛选方法的示意流程图。该用户筛选方法可以对待识别用户的线下图像数据进行图像处理,利用人脸数据将识别出的线下数据与线上数据关联得到用户数据,再利用预先训练的用户识别模型对待识别用户进行目标用户的筛选。
如图4所示,该用户筛选方法,具体包括:步骤S301至步骤S306。
S301、获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像和人脸识别图像。
具体地,所述线下图像数据包括拜访记录图像和人脸识别图像。其中,所述拜访记录图像包括业务人员在对待识别用户进行线下交流拜访时,根据与待识别用户的交流沟通情况所做的线下拜访记录的扫描图像或者拍摄图像。人脸识别图像包括在对待识别用户进行线下交流拜访时采集到的人脸图像。
S302、对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据。
具体地,线下拜访数据可以包括用户姓名、身份证号、用户对产品的了解程度、用户购买产品的名称和价格以及用户需求等。对业务人员所做的线下拜访记录进行图像处理,从而提取出待识别用户的线下拜访数据。
在一些实施例中,人脸识别图像还可以包括待识别用户在参加线下活动时进行人脸识别签到时采集到的人脸图像。
具体地,在待识别用户参加线下活动时,在用户签到处设置人脸采集图像代替人工签到,人脸采集到来签到的用户的人脸信息,并将采集到的人脸信息与签到时间、签到地点共同保存至数据库中。此时,在对拜访记录进行图像处理,得到待识别用户的线下拜访数据时,还可以利用待识别用户参加线下活动时采集到的人脸图像与待识别用户的线下拜访数据进行匹配,从而将待识别用户参加线下活动的数据对应的签到时间和签到地点也作为用户的线下拜访数据。
S303、根据所述人脸识别图像将所述线下拜访数据与线上数据进行关联,以得到所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据。
具体地,由于所述待识别用户的线上数据中包括了所述待识别用户的基本信息,也即姓名、性别、身份证号码和人脸图像等信息,在通过图像处理获取到线下拜访数据后,需要将获得的线下拜访数据与该用户的线上数据共同保存,以作为该用户的基础数据。因此,在进行数据关联时,根据线下拜访数据对应的人脸识别图像与线上数据中的人脸图像进行匹配;当人脸识别图像匹配成功时,则建立所述线下拜访数据与线上数据之间的关联,并将所述线下拜访数据与所述线上数据进行整合以得到待识别用户的基础数据。采用人脸识别图像匹配的方式代替业务人员手动输入数据,一方面提高了数据输入的效率;并且由于在进行线下拜访数据的输入时,大多是采用姓名筛选,从而将该用户的线下拜访数据进行输入,容易造成同名用户线上数据和线下拜访数据的混淆,使用人脸识别图像匹配另一方面也避免了线下拜访数据输入时造成的用户信息混淆。
其中,对待识别用户的基础数据进行特征衍生是指对待识别用户的基础数据进行特征学习,以得到新的待识别用户数据,并将新的待识别用户数据和待识别用户的基础数据共同作为待识别用户的特征数据。
S304、对所述待识别用户的特征数据进行数据清洗,以得到清洗后的所述待识别用户的特征数据。
具体地,对待识别用户的特征数据进行数据清洗包括数据空值检验和数据异常值检验等。其中,在对特征数据进行数据空值检验时,由于特征数据中包括多种不同类型的数据,因此不能采用一般的全局 常量填充的方式,需要结合不同类型的特征数据分别考虑。例如,对于一些互动行为特征,无法采集或者缺少的情况下可以进行零值填充;对于用户的收入、资产可以填充均值;对于性别、职业等可以保留空值为未知。
S305、根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配。
具体地,由于训练好的用户识别模型输出的结果为特征因子的重要性,利用该用户识别模型对清洗后的待识别用户的特征数据进行特征因子匹配,判断清洗后的特征数据中是否包括用户识别模型所输出的重要特征因子。
S306、当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
具体地,当清洗后的待识别用户的特征数据与用户识别模型中的特征因子的匹配度满足预设条件时,则说明该待识别用户为具有购买潜力的用户,并确定该识别用户为目标用户。
上述实施例公开了一种用户筛选方法,利用人脸识别图像将线下拜访数据和线上数据进行关联,使其二者共同作为基础数据,提高了线下拜访数据录入的准确率和效率。对基础数据进行特征衍生得到特征数据,增加了特征数据的数据量,丰富新用户的数据。对特征数据进行数据清洗,降低了脏数据对用户识别模型的干扰,提高特征因子匹配的准确度。根据用户识别模型对特征数据进行特征因子的匹配,当匹配度满足预设条件时,确定该待识别用户为目标用户,利用训练的用户识别模型提高目标用户的识别准确率。
请参阅图5,图5是本申请一实施例提供的一种用户识别模型训练装置的示意性框图,该用户识别模型训练装置可以配置于服务器中,用于执行前述的用户识别模型训练方法。如图5所示,该用户识别模型训练装置400,包括:数据获取模块401、数据清洗模块402、数据分类模块403和模型训练模块404。
数据获取模块401,用于获取样本数据,并依次对所述样本数据进行特征提取和特征衍生以得到样本特征数据。
数据清洗模块402,用于对所述样本特征数据进行数据清洗,以得到清洗后的样本特征数据。
数据分类模块403,用于将所述清洗后的样本特征数据按照产品购买规则进行分类,以分别获得正样本集和负样本集。
模型训练模块404,用于基于所述正样本集和负样本集利用随机森林算法训练用户识别模型,得到所述预先训练的用户识别模型。
请参阅图6,图6是本申请的实施例还提供一种用户筛选装置的示意性框图,该用户筛选装置用于执行前述的用户筛选方法。其中,该用户筛选装置可以配置于服务器或终端中。如图6所示,用户筛选装置500包括:数据获取模块501、图像处理模块502、特征衍生模块503、特征匹配模块504和用户确定模块505。
数据获取模块501,用于获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像。
图像处理模块502,用于对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据。
其中,图像处理模块502包括预处理子模块5021、识别结果子模块5022和数据确定子模块5023。
具体地,预处理子模块5021,用于对所述拜访记录图像进行预处理。
识别结果子模块5022,用于对预处理后的所述拜访记录图像进行版面分析和字符识别,以得到识别结果。
数据确定子模块5023,用于根据所述识别结果确定线下拜访数据。
特征衍生模块503,用于将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据。
特征匹配模块504,用于根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子。
用户确定模块505,用于当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度 满足预设条件时,确定所述待识别用户为目标用户。
请参阅图7,图7是本申请的实施例还提供一种用户筛选装置的示意性框图,该用户筛选装置用于执行前述的用户筛选方法。其中,该用户筛选装置可以配置于服务器或终端中。如图7所示,用户筛选装置600包括:数据获取模块601、图像处理模块602、特征衍生模块603、数据清洗模块604、特征匹配模块605和用户确定模块606。
数据获取模块601,用于获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像和人脸识别图像。
图像处理模块602,用于对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据。
特征衍生模块603,用于根据所述人脸识别图像将所述线下拜访数据与线上数据进行关联,以得到所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据。
数据清洗模块604,用于对所述待识别用户的特征数据进行数据清洗,以得到清洗后的所述待识别用户的特征数据。
特征匹配模块605,用于根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子。
用户确定模块606,用于当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的用户识别模型训练装置和各模块的具体工作过程,和上述描述的用户筛选装置和各模块的具体工作过程,可以参考前述用户识别模型训练方法和用户筛选方法实施例中的对应过程,在此不再赘述。
上述的用户筛选装置可以实现为一种计算机程序的形式,该计算机程序可以在如图8所示的计算机设备上运行。请参阅图8,图8是本申请实施例提供的一种计算机设备的结构示意性框图。该计算机设备可以是服务器或终端。参阅图8,该计算机设备包括通过***总线连接的处理器、存储器和网络接口,其中,存储器可以包括非易失性存储介质和内存储器。非易失性存储介质可存储操作***和计算机程序。该计算机程序包括程序指令,该程序指令被执行时,可使得处理器执行上述的任一实施例所示出的用户筛选方法。处理器用于提供计算和控制能力,支撑整个计算机设备的运行。内存储器为非易失性存储介质中的计算机程序的运行提供环境,该计算机程序被处理器执行时,可使得处理器执行上述的任一实施例所示出的用户筛选方法,其中,所述用户筛选方法包括:
获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
该网络接口用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图8中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本申请的实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机可读存储介质可以是非易失性,也可以是易失性,所述计算机程序中包括程序指令,所述处理器执行所述程序指令,实现本申请实施例提供的上述的任一实施例所示出的用户筛选方法,其中,所述用户筛选方法包括以下步骤:获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;当所述待识别用户的特征数据与所述用户 识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
其中,所述计算机可读存储介质可以是前述实施例所述的计算机设备的内部存储单元,例如所述计算机设备的硬盘或内存。所述计算机可读存储介质也可以是所述计算机设备的外部存储设备,例如所述计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。

Claims (20)

  1. 一种用户筛选方法,其中,包括:
    获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
    对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
    将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
    根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
    当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
  2. 根据权利要求1所述的用户筛选方法,其中,所述对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据,包括:
    对所述拜访记录图像进行预处理,所述预处理包括二值化、噪声去除和倾斜矫正;
    对预处理后的所述拜访记录图像进行版面分析和字符识别,以得到识别结果;
    根据所述识别结果确定线下拜访数据。
  3. 根据权利要求1所述的用户筛选方法,其中,所述线下图像数据还包括人脸识别图像;
    所述将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,包括:
    根据所述人脸识别图像将所述线下拜访数据与线上数据进行关联,以得到所述待识别用户的基础数据。
  4. 根据权利要求1所述的用户筛选方法,其中,在所述根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配之前,还包括:
    对所述待识别用户的特征数据进行数据清洗,以得到清洗后的所述待识别用户的特征数据。
  5. 根据权利要求1所述的用户识别方法,其中,还包括:
    获取样本数据,并依次对所述样本数据进行特征提取和特征衍生以得到样本特征数据;
    对所述样本特征数据进行数据清洗,以得到清洗后的样本特征数据;
    将所述清洗后的样本特征数据按照产品购买规则进行分类,以分别获得正样本集和负样本集;
    基于所述正样本集和负样本集利用随机森林算法训练用户识别模型,得到所述预先训练的用户识别模型。
  6. 根据权利要求5所述的用户筛选方法,其中,还包括:
    对所述清洗后的样本特征数据进行变量处理,得到处理后的样本特征数据;
    所述将所述清洗后的样本特征数据按照产品购买规则进行分类,以分别获得正样本集和负样本集,包括:
    将所述处理后的样本数据按照产品购买规则进行分类,以分别获得正样本集和负样本集。
  7. 根据权利要求5所述的用户筛选方法,其中,在所述基于所述正样本集和负样本集利用随机森林算法训练用户识别模型之前,还包括:
    判断所述负样本集与所述正样本集的样本数量差是否大于预设阈值;
    若所述负样本集与所述正样本集的样本数量差大于预设阈值,则对所述正样本集中的样本进行分析,以合成新样本并将所述新样本添加至所述正样本集中,以构建新的正样本集。
  8. 一种用户筛选装置,其中,包括:
    数据获取模块,用于获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
    图像处理模块,用于对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
    特征衍生模块,用于将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
    特征匹配模块,用于根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
    用户确定模块,用于当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
  9. 根据权利要求8所述的用户筛选装置,其中,所述图像处理模块包括:
    预处理子模块,用于对所述拜访记录图像进行预处理;
    识别结果子模块,用于对预处理后的所述拜访记录图像进行版面分析和字符识别,以得到识别结果;
    数据确定子模块,用于根据所述识别结果确定线下拜访数据。
  10. 根据权利要求8所述的用户筛选装置,其中,所述线下图像数据还包括人脸识别图像,所述特征衍生模块,
    还用于根据所述人脸识别图像将所述线下拜访数据与线上数据进行关联,以得到所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据。
  11. 一种计算机设备,其中,所述计算机设备包括存储器和处理器;
    所述存储器用于存储计算机程序;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现一种用户筛选方法:
    其中,所述用户筛选方法包括:
    获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
    对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
    将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
    根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
    当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
  12. 根据权利要求11所述的计算机设备,其中,所述对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据,包括:
    对所述拜访记录图像进行预处理,所述预处理包括二值化、噪声去除和倾斜矫正;
    对预处理后的所述拜访记录图像进行版面分析和字符识别,以得到识别结 果;
    根据所述识别结果确定线下拜访数据。
  13. 根据权利要求11所述的计算机设备,其中,所述线下图像数据还包括人脸识别图像;
    所述将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,包括:
    根据所述人脸识别图像将所述线下拜访数据与线上数据进行关联,以得到所述待识别用户的基础数据。
  14. 根据权利要求11所述的计算机设备,其中,在所述根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配之前,还包括:
    对所述待识别用户的特征数据进行数据清洗,以得到清洗后的所述待识别用户的特征数据。
  15. 根据权利要求11所述的计算机设备,其中,还包括:
    获取样本数据,并依次对所述样本数据进行特征提取和特征衍生以得到样本特征数据;
    对所述样本特征数据进行数据清洗,以得到清洗后的样本特征数据;
    将所述清洗后的样本特征数据按照产品购买规则进行分类,以分别获得正样本集和负样本集;
    基于所述正样本集和负样本集利用随机森林算法训练用户识别模型,得到所述预先训练的用户识别模型。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现一种用户筛选方法,其中,所述用户筛选方法包括以下步骤:
    获取待识别用户的线下图像数据和线上数据,所述线下图像数据包括拜访记录图像;
    对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据;
    将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,并对所述基础数据进行特征衍生以得到所述待识别用户的特征数据;
    根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配,所述预先训练的用户识别模型用于识别出目标用户的特征因子;
    当所述待识别用户的特征数据与所述用户识别模型中的特征因子的匹配度满足预设条件时,确定所述待识别用户为目标用户。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述对所述拜访记录图像进行图像处理,以得到所述待识别用户的线下拜访数据,包括:
    对所述拜访记录图像进行预处理,所述预处理包括二值化、噪声去除和倾斜矫正;
    对预处理后的所述拜访记录图像进行版面分析和字符识别,以得到识别结果;
    根据所述识别结果确定线下拜访数据。
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述线下图像数据还包括人脸识别图像;
    所述将所述线下拜访数据和线上数据作为所述待识别用户的基础数据,包括:
    根据所述人脸识别图像将所述线下拜访数据与线上数据进行关联,以得到所述待识别用户的基础数据。
  19. 根据权利要求16所述的计算机可读存储介质,其中,在所述根据预先训练的用户识别模型对所述待识别用户的特征数据进行特征因子匹配之前,还包括:
    对所述待识别用户的特征数据进行数据清洗,以得到清洗后的所述待识别用户的特征数据。
  20. 根据权利要求16所述的计算机可读存储介质,其中,还包括:
    获取样本数据,并依次对所述样本数据进行特征提取和特征衍生以得到样本特征数据;
    对所述样本特征数据进行数据清洗,以得到清洗后的样本特征数据;
    将所述清洗后的样本特征数据按照产品购买规则进行分类,以分别获得正样本集和负样本集;
    基于所述正样本集和负样本集利用随机森林算法训练用户识别模型,得到所述预先训练的用户识别模型。
PCT/CN2020/093424 2020-03-04 2020-05-29 用户筛选方法、装置、设备及存储介质 WO2021174699A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010144416.6A CN111506798A (zh) 2020-03-04 2020-03-04 用户筛选方法、装置、设备及存储介质
CN202010144416.6 2020-03-04

Publications (1)

Publication Number Publication Date
WO2021174699A1 true WO2021174699A1 (zh) 2021-09-10

Family

ID=71863921

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/093424 WO2021174699A1 (zh) 2020-03-04 2020-05-29 用户筛选方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111506798A (zh)
WO (1) WO2021174699A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111459922A (zh) * 2020-02-17 2020-07-28 平安科技(深圳)有限公司 用户识别方法、装置、设备及存储介质
CN113743752A (zh) * 2021-08-23 2021-12-03 南京星云数字技术有限公司 一种数据处理方法及装置
CN117250521B (zh) * 2023-11-17 2024-02-20 江西驴充充物联网科技有限公司 充电桩电池容量监测***及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657048A (zh) * 2017-09-21 2018-02-02 北京麒麟合盛网络技术有限公司 用户识别方法及装置
CN109218390A (zh) * 2018-07-12 2019-01-15 北京比特智学科技有限公司 用户筛选方法及装置
CN110049094A (zh) * 2019-02-28 2019-07-23 阿里巴巴集团控股有限公司 信息推送的方法和线下展示终端
CN110175298A (zh) * 2019-04-12 2019-08-27 腾讯科技(深圳)有限公司 用户匹配方法
US20190324621A1 (en) * 2018-04-23 2019-10-24 Qualcomm Incorporated System and Methods for Utilizing Multi-Finger Touch Capability to Efficiently Perform Content Editing on a Computing Device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789844B (zh) * 2015-11-23 2020-06-16 阿里巴巴集团控股有限公司 一种恶意用户识别方法及装置
CN105426857B (zh) * 2015-11-25 2019-04-12 小米科技有限责任公司 人脸识别模型训练方法和装置
CN106022317A (zh) * 2016-06-27 2016-10-12 北京小米移动软件有限公司 人脸识别方法及装置
CN109784351B (zh) * 2017-11-10 2023-03-24 财付通支付科技有限公司 行为数据分类方法、分类模型训练方法及装置
HK1250200A2 (zh) * 2018-04-28 2018-11-30 K11 Group Ltd 用戶數據獲取裝置和信息推送方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657048A (zh) * 2017-09-21 2018-02-02 北京麒麟合盛网络技术有限公司 用户识别方法及装置
US20190324621A1 (en) * 2018-04-23 2019-10-24 Qualcomm Incorporated System and Methods for Utilizing Multi-Finger Touch Capability to Efficiently Perform Content Editing on a Computing Device
CN109218390A (zh) * 2018-07-12 2019-01-15 北京比特智学科技有限公司 用户筛选方法及装置
CN110049094A (zh) * 2019-02-28 2019-07-23 阿里巴巴集团控股有限公司 信息推送的方法和线下展示终端
CN110175298A (zh) * 2019-04-12 2019-08-27 腾讯科技(深圳)有限公司 用户匹配方法

Also Published As

Publication number Publication date
CN111506798A (zh) 2020-08-07

Similar Documents

Publication Publication Date Title
WO2019214248A1 (zh) 一种风险评估方法、装置、终端设备及存储介质
CN107633265B (zh) 用于优化信用评估模型的数据处理方法及装置
CN107025596B (zh) 一种风险评估方法和***
WO2021174699A1 (zh) 用户筛选方法、装置、设备及存储介质
WO2021164232A1 (zh) 用户识别方法、装置、设备及存储介质
CN112150298B (zh) 数据处理方法、***、设备及可读介质
CN110991474A (zh) 一种机器学***台
US20190080352A1 (en) Segment Extension Based on Lookalike Selection
CN112990386A (zh) 用户价值聚类方法、装置、计算机设备和存储介质
CN114118816A (zh) 一种风险评估方法、装置、设备及计算机存储介质
CN112990989A (zh) 价值预测模型输入数据生成方法、装置、设备和介质
CN114743048A (zh) 检测异常秸秆图片的方法和检测装置
CN112732908B (zh) 试题新颖度评估方法、装置、电子设备和存储介质
CN110570301B (zh) 风险识别方法、装置、设备及介质
CN114170000A (zh) ***用户风险类别识别方法、装置、计算机设备和介质
CN113888265A (zh) 产品推荐方法、装置、设备及计算机可读存储介质
Mahalle et al. Data Acquisition and Preparation
CN114548620A (zh) 物流准时保业务推荐方法、装置、计算机设备和存储介质
CN113822309B (zh) 用户的分类方法、装置和非易失性计算机可读存储介质
KR100686466B1 (ko) 자산 평가 제공 방법 및 시스템과, 수익성에 대한 안정성 분석 제공 시스템
CN112926998B (zh) 作弊识别方法和装置
CN112508074B (zh) 可视化展示方法、***及可读存储介质
US20240211498A1 (en) Evaluation system and evaluation method
CN117874117A (zh) 一种数据信息管理的会员增值服务平台
CN114022284A (zh) 异常交易的检测方法及其装置、电子设备、存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20922561

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20922561

Country of ref document: EP

Kind code of ref document: A1