CN110533085A - With people's recognition methods and device, storage medium, computer equipment - Google Patents
With people's recognition methods and device, storage medium, computer equipment Download PDFInfo
- Publication number
- CN110533085A CN110533085A CN201910740557.1A CN201910740557A CN110533085A CN 110533085 A CN110533085 A CN 110533085A CN 201910740557 A CN201910740557 A CN 201910740557A CN 110533085 A CN110533085 A CN 110533085A
- Authority
- CN
- China
- Prior art keywords
- user
- sample
- cluster
- users
- people
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of same people's recognition methods and device, storage medium, computer equipments, this method comprises: clustering based on the characteristic information of sample of users to the sample of users, obtain at least one sample of users cluster;At least one set of training sample user is extracted from each sample of users cluster respectively, and obtains same people's markup information of the training sample user;Using the training sample user and corresponding same people's markup information, training is the same as people's identification model;User to be identified identify with people according to same people's identification model after training.The application reduces the training burden with people's identification model, realizes trained optimization, training for promotion efficiency by the cluster to sample of users.
Description
Technical field
This application involves data analysis technique fields, are situated between particularly with regard to a kind of same people's recognition methods and device, storage
Matter, computer equipment.
Background technique
Current internet flourishes, and expedites the emergence of out a collection of electric business and network finance service company, company has due to electric business
Various new person's subsidies, financial service subsidiary's meeting direct loan result in many users and replace cell-phone numbers, re-register to user
Means obtain interests, therefore, how to determine website registration user or service object is the same person, become electric business and interconnection
The key that net financial service subsidiary cuts operating costs with risk.
In same people's identification field, the building of training sample is most important with people's identification model to training.How from a large amount of
Sample of users in quickly determine which two user belongs to the same user, and then constructing training sample set is the weight in the field
Want problem.
Summary of the invention
In view of this, passing through this application provides a kind of same people's recognition methods and device, storage medium, computer equipment
To the cluster of sample of users, the training burden with people's identification model is reduced, trained optimization, training for promotion efficiency are realized.
According to the one aspect of the application, a kind of same people's recognition methods is provided, comprising:
The sample of users is clustered based on the characteristic information of sample of users, obtains at least one sample of users cluster;
At least one set of training sample user is extracted from each sample of users cluster respectively, and obtains the training sample
Same people's markup information of user;
Using the training sample user and corresponding same people's markup information, training is the same as people's identification model;
User to be identified identify with people according to same people's identification model after training.
Specifically, described that the sample of users is clustered based on the characteristic information of sample of users, obtain at least one
Before sample of users cluster, the method also includes:
Obtain the basic data of sample of users;
Based on the basic data of the sample of users, believe according to the feature that default feature classification counts the sample of users
Breath.
Specifically, described that the sample of users is clustered based on the characteristic information of sample of users, obtain at least one
Sample of users cluster, specifically includes:
Number of clusters is determined according to the quantity of the sample of users, and generates the initial cluster center of corresponding number of clusters;
According to the initial cluster center, K mean cluster is carried out to the characteristic information of the sample of users, is accordingly gathered
The sample of users cluster of class quantity and cluster centre corresponding with each sample of users cluster.
Specifically, training sample user described in any one group include in any sample of users cluster cluster centre it is corresponding
Other any described sample of users in the sample of users and the same sample cluster, same people's markup information includes same people
Mark or non-same people's mark.
Specifically, same people's identification model according to after training identify with people to user to be identified, specifically
Include:
According to the basic data of the user to be identified, count the user's to be identified according to the default feature classification
Characteristic information;
Characteristic information based on the user to be identified clusters the user to be identified, obtains at least one and waits knowing
Other user's cluster and cluster centre corresponding with user's cluster to be identified;
It obtains the corresponding central user of any user's cluster to be identified and compares user, wherein the central user
For the corresponding user to be identified of cluster centre of user's cluster to be identified, the comparison user is the user to be identified
The whole user to be identified in cluster in addition to cluster centre;
By in any user's cluster to be identified the corresponding characteristic information of the central user and any comparison
The corresponding characteristic information of user be input to training after same people's identification model in, obtain the central user with it is any described
Compare user whether be same user result.
Specifically, the method also includes:
If the central user and it is any it is described to compare user be same user, according to any comparisons user foundation
Same people's set corresponding with the central user.
Specifically, the foundation characteristic data include but is not limited to the communication data of the sample of users, carrier service
At least one of data and e-commerce operation data or combinations thereof.
According to the another aspect of the application, a kind of same people's identification device is provided, comprising:
Sample clustering module clusters the sample of users for the characteristic information based on sample of users, obtain to
A few sample of users cluster;
Training sample obtains module, uses for extracting at least one set of training sample from each sample of users cluster respectively
Family, and obtain same people's markup information of the training sample user;
Training module, for utilizing the training sample user and corresponding same people's markup information, the same people of training
Identification model;
Identification module, for identify with people to user to be identified according to same people's identification model after training.
Specifically, described device further include:
Basic data obtains module, clusters, obtains to the sample of users for the characteristic information based on sample of users
To before at least one sample of users cluster, the basic data of sample of users is obtained;
Characteristic information statistical module is counted for the basic data based on the sample of users according to default feature classification
The characteristic information of the sample of users.
Specifically, the sample clustering module, specifically includes:
Cluster centre generation unit for determining number of clusters according to the quantity of the sample of users, and generates corresponding poly-
The initial cluster center of class quantity;
Cluster cell, for it is poly- to carry out K mean value to the characteristic information of the sample of users according to the initial cluster center
Class obtains the sample of users cluster and cluster centre corresponding with each sample of users cluster of corresponding number of clusters.
Specifically, training sample user described in any one group include in any sample of users cluster cluster centre it is corresponding
Other any described sample of users in the sample of users and the same sample cluster, same people's markup information includes same people
Mark or non-same people's mark.
Specifically, the identification module, specifically includes:
Characteristic information statistic unit, for the basic data according to the user to be identified, according to the default feature class
The characteristic information of the user to be identified is not counted;
Cluster cell clusters the user to be identified for the characteristic information based on the user to be identified, obtains
To at least one user's cluster to be identified and cluster centre corresponding with user's cluster to be identified;
It identifies user's acquiring unit, is used for obtaining the corresponding central user of any user's cluster to be identified and comparing
Family, wherein the central user is the corresponding user to be identified of cluster centre of user's cluster to be identified, the comparison
User is the whole user to be identified in user's cluster to be identified in addition to cluster centre;
Recognition unit, for by any user's cluster to be identified the corresponding characteristic information of the central user and
Any corresponding characteristic information of user that compares is input in same people's identification model after training, is obtained the center and is used
Family and it is any it is described compare user whether be same user result.
Specifically, described device further include:
As a result output module, if for the central user and it is any it is described compare user be same user, according to appoint
The one comparison user establishes same people's set corresponding with the central user.
Specifically, the foundation characteristic data include but is not limited to the communication data of the sample of users, carrier service
At least one of data and e-commerce operation data or combinations thereof.
According to the application another aspect, a kind of storage medium is provided, computer program, described program are stored thereon with
Above-mentioned same people's recognition methods is realized when being executed by processor.
According to the application another aspect, a kind of computer equipment is provided, including storage medium, processor and be stored in
On storage medium and the computer program that can run on a processor, the processor realize above-mentioned same people when executing described program
Recognition methods.
By above-mentioned technical proposal, a kind of same people's recognition methods provided by the present application and device, storage medium, computer are set
It is standby, sample of users is clustered first with the characteristic information of sample of users, the higher sample of same person's possibility will be belonged to
User is put into the same cluster, training sample user is then chosen in each cluster, and believe with people to training sample user
After breath mark, using the corresponding characteristic information of training sample user and with the training of people's information labeling with people's identification model, finally
User to be identified identify with people using same people's identification model.The application reduces same people by the cluster to sample of users
The training burden of identification model realizes trained optimization, training for promotion efficiency.
Above description is only the general introduction of technical scheme, in order to better understand the technological means of the application,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects, features and advantages of the application can
It is clearer and more comprehensible, below the special specific embodiment for lifting the application.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present application, constitutes part of this application, this Shen
Illustrative embodiments and their description please are not constituted an undue limitation on the present application for explaining the application.In the accompanying drawings:
Fig. 1 shows a kind of flow diagram of same people's recognition methods provided by the embodiments of the present application;
Fig. 2 shows another flow diagrams with people's recognition methods provided by the embodiments of the present application;
Fig. 3 shows a kind of structural schematic diagram of same people's identification device provided by the embodiments of the present application;
Fig. 4 shows another structural schematic diagram with people's identification device provided by the embodiments of the present application.
Specific embodiment
The application is described in detail below with reference to attached drawing and in conjunction with the embodiments.It should be noted that not conflicting
In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.
A kind of same people's recognition methods is provided in the present embodiment, as shown in Figure 1, this method comprises:
Step 101, the characteristic information based on sample of users clusters sample of users, obtains at least one sample of users
Cluster.
In the embodiment of the present application, it before carrying out with people's identification, should first train with people's identification model, in general, need
The required sample of training is chosen from sample data by way of artificial screening, that is, belongs to the sample of the same user, so
And selected from a large amount of sample data belong to the same user sample data workload it is very big, therefore, the application is real
Example is applied by way of clustering to sample of users, the sample of users that first would be possible to belong to common identity is divided into a cluster
In, so as to greatly reduce mark workload.
It should be noted that the characteristic information in the embodiment of the present application based on sample of users clusters sample of users,
The characteristic information of sample of users may include the communication data of sample of users, carrier service data, e-commerce operation data etc., lead to
Letter data may include incoming call telephone number, outbound calling number etc., and carrier service data may include various operating agencies
Send out into short message, such as the Credit Card Swiping information etc. that bank sends, e-commerce operation data may include sample of users in electricity
The purchase of quotient's platform, collection, data on flows etc..
Step 102, at least one set of training sample user is extracted from each sample of users cluster respectively, and obtains training sample
Same people's markup information of user.
It mixes the sample with after family is divided into one or more sample of users clusters, i.e., will belong to the same higher sample of people's possibility
It after user is divided into a cluster, mixes the sample with the sample of users in the cluster of family and is divided into multiple groups training sample user, wherein is every
Group training sample user includes two sample of users can after each sample of users cluster is divided into corresponding training sample user
By manually mark or others by way of judge whether every group of training sample user is same user and is labeled.
Step 103, using training sample user and accordingly with people's markup information, training is the same as people's identification model.
It, can be according to the characteristic information of training sample and accordingly with people's mark letter after training sample user annotation
Breath, training is the same as people's identification model, wherein with people's identification model is two disaggregated models.In addition, the embodiment of the present application knows same people
Other model and training method without limitation, for example, using basic Logic Regression Models, set loss function, using random
Gradient descent method, the parameter of training pattern obtain final same people's identification model.Same people to be obtained using training is identified
Model identify with people to user to be identified.
Step 104, user to be identified identify with people according to same people's identification model after training.
During identify with people to user to be identified using same people's identification model, it can also be used according to sample
It is dual-purpose at several clusters and then by two in the same cluster first to be carried out clustering by the clustering method at family by user to be identified
Family characteristic information, which is input to, to be carried out identifying with people in same people's identification model, avoids whole users to be identified carrying out combination of two
Identified with people, recognition time is wasted, to improve recognition efficiency.
Certainly, in order to improve recognition accuracy, whole users to be identified can also be subjected to combination of two and be input to together
It realizes in people's identification model and is identified with people.
Technical solution by applying this embodiment gathers sample of users first with the characteristic information of sample of users
Class will belong to the higher sample of users of same person's possibility and be put into the same cluster, and training sample is then chosen in each cluster
This user, and after carrying out training sample user with people's information labeling, using the corresponding characteristic information of training sample user and
With the training of people's information labeling with people's identification model, finally user to be identified identify with people using same people's identification model.This
Application reduces the training burden with people's identification model by the cluster to sample of users, realizes trained optimization, training for promotion effect
Rate.
Further, as the refinement and extension of above-described embodiment specific embodiment, in order to completely illustrate the present embodiment
Specific implementation process, provide another kind with people's recognition methods, as shown in Fig. 2, this method comprises:
Step 201, the basic data of sample of users is obtained.
Wherein, foundation characteristic data include but is not limited to the communication data of the sample of users, carrier service data with
And at least one of e-commerce operation data or combinations thereof.
Step 202, based on the basic data of sample of users, believe according to the feature of default feature classification statistical sample user
Breath.
According to default feature classification, the basic data for mixing the sample with family is summarized as the characteristic information of sample of users, for example, in advance
If feature classification includes incoming call telephone number, it is assumed that the incoming call telephone number in the basic data of sample of users includes A, B, D tri-
It is a, it include five telephone numbers of A, B, C, D, E in whole telephone number databases, then extracting the Inbound Calls of sample of users
Number characteristic information is (1,1,0,1,0), other characteristic informations do not illustrate.
Step 203, number of clusters is determined according to the quantity of sample of users, and generated in the initial clustering of corresponding number of clusters
The heart.
Step 204, according to initial cluster center, K mean cluster is carried out to the characteristic information of sample of users, is accordingly gathered
The sample of users cluster of class quantity and cluster centre corresponding with each sample of users cluster.
In step 203 and step 204, after the characteristic information for extracting sample of users, using sample of users characteristic information into
Row clustering mixes the sample with family and is divided into sample of users cluster, specifically can be by the way of K mean cluster.It is of course also possible to
Using other cluster modes, the mode of K mean cluster is explained in the embodiment of the present application, firstly, being used according to sample
The quantity at family determines number of clusters K, such as the corresponding cluster of every 100 sample of users, and then, random generation K is initial poly-
Class center, or K initial cluster center is generated in the way of other agreements, finally, according to K initial cluster center, meter
The distance between each sample and each cluster centre are calculated, each sample is distributed to the cluster centre nearest apart from it, finally
Obtain K sample of users cluster and K cluster centre.
Step 205, at least one set of training sample user is extracted from each sample of users cluster respectively, and obtains training sample
Same people's markup information of user.
Specifically, any one group of training sample user includes the corresponding sample of users of cluster centre in any sample of users cluster
And other any sample of users in same sample cluster, it include same people's mark or non-same people's mark with people's markup information.
In the above-described embodiments, when extracting training sample user, in order to reduce training sample amount, due to being clustered
When, be clustered according to the distance between cluster centre sample and other samples, therefore the cluster centre sample of each cluster and
A possibility that other samples are the same persons is higher, the cluster centre sample in each cluster can be carried out two-by-two with other samples
Combination obtains training sample, so that mark with people's information is carried out to every group of training sample user by the modes such as manually marking,
Specifically mark whether every group of training sample user belongs to the same person.
Step 206, using training sample user and accordingly with people's markup information, training is the same as people's identification model.
Training may refer to the explanation to above-mentioned steps 103 with the mode of people's identification model, and details are not described herein.
Step 207, according to the basic data of user to be identified, the feature of user to be identified is counted according to default feature classification
Information.
Step 208, the characteristic information based on user to be identified clusters user to be identified, obtains at least one and waits knowing
Other user's cluster and cluster centre corresponding with user's cluster to be identified.
Step 209, it obtains the corresponding central user of any user's cluster to be identified and compares user, wherein central user
For the corresponding user to be identified of cluster centre of user's cluster to be identified, compare user be in user's cluster to be identified in addition to cluster centre
Whole user to be identified.
Step 210, by any user's cluster to be identified the corresponding characteristic information of central user and any comparison user
Corresponding characteristic information is input in same people's identification model after training, is obtained central user with any and is compared whether user is same
The result of one user.
In above-mentioned steps 207 into step 210, during identify with people to user to be identified, in order to improve
Recognition efficiency can first be clustered user to be identified to obtain user's cluster to be identified, and then be obtained using training in step 206
Same people's identification model identify whether cluster centre user and other users in each user's cluster to be identified are same use respectively
Family.Specific clustering method can be identical as the clustering method of sample of users, in order to improve recognition efficiency, due to being clustered
When, be clustered according to the distance between cluster centre sample and other samples, therefore the cluster centre sample of each cluster and
A possibility that other samples are the same persons is higher, therefore in identification, by the corresponding center of cluster centre in the same cluster
The characteristic information of user and other users, which are input in same people's identification model, to be identified.
Step 211, it if central user compares user with any as same user, is established with according to any comparison user
Heart user gathers with people accordingly.
After showing whether each comparison user of any user's cluster to be identified belongs to the conclusion of same user with central user,
The comparison user for belonging to same user with central user is established one to gather with people, then the user in the set and the center are used
Family belongs to the same person.
Technical solution by applying this embodiment, first, merge user's communications and liaison relationship, carrier service data, electric business
The multivariate datas such as operation data conclude user's characteristic information, whether pass through simple rule matching judgment user compared with the prior art
It is more acurrate for the judging result of same people's information for the same person;Second, the clustering of user is used as and is identified with people
The preceding Value Operations of model reduce data volume, to the optimization method of calculated performance;Third, with people's identification model as clustering
Post action, promoted with people's identification model differentiation accuracy.
Further, the specific implementation as Fig. 1 method, the embodiment of the present application provide a kind of same people's identification device, such as
Shown in Fig. 3, which includes: sample clustering module 31, training sample acquisition module 32, training module 33, identification module 34.
Sample clustering module 31 clusters sample of users for the characteristic information based on sample of users, obtains at least
One sample of users cluster;
Training sample obtains module 32, uses for extracting at least one set of training sample from each sample of users cluster respectively
Family, and obtain same people's markup information of training sample user;
Training module 33, for using training sample user and accordingly with people's markup information, training to identify mould with people
Type;
Identification module 34, for identify with people to user to be identified according to same people's identification model after training.
In specific application scenarios, the device further include: basic data obtains module 35, characteristic information statistical module
36。
Basic data obtains module 35, clusters, obtains to sample of users for the characteristic information based on sample of users
Before at least one sample of users cluster, the basic data of sample of users is obtained;
Characteristic information statistical module 36 counts sample according to default feature classification for the basic data based on sample of users
The characteristic information of this user.
In specific application scenarios, sample clustering module 31 is specifically included: cluster centre generation unit 311, cluster are single
Member 312.
Cluster centre generation unit 311 determines number of clusters for the quantity according to sample of users, and generates corresponding cluster
The initial cluster center of quantity;
Cluster cell 312, for carrying out K mean cluster to the characteristic information of sample of users, obtaining according to initial cluster center
To the sample of users cluster and cluster centre corresponding with each sample of users cluster of corresponding number of clusters.
In specific application scenarios, any one group of training sample user includes cluster centre pair in any sample of users cluster
Other any sample of users in the sample of users answered and same sample cluster include same people's mark or non-same people with people's markup information
Mark.
In specific application scenarios, identification module 34 is specifically included: characteristic information statistic unit 341, cluster cell
342, user's acquiring unit 343, recognition unit 344 are identified.
Characteristic information statistic unit 341 is counted for the basic data according to user to be identified according to default feature classification
The characteristic information of user to be identified;
Cluster cell 342 clusters user to be identified for the characteristic information based on user to be identified, obtains at least
One user's cluster to be identified and cluster centre corresponding with user's cluster to be identified;
It identifies user's acquiring unit 343, is used for obtaining the corresponding central user of any user's cluster to be identified and comparing
Family, wherein central user is the corresponding user to be identified of cluster centre of user's cluster to be identified, and comparison user is user to be identified
Whole user to be identified in cluster in addition to cluster centre;
Recognition unit 344, for by the corresponding characteristic information of central user in any user's cluster to be identified and any
It compares the corresponding characteristic information of user to be input in same people's identification model after training, obtains central user with any and compare user
Whether be same user result.
In specific application scenarios, the device further include: result output module 37.
As a result output module 37, if comparing user for same user, according to any comparison with any for central user
User establishes same people's set corresponding with central user.
Specifically, foundation characteristic data include but is not limited to the communication data of sample of users, carrier service data and
At least one of e-commerce operation data or combinations thereof.
It should be noted that other of each functional unit involved by a kind of same people's identification device provided by the embodiments of the present application
Corresponding description, can be with reference to the corresponding description in Fig. 1 and Fig. 2, and details are not described herein.
Based on above-mentioned method as depicted in figs. 1 and 2, correspondingly, the embodiment of the present application also provides a kind of storage medium,
On be stored with computer program, which realizes above-mentioned same people's recognition methods as depicted in figs. 1 and 2 when being executed by processor.
Based on this understanding, the technical solution of the application can be embodied in the form of software products, which produces
Product can store in a non-volatile memory medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions
With so that computer equipment (can be personal computer, server or the network equipment an etc.) execution the application is each
Method described in implement scene.
Based on above-mentioned method as shown in Figure 1 and Figure 2 and Fig. 3, virtual bench embodiment shown in Fig. 4, in order to realize
Above-mentioned purpose, the embodiment of the present application also provides a kind of computer equipments, are specifically as follows personal computer, server, network
Equipment etc., the computer equipment include storage medium and processor;Storage medium, for storing computer program;Processor is used
In execution computer program to realize above-mentioned same people's recognition methods as depicted in figs. 1 and 2.
Optionally, which can also include user interface, network interface, camera, radio frequency (Radio
Frequency, RF) circuit, sensor, voicefrequency circuit, WI-FI module etc..User interface may include display screen
(Display), input unit such as keyboard (Keyboard) etc., optional user interface can also connect including USB interface, card reader
Mouthful etc..Network interface optionally may include standard wireline interface and wireless interface (such as blue tooth interface, WI-FI interface).
It will be understood by those skilled in the art that a kind of computer equipment structure provided in this embodiment is not constituted to the meter
The restriction for calculating machine equipment, may include more or fewer components, perhaps combine certain components or different component layouts.
It can also include operating system, network communication module in storage medium.Operating system is management and preservation computer
The program of device hardware and software resource supports the operation of message handling program and other softwares and/or program.Network communication
Module is for realizing the communication between each component in storage medium inside, and between other hardware and softwares in the entity device
Communication.
Through the above description of the embodiments, those skilled in the art can be understood that the application can borrow
Help software that the mode of necessary general hardware platform is added to realize, it can also be by hardware realization first with the feature of sample of users
Information clusters sample of users, will belong to the higher sample of users of same person's possibility and is put into the same cluster, then
In each cluster choose training sample user, and to training sample user carry out with people's information labeling after, used using training sample
The corresponding characteristic information in family and with people's information labeling training with people's identification model, finally utilize same people's identification model to be identified
User identify with people.The application reduces the training burden with people's identification model, realizes instruction by the cluster to sample of users
Experienced optimization, training for promotion efficiency.
It will be appreciated by those skilled in the art that the accompanying drawings are only schematic diagrams of a preferred implementation scenario, module in attached drawing or
Process is not necessarily implemented necessary to the application.It will be appreciated by those skilled in the art that the mould in device in implement scene
Block can according to implement scene describe be distributed in the device of implement scene, can also carry out corresponding change be located at be different from
In one or more devices of this implement scene.The module of above-mentioned implement scene can be merged into a module, can also be into one
Step splits into multiple submodule.
Above-mentioned the application serial number is for illustration only, does not represent the superiority and inferiority of implement scene.Disclosed above is only the application
Several specific implementation scenes, still, the application is not limited to this, and the changes that any person skilled in the art can think of is all
The protection scope of the application should be fallen into.
Claims (10)
1. a kind of same people's recognition methods characterized by comprising
The sample of users is clustered based on the characteristic information of sample of users, obtains at least one sample of users cluster;
At least one set of training sample user is extracted from each sample of users cluster respectively, and obtains the training sample user
Same people's markup information;
Using the training sample user and corresponding same people's markup information, training is the same as people's identification model;
User to be identified identify with people according to same people's identification model after training.
2. the method according to claim 1, wherein it is described based on the characteristic information of sample of users to the sample
User clusters, before obtaining at least one sample of users cluster, the method also includes:
Obtain the basic data of sample of users;
Based on the basic data of the sample of users, the characteristic information of the sample of users is counted according to default feature classification.
3. according to the method described in claim 2, it is characterized in that, it is described based on the characteristic information of sample of users to the sample
User clusters, and obtains at least one sample of users cluster, specifically includes:
Number of clusters is determined according to the quantity of the sample of users, and generates the initial cluster center of corresponding number of clusters;
According to the initial cluster center, K mean cluster is carried out to the characteristic information of the sample of users, obtains corresponding cluster numbers
The sample of users cluster of amount and cluster centre corresponding with each sample of users cluster.
4. according to the method described in claim 3, it is characterized in that, training sample user described in any one group includes any described
Any described samples of other in the corresponding sample of users of cluster centre and the same sample cluster are used in sample of users cluster
Family, same people's markup information include same people's mark or non-same people's mark.
5. according to the method described in claim 4, it is characterized in that, same people's identification model according to after training is treated
It identifies that user identify with people, specifically includes:
According to the basic data of the user to be identified, the feature of the user to be identified is counted according to the default feature classification
Information;
Characteristic information based on the user to be identified clusters the user to be identified, obtains at least one use to be identified
Family cluster and cluster centre corresponding with user's cluster to be identified;
It obtains the corresponding central user of any user's cluster to be identified and compares user, wherein the central user is institute
The corresponding user to be identified of cluster centre of user's cluster to be identified is stated, the comparison user is in user's cluster to be identified
The whole user to be identified in addition to cluster centre;
By in any user's cluster to be identified the corresponding characteristic information of the central user and any comparison user
Corresponding characteristic information is input in same people's identification model after training, is obtained the central user and described is compared with any
User whether be same user result.
6. according to the method described in claim 5, it is characterized in that, the method also includes:
If the central user and it is any it is described to compare user be same user, according to any comparisons user foundation and institute
Central user is stated to gather with people accordingly.
7. the method according to any one of claim 2 to 6, which is characterized in that the foundation characteristic data include but not
It is limited at least one of communication data, carrier service data and e-commerce operation data of the sample of users or its group
It closes.
8. a kind of same people's identification device characterized by comprising
Sample clustering module clusters the sample of users for the characteristic information based on sample of users, obtains at least one
A sample of users cluster;
Training sample obtains module, at least one set of training sample user to be extracted from each sample of users cluster respectively,
And obtain same people's markup information of the training sample user;
Training module, for being identified using the training sample user and corresponding same people's markup information, training with people
Model;
Identification module, for identify with people to user to be identified according to same people's identification model after training.
9. a kind of storage medium, is stored thereon with computer program, which is characterized in that realization when described program is executed by processor
Same people's recognition methods described in any one of claims 1 to 7.
10. a kind of computer equipment, including storage medium, processor and storage can be run on a storage medium and on a processor
Computer program, which is characterized in that the processor is realized described in any one of claims 1 to 7 when executing described program
Same people's recognition methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910740557.1A CN110533085B (en) | 2019-08-12 | 2019-08-12 | Same-person identification method and device, storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910740557.1A CN110533085B (en) | 2019-08-12 | 2019-08-12 | Same-person identification method and device, storage medium and computer equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110533085A true CN110533085A (en) | 2019-12-03 |
CN110533085B CN110533085B (en) | 2022-04-01 |
Family
ID=68663021
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910740557.1A Active CN110533085B (en) | 2019-08-12 | 2019-08-12 | Same-person identification method and device, storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110533085B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111159243A (en) * | 2019-12-30 | 2020-05-15 | ***通信集团江苏有限公司 | User type identification method, device, equipment and storage medium |
CN111598360A (en) * | 2020-07-24 | 2020-08-28 | 北京淇瑀信息科技有限公司 | Service policy determination method and device and electronic equipment |
CN111625817A (en) * | 2020-05-12 | 2020-09-04 | 咪咕文化科技有限公司 | Abnormal user identification method and device, electronic equipment and storage medium |
CN112085114A (en) * | 2020-09-14 | 2020-12-15 | 杭州中奥科技有限公司 | Online and offline identity matching method, device, equipment and storage medium |
CN112148981A (en) * | 2020-09-29 | 2020-12-29 | 广州小鹏自动驾驶科技有限公司 | Method, device, equipment and storage medium for identifying same |
CN112819106A (en) * | 2021-04-16 | 2021-05-18 | 江西博微新技术有限公司 | IFC component type identification method, device, storage medium and equipment |
CN113139005A (en) * | 2021-04-22 | 2021-07-20 | 康键信息技术(深圳)有限公司 | Same-person identification method based on same-person identification model and related equipment |
CN113361603A (en) * | 2021-06-04 | 2021-09-07 | 北京百度网讯科技有限公司 | Training method, class recognition device, electronic device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130142423A1 (en) * | 2010-06-01 | 2013-06-06 | Tong Zhang | Image clustering using a personal clothing model |
CN106355170A (en) * | 2016-11-22 | 2017-01-25 | Tcl集团股份有限公司 | Photo classifying method and device |
CN107358945A (en) * | 2017-07-26 | 2017-11-17 | 谢兵 | A kind of more people's conversation audio recognition methods and system based on machine learning |
US20180137395A1 (en) * | 2016-11-17 | 2018-05-17 | Samsung Electronics Co., Ltd. | Recognition and training method and apparatus |
CN108229321A (en) * | 2017-11-30 | 2018-06-29 | 北京市商汤科技开发有限公司 | Human face recognition model and its training method and device, equipment, program and medium |
CN109816043A (en) * | 2019-02-02 | 2019-05-28 | 拉扎斯网络科技(上海)有限公司 | Determination method, apparatus, electronic equipment and the storage medium of user's identification model |
-
2019
- 2019-08-12 CN CN201910740557.1A patent/CN110533085B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130142423A1 (en) * | 2010-06-01 | 2013-06-06 | Tong Zhang | Image clustering using a personal clothing model |
US20180137395A1 (en) * | 2016-11-17 | 2018-05-17 | Samsung Electronics Co., Ltd. | Recognition and training method and apparatus |
CN106355170A (en) * | 2016-11-22 | 2017-01-25 | Tcl集团股份有限公司 | Photo classifying method and device |
CN107358945A (en) * | 2017-07-26 | 2017-11-17 | 谢兵 | A kind of more people's conversation audio recognition methods and system based on machine learning |
CN108229321A (en) * | 2017-11-30 | 2018-06-29 | 北京市商汤科技开发有限公司 | Human face recognition model and its training method and device, equipment, program and medium |
CN109816043A (en) * | 2019-02-02 | 2019-05-28 | 拉扎斯网络科技(上海)有限公司 | Determination method, apparatus, electronic equipment and the storage medium of user's identification model |
Non-Patent Citations (2)
Title |
---|
JIAN LU 等: ""Centralized and Clustered Features for Person Re-Identification"", 《 IEEE SIGNAL PROCESSING LETTERS》 * |
胡易: ""视频中的人脸聚类***的设计与实现"", 《中国优秀博硕士学位论文全文数据库(硕士)-信息科技辑》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111159243A (en) * | 2019-12-30 | 2020-05-15 | ***通信集团江苏有限公司 | User type identification method, device, equipment and storage medium |
CN111159243B (en) * | 2019-12-30 | 2023-08-04 | ***通信集团江苏有限公司 | User type identification method, device, equipment and storage medium |
CN111625817A (en) * | 2020-05-12 | 2020-09-04 | 咪咕文化科技有限公司 | Abnormal user identification method and device, electronic equipment and storage medium |
CN111625817B (en) * | 2020-05-12 | 2023-05-02 | 咪咕文化科技有限公司 | Abnormal user identification method, device, electronic equipment and storage medium |
CN111598360A (en) * | 2020-07-24 | 2020-08-28 | 北京淇瑀信息科技有限公司 | Service policy determination method and device and electronic equipment |
CN112085114A (en) * | 2020-09-14 | 2020-12-15 | 杭州中奥科技有限公司 | Online and offline identity matching method, device, equipment and storage medium |
CN112148981A (en) * | 2020-09-29 | 2020-12-29 | 广州小鹏自动驾驶科技有限公司 | Method, device, equipment and storage medium for identifying same |
CN112819106A (en) * | 2021-04-16 | 2021-05-18 | 江西博微新技术有限公司 | IFC component type identification method, device, storage medium and equipment |
CN112819106B (en) * | 2021-04-16 | 2021-07-13 | 江西博微新技术有限公司 | IFC component type identification method, device, storage medium and equipment |
CN113139005A (en) * | 2021-04-22 | 2021-07-20 | 康键信息技术(深圳)有限公司 | Same-person identification method based on same-person identification model and related equipment |
CN113361603A (en) * | 2021-06-04 | 2021-09-07 | 北京百度网讯科技有限公司 | Training method, class recognition device, electronic device and storage medium |
CN113361603B (en) * | 2021-06-04 | 2024-05-10 | 北京百度网讯科技有限公司 | Training method, category identification device, electronic device, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110533085B (en) | 2022-04-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110533085A (en) | With people's recognition methods and device, storage medium, computer equipment | |
CN110401779A (en) | A kind of method, apparatus and computer readable storage medium identifying telephone number | |
CN109634996A (en) | Customer information table generating method, device, equipment and computer readable storage medium | |
CN109858960A (en) | Commodity method for pushing, device, subscriber information management server and storage medium | |
CN107423613A (en) | The method, apparatus and server of device-fingerprint are determined according to similarity | |
CN109600336A (en) | Store equipment, identifying code application method and device | |
CN110445939B (en) | Capacity resource prediction method and device | |
CN113412607B (en) | Content pushing method and device, mobile terminal and storage medium | |
CN111815169A (en) | Business approval parameter configuration method and device | |
CN111931809A (en) | Data processing method and device, storage medium and electronic equipment | |
CN113206909A (en) | Crank call interception method and device | |
CN103390021A (en) | Method and apparatus for extracting social relations from calling time data | |
CN113609409A (en) | Method and system for recommending browsing information, electronic device and storage medium | |
CN111401478B (en) | Data anomaly identification method and device | |
CN113221005A (en) | Customer service pushing method, server and related products | |
CN113011966A (en) | Credit scoring method and device based on deep learning | |
CN110046233A (en) | Problem distributing method and device | |
CN110232148A (en) | Item recommendation system, method and device | |
CN113011503B (en) | Data evidence obtaining method of electronic equipment, storage medium and terminal | |
CN111163237B (en) | Call service flow control method and related device | |
CN115099934A (en) | High-latency customer identification method, electronic equipment and storage medium | |
CN114529143A (en) | Method and device for recommending outbound clues and electronic equipment | |
CN110245775B (en) | User collection and payment data analysis method and device and computer equipment | |
CN110163761B (en) | Suspicious item member identification method and device based on image processing | |
CN113890948A (en) | Resource allocation method based on voice outbound robot dialogue data and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |