CN110533085A

CN110533085A - With people's recognition methods and device, storage medium, computer equipment

Info

Publication number: CN110533085A
Application number: CN201910740557.1A
Authority: CN
Inventors: 刘逸哲
Original assignee: Dazhu (hangzhou) Technology Co Ltd
Current assignee: Dazhu (hangzhou) Technology Co Ltd
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2019-12-03
Anticipated expiration: 2039-08-12
Also published as: CN110533085B

Abstract

This application discloses a kind of same people's recognition methods and device, storage medium, computer equipments, this method comprises: clustering based on the characteristic information of sample of users to the sample of users, obtain at least one sample of users cluster；At least one set of training sample user is extracted from each sample of users cluster respectively, and obtains same people's markup information of the training sample user；Using the training sample user and corresponding same people's markup information, training is the same as people's identification model；User to be identified identify with people according to same people's identification model after training.The application reduces the training burden with people's identification model, realizes trained optimization, training for promotion efficiency by the cluster to sample of users.

Description

With people's recognition methods and device, storage medium, computer equipment

Technical field

This application involves data analysis technique fields, are situated between particularly with regard to a kind of same people's recognition methods and device, storage Matter, computer equipment.

Background technique

Current internet flourishes, and expedites the emergence of out a collection of electric business and network finance service company, company has due to electric business Various new person's subsidies, financial service subsidiary's meeting direct loan result in many users and replace cell-phone numbers, re-register to user Means obtain interests, therefore, how to determine website registration user or service object is the same person, become electric business and interconnection The key that net financial service subsidiary cuts operating costs with risk.

In same people's identification field, the building of training sample is most important with people's identification model to training.How from a large amount of Sample of users in quickly determine which two user belongs to the same user, and then constructing training sample set is the weight in the field Want problem.

Summary of the invention

In view of this, passing through this application provides a kind of same people's recognition methods and device, storage medium, computer equipment To the cluster of sample of users, the training burden with people's identification model is reduced, trained optimization, training for promotion efficiency are realized.

According to the one aspect of the application, a kind of same people's recognition methods is provided, comprising:

The sample of users is clustered based on the characteristic information of sample of users, obtains at least one sample of users cluster；

At least one set of training sample user is extracted from each sample of users cluster respectively, and obtains the training sample Same people's markup information of user；

Using the training sample user and corresponding same people's markup information, training is the same as people's identification model；

User to be identified identify with people according to same people's identification model after training.

Specifically, described that the sample of users is clustered based on the characteristic information of sample of users, obtain at least one Before sample of users cluster, the method also includes:

Obtain the basic data of sample of users；

Based on the basic data of the sample of users, believe according to the feature that default feature classification counts the sample of users Breath.

Specifically, described that the sample of users is clustered based on the characteristic information of sample of users, obtain at least one Sample of users cluster, specifically includes:

Number of clusters is determined according to the quantity of the sample of users, and generates the initial cluster center of corresponding number of clusters；

According to the initial cluster center, K mean cluster is carried out to the characteristic information of the sample of users, is accordingly gathered The sample of users cluster of class quantity and cluster centre corresponding with each sample of users cluster.

Specifically, training sample user described in any one group include in any sample of users cluster cluster centre it is corresponding Other any described sample of users in the sample of users and the same sample cluster, same people's markup information includes same people Mark or non-same people's mark.

Specifically, same people's identification model according to after training identify with people to user to be identified, specifically Include:

According to the basic data of the user to be identified, count the user's to be identified according to the default feature classification Characteristic information；

Characteristic information based on the user to be identified clusters the user to be identified, obtains at least one and waits knowing Other user's cluster and cluster centre corresponding with user's cluster to be identified；

It obtains the corresponding central user of any user's cluster to be identified and compares user, wherein the central user For the corresponding user to be identified of cluster centre of user's cluster to be identified, the comparison user is the user to be identified The whole user to be identified in cluster in addition to cluster centre；

By in any user's cluster to be identified the corresponding characteristic information of the central user and any comparison The corresponding characteristic information of user be input to training after same people's identification model in, obtain the central user with it is any described Compare user whether be same user result.

Specifically, the method also includes:

If the central user and it is any it is described to compare user be same user, according to any comparisons user foundation Same people's set corresponding with the central user.

Specifically, the foundation characteristic data include but is not limited to the communication data of the sample of users, carrier service At least one of data and e-commerce operation data or combinations thereof.

According to the another aspect of the application, a kind of same people's identification device is provided, comprising:

Sample clustering module clusters the sample of users for the characteristic information based on sample of users, obtain to A few sample of users cluster；

Training sample obtains module, uses for extracting at least one set of training sample from each sample of users cluster respectively Family, and obtain same people's markup information of the training sample user；

Training module, for utilizing the training sample user and corresponding same people's markup information, the same people of training Identification model；

Identification module, for identify with people to user to be identified according to same people's identification model after training.

Specifically, described device further include:

Basic data obtains module, clusters, obtains to the sample of users for the characteristic information based on sample of users To before at least one sample of users cluster, the basic data of sample of users is obtained；

Characteristic information statistical module is counted for the basic data based on the sample of users according to default feature classification The characteristic information of the sample of users.

Specifically, the sample clustering module, specifically includes:

Cluster centre generation unit for determining number of clusters according to the quantity of the sample of users, and generates corresponding poly- The initial cluster center of class quantity；

Cluster cell, for it is poly- to carry out K mean value to the characteristic information of the sample of users according to the initial cluster center Class obtains the sample of users cluster and cluster centre corresponding with each sample of users cluster of corresponding number of clusters.

Specifically, the identification module, specifically includes:

Characteristic information statistic unit, for the basic data according to the user to be identified, according to the default feature class The characteristic information of the user to be identified is not counted；

Cluster cell clusters the user to be identified for the characteristic information based on the user to be identified, obtains To at least one user's cluster to be identified and cluster centre corresponding with user's cluster to be identified；

It identifies user's acquiring unit, is used for obtaining the corresponding central user of any user's cluster to be identified and comparing Family, wherein the central user is the corresponding user to be identified of cluster centre of user's cluster to be identified, the comparison User is the whole user to be identified in user's cluster to be identified in addition to cluster centre；

Recognition unit, for by any user's cluster to be identified the corresponding characteristic information of the central user and Any corresponding characteristic information of user that compares is input in same people's identification model after training, is obtained the center and is used Family and it is any it is described compare user whether be same user result.

Specifically, described device further include:

As a result output module, if for the central user and it is any it is described compare user be same user, according to appoint The one comparison user establishes same people's set corresponding with the central user.

According to the application another aspect, a kind of storage medium is provided, computer program, described program are stored thereon with Above-mentioned same people's recognition methods is realized when being executed by processor.

According to the application another aspect, a kind of computer equipment is provided, including storage medium, processor and be stored in On storage medium and the computer program that can run on a processor, the processor realize above-mentioned same people when executing described program Recognition methods.

By above-mentioned technical proposal, a kind of same people's recognition methods provided by the present application and device, storage medium, computer are set It is standby, sample of users is clustered first with the characteristic information of sample of users, the higher sample of same person's possibility will be belonged to User is put into the same cluster, training sample user is then chosen in each cluster, and believe with people to training sample user After breath mark, using the corresponding characteristic information of training sample user and with the training of people's information labeling with people's identification model, finally User to be identified identify with people using same people's identification model.The application reduces same people by the cluster to sample of users The training burden of identification model realizes trained optimization, training for promotion efficiency.

Above description is only the general introduction of technical scheme, in order to better understand the technological means of the application, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects, features and advantages of the application can It is clearer and more comprehensible, below the special specific embodiment for lifting the application.

Detailed description of the invention

The drawings described herein are used to provide a further understanding of the present application, constitutes part of this application, this Shen Illustrative embodiments and their description please are not constituted an undue limitation on the present application for explaining the application.In the accompanying drawings:

Fig. 1 shows a kind of flow diagram of same people's recognition methods provided by the embodiments of the present application；

Fig. 2 shows another flow diagrams with people's recognition methods provided by the embodiments of the present application；

Fig. 3 shows a kind of structural schematic diagram of same people's identification device provided by the embodiments of the present application；

Fig. 4 shows another structural schematic diagram with people's identification device provided by the embodiments of the present application.

Specific embodiment

The application is described in detail below with reference to attached drawing and in conjunction with the embodiments.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.

A kind of same people's recognition methods is provided in the present embodiment, as shown in Figure 1, this method comprises:

Step 101, the characteristic information based on sample of users clusters sample of users, obtains at least one sample of users Cluster.

In the embodiment of the present application, it before carrying out with people's identification, should first train with people's identification model, in general, need The required sample of training is chosen from sample data by way of artificial screening, that is, belongs to the sample of the same user, so And selected from a large amount of sample data belong to the same user sample data workload it is very big, therefore, the application is real Example is applied by way of clustering to sample of users, the sample of users that first would be possible to belong to common identity is divided into a cluster In, so as to greatly reduce mark workload.

It should be noted that the characteristic information in the embodiment of the present application based on sample of users clusters sample of users, The characteristic information of sample of users may include the communication data of sample of users, carrier service data, e-commerce operation data etc., lead to Letter data may include incoming call telephone number, outbound calling number etc., and carrier service data may include various operating agencies Send out into short message, such as the Credit Card Swiping information etc. that bank sends, e-commerce operation data may include sample of users in electricity The purchase of quotient's platform, collection, data on flows etc..

Step 102, at least one set of training sample user is extracted from each sample of users cluster respectively, and obtains training sample Same people's markup information of user.

It mixes the sample with after family is divided into one or more sample of users clusters, i.e., will belong to the same higher sample of people's possibility It after user is divided into a cluster, mixes the sample with the sample of users in the cluster of family and is divided into multiple groups training sample user, wherein is every Group training sample user includes two sample of users can after each sample of users cluster is divided into corresponding training sample user By manually mark or others by way of judge whether every group of training sample user is same user and is labeled.

Step 103, using training sample user and accordingly with people's markup information, training is the same as people's identification model.

It, can be according to the characteristic information of training sample and accordingly with people's mark letter after training sample user annotation Breath, training is the same as people's identification model, wherein with people's identification model is two disaggregated models.In addition, the embodiment of the present application knows same people Other model and training method without limitation, for example, using basic Logic Regression Models, set loss function, using random Gradient descent method, the parameter of training pattern obtain final same people's identification model.Same people to be obtained using training is identified Model identify with people to user to be identified.

Step 104, user to be identified identify with people according to same people's identification model after training.

During identify with people to user to be identified using same people's identification model, it can also be used according to sample It is dual-purpose at several clusters and then by two in the same cluster first to be carried out clustering by the clustering method at family by user to be identified Family characteristic information, which is input to, to be carried out identifying with people in same people's identification model, avoids whole users to be identified carrying out combination of two Identified with people, recognition time is wasted, to improve recognition efficiency.

Certainly, in order to improve recognition accuracy, whole users to be identified can also be subjected to combination of two and be input to together It realizes in people's identification model and is identified with people.

Technical solution by applying this embodiment gathers sample of users first with the characteristic information of sample of users Class will belong to the higher sample of users of same person's possibility and be put into the same cluster, and training sample is then chosen in each cluster This user, and after carrying out training sample user with people's information labeling, using the corresponding characteristic information of training sample user and With the training of people's information labeling with people's identification model, finally user to be identified identify with people using same people's identification model.This Application reduces the training burden with people's identification model by the cluster to sample of users, realizes trained optimization, training for promotion effect Rate.

Further, as the refinement and extension of above-described embodiment specific embodiment, in order to completely illustrate the present embodiment Specific implementation process, provide another kind with people's recognition methods, as shown in Fig. 2, this method comprises:

Step 201, the basic data of sample of users is obtained.

Wherein, foundation characteristic data include but is not limited to the communication data of the sample of users, carrier service data with And at least one of e-commerce operation data or combinations thereof.

Step 202, based on the basic data of sample of users, believe according to the feature of default feature classification statistical sample user Breath.

According to default feature classification, the basic data for mixing the sample with family is summarized as the characteristic information of sample of users, for example, in advance If feature classification includes incoming call telephone number, it is assumed that the incoming call telephone number in the basic data of sample of users includes A, B, D tri- It is a, it include five telephone numbers of A, B, C, D, E in whole telephone number databases, then extracting the Inbound Calls of sample of users Number characteristic information is (1,1,0,1,0), other characteristic informations do not illustrate.

Step 203, number of clusters is determined according to the quantity of sample of users, and generated in the initial clustering of corresponding number of clusters The heart.

Step 204, according to initial cluster center, K mean cluster is carried out to the characteristic information of sample of users, is accordingly gathered The sample of users cluster of class quantity and cluster centre corresponding with each sample of users cluster.

In step 203 and step 204, after the characteristic information for extracting sample of users, using sample of users characteristic information into Row clustering mixes the sample with family and is divided into sample of users cluster, specifically can be by the way of K mean cluster.It is of course also possible to Using other cluster modes, the mode of K mean cluster is explained in the embodiment of the present application, firstly, being used according to sample The quantity at family determines number of clusters K, such as the corresponding cluster of every 100 sample of users, and then, random generation K is initial poly- Class center, or K initial cluster center is generated in the way of other agreements, finally, according to K initial cluster center, meter The distance between each sample and each cluster centre are calculated, each sample is distributed to the cluster centre nearest apart from it, finally Obtain K sample of users cluster and K cluster centre.

Step 205, at least one set of training sample user is extracted from each sample of users cluster respectively, and obtains training sample Same people's markup information of user.

Specifically, any one group of training sample user includes the corresponding sample of users of cluster centre in any sample of users cluster And other any sample of users in same sample cluster, it include same people's mark or non-same people's mark with people's markup information.

In the above-described embodiments, when extracting training sample user, in order to reduce training sample amount, due to being clustered When, be clustered according to the distance between cluster centre sample and other samples, therefore the cluster centre sample of each cluster and A possibility that other samples are the same persons is higher, the cluster centre sample in each cluster can be carried out two-by-two with other samples Combination obtains training sample, so that mark with people's information is carried out to every group of training sample user by the modes such as manually marking, Specifically mark whether every group of training sample user belongs to the same person.

Step 206, using training sample user and accordingly with people's markup information, training is the same as people's identification model.

Training may refer to the explanation to above-mentioned steps 103 with the mode of people's identification model, and details are not described herein.

Step 207, according to the basic data of user to be identified, the feature of user to be identified is counted according to default feature classification Information.

Step 208, the characteristic information based on user to be identified clusters user to be identified, obtains at least one and waits knowing Other user's cluster and cluster centre corresponding with user's cluster to be identified.

Step 209, it obtains the corresponding central user of any user's cluster to be identified and compares user, wherein central user For the corresponding user to be identified of cluster centre of user's cluster to be identified, compare user be in user's cluster to be identified in addition to cluster centre Whole user to be identified.

Step 210, by any user's cluster to be identified the corresponding characteristic information of central user and any comparison user Corresponding characteristic information is input in same people's identification model after training, is obtained central user with any and is compared whether user is same The result of one user.

In above-mentioned steps 207 into step 210, during identify with people to user to be identified, in order to improve Recognition efficiency can first be clustered user to be identified to obtain user's cluster to be identified, and then be obtained using training in step 206 Same people's identification model identify whether cluster centre user and other users in each user's cluster to be identified are same use respectively Family.Specific clustering method can be identical as the clustering method of sample of users, in order to improve recognition efficiency, due to being clustered When, be clustered according to the distance between cluster centre sample and other samples, therefore the cluster centre sample of each cluster and A possibility that other samples are the same persons is higher, therefore in identification, by the corresponding center of cluster centre in the same cluster The characteristic information of user and other users, which are input in same people's identification model, to be identified.

Step 211, it if central user compares user with any as same user, is established with according to any comparison user Heart user gathers with people accordingly.

After showing whether each comparison user of any user's cluster to be identified belongs to the conclusion of same user with central user, The comparison user for belonging to same user with central user is established one to gather with people, then the user in the set and the center are used Family belongs to the same person.

Technical solution by applying this embodiment, first, merge user's communications and liaison relationship, carrier service data, electric business The multivariate datas such as operation data conclude user's characteristic information, whether pass through simple rule matching judgment user compared with the prior art It is more acurrate for the judging result of same people's information for the same person；Second, the clustering of user is used as and is identified with people The preceding Value Operations of model reduce data volume, to the optimization method of calculated performance；Third, with people's identification model as clustering Post action, promoted with people's identification model differentiation accuracy.

Further, the specific implementation as Fig. 1 method, the embodiment of the present application provide a kind of same people's identification device, such as Shown in Fig. 3, which includes: sample clustering module 31, training sample acquisition module 32, training module 33, identification module 34.

Sample clustering module 31 clusters sample of users for the characteristic information based on sample of users, obtains at least One sample of users cluster；

Training sample obtains module 32, uses for extracting at least one set of training sample from each sample of users cluster respectively Family, and obtain same people's markup information of training sample user；

Training module 33, for using training sample user and accordingly with people's markup information, training to identify mould with people Type；

Identification module 34, for identify with people to user to be identified according to same people's identification model after training.

In specific application scenarios, the device further include: basic data obtains module 35, characteristic information statistical module 36。

Basic data obtains module 35, clusters, obtains to sample of users for the characteristic information based on sample of users Before at least one sample of users cluster, the basic data of sample of users is obtained；

Characteristic information statistical module 36 counts sample according to default feature classification for the basic data based on sample of users The characteristic information of this user.

In specific application scenarios, sample clustering module 31 is specifically included: cluster centre generation unit 311, cluster are single Member 312.

Cluster centre generation unit 311 determines number of clusters for the quantity according to sample of users, and generates corresponding cluster The initial cluster center of quantity；

Cluster cell 312, for carrying out K mean cluster to the characteristic information of sample of users, obtaining according to initial cluster center To the sample of users cluster and cluster centre corresponding with each sample of users cluster of corresponding number of clusters.

In specific application scenarios, any one group of training sample user includes cluster centre pair in any sample of users cluster Other any sample of users in the sample of users answered and same sample cluster include same people's mark or non-same people with people's markup information Mark.

In specific application scenarios, identification module 34 is specifically included: characteristic information statistic unit 341, cluster cell 342, user's acquiring unit 343, recognition unit 344 are identified.

Characteristic information statistic unit 341 is counted for the basic data according to user to be identified according to default feature classification The characteristic information of user to be identified；

Cluster cell 342 clusters user to be identified for the characteristic information based on user to be identified, obtains at least One user's cluster to be identified and cluster centre corresponding with user's cluster to be identified；

It identifies user's acquiring unit 343, is used for obtaining the corresponding central user of any user's cluster to be identified and comparing Family, wherein central user is the corresponding user to be identified of cluster centre of user's cluster to be identified, and comparison user is user to be identified Whole user to be identified in cluster in addition to cluster centre；

Recognition unit 344, for by the corresponding characteristic information of central user in any user's cluster to be identified and any It compares the corresponding characteristic information of user to be input in same people's identification model after training, obtains central user with any and compare user Whether be same user result.

In specific application scenarios, the device further include: result output module 37.

As a result output module 37, if comparing user for same user, according to any comparison with any for central user User establishes same people's set corresponding with central user.

Specifically, foundation characteristic data include but is not limited to the communication data of sample of users, carrier service data and At least one of e-commerce operation data or combinations thereof.

It should be noted that other of each functional unit involved by a kind of same people's identification device provided by the embodiments of the present application Corresponding description, can be with reference to the corresponding description in Fig. 1 and Fig. 2, and details are not described herein.

Based on above-mentioned method as depicted in figs. 1 and 2, correspondingly, the embodiment of the present application also provides a kind of storage medium, On be stored with computer program, which realizes above-mentioned same people's recognition methods as depicted in figs. 1 and 2 when being executed by processor.

Based on this understanding, the technical solution of the application can be embodied in the form of software products, which produces Product can store in a non-volatile memory medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions With so that computer equipment (can be personal computer, server or the network equipment an etc.) execution the application is each Method described in implement scene.

Based on above-mentioned method as shown in Figure 1 and Figure 2 and Fig. 3, virtual bench embodiment shown in Fig. 4, in order to realize Above-mentioned purpose, the embodiment of the present application also provides a kind of computer equipments, are specifically as follows personal computer, server, network Equipment etc., the computer equipment include storage medium and processor；Storage medium, for storing computer program；Processor is used In execution computer program to realize above-mentioned same people's recognition methods as depicted in figs. 1 and 2.

Optionally, which can also include user interface, network interface, camera, radio frequency (Radio Frequency, RF) circuit, sensor, voicefrequency circuit, WI-FI module etc..User interface may include display screen (Display), input unit such as keyboard (Keyboard) etc., optional user interface can also connect including USB interface, card reader Mouthful etc..Network interface optionally may include standard wireline interface and wireless interface (such as blue tooth interface, WI-FI interface).

It will be understood by those skilled in the art that a kind of computer equipment structure provided in this embodiment is not constituted to the meter The restriction for calculating machine equipment, may include more or fewer components, perhaps combine certain components or different component layouts.

It can also include operating system, network communication module in storage medium.Operating system is management and preservation computer The program of device hardware and software resource supports the operation of message handling program and other softwares and/or program.Network communication Module is for realizing the communication between each component in storage medium inside, and between other hardware and softwares in the entity device Communication.

Through the above description of the embodiments, those skilled in the art can be understood that the application can borrow Help software that the mode of necessary general hardware platform is added to realize, it can also be by hardware realization first with the feature of sample of users Information clusters sample of users, will belong to the higher sample of users of same person's possibility and is put into the same cluster, then In each cluster choose training sample user, and to training sample user carry out with people's information labeling after, used using training sample The corresponding characteristic information in family and with people's information labeling training with people's identification model, finally utilize same people's identification model to be identified User identify with people.The application reduces the training burden with people's identification model, realizes instruction by the cluster to sample of users Experienced optimization, training for promotion efficiency.

It will be appreciated by those skilled in the art that the accompanying drawings are only schematic diagrams of a preferred implementation scenario, module in attached drawing or Process is not necessarily implemented necessary to the application.It will be appreciated by those skilled in the art that the mould in device in implement scene Block can according to implement scene describe be distributed in the device of implement scene, can also carry out corresponding change be located at be different from In one or more devices of this implement scene.The module of above-mentioned implement scene can be merged into a module, can also be into one Step splits into multiple submodule.

Above-mentioned the application serial number is for illustration only, does not represent the superiority and inferiority of implement scene.Disclosed above is only the application Several specific implementation scenes, still, the application is not limited to this, and the changes that any person skilled in the art can think of is all The protection scope of the application should be fallen into.

Claims

1. a kind of same people's recognition methods characterized by comprising

At least one set of training sample user is extracted from each sample of users cluster respectively, and obtains the training sample user Same people's markup information；

2. the method according to claim 1, wherein it is described based on the characteristic information of sample of users to the sample User clusters, before obtaining at least one sample of users cluster, the method also includes:

Obtain the basic data of sample of users；

Based on the basic data of the sample of users, the characteristic information of the sample of users is counted according to default feature classification.

3. according to the method described in claim 2, it is characterized in that, it is described based on the characteristic information of sample of users to the sample User clusters, and obtains at least one sample of users cluster, specifically includes:

According to the initial cluster center, K mean cluster is carried out to the characteristic information of the sample of users, obtains corresponding cluster numbers The sample of users cluster of amount and cluster centre corresponding with each sample of users cluster.

4. according to the method described in claim 3, it is characterized in that, training sample user described in any one group includes any described Any described samples of other in the corresponding sample of users of cluster centre and the same sample cluster are used in sample of users cluster Family, same people's markup information include same people's mark or non-same people's mark.

5. according to the method described in claim 4, it is characterized in that, same people's identification model according to after training is treated It identifies that user identify with people, specifically includes:

According to the basic data of the user to be identified, the feature of the user to be identified is counted according to the default feature classification Information；

Characteristic information based on the user to be identified clusters the user to be identified, obtains at least one use to be identified Family cluster and cluster centre corresponding with user's cluster to be identified；

It obtains the corresponding central user of any user's cluster to be identified and compares user, wherein the central user is institute The corresponding user to be identified of cluster centre of user's cluster to be identified is stated, the comparison user is in user's cluster to be identified The whole user to be identified in addition to cluster centre；

By in any user's cluster to be identified the corresponding characteristic information of the central user and any comparison user Corresponding characteristic information is input in same people's identification model after training, is obtained the central user and described is compared with any User whether be same user result.

6. according to the method described in claim 5, it is characterized in that, the method also includes:

If the central user and it is any it is described to compare user be same user, according to any comparisons user foundation and institute Central user is stated to gather with people accordingly.

7. the method according to any one of claim 2 to 6, which is characterized in that the foundation characteristic data include but not It is limited at least one of communication data, carrier service data and e-commerce operation data of the sample of users or its group It closes.

8. a kind of same people's identification device characterized by comprising

Sample clustering module clusters the sample of users for the characteristic information based on sample of users, obtains at least one A sample of users cluster；

Training sample obtains module, at least one set of training sample user to be extracted from each sample of users cluster respectively, And obtain same people's markup information of the training sample user；

Training module, for being identified using the training sample user and corresponding same people's markup information, training with people Model；

9. a kind of storage medium, is stored thereon with computer program, which is characterized in that realization when described program is executed by processor Same people's recognition methods described in any one of claims 1 to 7.

10. a kind of computer equipment, including storage medium, processor and storage can be run on a storage medium and on a processor Computer program, which is characterized in that the processor is realized described in any one of claims 1 to 7 when executing described program Same people's recognition methods.