CN111209512A - User identification method, device and equipment - Google Patents

User identification method, device and equipment Download PDF

Info

Publication number
CN111209512A
CN111209512A CN202010004113.4A CN202010004113A CN111209512A CN 111209512 A CN111209512 A CN 111209512A CN 202010004113 A CN202010004113 A CN 202010004113A CN 111209512 A CN111209512 A CN 111209512A
Authority
CN
China
Prior art keywords
users
user
community
user group
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010004113.4A
Other languages
Chinese (zh)
Inventor
丁茹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tongbang Zhuoyi Technology Co Ltd
Original Assignee
Beijing Tongbang Zhuoyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tongbang Zhuoyi Technology Co Ltd filed Critical Beijing Tongbang Zhuoyi Technology Co Ltd
Priority to CN202010004113.4A priority Critical patent/CN111209512A/en
Publication of CN111209512A publication Critical patent/CN111209512A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention provides a user identification method, a device and equipment, wherein the method comprises the following steps: determining the correlation among a plurality of users according to the access time of the users to a preset page, wherein the correlation is used for indicating the incidence relation among the users; according to the correlation among the multiple users, carrying out community discovery among the multiple users to obtain at least one community, wherein each community comprises the multiple users; and identifying target users in the at least one community according to the user characteristics of the users in the at least one community, wherein the target users are users with preset behaviors. The reliability of user identification is improved.

Description

User identification method, device and equipment
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a user identification method, device and equipment.
Background
Currently, many users have bad behaviors (e.g., fraud, etc.) in the network, and the users with the bad behaviors need to be found according to the behaviors of the users in the network.
In the related art, a user with bad behavior may be identified according to information such as a device number and an emergency contact when the user operates in a network, for example, the device number may be an Internet Protocol (IP) address, a Media Access Control (MAC) address, and the like of the device. However, only users with obvious bad behaviors can be identified and obtained according to the above method, and if the users use high-end technology to perform bad behaviors, users with bad behaviors cannot be identified and obtained according to the above method, so that the reliability of user identification is low.
Disclosure of Invention
The embodiment of the invention provides a user identification method, a device and equipment, which improve the reliability of user identification.
In a first aspect, an embodiment of the present invention provides a user identification method, including:
determining the correlation among a plurality of users according to the access time of the users to a preset page, wherein the correlation is used for indicating the incidence relation among the users;
according to the correlation among the multiple users, carrying out community discovery among the multiple users to obtain at least one community, wherein each community comprises the multiple users;
and identifying target users in the at least one community according to the user characteristics of the users in the at least one community, wherein the target users are users with preset behaviors.
In a possible implementation manner, determining the relevance among a plurality of users according to the access time of the plurality of users to a preset page includes:
determining at least one user group in the plurality of users according to the access time of the plurality of users to the preset page, wherein each user group comprises a plurality of users;
respectively acquiring the correlation among the users in each user group according to the access time of the users in each user group to access the preset page;
and respectively carrying out community discovery among a plurality of users in each user group according to the correlation among the users in each user group.
In a possible implementation manner, if the number of the preset pages is 1, the time difference between every two users in one user group accessing the preset pages is smaller than a first threshold;
if the number of the preset pages is greater than 1, M time differences exist in N time differences corresponding to every two users in a user group, wherein the M time differences are smaller than a first threshold value, the N time differences corresponding to the two users are respectively time differences of the two users accessing the N preset pages, M is greater than or equal to 1 and is less than or equal to N, and M is an integer.
In one possible embodiment, for any one user group; according to the relevance among the users in the user group, carrying out community discovery among a plurality of users in the user group, wherein the community discovery comprises the following steps:
determining the edge relation among the users in the user group according to the correlation among the users in the user group;
and according to the edge relation among the users in the user group, carrying out community discovery among a plurality of users in the user group.
In one possible embodiment, determining the edge relation between users in the user group according to the correlation between users in the user group includes:
if the correlation between a first user and a second user in the user group is larger than a second threshold value, determining that the first user and the second user have an edge relationship;
and if the correlation between the first user and the second user in the user group is smaller than or equal to a second threshold value, determining that no edge relation exists between the first user and the second user.
In a possible embodiment, for any one of the at least one community, identifying a target user in the community according to the user characteristics of each user in the community includes:
determining a first user in the community according to the user characteristics of each user in the community, wherein the first user is a user marked with the preset behavior;
and identifying target users in the community according to the number of the first users in the community.
In one possible embodiment, identifying a target user in the community according to the number of the first users in the community includes:
and if the number of the first users meets a preset condition, determining the users except the first users in the community as the target users.
In a possible embodiment, the preset conditions comprise one or more of the following conditions:
the number of the first users is greater than a third threshold;
a ratio of the number of the first users to a total number of users included in the community is greater than a fourth threshold.
In a second aspect, an embodiment of the present invention provides a user identification apparatus, including a determining module, a discovering module, and an identifying module, wherein,
the determining module is used for determining the correlation among a plurality of users according to the access time of the users to a preset page, wherein the correlation is used for indicating the incidence relation among the users;
the discovery module is used for carrying out community discovery among the plurality of users according to the correlation among the plurality of users to obtain at least one community, wherein each community comprises a plurality of users;
the identification module is used for identifying target users in the at least one community according to the user characteristics of the users in the at least one community, and the target users are users with preset behaviors.
In a possible implementation, the determining module is specifically configured to:
determining at least one user group in the plurality of users according to the access time of the plurality of users to the preset page, wherein each user group comprises a plurality of users;
respectively acquiring the correlation among the users in each user group according to the access time of the users in each user group to access the preset page;
and respectively carrying out community discovery among a plurality of users in each user group according to the correlation among the users in each user group.
In a possible implementation manner, if the number of the preset pages is 1, the time difference between every two users in one user group accessing the preset pages is smaller than a first threshold;
if the number of the preset pages is greater than 1, M time differences exist in N time differences corresponding to every two users in a user group, wherein the M time differences are smaller than a first threshold value, the N time differences corresponding to the two users are respectively time differences of the two users accessing the N preset pages, M is greater than or equal to 1 and is less than or equal to N, and M is an integer.
In a possible implementation, the discovery module is specifically configured to:
aiming at any user group, determining the edge relation among the users in the user group according to the correlation among the users in the user group;
and according to the edge relation among the users in the user group, carrying out community discovery among a plurality of users in the user group.
In a possible implementation, the discovery module is specifically configured to:
if the correlation between a first user and a second user in the user group is larger than a second threshold value, determining that the first user and the second user have an edge relationship;
and if the correlation between the first user and the second user in the user group is smaller than or equal to a second threshold value, determining that no edge relation exists between the first user and the second user.
In a possible implementation, the identification module is specifically configured to:
aiming at any community in the at least one community, determining a first user in the community according to the user characteristics of each user in the community, wherein the first user is a user marked with the preset behavior;
and identifying target users in the community according to the number of the first users in the community.
In a possible implementation, the identification module is specifically configured to:
and if the number of the first users meets a preset condition, determining the users except the first users in the community as the target users.
In a possible embodiment, the preset conditions comprise one or more of the following conditions:
the number of the first users is greater than a third threshold;
a ratio of the number of the first users to a total number of users included in the community is greater than a fourth threshold.
In a third aspect, an embodiment of the present invention provides a user identification apparatus, including: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the user identification method of any of the first aspects.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer-executable instruction is stored in the computer-readable storage medium, and when a processor executes the computer-executable instruction, the user identification method according to any one of the first aspect is implemented.
According to the user identification method provided by the embodiment of the invention, when a target user (a user with a preset behavior) needs to be identified in a plurality of users, the relevance among the plurality of users can be determined according to the access time of the plurality of users to a preset page; according to the correlation among the multiple users, carrying out community discovery among the multiple users to obtain at least one community, wherein each community comprises the multiple users; and identifying target users in the at least one community according to the user characteristics of the users in the at least one community. In the process, the preset page is a page which needs to be accessed when the user executes the preset behavior, and the user with the preset behavior generally has higher consistency in the case time, so that community discovery can be accurately performed in the user according to the time of the user accessing the preset page, the user with the preset behavior can be accurately identified, and the reliability of user identification is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic view of an application scenario of a user identification method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a community structure according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of a user identification method according to an embodiment of the present invention;
fig. 4 is a schematic flowchart of another user identification method according to an embodiment of the present invention;
fig. 5 is a schematic process diagram of a user identification method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a user identification device according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a hardware structure of a subscriber identity module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic view of an application scenario of a user identification method according to an embodiment of the present invention. Referring to fig. 1, a user may access a preset page, where the preset page is a page accessed when the user performs a preset behavior. The time of the user accessing the preset page can be obtained, community discovery is carried out according to the time of the user accessing the preset page, and user identification is carried out based on the discovered community, so that the user with the preset behavior can be identified. In the process, the preset page is a page which needs to be accessed when the user executes the preset behavior, and the user with the preset behavior generally has higher consistency in the case time, so that community discovery can be accurately performed in the user according to the time of the user accessing the preset page, the user with the preset behavior can be accurately identified, and the reliability of user identification is improved.
For ease of understanding, the concepts related to the present application are explained below.
Community: a plurality of vertexes are included in a community, edge connection relations exist among partial vertexes in the community, and edge connection relations may not exist among partial vertexes. A vertex may refer to any object, such as a person, an organization, an event, and so forth. In this application, a vertex may refer to a user. The connection relationship inside the communities is tight, and the connection relationship among the communities is sparse.
Community discovery: refers to a process of determining communities in vertices (referred to as users in this application) according to the correlation between the vertices. The relevance between users refers to the relevance between users, for example, if two users visit the same page at the same time for many times, it indicates that the relevance between the two users is strong, and the relevance between the two users is high.
The community will be described with reference to fig. 2.
Fig. 2 is a schematic view of a community structure according to an embodiment of the present invention. Referring to fig. 2, the network includes a plurality of vertices, which are respectively denoted as vertex 1, vertex 2, … …, and vertex 16, different vertices may be connected to each other, and a connection line between different vertices is called an edge. In this application, each vertex may refer to a user. Alternatively, the edges in the network may represent relationships between vertices, for example, if there is a relationship between vertices, the vertices are connected by an edge, and if there is no relationship between vertices, there is no edge connection between the vertices.
In the practical application process, a plurality of communities can be determined in the network according to the connection relationship among the vertexes in the network, so that the determined internal connection relationship of each community in the plurality of communities is compact, and the connection relationship among different communities is sparse. For example, as shown in fig. 2, community a, community B, and community C may be determined, where the internal connection relationship of community a is tight, the internal connection relationship of community B is tight, the internal connection relationship of community C is tight, and the connection relationship between community a, community B, and community C is sparse.
It should be noted that fig. 1 illustrates vertices, connection relationships between vertices, and determined communities in a network by way of example only, and does not limit vertices, connection relationships between vertices, and determined communities in a network.
The technical means shown in the present application will be described in detail below with reference to specific examples. It should be noted that the following embodiments may be combined with each other, and the description of the same or similar contents in different embodiments is not repeated.
Fig. 3 is a flowchart illustrating a user identification method according to an embodiment of the present invention. Referring to fig. 3, the method may include:
s301, determining the correlation among a plurality of users according to the access time of the plurality of users to the preset page.
The execution main body of the embodiment of the invention can be electronic equipment, and can also be a user identification device arranged in the electronic equipment. For example, the electronic device may be a terminal device, or may be a device such as a server. The user identification means may be implemented by software or by a combination of software and hardware.
Optionally, the preset page is a page accessed by the user when executing the preset behavior. For example, assuming that the predetermined behavior is credit fraud, the predetermined page may include a page that the user must manipulate when making a loan, such as a registration page, an information filling page, a loan product details page, and the like. The number of the preset pages can be 1 or more.
The access time of the user to the preset page may be a time when the user accesses the preset page for the first time. For example, assuming that the preset behavior is credit fraud, in order to identify a fraud user before credit, the current state of the user may be set to be a state before credit (that is, the credit of the user has not been successfully approved), and the time when the user first accesses the preset page may be the time when the user first accesses the preset page within the full life cycle; for the identification of the fraudulent loan user, the current status of the user may be set as the loan status (i.e., the credit of the user is approved successfully), and the time when the user accesses the preset page for the first time may be the time when the user accesses the preset page for the first time after the loan approval is successful.
Alternatively, the access time of the user to the preset page may be expressed in the form of a time stamp. The timestamp refers to the number of seconds or milliseconds between the current time and the historically standard time.
Alternatively, the correlation between two users may be represented by a pearson correlation coefficient. For example, assuming that the number of preset pages is 10, the user 1 corresponds to 10 access times, and the user 2 also corresponds to 10 access times, and a pearson operation is performed according to the 10 access times corresponding to the user 1 and the 10 access times corresponding to the user 2 to obtain a pearson correlation coefficient between the user 1 and the user 2, where the pearson correlation coefficient may represent a correlation between the user 1 and the user 2.
S302, according to the relevance among the multiple users, community discovery is carried out among the multiple users to obtain at least one community, and each community comprises the multiple users.
Optionally, the edge relationship between the users may be determined first, and community discovery may be performed among a plurality of users according to the edge relationship between the users.
For example, if the correlation between two users is greater than a second threshold, it is determined that there is an edge relationship between the two users. And if the correlation between the two users is less than or equal to the second threshold value, determining that the two users do not have the edge relation.
After determining the obtained edge relationship, a preset algorithm may be adopted to perform community discovery among a plurality of users. For example, the preset Algorithm may be a Fast-Newman Algorithm based on modular maximization, or a broadcast-listen tag delivery Algorithm (SLPA), or an Algorithm relying on core point dependence based on GraphLab, and the like. Of course, other algorithms may also be used to perform community discovery among multiple users, which is not specifically limited in this embodiment of the present invention.
S303, identifying a target user in the at least one community according to the user characteristics of each user in the at least one community, wherein the target user is a user with a preset behavior.
The process of identifying a target user in each community is the same, and the following description will take the process of identifying a target user in any one community as an example.
The first user can be determined in the community according to the user characteristics of all users in the community, the first user is a user marked with a preset behavior, and the target user is identified in the community according to the number of the first users in the community.
For example, for a plurality of users in the community, a preset behavior query may be performed according to the user identification to determine the first user in the community. Or, the first user may be determined in the community according to whether the user in the community has the mark information, if the user has the mark information, it is determined that the user is the user marked with the preset behavior, and if the user does not have the mark information, it is determined that the user is not the user marked with the preset behavior.
After the first users are determined to be obtained in the community, if the number of the first users meets a preset condition, determining the users except the first users in the community as target users. The preset conditions include one or more of the following conditions: the number of first users is greater than a third threshold; alternatively, the ratio of the number of the first users to the total number of users included in the community is greater than the fourth threshold.
For example, assuming that the fourth threshold is 0.7, assuming that the ratio of the number of the first users in a community to the total number of users in the community is greater than 0.7, the other users in the community except the first users are determined as the target users.
According to the user identification method provided by the embodiment of the invention, when a target user (a user with a preset behavior) needs to be identified in a plurality of users, the relevance among the plurality of users can be determined according to the access time of the plurality of users to a preset page; according to the correlation among the multiple users, carrying out community discovery among the multiple users to obtain at least one community, wherein each community comprises the multiple users; and identifying target users in the at least one community according to the user characteristics of the users in the at least one community. In the process, the preset page is a page which needs to be accessed when the user executes the preset behavior, and the user with the preset behavior generally has higher consistency in the case time, so that community discovery can be accurately performed in the user according to the time of the user accessing the preset page, the user with the preset behavior can be accurately identified, and the reliability of user identification is improved.
On the basis of any of the above embodiments, in order to improve the efficiency of user identification and reduce the complexity of user identification, a plurality of user groups may be determined in a plurality of users, and then community discovery may be performed in each user group and target user identification may be performed in a community. Next, the user identification method will be described with reference to fig. 4.
Fig. 4 is a flowchart illustrating another user identification method according to an embodiment of the present invention. Referring to fig. 4, the method may include:
s401, determining at least one user group in a plurality of users according to the access time of the plurality of users to a preset page, wherein each user group comprises a plurality of users.
Optionally, the number of the preset pages may be 1, or may be multiple.
If the number of the preset pages is 1, the time difference of every two users in one user group accessing the preset pages is smaller than a first threshold value.
If the number of the preset pages is more than 1, M time differences exist in N time differences corresponding to every two users in a user group, wherein the M time differences are smaller than a first threshold value, the N time differences corresponding to the two users are respectively the time differences of the two users accessing the N preset pages, M is more than or equal to 1 and less than or equal to N, M is an integer,
optionally, the at least one user group may be determined in two ways:
one possible implementation is:
and determining a plurality of preset time periods and a user group corresponding to each preset time period, and determining at least one user group in the plurality of users according to the access time of the plurality of users to the preset page and the plurality of preset time periods. And the access time of the user in one user group to the preset page is positioned in the preset time period corresponding to the user group. In the above process, for any user, a user group may be selected from the user groups corresponding to the plurality of preset time periods according to the access time of the user to the preset page and the plurality of preset time periods (the access time of the user to the preset page is in the preset time period corresponding to the user group), and the user is added to the selected user group.
For example, assuming that 5 preset time periods are determined, the 5 preset time periods may be respectively denoted as time period 1, time period 2, … …, and time period 5, each time period corresponds to one user group, and the user groups corresponding to the 5 time periods are respectively denoted as user group 1, user group 2, … …, and user group 5. For any user 1, if the access time of the user 1 to the preset page is in the period 1, adding the user 1 to the user group 1 corresponding to the period 1.
In this implementation manner, if the number of the preset pages is N (N is greater than 1), and if M access times among N access times of a user accessing the N preset pages are located in a preset time period, the user is added to a user group corresponding to the preset time period. For example, assuming that N is 10 and M is 7, assuming that 7 access times out of 10 access times corresponding to the user 1 are all located in the preset time period 1, the user 1 is added to the user group corresponding to the preset time period 1.
Another possible implementation:
initially, adding any user to a user group to obtain the user group; aiming at other users in the multiple users, if the time difference between the other users and the users in the existing user group is smaller than a first threshold value, adding the other users to the user group; and if the time difference between the other users and the users in the existing user groups is larger than or equal to the first threshold value, a user group is newly established, and the other users are added to the newly established user group. The above process is repeated until all users are added to the user group.
For example, initially, user 1 is added to user group 1, resulting in a user group. For the user 2, if the time difference between the user 2 and the users in the existing user group (user group 1) is smaller than a first threshold, the user 2 is added to the user group 1, and if the time difference between the user 2 and the users in the user group 1 is larger than or equal to the first threshold, the user group 2 is newly created, and the user 2 is added to the user group 2. The above process is repeated until all users are added to the user group.
In this implementation manner, if the number of the preset pages is N (N is greater than 1), and if there is a time difference between M access times and M access times corresponding to one user group among N access times when the user accesses the N preset pages, which is smaller than a first threshold, the user is added to the user group corresponding to the preset time period. For example, assuming that N is 10 and M is 7, assuming that a time difference between 7 access times among 10 access times corresponding to the user 1 and the corresponding 7 access times in the user group 1 is smaller than a first threshold, the user 1 is added to the user group corresponding to the preset time period 1.
S402, obtaining the correlation among the users in each user group according to the access time of the users in each user group to access the preset page.
In S402, the correlations between the users in the user group are obtained, wherein the manner of obtaining the correlations between the users in the user group may refer to S301, which is not described herein again.
Because only the correlation between every two users in the user group needs to be obtained, the correlation between every two users in all the users does not need to be obtained, and the complexity of the operation is reduced.
And S403, respectively carrying out community discovery among a plurality of users in each user group according to the correlation among the users in each user group.
Optionally, for any one user group: the method can determine the edge relationship among the users in the user group according to the correlation among the users in the user group, and carry out community discovery among a plurality of users in the user group according to the edge relationship among the users in the user group.
For example, if the correlation between a first user and a second user in a group of users is greater than a second threshold, it is determined that there is an edge relationship between the first user and the second user. And if the correlation between the first user and the second user in the user group is smaller than or equal to the second threshold, determining that no edge relation exists between the first user and the second user.
In S403, community discovery is performed within a user group, in other words, users in one community are located in the same user group. For the process of community discovery in the user group, reference may be made to S302, which is not described herein again.
Because community discovery is only needed among users in the user group, community discovery is not needed among all users, and the complexity of operation is reduced.
S404, determining a first user in the community according to the user characteristics of each user in the community.
The first user is a user marked with a preset behavior.
Optionally, the user characteristics may include an identification of the user, tagging information of the user, and the like.
For example, when the user characteristic is the user identifier, whether the user has a preset behavior may be queried according to the user identifier. When the user characteristic is the mark information of the user, whether the user has the mark information or not can be inquired, if the user has the mark information, the user is the user marked with the preset behavior, and if the user does not have the mark information, the user is not the user marked with the preset behavior.
S405, identifying target users in the community according to the number of the first users in the community.
It should be noted that the execution process of S405 may refer to the execution process of S303, and is not described herein again.
In the embodiment shown in fig. 4, when a target user (a user with a preset behavior) needs to be identified among a plurality of users, the correlation among the plurality of users may be determined according to the access time of the plurality of users to a preset page; according to the correlation among the multiple users, carrying out community discovery among the multiple users to obtain at least one community, wherein each community comprises the multiple users; and identifying target users in the at least one community according to the user characteristics of the users in the at least one community. In the process, the preset page is a page which needs to be accessed when the user executes the preset behavior, and the user with the preset behavior generally has higher consistency in the case time, so that community discovery can be accurately performed in the user according to the time of the user accessing the preset page, the user with the preset behavior can be accurately identified, and the reliability of user identification is improved. Furthermore, a plurality of user groups are determined among the users, community discovery is performed in each user group, and target user identification is performed in the community.
On the basis of any of the above embodiments, the following describes the user identification method by taking a specific example in conjunction with fig. 5.
Fig. 5 is a process diagram of a user identification method according to an embodiment of the present invention. Referring to fig. 5, each circle in fig. 5 may represent a user.
A plurality of users are determined, and the users can be users who visit a preset page within a preset time period.
And determining 3 user groups from the plurality of users according to the access time of the plurality of users to the preset page, and respectively recording the user groups as a user group 1, a user group 2 and a user group 3. And the difference value of the access time of every two users in each user group to the preset page is smaller than a first threshold value.
And respectively carrying out community discovery on the users in each user group. For example, community a and community B are found in user group 1, community C is found in user group 2, and community D, community E, and community F are found in user group 3.
The first user (user marked with preset behavior) is determined in each community, and the circle with gray filling in fig. 5 corresponds to the first user. If the ratio of the number of the first users in a community to the total number of the users in the community is greater than 0.5, determining the users except the first users in the community as target users, wherein the target users are newly found users with preset behaviors. For example, if there are 6 users in the community B in total, 4 of the users are the first user, the other 2 users in the community B may be determined as the target users. If there are 7 users in the community C in total, and 5 users are first users, the other 2 users in the community C may be determined as target users. If there are 5 users in the community F, and 4 users are the first user, the other 1 user in the community F may be determined as the target user.
Fig. 6 is a schematic structural diagram of a user identification device according to an embodiment of the present invention. Referring to fig. 6, the user identification device 10 may include a determination module 11, a discovery module 12, and an identification module 13, wherein,
the determining module 11 is configured to determine, according to access time of a plurality of users to a preset page, a correlation between the plurality of users, where the correlation is used to indicate an association relationship between the users;
the discovery module 12 is configured to perform community discovery among the multiple users according to the correlation among the multiple users to obtain at least one community, where each community includes multiple users;
the identification module 13 is configured to identify a target user in the at least one community according to a user characteristic of each user in the at least one community, where the target user is a user with a preset behavior.
The user identification device shown in the embodiment of the present invention may implement the technical solutions shown in the above method embodiments, and the implementation principles and beneficial effects thereof are similar, and are not described herein again.
In a possible implementation, the determining module 11 is specifically configured to:
determining at least one user group in the plurality of users according to the access time of the plurality of users to the preset page, wherein each user group comprises a plurality of users;
respectively acquiring the correlation among the users in each user group according to the access time of the users in each user group to access the preset page;
and respectively carrying out community discovery among a plurality of users in each user group according to the correlation among the users in each user group.
In a possible implementation manner, if the number of the preset pages is 1, the time difference between every two users in one user group accessing the preset pages is smaller than a first threshold;
if the number of the preset pages is greater than 1, M time differences exist in N time differences corresponding to every two users in a user group, wherein the M time differences are smaller than a first threshold value, the N time differences corresponding to the two users are respectively time differences of the two users accessing the N preset pages, M is greater than or equal to 1 and is less than or equal to N, and M is an integer.
In a possible implementation, the discovery module 12 is specifically configured to:
aiming at any user group, determining the edge relation among the users in the user group according to the correlation among the users in the user group;
and according to the edge relation among the users in the user group, carrying out community discovery among a plurality of users in the user group.
In a possible implementation, the discovery module 12 is specifically configured to:
if the correlation between a first user and a second user in the user group is larger than a second threshold value, determining that the first user and the second user have an edge relationship;
and if the correlation between the first user and the second user in the user group is smaller than or equal to a second threshold value, determining that no edge relation exists between the first user and the second user.
In a possible implementation, the identification module 13 is specifically configured to:
aiming at any community in the at least one community, determining a first user in the community according to the user characteristics of each user in the community, wherein the first user is a user marked with the preset behavior;
and identifying target users in the community according to the number of the first users in the community.
In a possible implementation, the identification module 13 is specifically configured to:
and if the number of the first users meets a preset condition, determining the users except the first users in the community as the target users.
In a possible embodiment, the preset conditions comprise one or more of the following conditions:
the number of the first users is greater than a third threshold;
a ratio of the number of the first users to a total number of users included in the community is greater than a fourth threshold.
The user identification device shown in the embodiment of the present invention may implement the technical solutions shown in the above method embodiments, and the implementation principles and beneficial effects thereof are similar, and are not described herein again.
Fig. 7 is a schematic diagram of a hardware structure of a subscriber identity device according to an embodiment of the present invention, and as shown in fig. 7, the subscriber identity device 20 includes: at least one processor 21 and a memory 22. The processor 21 and the memory 22 are connected by a bus 23.
In a specific implementation, the at least one processor 21 executes computer-executable instructions stored by the memory 22 to cause the at least one processor 21 to perform the user identification method as described above.
For a specific implementation process of the processor 21, reference may be made to the above method embodiments, which implement similar principles and technical effects, and this embodiment is not described herein again.
In the embodiment shown in fig. 7, it should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.
The memory may comprise high speed RAM memory and may also include non-volatile storage NVM, such as at least one disk memory.
The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
The present application also provides a computer-readable storage medium, in which computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the user identification method as described above is implemented.
The computer-readable storage medium may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the readable storage medium may also reside as discrete components in the apparatus.
The division of the units is only a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (15)

1. A method for identifying a user, comprising:
determining the correlation among a plurality of users according to the access time of the users to a preset page, wherein the correlation is used for indicating the incidence relation among the users;
according to the correlation among the multiple users, carrying out community discovery among the multiple users to obtain at least one community, wherein each community comprises the multiple users;
and identifying target users in the at least one community according to the user characteristics of the users in the at least one community, wherein the target users are users with preset behaviors.
2. The method of claim 1, wherein determining the relevance between the plurality of users according to the access time of the plurality of users to the preset page comprises:
determining at least one user group in the plurality of users according to the access time of the plurality of users to the preset page, wherein each user group comprises a plurality of users;
respectively acquiring the correlation among the users in each user group according to the access time of the users in each user group to access the preset page;
and respectively carrying out community discovery among a plurality of users in each user group according to the correlation among the users in each user group.
3. The method of claim 2,
if the number of the preset pages is 1, the time difference of every two users in a user group accessing the preset pages is smaller than a first threshold value;
if the number of the preset pages is greater than 1, M time differences exist in N time differences corresponding to every two users in a user group, wherein the M time differences are smaller than a first threshold value, the N time differences corresponding to the two users are respectively time differences of the two users accessing the N preset pages, M is greater than or equal to 1 and is less than or equal to N, and M is an integer.
4. A method according to claim 2 or 3, characterized in that for any one user group; according to the relevance among the users in the user group, carrying out community discovery among a plurality of users in the user group, wherein the community discovery comprises the following steps:
determining the edge relation among the users in the user group according to the correlation among the users in the user group;
and according to the edge relation among the users in the user group, carrying out community discovery among a plurality of users in the user group.
5. The method of claim 4, wherein determining the edge relationships between users in the user group based on the correlations between users in the user group comprises:
if the correlation between a first user and a second user in the user group is larger than a second threshold value, determining that the first user and the second user have an edge relationship;
and if the correlation between the first user and the second user in the user group is smaller than or equal to a second threshold value, determining that no edge relation exists between the first user and the second user.
6. The method according to any one of claims 1 to 5, wherein, for any one of the at least one community, identifying a target user in the community according to the user characteristics of each user in the community comprises:
determining a first user in the community according to the user characteristics of each user in the community, wherein the first user is a user marked with the preset behavior;
and identifying target users in the community according to the number of the first users in the community.
7. The method of claim 6, wherein identifying a target user in the community based on the number of the first users in the community comprises:
and if the number of the first users meets a preset condition, determining the users except the first users in the community as the target users.
8. The method of claim 7, wherein the preset conditions comprise one or more of the following conditions:
the number of the first users is greater than a third threshold;
a ratio of the number of the first users to a total number of users included in the community is greater than a fourth threshold.
9. A user identification device is characterized by comprising a determining module, a discovering module and an identifying module, wherein,
the determining module is used for determining the correlation among a plurality of users according to the access time of the users to a preset page, wherein the correlation is used for indicating the incidence relation among the users;
the discovery module is used for carrying out community discovery among the plurality of users according to the correlation among the plurality of users to obtain at least one community, wherein each community comprises a plurality of users;
the identification module is used for identifying target users in the at least one community according to the user characteristics of the users in the at least one community, and the target users are users with preset behaviors.
10. The apparatus of claim 9, wherein the determining module is specifically configured to:
determining at least one user group in the plurality of users according to the access time of the plurality of users to the preset page, wherein each user group comprises a plurality of users;
respectively acquiring the correlation among the users in each user group according to the access time of the users in each user group to access the preset page;
and respectively carrying out community discovery among a plurality of users in each user group according to the correlation among the users in each user group.
11. The apparatus of claim 10,
if the number of the preset pages is 1, the time difference of every two users in a user group accessing the preset pages is smaller than a first threshold value;
if the number of the preset pages is greater than 1, M time differences exist in N time differences corresponding to every two users in a user group, wherein the M time differences are smaller than a first threshold value, the N time differences corresponding to the two users are respectively time differences of the two users accessing the N preset pages, M is greater than or equal to 1 and is less than or equal to N, and M is an integer.
12. The apparatus according to claim 10 or 11, wherein the discovery module is specifically configured to:
aiming at any user group, determining the edge relation among the users in the user group according to the correlation among the users in the user group;
and according to the edge relation among the users in the user group, carrying out community discovery among a plurality of users in the user group.
13. The apparatus of claim 12, wherein the discovery module is specifically configured to:
if the correlation between a first user and a second user in the user group is larger than a second threshold value, determining that the first user and the second user have an edge relationship;
and if the correlation between the first user and the second user in the user group is smaller than or equal to a second threshold value, determining that no edge relation exists between the first user and the second user.
14. A user identification device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the user identification method of any of claims 1 to 8.
15. A computer-readable storage medium having computer-executable instructions stored thereon, which when executed by a processor, implement the user identification method of any one of claims 1 to 8.
CN202010004113.4A 2020-01-03 2020-01-03 User identification method, device and equipment Pending CN111209512A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010004113.4A CN111209512A (en) 2020-01-03 2020-01-03 User identification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010004113.4A CN111209512A (en) 2020-01-03 2020-01-03 User identification method, device and equipment

Publications (1)

Publication Number Publication Date
CN111209512A true CN111209512A (en) 2020-05-29

Family

ID=70787112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010004113.4A Pending CN111209512A (en) 2020-01-03 2020-01-03 User identification method, device and equipment

Country Status (1)

Country Link
CN (1) CN111209512A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105306495A (en) * 2015-11-30 2016-02-03 百度在线网络技术(北京)有限公司 User identification method and device
CN105657659A (en) * 2016-01-29 2016-06-08 北京邮电大学 Method and system for identifying scalping user in taxi service
CN107093090A (en) * 2016-10-25 2017-08-25 北京小度信息科技有限公司 Abnormal user recognition methods and device
CN108304482A (en) * 2017-12-29 2018-07-20 北京城市网邻信息技术有限公司 The recognition methods and device of broker, electronic equipment and readable storage medium storing program for executing
CN109344326A (en) * 2018-09-11 2019-02-15 阿里巴巴集团控股有限公司 A kind of method for digging and device of social circle
CN110222484A (en) * 2019-04-28 2019-09-10 五八有限公司 A kind of method for identifying ID, device, electronic equipment and storage medium
CN110517097A (en) * 2019-09-09 2019-11-29 平安普惠企业管理有限公司 Identify method, apparatus, equipment and the storage medium of abnormal user

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105306495A (en) * 2015-11-30 2016-02-03 百度在线网络技术(北京)有限公司 User identification method and device
CN105657659A (en) * 2016-01-29 2016-06-08 北京邮电大学 Method and system for identifying scalping user in taxi service
CN107093090A (en) * 2016-10-25 2017-08-25 北京小度信息科技有限公司 Abnormal user recognition methods and device
CN108304482A (en) * 2017-12-29 2018-07-20 北京城市网邻信息技术有限公司 The recognition methods and device of broker, electronic equipment and readable storage medium storing program for executing
CN109344326A (en) * 2018-09-11 2019-02-15 阿里巴巴集团控股有限公司 A kind of method for digging and device of social circle
CN110222484A (en) * 2019-04-28 2019-09-10 五八有限公司 A kind of method for identifying ID, device, electronic equipment and storage medium
CN110517097A (en) * 2019-09-09 2019-11-29 平安普惠企业管理有限公司 Identify method, apparatus, equipment and the storage medium of abnormal user

Similar Documents

Publication Publication Date Title
CN110570311B (en) Block chain consensus method, device and equipment
CN112579327A (en) Fault detection method, device and equipment
CN114265740A (en) Error information processing method, device, equipment and storage medium
CN112732427B (en) Data processing method, system and related device based on Redis cluster
CN113326064A (en) Method for dividing business logic module, electronic equipment and storage medium
CN117492661A (en) Data writing method, medium, device and computing equipment
CN111209512A (en) User identification method, device and equipment
CN114697440B (en) Network management method and mobile terminal
CN113111078B (en) Resource data processing method and device, computer equipment and storage medium
CN111131393B (en) User activity data statistical method, electronic device and storage medium
CN114169451A (en) Behavior data classification processing method, device, equipment and storage medium
CN111371818B (en) Data request verification method, device and equipment
CN112837158A (en) Stock data acquisition and storage method, device and system based on cloud computing technology
CN112115132B (en) Data association method, device, equipment and storage medium
CN113806249B (en) Object storage sequence lifting method, device, terminal and storage medium
CN109101436A (en) Data dynamic addressing storage method and device, storage medium and terminal equipment
CN113079110B (en) Message processing method, device, equipment and storage medium
CN112261484B (en) Target user identification method and device, electronic equipment and storage medium
CN116910325A (en) Multi-attribute graph anomaly detection method, device, equipment and medium
CN117522099A (en) Financial card information changing method, device, equipment and storage medium
CN116582895A (en) Method and device for determining broadband
CN115393032A (en) Credit label determination method based on credit variable and related product
CN117768185A (en) Method and device for determining attack clock and electronic equipment
CN117171155A (en) Data cleaning method, device, equipment and storage medium
CN115273161A (en) Method and device for identifying equipment identifier, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination