CN110399399B - User analysis method, device, electronic equipment and storage medium - Google Patents

User analysis method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110399399B
CN110399399B CN201810360821.4A CN201810360821A CN110399399B CN 110399399 B CN110399399 B CN 110399399B CN 201810360821 A CN201810360821 A CN 201810360821A CN 110399399 B CN110399399 B CN 110399399B
Authority
CN
China
Prior art keywords
value
population
social network
user
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810360821.4A
Other languages
Chinese (zh)
Other versions
CN110399399A (en
Inventor
莫莉
唐秋香
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Datang Mobile Communications Equipment Co ltd
Original Assignee
Shanghai Datang Mobile Communications Equipment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Datang Mobile Communications Equipment Co ltd filed Critical Shanghai Datang Mobile Communications Equipment Co ltd
Priority to CN201810360821.4A priority Critical patent/CN110399399B/en
Publication of CN110399399A publication Critical patent/CN110399399A/en
Application granted granted Critical
Publication of CN110399399B publication Critical patent/CN110399399B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Telephonic Communication Services (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a user analysis method and device, electronic equipment and a storage medium. The method comprises the steps of obtaining a pre-constructed social network, wherein the social network comprises a plurality of nodes and a plurality of edges; aiming at two users with call records, calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users; determining a first webpage ranking PR value of each user according to the weight of all edges of each user, wherein each first PR value represents the importance degree of one user in the social network; and analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network. According to the method, the first PR value of each user is obtained by constructing the social network, and the influence of the social network on the society can be accurately judged by combining the factors of the floating population.

Description

User analysis method, device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of communication, in particular to a user analysis method, a user analysis device, electronic equipment and a storage medium.
Background
In order to further promote the urban and rural economic development and social stability and enhance the management work of cities, the user can be monitored by analyzing the user behavior by using the communication network, for example, to determine that a certain user has suspicion (e.g., fraud or violence) disturbing public order.
In the prior art, user behavior analysis generally includes acquiring website access records or telephone dialing records, classifying each user according to a pre-constructed decision tree, and identifying to obtain a specific user.
It can be understood that the website access records or the telephone dialing records generated every day are massive, and due to the fact that the behavior characteristics of the user change along with the time, and the like, the accuracy of user analysis and identification in the prior art is low.
Disclosure of Invention
In order to overcome the defects in the prior art, embodiments of the present invention provide a method and apparatus for user analysis, an electronic device, and a storage medium.
In one aspect, an embodiment of the present invention provides a user analysis method, where the method includes:
the method comprises the steps that a pre-constructed social network is obtained, the social network comprises a plurality of nodes and a plurality of edges, each node represents a user, and if a call record exists between two users, one edge exists between the two users;
aiming at two users with call records, calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users;
determining a first webpage ranking PR value of each user according to the weight of all edges of each user, wherein each first PR value represents the importance degree of one user in the social network;
and analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network.
In another aspect, an embodiment of the present invention provides an apparatus for user analysis, where the apparatus includes:
the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring a pre-constructed social network, the social network comprises a plurality of nodes and a plurality of edges, each node represents a user, and if a call record exists between two users, one edge exists between the two users;
the calculation module is used for calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users aiming at the two users with the call records;
the determining module is used for determining a first webpage ranking PR value of each user according to the weight of all sides of each user, and each first PR value represents the importance degree of one user in the social network;
and the analysis module is used for analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network.
In another aspect, an embodiment of the present invention further provides an electronic device, which includes a memory, a processor, a bus, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above method when executing the program.
In another aspect, an embodiment of the present invention further provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the above method.
According to the technical scheme, the method, the device, the electronic equipment and the storage medium for user analysis are provided by the embodiment of the invention, the first PR value of each user is obtained by constructing the social network, and the influence of the social network on the society can be accurately judged by combining the factors of the floating population.
Drawings
Fig. 1 is a schematic flowchart of a user analysis method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a social network provided by an embodiment of the present invention;
FIG. 3 is a flow chart of a method provided by another embodiment of the present invention;
FIG. 4 is a diagram illustrating an improved PageRank ranking algorithm according to yet another embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a user analysis apparatus according to another embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to yet another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.
Fig. 1 is a flowchart illustrating a method for user analysis according to an embodiment of the present invention.
As shown in fig. 1, the method provided in the embodiment of the present invention specifically includes the following steps:
step 11, obtaining a pre-constructed social network, wherein the social network comprises a plurality of nodes and a plurality of edges, each node represents a user, and if a call record exists between two users, one edge exists between the two users;
optionally, the method provided in the embodiment of the present invention is implemented on a user analysis device, and the user analysis device may be a computer.
Optionally, the computer collects a period of call ticket data, which may be a plurality of call records extracted from S1, where each call record includes an identifier of a calling party and a called party, such as an IMSI (International Mobile Subscriber identity Number), a call start time, a total call duration, a location city of the calling party and the called party, a home location of the calling party and the called party, and the like. The period of time can be adjusted according to the actual situation, and can be one week.
Optionally, the collected call ticket data is preprocessed, and a plurality of call ticket records are classified and counted, wherein each call ticket record comprises the identification of the calling and called users, the total weekly call duration between the calling and called users, the call frequency and the average call duration.
Wherein, the average call duration is the total call duration/call frequency.
Optionally, the computer extracts user information from the call ticket data, including the identifier of the calling and called users, the call start time, the location city of the calling and called users, the home of the calling and called users, and the like.
Optionally, a social network is constructed according to the ticket records, and the social network can reflect the call relations among a plurality of users.
Optionally, each user in the ticket record is abstracted to a node in the social network, and a call between the user and the user is abstracted to an edge in the social network.
Fig. 2 is a schematic diagram of a social network according to an embodiment of the present invention.
As shown in fig. 2, AB are two different users with two edges between AB, the upper edge indicating a as calling call B and the lower edge indicating B as calling call a.
Alternatively, the social network may be stored in the form of a matrix, which may be denoted as PR ═ P, R.
Where P represents the set of all nodes in the social network and R represents the set of all edges in the social network.
Step 12, aiming at two users with call records, calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users;
optionally, each sideband in R of the social network has a weight, each user in the social network is assigned a unique number corresponding to a node in the network, and for any two users i and j, if a contact record exists, the total call duration T is usedtotalFrequency of calls TtimesAverage talk time TaverageFor the parameter, the degree of closeness of connection between i and j, i.e. the weight of each edge, is calculated and the weight is recorded as the value of k.
Alternatively, there are several ways to calculate the k value, for example, a logistic regression model is used to calculate the weight k value.
Step 13, determining a first webpage ranking PR value of each user according to the weight of all sides of each user, wherein each first PR value represents the importance degree of one user in the social network;
optionally, let a user be p, the user p has n adjacent users, and the statistical user p and each adjacent user piTotal duration of call TtotalFrequency of calls TtimesAverage talk time Taverage
Optionally, a first PR (page rank) value of each user in the social network is obtained.
Alternatively, the meaning of a PR value in the prior art is a "number of votes" for a page, and a hyperlink to a page is equivalent to casting a vote for the page.
In the embodiment of the present invention, the first PR value of each user is the "number of votes obtained" of the user, which is determined by the user who has a call record with the user, and the tightness of the call connection between the two users is described by the k value. The first PR value indicates the importance degree of one user in the social network, and the higher the first PR value, the more important and influential the one user in the social network.
Optionally, the first PR value for a user is equal to the sum of the k values of all edges of the user.
And step 14, analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network.
Optionally, according to the pre-extracted user information, taking the user whose home location is a foreign place and whose location city is a local place as a floating population, determining the proportion of the floating population, and recording as pfAnd recording the first PR value of each floating population as rf
Optionally, according to the pre-extracted user information, taking users whose home locations are local and whose located cities are local as local population, determining the proportion of the local population, and marking as pnAnd recording the first PR value of each local population as rn
Optionally, according to the proportion p of the floating populationfLocal population ratio pnFirst PR value r for each local populationnAnd a first PR value r for each floating populationfAnd comprehensively judging the influence of a social network on the society.
Optionally, the percentage of floating population pfThe higher and the higher the first PR value of the floating population, the more influential users in the social network are the floating population, and the more influential users are the floating population, which is considered to have a negative influence on the society and has a larger negative influence.
Optionally, the local population's proportion pnThe higher the first PR value of the floating population is, the higher the PR value of the floating population is, the floating population in the social network is influential, and most of the influential users are local populations, and the social network is considered to have negative influence on the society.
Optionally, the percentage of floating population pfThe higher the PR value of the local population, the higher the PR value of the local population, the higher the PR value of the higher the PR value of the local population, the higher the second PR value of the local population, the second PR value of the local population, the second PR of the third social network, the third social network and the third social network, the third social network and the third social networkThe users who have influence are mostly floating population, and the social network is considered to have positive influence on the society.
Optionally, the local population's proportion pnThe higher the first PR value of the local population is, the higher the influence of the local population in the social network is, and most of the influenced users are the local population, and the social network is considered to have positive influence on the society.
If the analysis result shows that the social network has negative influence on the society, the users in the social network can be monitored in a key mode.
According to the user analysis method provided by the embodiment of the invention, the first PR value of each user is obtained by constructing the social network, and the influence of the social network on the society can be accurately judged by combining the factors of the floating population.
On the basis of the foregoing embodiment, in the method for user analysis provided in another embodiment of the present invention, for two users with call records, according to the total call duration, the call frequency, and the average call duration of the two users, the step of calculating the weight of the edge of the two users specifically includes:
Figure BDA0001635926620000051
where k is the weight, β0,β1,β2And beta3Is a factor weight coefficient, TtotalTotal duration, T, of a call for two userstimesFrequency of conversation, T, for two usersaverageThe value range of k is (0,1) which is the average call duration of two users.
There are various ways to determine the weight, and one of the ways is taken as an example for the embodiment of the present invention.
Alternatively, the weight k value is calculated using the following logistic regression model.
Figure BDA0001635926620000061
Optionally, the respective factor weightsCoefficient beta0,β1,β2And beta3The values of (A) are different and determined according to actual conditions.
For example, the total duration T of the call between two userstotalThe longer the call is, the more intimate the relationship is, the total call duration T can be consideredtotalImportant in describing how closely two users are connected, then β1Is relatively large.
Other steps of the embodiment of the present invention are similar to those of the previous embodiment, and are not described again in the embodiment of the present invention.
According to the user analysis method provided by the embodiment of the invention, the weight of each edge can be accurately calculated by adopting the formula.
On the basis of the above embodiments, the method for user analysis according to another embodiment of the present invention,
according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
and analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the second PR value of each local population and the second PR value of each floating population in the social network, wherein the second PR value is obtained by correcting the first PR value.
There are various ways to determine the social influence of the social network, and one of the ways is taken as an example in the embodiment of the present invention.
In practical application, the PR value of each user is fluctuated, a first PR value is obtained first and is used as an initial PR value, the initial PR value is corrected, a more stable second PR value can be obtained, and therefore the importance degree of each user in the social network can be better described.
Optionally, there are various modifications, and one of the modifications is described as an example in the embodiments of the present invention.
After the step of obtaining the first page rank PR value of each user according to the weight of all the edges of each user, the method further includes:
and correcting the first PR value according to a pre-acquired probability matrix to obtain a second PR value, wherein the probability matrix comprises a plurality of elements, and each element corresponds to the weight of each edge.
Optionally, knowing the weight of each edge, a probability matrix corresponding to the social network may be calculated, where the probability matrix includes a plurality of elements, and the total number of the elements is the same as the total number of the edges.
Optionally, the elements of the probability matrix are derived from the weights of the edges of each user.
Optionally, the weight of each user is the total duration of the call T according to the three parameterstotalFrequency of calls TtimesAverage talk time TaverageIn practical applications, the three parameters are calculated to fluctuate, and the elements are more stable.
Optionally, the first PR value is considered as an initial PR value, and is denoted as PR0. The PR may be represented in a matrix0=(kID1…kIDn)TWherein k isID1Is the k value of the first edge of a user and n is the number of edges.
Optionally, the initial PR value is modified according to the calculated probability matrix M to obtain a second PR value, and PR is used1And (4) showing.
Optionally, it is calculated by the following formula: PR1=MT*PR0
Optionally, for PR0Multiplying by MTA matrix is then obtained, which is marked as the final second PR value, compared to the first PR value PR0The second PR value is more convergent and more stable.
Other steps of the embodiment of the present invention are similar to those of the previous embodiment, and are not described again in the embodiment of the present invention.
According to the user analysis method provided by the embodiment of the invention, the first PR value is corrected through the probability matrix, so that an accurate second PR value can be obtained for subsequent accurate analysis of the social network.
On the basis of the foregoing embodiment, in a method for analyzing a user according to another embodiment of the present invention, according to a percentage of floating population in a social network, a percentage of local population, a first PR value of each local population, and a first PR value of each floating population, the step of analyzing an influence of the social network on the society is specifically:
the influence coefficient ρ is calculated by the following formulai
Figure BDA0001635926620000071
Wherein phi (p)f),χ(rf),η(pn),Δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if influence coefficient ρiGreater than a first predetermined threshold indicates that the social network will increase social mobility.
There are various ways to analyze the social influence of the social network, and one of the ways is taken as an example in the embodiment of the present invention.
Alternatively, [ phi ] (p)f),χ(rf),η(pn),Δ(rn) May be the same function. Phi (p)f) The function describes the fluctuation of the proportion of the floating population of a social network, the proportion of the floating population pfThe statistical period is a fixed value, the number of the floating population in a social network in a plurality of statistical periods is changed in real time, and the proportion of the floating population needs to be tracked and corrected, so that the proportion of the floating population in the social network can be more accurately described.
Accordingly, η (p)n) The proportion of local population within a social network can be described more accurately.
Alternatively, χ (r)f) The function describes the rank r of each floating population of a social networkfOf the ranking r of each userfA fixed value in one statistical period, and a social network in a plurality of statistical periodsThe ranking within is real-time and requires a ranking r of the floating populationfTracking corrections are made so that the ranking of floating population within a social network can be more accurately described.
Accordingly, Δ (r)n) The ranking of local populations within a social network may be more accurately described.
Optionally, the function φ (p)f),χ(rf),η(pn) And Δ (r)n) Is a function of the same type, and the type of the function is an exponential decay function, a linear function or a normalized function.
That is, the ratio p to the floating populationfPerforming any one of the following treatments: exponential decay simulation, linear transformation or normalization, to obtain phi (p)f) A function. And to (r)f),(pn),(rn) The same treatment is carried out to correspondingly obtain χ (r)f),η(pn),Δ(rn)。
Optionally, the first threshold may be set according to practical situations, and may be, for example, 0.6.
Optionally, multiple thresholds may be set to more accurately analyze the social impact of the social network.
If influence coefficient ρiGreater than the first threshold (0.6) indicates that the social network has a greater impact on social liquidity, and the social network increases social liquidity by a greater amount.
If influence coefficient ρiGreater than the other threshold (0.3) and less than the first threshold indicates that the social network will increase social mobility, but the impact on social mobility is not significant.
If influence coefficient ρiLess than 0.3 indicates that the social network does not increase social mobility.
Other steps of the embodiment of the present invention are similar to those of the previous embodiment, and are not described again in the embodiment of the present invention.
According to the user analysis method provided by the embodiment of the invention, the influence of the social network on the society can be accurately analyzed by calculating the influence coefficient.
On the basis of the foregoing embodiment, in the method for user analysis provided in another embodiment of the present invention, if the influence coefficient ρ isiAfter the step of indicating that the social network will increase social mobility, the method further comprises:
calculating the safety factor omega by adopting the following formulateam
Figure BDA0001635926620000081
Wherein the content of the first and second substances,
Figure BDA0001635926620000082
θ(rf),γ(pn),δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if the safety factor omegateamAnd if the second threshold is larger than the preset second threshold, performing key monitoring on the social network.
Alternatively, if the influence coefficient ρiIf the social network is larger than the preset threshold, the social network can be preliminarily determined to increase social mobility, and the safety factor omega can be further calculatedteamAnd the safety factor is used for judging whether the social network threatens the social public order.
Alternatively,
Figure BDA0001635926620000091
θ(rf),γ(pn),δ(rn) May be the same function.
Figure BDA0001635926620000092
The function describes the fluctuation of the proportion of the floating population of a social network, the proportion of the floating population pfThe number of the floating population in a social network in a plurality of statistical periods is changed in real time and needs to track and correct the occupation ratio of the floating population,thereby more accurately describing the proportion of floating population within a social network.
Accordingly, γ (p)n) The proportion of local population within a social network can be described more accurately.
Alternatively, θ (r)f) The function describes the rank r of each floating population of a social networkfOf the ranking r of each userfA fixed value in one statistical period, and the ranking in a social network in a plurality of statistical periods is changed in real time, so that the ranking r of the floating population is neededfTracking corrections are made so that the ranking of floating population within a social network can be more accurately described.
Accordingly, δ (r)n) The ranking of local populations within a social network may be more accurately described.
Optionally, the function
Figure BDA0001635926620000093
θ(rf),γ(pn),δ(rn) Is a function of the same type, and the type of the function is an exponential decay function, a linear function or a normalized function.
That is, the ratio p to the floating populationfPerforming any one of the following treatments: exponential decay simulation, linear transformation or normalization, can be obtained
Figure BDA0001635926620000094
A function. And to (r)f),(pn),(rn) The same processing is performed, and θ (r _ f), γ (p _ n), and δ (r _ n) are obtained correspondingly.
Optionally, the second threshold may be set according to practical situations, and may be, for example, 0.6.
Optionally, multiple thresholds may be set to more accurately analyze the social impact of the social network.
If the safety factor rhoiAnd if the value is larger than the second threshold, the threat of the social network to the social public order is large, and the social network is required to be monitored in an important way.
The user with the highest first PR value of the floating population can be obtained, the operator information is inquired, and the identity information of the user, such as an identification number, is obtained.
Optionally, the identity information of the user is queried in the public security system, and if it is shown that the user has a previous department, the security of the social network is considered to be extremely low, and the user and the social network should be monitored in an important manner.
If the safety factor omegateamGreater than another threshold (0.3) and less than a second threshold, indicates that the social network is not threatening public order of society.
If the safety factor omegateamLess than 0.3 indicates that the social network does not pose a threat to social public order.
Other steps of the embodiment of the present invention are similar to those of the previous embodiment, and are not described again in the embodiment of the present invention.
According to the user analysis method provided by the embodiment of the invention, whether the social network is subjected to key monitoring or not is judged by calculating the safety coefficient.
In order to more fully understand the technical content of the present invention, the method for user analysis provided by the embodiment of the present invention is explained in detail on the basis of the above embodiment.
The embodiment of the invention provides a floating population detection system based on social analysis, which comprises three modules: the data acquisition module, the data processing module and the data display module mainly work as follows:
the data acquisition module acquires multi-dimensional data of a user, including network data, ticket data, position data, attribute data, scene data and the like;
the data processing module calculates, analyzes and models the acquired data, and modularizes the data by adopting methods such as a position positioning algorithm, a social analysis algorithm and the like;
the data display module performs platform application display according to a specific application scene and outputs the platform application display according to a certain rule, and the platform application display is mainly displayed in the modes of graphics, forms, geography and the like.
Fig. 3 is a flowchart of a method according to another embodiment of the present invention.
As shown in fig. 3, the method comprises the following specific steps:
step one, collecting and processing call ticket data
1. Collecting user call ticket data;
2. preprocessing the user call ticket data, extracting a calling IMSI (International Mobile Subscriber identity Number), a called IMSI (International Mobile Subscriber identity Number), a total call duration (weekly statistics), a call frequency and an average call duration, and generating call ticket records as the following table 1, wherein the average call duration is the total call duration/call frequency:
TABLE 1
Figure BDA0001635926620000101
Figure BDA0001635926620000111
3. Extracting user information from the user call ticket data, including Time, IMSI (International Mobile Subscriber identity Number), Province Number of the user currently located, City Number of the user currently located, homed Province Number of the user, Owner City Number of the user, longitude and latitude, as shown in table 2 below:
TABLE 2
Figure BDA0001635926620000112
Step two, constructing a social network graph model
1. Abstracting each person in the ticket record into a node in a network graph, and abstracting the contact between people into an edge in the network graph (such as figure 2);
2. and storing the social network graph model in a matrix form by using the ticket data extracted in the step one, wherein the social network graph model can be represented as PR (P, R), wherein P represents the set of all points in the graph, R represents the set of all edges in the graph, each side in R has weight and direction, and the weight data of the edges are stored in a single vector list.
Step three, grouping communities
1. According to specific application requirements, a plurality of key monitoring personnel can be set, and can also aim at all people;
2. starting from any user as a basic reference point, taking a reference vertex and all adjacent nodes to form a community to group the community network, and repeating the steps until the community division is finished;
step four, an improved PageRank ranking algorithm
Fig. 4 is a schematic diagram of an improved PageRank algorithm according to yet another embodiment of the present invention.
As shown in FIG. 4, each person in the data is assigned a unique serial number ID, the ID corresponds to a node in the network, and if a contact record exists between any two persons i and j, the total conversation time length T is usedtotalFrequency of calls TtimesAverage talk time TaverageCalculating a link weight value between i and j for the parameter, namely the weight of each edge in R;
using a logistic regression model:
Figure BDA0001635926620000121
the weight value k is related to the total duration T of the calltotalFrequency of calls TtimesAverage talk time TaverageWherein the coefficient beta0,β1,β2,β3The value of (b) is determined according to a specific application scene, and an output result value k is always between (0, 1);
after the weight value of each edge is known, a probability matrix M can be calculated, taking the social network diagram in step 2 as an example:
Figure BDA0001635926620000122
extract each user p for all its neighboring users piTotal duration of call T (with n adjacent subscribers)totalFrequency of calls TtimesAverage talk time Taverage
Figure BDA0001635926620000123
Figure BDA0001635926620000124
Taverage=Ttotal(p)/Ttimes(p)
Calculating the k value of each user p as an initial PR value by adopting a Logistic regression model, and expressing PR by a matrix0=(kID…kIDn)T. According to the calculated probability matrix M and the initial PR value, the probability matrix can pass through PR1=MT*PR0Proceed to the next calculation and multiply by M repeatedlyTThen, a convergence function can be obtained, so that a final PR matrix is obtained, and the PR value (namely PR ranking) of each community user is obtained.
Step five, outputting the gathering safety and the population mobility of the floating population
Calculating the proportion p of the floating population (the place of attribution is foreign and the positioning result is local) in each community by using the user information extracted in the step one and combining the PR ranking result output in the step fourfAnd rank rfThe ratio p of local population (the home location is local and the positioning result is local)nAnd rank rn
The specific form of the function model of the above 6 indexes can be determined according to specific application scenarios and user experience, so as to judge the floating population aggregation safety and population mobility, for example, the floating population aggregation safety of each community is used
Figure BDA0001635926620000131
To representFor fluidity of human body
Figure BDA0001635926620000132
Is shown, wherein the safety factor ωteamE (0,1), flow coefficient ρi∈(0,1),
Figure BDA0001635926620000133
θ(rf),γ(pn),Δ(rn),φ(pf),χ(rf),η(pn),δ(rn) Respectively, the ratio p of the population to the floating populationfRank of floating population rfLocal population ratio pnRanking of local population rnThe specific form of the function is determined according to the specific application scenario and the experience of the user,
the safety factor function model is as follows 3:
TABLE 3
Figure BDA0001635926620000134
The functional model of the flow coefficient is given in table 4 below:
TABLE 4
Mobility of the population Function model
Height of ρi>0.6
In 0.3<ρi≤0.6
Is low in ρi≤0.3
The improved PageRank ranking algorithm in the embodiment of the invention comprises the following steps: firstly, each user is regarded as a node, and the directed connection weight between the nodes is the influence degree of the calling time length and the calling frequency identification; then, any user or a specific user is taken as a reference vertex in sequence, the reference vertex and all adjacent nodes form a community of the reference user, and the analogy is repeated to obtain all user communities; finally, by adopting a PageRank ranking algorithm and a method of repeatedly multiplying a probability matrix, the final convergence result can obtain the user PR value and the intra-group ranking of each user community;
analysis for population mobility: after the user ranking of each user community is obtained, the liquidity and the aggregation safety of the reference users are judged by combining the reference users and the attributions and the real-time positions of the users in the communities.
The embodiment of the invention can find potential related people from a given crowd social circle, can be combined with public security service, is applied to vehicle management, can also provide effective support for management and monitoring of urban floating population, has good performance, considers the requirements of real scenes, analyzes in a range specified by a user, and has good expansibility.
Fig. 5 is a schematic structural diagram of an apparatus for user analysis according to yet another embodiment of the present invention.
Referring to fig. 5, on the basis of the above embodiment, the apparatus for user analysis provided in the embodiment of the present invention includes an obtaining module 51, a calculating module 52, a determining module 53, and an analyzing module 54, where:
the obtaining module 51 is configured to obtain a pre-constructed social network, where the social network includes multiple nodes and multiple edges, each node represents one user, and if a call record exists between two users, there is one edge between the two users; the calculating module 52 is configured to calculate, for two users with call records, weights of edges of the two users according to total call duration, call frequency, and average call duration of the two users; the determining module 53 is configured to determine a first webpage ranking PR value of each user according to the weight of all the edges of each user, where each first PR value indicates the importance degree of one user in the social network; the analysis module 54 is configured to analyze the social network influence on the society according to the percentage of the floating population in the social network, the percentage of the local population, the first PR value of each local population, and the first PR value of each floating population.
The user analysis apparatus provided in the embodiment of the present invention may be used to execute the method in the foregoing method embodiment, and details of this implementation are not described again.
According to the user analysis device provided by the embodiment of the invention, the social network is constructed through the acquisition module, the first PR value of each user is obtained through the calculation module, and the influence of one social network on the society can be accurately judged by combining the factors of the floating population through the analysis module.
Fig. 6 is a schematic structural diagram of an electronic device according to yet another embodiment of the present invention.
Referring to fig. 6, an electronic device provided by the embodiment of the present invention includes a memory (memory)61, a processor (processor)62, a bus 63, and a computer program stored in the memory 61 and running on the processor. The memory 61 and the processor 62 complete communication with each other through the bus 63.
The processor 62 is used to call the program instructions in the memory 61 to implement the method of fig. 1 when executing the program.
In another embodiment, the processor, when executing the program, implements the method of:
for two users with call records, the step of calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users is specifically as follows:
Figure BDA0001635926620000151
where k is the weight, β1,β2And beta3Is a factor weight coefficient, TtotalTotal duration, T, of a call for two userstimesFrequency of conversation, T, for two usersaverageThe value range of k is (0,1) which is the average call duration of two users.
In another embodiment, the processor, when executing the program, implements the method of:
according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
and analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the second PR value of each local population and the second PR value of each floating population in the social network, wherein the second PR value is obtained by correcting the first PR value.
In another embodiment, the processor, when executing the program, implements the method of:
after the step of obtaining the first page rank PR value of each user according to the weight of all the edges of each user, the method further includes:
and correcting the first PR value according to a pre-acquired probability matrix to obtain a second PR value, wherein the probability matrix comprises a plurality of elements, and each element corresponds to the weight of each edge.
In another embodiment, the processor, when executing the program, implements the method of: according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
the influence coefficient ρ is calculated by the following formulai
Figure BDA0001635926620000152
Wherein phi (p)f),χ(rf),η(pn),Δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if influence coefficient ρiGreater than a first predetermined threshold indicates that the social network will increase social mobility.
In another embodiment, the processor, when executing the program, implements the method of:
said function phi (p)f),χ(rf),η(pn) And Δ (r)n) Is a function of the same type, and the type of the function is an exponential decay function, a linear function or a normalized function.
In another embodiment, the processor, when executing the program, implements the method of:
if influence coefficient ρiAfter the step of indicating that the social network will increase social mobility, the method further comprises:
calculating the safety factor omega by adopting the following formulateam
Figure BDA0001635926620000161
Wherein the content of the first and second substances,
Figure BDA0001635926620000162
θ(rf),γ(pn),δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if the safety factor omegateamAnd if the second threshold is larger than the preset second threshold, performing key monitoring on the social network.
The electronic device provided in the embodiment of the present invention may be configured to execute a program corresponding to the method in the foregoing method embodiment, and details of this implementation are not described again.
According to the electronic device provided by the embodiment of the invention, when the processor executes the program, the first PR value of each user is obtained by constructing the social network, and the influence of the social network on the society can be accurately judged by combining the factors of the floating population.
A further embodiment of the invention provides a storage medium having a computer program stored thereon, which when executed by a processor performs the steps of fig. 1.
In another embodiment, the program when executed by a processor implements a method comprising:
for two users with call records, the step of calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users is specifically as follows:
Figure BDA0001635926620000163
where k is the weight, β0,β1,β2And beta3Is a factor weight coefficient, TtotalTotal duration, T, of a call for two userstimesFrequency of conversation, T, for two usersaverageThe value range of k is (0,1) which is the average call duration of two users.
In another embodiment, the program when executed by a processor implements a method comprising:
according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
and analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the second PR value of each local population and the second PR value of each floating population in the social network, wherein the second PR value is obtained by correcting the first PR value.
In another embodiment, the program when executed by a processor implements a method comprising:
after the step of obtaining the first page rank PR value of each user according to the weight of all the edges of each user, the method further includes:
and correcting the first PR value according to a pre-acquired probability matrix to obtain a second PR value, wherein the probability matrix comprises a plurality of elements, and each element corresponds to the weight of each edge.
In another embodiment, the program when executed by a processor implements a method comprising: according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
the influence coefficient ρ is calculated by the following formulai
Figure BDA0001635926620000171
Wherein phi (p)f),χ(rf),η(pn),Δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if influence coefficient ρiGreater than a first predetermined threshold indicates that the social network will increase social mobility.
In another embodiment, the program when executed by a processor implements a method comprising: said function phi (p)f),χ(rf),η(pn) And Δ (r)n) Is a function of the same type, and the type of the function is an exponential decay function, a linear function or a normalized function.
In another embodiment, the program when executed by a processor implements a method comprising: if influence coefficient ρiAfter the step of indicating that the social network will increase social mobility, the method further comprises:
calculating the safety factor omega by adopting the following formulateam
Figure BDA0001635926620000172
Wherein the content of the first and second substances,
Figure BDA0001635926620000173
θ(rf),γ(pn),δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if the safety factor omegateamAnd if the second threshold is larger than the preset second threshold, performing key monitoring on the social network.
In the storage medium provided in the embodiment of the present invention, when the program is executed by the processor, the method in the foregoing method embodiment is implemented, and details of this implementation are not described again.
The storage medium provided by the embodiment of the invention can obtain the first PR value of each user by constructing the social network, and can accurately judge the influence of the social network on the society by combining the factors of the floating population.
Yet another embodiment of the present invention discloses a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the methods provided by the above-mentioned method embodiments, for example, comprising:
the method comprises the steps that a pre-constructed social network is obtained, the social network comprises a plurality of nodes and a plurality of edges, each node represents a user, and if a call record exists between two users, one edge exists between the two users;
aiming at two users with call records, calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users;
determining a first webpage ranking PR value of each user according to the weight of all edges of each user, wherein each first PR value represents the importance degree of one user in the social network;
and analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network.
Those skilled in the art will appreciate that although some embodiments described herein include some features included in other embodiments instead of others, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments.
Those skilled in the art will appreciate that the steps of the embodiments may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (8)

1. A method of user analytics, the method comprising:
the method comprises the steps that a pre-constructed social network is obtained, the social network comprises a plurality of nodes and a plurality of edges, each node represents a user, and if a call record exists between two users, one edge exists between the two users;
aiming at two users with call records, calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users;
determining a first webpage ranking PR value of each user according to the weight of all edges of each user, wherein each first PR value represents the importance degree of one user in the social network;
analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network;
according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
the influence coefficient ρ is calculated by the following formulai
Figure FDA0003326508820000011
Wherein phi (p)f),x(rf),η(pn),Δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if influence coefficient ρiIf the value is larger than a preset first threshold, the social network can increase the social mobility;
calculating the safety factor omega by adopting the following formulateam
Figure FDA0003326508820000012
Wherein the content of the first and second substances,
Figure FDA0003326508820000013
θ(rf),γ(pn),δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if the safety factor omegateamAnd if the second threshold is larger than the preset second threshold, performing key monitoring on the social network.
2. The method of claim 1, wherein: for two users with call records, the step of calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users is specifically as follows:
Figure FDA0003326508820000014
where k is the weight, β0,β1,β2And beta3Is a factor weight coefficient, TtotalTotal duration, T, of a call for two userstimesFrequency of conversation, T, for two usersaverageThe value range of k is (0,1) which is the average call duration of two users.
3. The method of claim 1, wherein:
according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
and analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the second PR value of each local population and the second PR value of each floating population in the social network, wherein the second PR value is obtained by correcting the first PR value.
4. The method of claim 3, wherein:
after the step of obtaining the first page rank PR value of each user according to the weight of all the edges of each user, the method further includes:
and correcting the first PR value according to a pre-acquired probability matrix to obtain a second PR value, wherein the probability matrix comprises a plurality of elements, and each element corresponds to the weight of each edge.
5. The method of claim 1, wherein: said function phi (p)f),x(rf),η(pn) And Δ (r)n) Is a function of the same type, and the type of the function is an exponential decay function, a linear function or a normalized function.
6. An apparatus for user analysis, the apparatus comprising:
the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring a pre-constructed social network, the social network comprises a plurality of nodes and a plurality of edges, each node represents a user, and if a call record exists between two users, one edge exists between the two users;
the calculation module is used for calculating the weight of the edges of the two users according to the total call duration, the call frequency and the average call duration of the two users aiming at the two users with the call records;
the determining module is used for determining a first webpage ranking PR value of each user according to the weight of all sides of each user, and each first PR value represents the importance degree of one user in the social network;
the analysis module is used for analyzing the influence of the social network on the society according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network;
according to the occupation ratio of the floating population, the occupation ratio of the local population, the first PR value of each local population and the first PR value of each floating population in the social network, the step of analyzing the influence of the social network on the society specifically comprises the following steps:
the influence coefficient ρ i is calculated using the following formula:
Figure FDA0003326508820000031
wherein phi (p)f),x(rf),η(pn),Δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if influence coefficient ρiIf the value is larger than a preset first threshold, the social network can increase the social mobility;
calculating the safety factor omega by adopting the following formulateam
Figure FDA0003326508820000032
Wherein the content of the first and second substances,
Figure FDA0003326508820000033
θ(rf),γ(pn),δ(rn) Respectively, the ratio p of the population to the floating populationfPR value r of floating populationfLocal population ratio pnPR value r of local populationnA function of (a);
if the safety factor omegateamAnd if the second threshold is larger than the preset second threshold, performing key monitoring on the social network.
7. An electronic device comprising a memory, a processor, a bus and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of claims 1-5 when executing the program.
8. A storage medium having a computer program stored thereon, characterized in that: the program when executed by a processor implementing the steps of any of claims 1-5.
CN201810360821.4A 2018-04-20 2018-04-20 User analysis method, device, electronic equipment and storage medium Active CN110399399B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810360821.4A CN110399399B (en) 2018-04-20 2018-04-20 User analysis method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810360821.4A CN110399399B (en) 2018-04-20 2018-04-20 User analysis method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110399399A CN110399399A (en) 2019-11-01
CN110399399B true CN110399399B (en) 2022-04-05

Family

ID=68319488

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810360821.4A Active CN110399399B (en) 2018-04-20 2018-04-20 User analysis method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110399399B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191146B (en) * 2019-11-27 2023-06-16 重庆特斯联智慧科技股份有限公司 Family member communication method and system based on social network analysis algorithm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268584A (en) * 2013-02-28 2013-08-28 中国联合网络通信集团有限公司 Method and device for discriminating native place of floating population
CN105812593A (en) * 2016-03-30 2016-07-27 中国联合网络通信集团有限公司 Method and device for grading users
CN107645740A (en) * 2017-09-01 2018-01-30 深圳市盛路物联通讯技术有限公司 A kind of mobile monitoring method and terminal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9251177B2 (en) * 2012-06-12 2016-02-02 Empire Technology Development Llc Information removal from a network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268584A (en) * 2013-02-28 2013-08-28 中国联合网络通信集团有限公司 Method and device for discriminating native place of floating population
CN105812593A (en) * 2016-03-30 2016-07-27 中国联合网络通信集团有限公司 Method and device for grading users
CN107645740A (en) * 2017-09-01 2018-01-30 深圳市盛路物联通讯技术有限公司 A kind of mobile monitoring method and terminal

Also Published As

Publication number Publication date
CN110399399A (en) 2019-11-01

Similar Documents

Publication Publication Date Title
CN110417721B (en) Security risk assessment method, device, equipment and computer readable storage medium
CN111614690B (en) Abnormal behavior detection method and device
CN104836781A (en) Method distinguishing identities of access users, and device
CN113765881A (en) Method and device for detecting abnormal network security behavior, electronic equipment and storage medium
KR20120040589A (en) Optimum tender price prediction method and system
US20130124448A1 (en) Method and system for selecting a target with respect to a behavior in a population of communicating entities
CN108269087A (en) The processing method and processing device of location information
CN109949154A (en) Customer information classification method, device, computer equipment and storage medium
CN112819611A (en) Fraud identification method, device, electronic equipment and computer-readable storage medium
CN111510368A (en) Family group identification method, device, equipment and computer readable storage medium
CN112995201B (en) Resource value evaluation processing method based on cloud platform and related device
CN110399399B (en) User analysis method, device, electronic equipment and storage medium
CN111105064A (en) Method and device for determining suspected information of fraud event
CN109711984B (en) Pre-loan risk monitoring method and device based on collection urging
CN114841705B (en) Anti-fraud monitoring method based on scene recognition
CN109587248A (en) User identification method, device, server and storage medium
CN110969209B (en) Stranger identification method and device, electronic equipment and storage medium
CN111127059B (en) User quality analysis method and device
CN114091906A (en) Security situation analysis method and device, electronic equipment and computer readable medium
CN114241206A (en) Target object feature extraction method and device, electronic equipment and storage medium
CN112416922A (en) Group partner association data mining method, device, equipment and storage medium
CN113723522B (en) Abnormal user identification method and device, electronic equipment and storage medium
CN111382343A (en) Label system generation method and device
CN109919811A (en) Insurance agent&#39;s culture scheme generation method and relevant device based on big data
CN112989374B (en) Data security risk identification method and device based on complex network analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant