CN109104696B - Track privacy protection method and system for mobile user based on differential privacy - Google Patents

Track privacy protection method and system for mobile user based on differential privacy Download PDF

Info

Publication number
CN109104696B
CN109104696B CN201810916399.6A CN201810916399A CN109104696B CN 109104696 B CN109104696 B CN 109104696B CN 201810916399 A CN201810916399 A CN 201810916399A CN 109104696 B CN109104696 B CN 109104696B
Authority
CN
China
Prior art keywords
time
user
privacy
base station
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810916399.6A
Other languages
Chinese (zh)
Other versions
CN109104696A (en
Inventor
陈志立
阚晓立
张顺
仲红
崔杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN201810916399.6A priority Critical patent/CN109104696B/en
Publication of CN109104696A publication Critical patent/CN109104696A/en
Application granted granted Critical
Publication of CN109104696B publication Critical patent/CN109104696B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04KSECRET COMMUNICATION; JAMMING OF COMMUNICATION
    • H04K3/00Jamming of communication; Counter-measures
    • H04K3/80Jamming or countermeasure characterized by its function
    • H04K3/82Jamming or countermeasure characterized by its function related to preventing surveillance, interception or detection
    • H04K3/825Jamming or countermeasure characterized by its function related to preventing surveillance, interception or detection by jamming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W64/00Locating users or terminals or network equipment for network management purposes, e.g. mobility management
    • H04W64/003Locating users or terminals or network equipment for network management purposes, e.g. mobility management locating network equipment

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a track privacy protection method of a mobile user based on differential privacy, which is applied to a server and acts on m communication base stations L ═ L1,l2,…lmN users U ═ U1,u2,…unIn a moving scene, where liIt represents the position of the ith communication base station, i is more than or equal to 1 and less than or equal to m; u. ofjRepresenting the jth user, j is more than or equal to 1 and less than or equal to n, the method comprises the following steps: s1, initializing; s2, a data aggregation stage; s3, a data disturbance stage; s4, data release phase. The invention also discloses a track privacy protection system of the mobile user based on the differential privacy. According to the invention, the number of the users covered by the communication base station is disturbed, and the disturbance is carried out on the position of each user in the case of other schemes, so that the use of privacy budget is greatly reduced, the usability of data is ensured, and the calculation cost is reduced.

Description

Track privacy protection method and system for mobile user based on differential privacy
Technical Field
The invention relates to the technical field of network and information security, in particular to a track privacy protection method and system for a mobile user based on differential privacy.
Background
With the rapid development of internet technology, a big data age has come, and the popularization of mobile devices has enabled human mobile data to be widely collected through cellular networks and mobile applications and publicly released for academic research and business purposes. However, one of the main problems with such data distribution is how to protect the privacy of the mobile user.
To protect the privacy of the users, the data owner (operator) would only publish aggregated mobile data, rather than providing a track record for each user, e.g. publishing the number of users in the coverage of the base station within a specific time stamp. The aggregated user mobile statistical data has great practical value in a plurality of applications such as traffic scheduling and business intelligence. More importantly, most of these data providers consider that such aggregated statistics can protect the privacy of users by publishing them, since an adversary cannot distinguish a particular user from the aggregated data.
Recent research has shown that publishing aggregated user mobility statistics may lead to privacy leakage in their movement trajectories, mainly because of two key features present in human movement. First, the movement pattern of a single user is consistent and regular, which makes the trajectory for a single user highly predictable. Second, the movement pattern of any one user is significantly different from the movement patterns of other users, which enables an adversary to specifically re-recognize the trajectory of a certain user. Although it is difficult to distinguish the record of each user from the aggregated data under a certain timestamp, the motion trajectory of the user can be recovered from the aggregated data according to the movement characteristics of the user in continuous time, and then the motion trajectory of a specific user can be obtained again. It is crucial to effectively protect the privacy of the trajectory of the mobile device user.
Disclosure of Invention
The invention aims to provide a track privacy protection method and a track privacy protection system for a mobile user based on differential privacy, which protect the position privacy of the user by using a Laplace mechanism in the differential privacy, thereby preventing the recovery of the moving track of an adversary with any background knowledge to the user.
The invention is realized by the following technical scheme: a track privacy protection method of a mobile user based on differential privacy is applied to a server and acts on m communication base stations L ═ L1,l2,…lmN users U ═ U1,u2,…unIn a moving scene, where liIt represents the position of the ith communication base station, i is more than or equal to 1 and less than or equal to m; u. ofjRepresents the jth user, 1 ≦ j ≦ n, theThe method comprises the following steps:
s1, initialization stage: when user ujInto a communication base stationiThe server of the communication operator records the relevant information of the user: the anonymized user ID number, the accessed communication base station position information and the access time t;
s2, data aggregation stage: the server carries out statistical aggregation on the relevant information of the users collected within a period of time according to a certain time interval, extracts the position information of the communication base station with the most frequent access of each user within each time period, and calculates the total number of the users contained in each communication base station within the time period
Figure GDA0002546640590000021
Figure GDA0002546640590000022
Represents the communication base station l at time tiI is more than or equal to 1 and less than or equal to m;
s3, data disturbance stage: time is divided into a day time zone, a night time zone and a late night time zone according to the movement characteristics of a human being, by
Figure GDA0002546640590000023
Calculating the distribution of the number of people in each base station in each time period
Figure GDA0002546640590000024
Reference differential privacy mechanism to data in these three time periods
Figure GDA0002546640590000025
Carrying out different disturbance treatments;
s4, data release stage: the communication operator will process the data
Figure GDA0002546640590000031
And releasing the message.
As one of the preferable embodiments of the present invention, the data aggregation stage of step S2 specifically includes the following operation flows:
(1) time period division: dividing according to different time intervals according to the requirement on the resolution of the user moving track, wherein the larger the time interval is, the smaller the resolution of the user moving track is; the smaller the time interval is, the more accurate the user movement track is;
(2) position extraction: after the time interval is divided, a user can have 1 or more access records of different communication base stations in a certain time interval, and at the moment, the base station which is most frequently accessed by the user in the communication base stations is extracted as the position accessed by the user in the time period;
(3) and (3) data statistics: through step (2), it can be known that the user can only access one communication base station under each time stamp, so that each base station (l) under the time stamp t is calculated1,l2,…lm) Number of visits to users in a coverage area
Figure GDA0002546640590000032
Thereby, the distribution of the number of users in the coverage area of each communication base station in a certain area, namely, the data set under the real-time condition can be obtained
Figure GDA0002546640590000033
As one of the preferable embodiments of the present invention, the data perturbation phase of step S3 specifically includes:
(1) in the daytime, the access position of the base station of the user is in a state of continuous updating and changing from 7 am to 6 pm; at this time the data sets D of adjacent time instantst-1And DtThe number of users in the data set D is different, and the data set D is obtained at each timetDirect introduction of Laplace mechanism satisfying differential privacy to add perturbation noise conforming to Laplace distribution
Figure GDA0002546640590000034
(2) In the night time period, setting the time from 6 pm to 11 pm, the base station access position of the user is moreThe new speed is reduced; at this time the data sets D of adjacent time instantst-1And DtThe distribution of the number of users still has difference, but the difference is in the case of large or small;
(3) in the late-night time period, from 11 pm to 7 pm, the access location of the base station of the user will not change, and the data set D at the adjacent time ist-1And DtThe user distribution of (2) is almost unchanged; the track privacy of the user can be exposed only in the moving process, and the original data is directly published at the moment.
In a preferred embodiment of the present invention, in order to ensure that the total number n of base stations does not change, only the first m-1 base stations are disturbed in the step (1), and Noise vectors introduced by the first m-1 base stations are Noisem-1=(x1,x2,…,xm-1) The noise added by the mth base station is the opposite number according to the noise added by the first m-1 base stations,
Figure GDA0002546640590000041
as one of preferable embodiments of the present invention, in the step (2): data set D at adjacent timet-1And DtIn the case of large user distribution difference, only the data set D at the current moment needs to be calculatedtAnd the data set D released at the previous momentt-1Comparing the distribution difference with a set fixed threshold value T, and judging whether to carry out disturbance processing on the current data set; thus only a part of the data set D need be addressedtAnd carrying out disturbance processing, and setting the number of the data sets D needing the disturbance processing as C.
As one of the preferable modes of the present invention, the following perturbation method is applied in the evening time period of the step (2), and the perturbation method comprises the following steps:
(1) and (3) privacy budget allocation: before disturbance processing is carried out on the data set D, firstly, the privacy budget is distributed, so that the privacy budget meets the requirement of differential privacy in the whole data disturbance processing stage; make ═ 11+21Is used for privacy budgeting in the decision process, an1=k,2Is a privacy budget used in the perturbation process on the data set D;
(2) a judging stage: in the process, a Laplace mechanism meeting the difference privacy is adopted to add disturbance noise vectors conforming to Laplace distribution
Figure GDA0002546640590000042
Calculating a noise threshold
Figure GDA0002546640590000043
Figure GDA0002546640590000044
And the data set of the current time
Figure GDA0002546640590000045
And the data set released at the previous moment
Figure GDA0002546640590000046
Noise value of distribution difference between
Figure GDA0002546640590000047
(3) A disturbance stage: for the
Figure GDA0002546640590000048
Introducing a privacy budget of
Figure GDA0002546640590000049
Adopts a Laplace mechanism satisfying difference privacy to add disturbance noise conforming to Laplace distribution
Figure GDA0002546640590000051
Noisem-1=(x1,x2,…,xm-1),
Figure GDA0002546640590000052
Computing a noisy data set
Figure GDA0002546640590000053
As one preferable aspect of the present invention, in the determination stage of the step (2): when in use
Figure GDA0002546640590000054
Then pair
Figure GDA0002546640590000055
Performing perturbation processing, otherwise
Figure GDA0002546640590000056
Instead of distributing the user's location at the current moment, i.e. publishing
Figure GDA0002546640590000057
As one preferable aspect of the present invention, in the perturbation phase of the step (3): when the privacy budget is remained, adding the noise of all the remained privacy budgets at the last moment; when privacy budgeting2When exhausted, the subsequent data set will be published with the last distributed data set.
The invention also discloses a track privacy protection system of the mobile user based on the differential privacy, which is applied to the server and acts on m communication base stations L ═ L1,l2,…lmN users U ═ U1,u2,…unIn a moving scene, where liIt represents the position of the ith communication base station, i is more than or equal to 1 and less than or equal to m; u. ofjRepresenting the jth user, j is more than or equal to 1 and less than or equal to n, the system comprises the following modules:
an initialization module: when user ujInto a communication base stationiFor recording the relevant information of the user by the server of the communication carrier: the anonymized user ID number, the accessed communication base station position information and the access time t;
a data aggregation module: the server carries out statistical aggregation on the user related information collected in a period of time according to a certain time intervalAnd combining, extracting the position information of the communication base station which is most frequently visited by each user in each time period, and calculating the total number of the users in each communication base station in the time period
Figure GDA0002546640590000058
Figure GDA0002546640590000059
Represents the communication base station l at time tiI is more than or equal to 1 and less than or equal to m;
a data perturbation module: for dividing time into a day time zone, a night time zone and a late night time zone according to the movement characteristics of human beings, and passing through
Figure GDA00025466405900000510
Calculating the distribution of the number of people in each base station in each time period
Figure GDA0002546640590000061
Reference differential privacy mechanism to data in these three time periods
Figure GDA0002546640590000062
Carrying out different disturbance treatments;
the data release module: for processing data of communication operator
Figure GDA0002546640590000063
And releasing the message.
Compared with the prior art, the invention has the advantages that:
(1) according to the invention, the number of the users covered by the communication base station is disturbed, and the disturbance is carried out on the position of each user in the case of other schemes, so that the use of privacy budget is greatly reduced, the usability of data is ensured, and the calculation cost is reduced;
(2) the invention provides the aggregated data information based on the communication base station for the first time, the moving track of the user is protected by using a differential privacy mechanism, and the introduced noise follows the principle that the total number of people covered by the communication base station is unchanged, so that the difficulty of recovering the user track by the adversary is increased;
(3) the invention relates to a method for acquiring a data set D according to adjacent moments in different time periodst-1And DtThe time is divided into a day time period, a night time period and a late night time period, and different disturbance processing is carried out on the three time periods by using a differential privacy mechanism.
Drawings
Fig. 1 is a schematic view of an application scenario of embodiment 1 of the present invention;
fig. 2 is a flowchart of main implementation steps of a track privacy protection method for a mobile user based on differential privacy according to embodiment 1 of the present invention;
fig. 3 is a diagram of an aggregated data distribution of the number of people covered by a communication base station at a certain time according to embodiment 1 of the present invention.
Detailed Description
The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.
Example 1
Referring to fig. 1-3: in order to protect the track privacy of the user of the mobile device during the communication process from being leaked, the track privacy protection method for the mobile user based on the differential privacy of the embodiment is applied to the server and acts on m communication base stations L ═ L1,l2,…lmN users U ═ U1,u2,…unIn a moving scene, where liIt represents the position of the ith communication base station, i is more than or equal to 1 and less than or equal to m; u. ofjRepresents the jth user, j is more than or equal to 1 and less than or equal to n, when the user ujAt a base station liWill pass through base station liTransmitting signals to obtain related request services; the server in the background will collect the user ujThe method comprises the following steps:
s1, initialization stage: when user ujInto a communication base stationiThe server of the communication operator records the relevant information of the user: the anonymized user ID number, the accessed communication base station position information and the access time t; each user can be in the coverage range of a certain divided communication base station at any time, when the user needs to request service, the user can communicate with the base station, and the time base station can record the personal identity information, the position information and the time for accessing the communication base station and send the information to a background server;
s2, data aggregation stage: the server carries out statistical aggregation on the relevant information of the users collected within a period of time according to a certain time interval, extracts the position information of the communication base station with the most frequent access of each user within each time period, and calculates the total number of the users contained in each communication base station within the time period
Figure GDA0002546640590000071
Figure GDA0002546640590000072
Represents the communication base station l at time tiI is more than or equal to 1 and less than or equal to m; the server performs statistical aggregation on the user information collected in a period of time according to a certain time interval, extracts the position information of the communication base station which is most frequently visited by each user in each period of time, and calculates the total number of users in each communication base station in the period of time;
s3, data disturbance stage: time is divided into a day time zone, a night time zone and a late night time zone according to the movement characteristics of a human being, by
Figure GDA0002546640590000081
Calculating the distribution of the number of people in each base station in each time period
Figure GDA0002546640590000082
Reference differential privacy mechanism to data in these three time periods
Figure GDA0002546640590000083
Carrying out different disturbance treatments;
s4, data release stage: the communication operator will process the data
Figure GDA0002546640590000084
And releasing the message.
Further, the data aggregation stage of step S2 specifically includes the following operation flows:
(1) time period division: dividing according to different time intervals according to the requirement on the resolution of the user moving track, wherein the larger the time interval is, the smaller the resolution of the user moving track is; the smaller the time interval is, the more accurate the user movement track is;
(2) position extraction: after the time interval is divided, a user can have 1 or more access records of different communication base stations in a certain time interval, and at the moment, the base station which is most frequently accessed by the user in the communication base stations is extracted as the position accessed by the user in the time period;
(3) and (3) data statistics: through step (2), it can be known that the user can only access one communication base station under each time stamp, so that each base station (l) under the time stamp t is calculated1,l2,…lm) Number of visits to users in a coverage area
Figure GDA0002546640590000085
Thereby, the distribution of the number of users in the coverage area of each communication base station in a certain area, namely, the data set under the real-time condition can be obtained
Figure GDA0002546640590000086
Further, the data perturbation stage of step S3 specifically includes:
(1) in the daytime, the access position of the base station of the user is in a state of continuous updating and changing from 7 am to 6 pm; at this time the data sets D of adjacent time instantst-1And DtThe number of users in the data set D is different, and the data set D is obtained at each timetDirect introduction of Laplace mechanism satisfying differential privacy to add perturbation noise conforming to Laplace distribution
Figure GDA0002546640590000087
(2) In the evening time period, setting the time from 6 pm to 11 pm, the updating speed of the base station visiting location of the user is reduced; at this time the data sets D of adjacent time instantst-1And DtThe distribution of the number of users still has difference, but the difference is in the case of large or small;
(3) in the late-night time period, from 11 pm to 7 pm, the access location of the base station of the user will not change, and the data set D at the adjacent time ist-1And DtThe user distribution of (2) is almost unchanged; the track privacy of the user can be exposed only in the moving process, and the original data is directly published at the moment.
Further, in order to meet the requirement that the total number n of the base stations does not change, only the first m-1 base stations are subjected to disturbance processing in the step (1), and Noise vectors introduced by the first m-1 base stations are Noisem-1=(x1,x2,…,xm-1) The noise added by the mth base station is the opposite number according to the noise added by the first m-1 base stations,
Figure GDA0002546640590000091
further, in the step (2): data set D at adjacent timet-1And DtUnder the condition of large user distribution difference, the position track privacy of the user is threatened, and only the data set D at the current moment needs to be calculatedtAnd the data set D released at the previous momentt-1Comparing the distribution difference with a set fixed threshold value T, and judging whether to carry out disturbance processing on the current data set; thus only a part of the data set D need be addressedtAnd carrying out disturbance processing, and setting the number of the data sets D needing the disturbance processing as C.
Further, the following perturbation method is applied in the evening time period of the step (2), and the perturbation method comprises the following steps:
(1) and (3) privacy budget allocation: before disturbance processing is carried out on the data set D, firstly, the privacy budget is distributed, so that the privacy budget meets the requirement of differential privacy in the whole data disturbance processing stage; make ═ 11+21Is used for privacy budgeting in the decision process, an1=k,2Is a privacy budget used in the perturbation process on the data set D;
(2) a judging stage: in the process, a Laplace mechanism meeting the difference privacy is adopted to add disturbance noise vectors conforming to Laplace distribution
Figure GDA0002546640590000101
Calculating a noise threshold
Figure GDA0002546640590000102
Figure GDA0002546640590000103
And the data set of the current time
Figure GDA0002546640590000104
And the data set released at the previous moment
Figure GDA0002546640590000105
Noise value of distribution difference between
Figure GDA0002546640590000106
(3) A disturbance stage: for the
Figure GDA0002546640590000107
Introducing a privacy budget of
Figure GDA0002546640590000108
Adopts a Laplace mechanism satisfying the difference privacy to add the disturbance conforming to the Laplace distributionDynamic noise
Figure GDA0002546640590000109
Noisem-1=(x1,x2,…,xm-1),
Figure GDA00025466405900001010
Computing a noisy data set
Figure GDA00025466405900001011
Further, in the determination phase of the step (2): when in use
Figure GDA00025466405900001012
Then pair
Figure GDA00025466405900001013
Performing perturbation processing, otherwise
Figure GDA00025466405900001014
Instead of distributing the user's location at the current moment, i.e. publishing
Figure GDA00025466405900001015
Further, in the perturbation phase of step (3): when the privacy budget is remained, adding the noise of all the remained privacy budgets at the last moment; when privacy budgeting2When exhausted, the subsequent data set will be published with the last distributed data set.
The embodiment also discloses a track privacy protection system of the mobile user based on the differential privacy, which is applied to the server and acts on m communication base stations L ═ L1,l2,…lmN users U ═ U1,u2,…unIn a moving scene, where liIt represents the position of the ith communication base station, i is more than or equal to 1 and less than or equal to m; u. ofjRepresenting the jth user, j is more than or equal to 1 and less than or equal to n, the system comprises the following modules:
an initialization module: when user ujInto a communication base stationiFor recording the relevant information of the user by the server of the communication carrier: the anonymized user ID number, the accessed communication base station position information and the access time t;
a data aggregation module: the server carries out statistical aggregation on the relevant information of the users collected within a period of time according to a certain time interval, extracts the position information of the communication base station with the most frequent access of each user within each time period, and calculates the total number of the users contained in each communication base station within the time period
Figure GDA0002546640590000111
Figure GDA0002546640590000112
Represents the communication base station l at time tiI is more than or equal to 1 and less than or equal to m;
a data perturbation module: for dividing time into a day time zone, a night time zone and a late night time zone according to the movement characteristics of human beings, and passing through
Figure GDA0002546640590000113
Calculating the distribution of the number of people in each base station in each time period
Figure GDA0002546640590000114
Reference differential privacy mechanism to data in these three time periods
Figure GDA0002546640590000115
Carrying out different disturbance treatments;
the data release module: for processing data of communication operator
Figure GDA0002546640590000116
And releasing the message.
For better understanding, as shown in fig. 1: suppose there are 7 users U ═ U1,u2,u3,u4,u5,u6,u7L ═ 3 communication base stations, }, L1,l2,l3}; when each user sends a request service through the base station, the communication base station sends the relevant information of the user to the server and stores the information in the server in a recorded manner; for example, user u1At base station l1When requesting service in the coverage area, the server end stores a record
Figure GDA0002546640590000117
The server side carries out aggregation statistics according to the collected data records, and at t1In time range, the server discovers the user u1There are 3 access records, respectively
Figure GDA0002546640590000118
Figure GDA0002546640590000119
Representing user u1Access a communication base station l1Twice, access the communication base station l2Once, using the base station l with the largest number of accesses1As user u1At t1The location of the base station visited within the time period. Accordingly, the other users are respectively found out at t1Location of base stations visited during a time period, generating a data set
Figure GDA00025466405900001110
Generating t by the same principle2Data set under time
Figure GDA00025466405900001111
t3Data set under time
Figure GDA00025466405900001112
Figure GDA00025466405900001113
Thereby obtaining a data set
Figure GDA00025466405900001114
If it is not
Figure GDA00025466405900001115
For datasets during the daytime, dataset D at each timetDirect introduction of Laplace mechanism satisfying differential privacy to add perturbation noise conforming to Laplace distribution
Figure GDA00025466405900001116
In order to meet the requirement that the total number of the base stations does not change, only the first 2 base stations are subjected to disturbance treatment, and Noise vectors introduced by the previous base stations are Noise2=(x1,x2) That is, (1,2), the noise added by the mth base station will be the inverse of the noise added by the first 2 base stations, x3=-x1-x2=-3。
If it is not
Figure GDA0002546640590000121
For a data set at a period of time in the evening, then for a data set at the first time
Figure GDA0002546640590000122
Introducing Laplace mechanism to carry out disturbance processing and adding noise
Figure GDA0002546640590000123
Noise2=(x1,x2)=(1,0),x3Is-1, available as
Figure GDA0002546640590000124
Figure GDA0002546640590000125
For the next time t2Data set of
Figure GDA0002546640590000126
And judging whether disturbance processing is needed or not. Respectively calculating the noise value of the fixed threshold T, an
Figure GDA0002546640590000127
And
Figure GDA0002546640590000128
the noise value of the inter-distance needs to introduce a noise vector generated by a Laplace mechanism
Figure GDA0002546640590000129
Noise threshold
Figure GDA00025466405900001210
Noise distance
Figure GDA00025466405900001211
When in use
Figure GDA00025466405900001212
Then, the Laplace mechanism of the second layer is added into the noise vector
Figure GDA00025466405900001213
Can obtain
Figure GDA00025466405900001214
If it is not
Figure GDA00025466405900001215
Then
Figure GDA00025466405900001216
The same method can obtain
Figure GDA00025466405900001217
If it is not
Figure GDA00025466405900001218
For a data set in a late-night time period, no perturbation process is performed.
In the above example, during the daytime, direct noise addition is performed directly using laplacian noise satisfying the differential privacy mechanism. And in the evening time period, useThe Laplace mechanism in the two-time difference privacy is adopted, in the Laplace mechanism of the first layer, after disturbance processing is carried out on the fixed threshold T, comparison is carried out to judge whether the Laplace mechanism of the second layer is executed or not, and the current data set D is subjected toiAnd (6) adding noise. The first layer Laplace mechanism satisfies according to Laplace mechanism differential privacy1Differential privacy, second layer Laplace mechanism satisfied2Differential privacy. According to the combination property, the whole mechanism satisfies +1+2The differential privacy of (1) is that deleting the base station access record of any user has no influence on the finally issued data set, so that an attacker cannot recover the movement track of the user according to the issuing result, and the track privacy protection of the mobile user in the process of issuing the aggregated data is realized.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (7)

1. A track privacy protection method of a mobile user based on differential privacy is characterized in that the method is applied to a server and acts on m communication base stations L ═ L1,l2,...lmN users U ═ U1,u2,...unIn a moving scene, where liExpressed as the position of the ith communication base station, i is more than or equal to 1 and less than or equal to m, ujRepresenting the jth user, j is more than or equal to 1 and less than or equal to n, the method comprises the following steps:
s1, initialization stage: when user ujInto a communication base stationiThe server of the communication operator records the relevant information of the user: the anonymized user ID number, the accessed communication base station position information and the access time t;
s2, data aggregation stage: the server carries out statistical aggregation on the relevant information of the users collected in a period of time according to a certain time interval, and extracts the position information of the communication base station with the most frequent access of each user in each time intervalAnd calculating the total number of users in each communication base station at the time interval
Figure FDA0002546640580000011
Figure FDA0002546640580000012
Represents the communication base station l at time tiI is more than or equal to 1 and less than or equal to m;
s3, data disturbance stage: time is divided into a day time zone, a night time zone and a late night time zone according to the movement characteristics of a human being, by
Figure FDA0002546640580000013
Calculating the distribution of the number of people in each base station in each time period
Figure FDA0002546640580000014
Reference differential privacy mechanism to data in these three time periods
Figure FDA0002546640580000015
Carrying out different disturbance treatments;
s4, data release stage: the communication operator will process the data
Figure FDA0002546640580000016
Releasing the product;
the data aggregation stage of step S2 specifically includes the following operation flows:
(1) time period division: dividing according to different time intervals according to the requirement on the resolution of the user moving track, wherein the larger the time interval is, the smaller the resolution of the user moving track is; the smaller the time interval is, the more accurate the user movement track is;
(2) position extraction: after the time interval is divided, a user can have 1 or more access records of different communication base stations in a certain time interval, and at the moment, the base station which is most frequently accessed by the user in the communication base stations is extracted as the position accessed by the user in the time period;
(3) and (3) data statistics: through step (2), it can be known that the user can only access one communication base station under each time stamp, so that each base station (l) under the time stamp t is calculated1,l2,...lm) Number of visits to users in a coverage area
Figure FDA0002546640580000021
Thereby, the distribution of the number of users in the coverage area of each communication base station in a certain area, namely, the data set under the real-time condition can be obtained
Figure FDA0002546640580000022
The data perturbation stage of step S3 specifically includes:
(1) in the daytime, the access position of the base station of the user is in a state of continuous updating and changing from 7 am to 6 pm; at this time the data sets D of adjacent time instantst-1And DtThe number of users in the data set D is different, and the data set D is obtained at each timetAdding disturbance noise conforming to Laplace distribution by directly introducing Laplace mechanism satisfying a difference privacy
Figure FDA0002546640580000023
(2) In the evening time period, setting the time from 6 pm to 11 pm, the updating speed of the base station visiting location of the user is reduced; at this time the data sets D of adjacent time instantst-1And DtThe distribution of the number of users still has difference, but the difference is in the case of large or small;
(3) in the late-night time period, from 11 pm to 7 pm, the access location of the base station of the user will not change, and the data set D at the adjacent time ist-1And DtThe user distribution of (2) is almost unchanged; the track privacy of the user can be exposed only in the moving process, and the original data is directly published at the moment.
2. The privacy-difference-based track privacy protection method for mobile users according to claim 1, wherein in order to ensure that the total number n of base stations does not change, only the first m-1 base stations are disturbed in the step (1), and Noise vectors introduced by the first m-1 base stations are Noisem-1=(x1,x2,...,xm-1) The noise added by the mth base station is the opposite number according to the noise added by the first m-1 base stations,
Figure FDA0002546640580000024
3. the differential privacy-based track privacy protection method for mobile users according to claim 1, wherein in the step (2): data set D at adjacent timet-1And DtIn the case of large user distribution difference, only the data set D at the current moment needs to be calculatedtAnd the data set D released at the previous momentt-1Comparing the distribution difference with a set fixed threshold value T, and judging whether to carry out disturbance processing on the current data set; thus only a part of the data set D need be addressedtAnd carrying out disturbance processing, and setting the number of the data sets D needing the disturbance processing as C.
4. The privacy-difference-privacy-based track privacy protection method for mobile users according to claim 1, wherein the following perturbation method is applied in the evening time period of the step (2), and comprises the following steps:
(1) and (3) privacy budget allocation: before disturbance processing is carried out on the data set D, firstly, the privacy budget is distributed, so that the privacy budget meets the requirement of differential privacy in the whole data disturbance processing stage; make ═ 11+21Is used for privacy budgeting in the decision process, an1=k,2Is a privacy budget used in the perturbation process on the data set D;
(2) a judging stage: in the process, it is assumed to be fullLaplace mechanism of sufficient difference privacy to add disturbance noise vector conforming to Laplace distribution
Figure FDA0002546640580000031
Calculating a noise threshold
Figure FDA0002546640580000032
Figure FDA0002546640580000033
And the data set of the current time
Figure FDA0002546640580000034
And the data set released at the previous moment
Figure FDA0002546640580000035
Noise value of distribution difference between
Figure FDA0002546640580000036
(3) A disturbance stage: for the
Figure FDA0002546640580000037
Introducing a privacy budget of
Figure FDA0002546640580000038
Adopts a Laplace mechanism satisfying difference privacy to add disturbance noise conforming to Laplace distribution
Figure FDA0002546640580000039
Noisem-1=(x1,x2,...,xm-1),
Figure FDA00025466405800000310
Computing a noisy data set
Figure FDA00025466405800000311
5. The differential privacy-based track privacy protection method for mobile users according to claim 4, wherein in the determination phase of step (2): when in use
Figure FDA00025466405800000312
Then pair
Figure FDA00025466405800000313
Performing perturbation processing, otherwise
Figure FDA0002546640580000041
Instead of distributing the user's location at the current moment, i.e. publishing
Figure FDA0002546640580000042
6. The differential privacy-based track privacy protection method for mobile users according to claim 4, wherein in the perturbation phase of the step (3): when the privacy budget is remained, adding the noise of all the remained privacy budgets at the last moment; when privacy budgeting2When exhausted, the subsequent data set will be published with the last distributed data set.
7. A mobile user trajectory privacy protection system based on score privacy using the method of any one of claims 1 to 6, wherein the system is applied to a server and acts on m communication base stations L ═ { L [ ]1,l2,...lmN users U ═ U1,u2,...unIn a moving scene, where liIt represents the position of the ith communication base station, i is more than or equal to 1 and less than or equal to m; u. ofjRepresenting the jth user, j is more than or equal to 1 and less than or equal to n, the system comprises the following modules:
an initialization module: when user ujInto a communication base stationiFor recording the relevant information of the user by the server of the communication carrier: the anonymized user ID number, the accessed communication base station position information and the access time t;
a data aggregation module: the server carries out statistical aggregation on the relevant information of the users collected within a period of time according to a certain time interval, extracts the position information of the communication base station with the most frequent access of each user within each time period, and calculates the total number of the users contained in each communication base station within the time period
Figure FDA0002546640580000043
Figure FDA0002546640580000044
I is more than or equal to 1 and less than or equal to m, and represents the total number of users covered by the communication base station li under the time t;
a data perturbation module: for dividing time into a day time zone, a night time zone and a late night time zone according to the movement characteristics of human beings, and passing through
Figure FDA0002546640580000045
Calculating the distribution of the number of people in each base station in each time period
Figure FDA0002546640580000046
Reference differential privacy mechanism to data in these three time periods
Figure FDA0002546640580000047
Carrying out different disturbance treatments;
the data release module: for processing data of communication operator
Figure FDA0002546640580000048
And releasing the message.
CN201810916399.6A 2018-08-13 2018-08-13 Track privacy protection method and system for mobile user based on differential privacy Active CN109104696B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810916399.6A CN109104696B (en) 2018-08-13 2018-08-13 Track privacy protection method and system for mobile user based on differential privacy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810916399.6A CN109104696B (en) 2018-08-13 2018-08-13 Track privacy protection method and system for mobile user based on differential privacy

Publications (2)

Publication Number Publication Date
CN109104696A CN109104696A (en) 2018-12-28
CN109104696B true CN109104696B (en) 2020-10-02

Family

ID=64849638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810916399.6A Active CN109104696B (en) 2018-08-13 2018-08-13 Track privacy protection method and system for mobile user based on differential privacy

Country Status (1)

Country Link
CN (1) CN109104696B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670341A (en) * 2018-12-29 2019-04-23 中山大学 The method for secret protection that a kind of pair of structural data and semi-structured data combine
CN110602631B (en) * 2019-06-11 2021-03-05 东华大学 Processing method and processing device for location data for resisting conjecture attack in LBS
CN110516476B (en) * 2019-08-31 2022-05-13 贵州大学 Geographical indistinguishable location privacy protection method based on frequent location classification
CN112580701B (en) * 2020-12-09 2022-07-12 哈尔滨理工大学 Mean value estimation method and device based on classification transformation disturbance mechanism
CN113207120A (en) * 2021-03-30 2021-08-03 郑州铁路职业技术学院 Differential privacy method for collecting user real-time position information in mobile crowd sensing
CN115017440B (en) * 2022-05-31 2024-05-07 湖南大学 Aggregation position data release method based on differential privacy protection

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104394509B (en) * 2014-11-21 2018-10-30 西安交通大学 A kind of efficient difference disturbance location intimacy protection system and method

Also Published As

Publication number Publication date
CN109104696A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
CN109104696B (en) Track privacy protection method and system for mobile user based on differential privacy
Zhang et al. FASTGNN: A topological information protected federated learning approach for traffic speed forecasting
Cheng et al. Mobile big data: The fuel for data-driven wireless
Abdulkareem et al. A review of fog computing and machine learning: concepts, applications, challenges, and open issues
Arain et al. Location monitoring approach: multiple mix-zones with location privacy protection based on traffic flow over road networks
Zhao et al. A survey of local differential privacy for securing internet of vehicles
CN103957501B (en) Long-time request position privacy protection method based on road network prediction
CN110602631B (en) Processing method and processing device for location data for resisting conjecture attack in LBS
Kim et al. Deep learning-based privacy-preserving framework for synthetic trajectory generation
Pyrgelis et al. Privacy-friendly mobility analytics using aggregate location data
Fang et al. Privatebus: Privacy identification and protection in large-scale bus WiFi systems
WO2020223908A1 (en) Privacy management
Han et al. Privacy Protection Algorithm for the Internet of Vehicles Based on Local Differential Privacy and Game Model.
Dawar Enhancing Wireless Security and Privacy: A 2-Way Identity Authentication Method for 5G Networks
Saravanan et al. A novel approach of privacy protection of mobile users while using location-based services applications
Al-Dhubhani et al. A framework for preserving location privacy for continuous queries
Alharthi et al. Protecting location privacy for crowd workers in spatial crowdsourcing using a novel dummy-based mechanism
He et al. Cross-screen Tracking Method based on User Behavior Data
CN108111968A (en) It is a kind of based on extensive location privacy protection method
Swedha et al. LSTM network for hotspot prediction in traffic density of cellular network
Chen et al. Differentially private aggregated mobility data publication using moving characteristics
Narayanan et al. Mining spatial-temporal geomobile data via feature distributional similarity graph
Yan et al. LDPORR: A localized location privacy protection method based on optimized random response
Errounda Adaptive Differential Privacy for Decentralized Mobility Data Sharing and Forecasting
Ding et al. Migration Privacy Protection Based on Scheduling Algorithm for Online Car-Hailing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant