CN112818402B - Method for realizing k anonymity of track data release based on point density segmentation track - Google Patents

Method for realizing k anonymity of track data release based on point density segmentation track Download PDF

Info

Publication number
CN112818402B
CN112818402B CN202110213797.3A CN202110213797A CN112818402B CN 112818402 B CN112818402 B CN 112818402B CN 202110213797 A CN202110213797 A CN 202110213797A CN 112818402 B CN112818402 B CN 112818402B
Authority
CN
China
Prior art keywords
track
data set
virtual
trv
set model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110213797.3A
Other languages
Chinese (zh)
Other versions
CN112818402A (en
Inventor
徐红云
杨丰源
陆涛
余宛书
熊镔
时浩南
孙雨虹
张紫怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110213797.3A priority Critical patent/CN112818402B/en
Publication of CN112818402A publication Critical patent/CN112818402A/en
Application granted granted Critical
Publication of CN112818402B publication Critical patent/CN112818402B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Remote Sensing (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for realizing k anonymity of track data distribution based on a point density segmentation track, which comprises the following steps of: 1) acquiring basic track data and establishing a track data set model; 2) establishing a DGH tree of a track loss model; 3) adding virtual points in the track data set model to generate a track data set model containing the virtual points and a virtual point mark data set model; 4) clustering the track data set model containing the virtual points, marking the clustering center to which each point belongs, and generating a marked data set model; 5) traversing the track data set model, and segmenting the track through the marked data set model to generate a segmented track data set model; 6) and calculating loss of the segmented data set model by adopting a dynamic sequence alignment algorithm, and clustering based on information loss by using an iterative track k anonymous clustering algorithm. The method segments the track based on the point density of the track data set, and reduces information loss caused in the k anonymization process.

Description

Method for realizing k anonymity of track data release based on point density segmentation track
Technical Field
The invention relates to the technical field of track data privacy protection and release, in particular to a method for realizing k anonymity of track data release based on point density segmentation tracks.
Background
Today, with the rapid development of the technology level, mobile devices have become widely spread in people, and people widely collect movement trajectory data through cellular networks and applications provided by mobile devices, which are required when mobile devices are networked. The rapid development of mass storage technology and data processing technology makes the external public distribution of the track data extremely convenient.
The publicly released track data not only plays an important role in research of scientific research organizations, but also is very important for reflecting the transparency of the track data for departments such as operators, governments and the like. However, such trace data may also be utilized by malicious attackers.
In track data distribution, the privacy protection target is mainly the corresponding relation between sensitive data in a user track and a user individual. In order to ensure that private data of people are not attacked or leaked under the condition of publishing track data publicly, different organizations use a plurality of methods to process the track data before publishing the data. For the privacy protection problem of track data, a large amount of research is carried out by scholars, and some privacy protection methods are proposed, which are specifically as follows:
1) shaham et al, Privacy Preserving Location Data Publishing: a Machine Learning Approach uses a heuristic algorithm and a k-means algorithm to realize the clustering of the tracks respectively, proposes to use a regional generalized hierarchical tree to process the track data, calculates the loss according to the regional generalized hierarchical tree, and performs the generalized processing of the track data so as to realize k ' -anonymity (k is added with a ' ″ ' used for being distinguished from k in the k-means). However, this method may cause a problem of generalization loss of some points on the track in the practical application process.
2) An article of "Anonymization of longitudinal electronic medical records" of Tamersoy et al is based on a generalization concept, and a heuristic method is adopted to realize k-anonymity of a data set, but the algorithm has the problem of great information loss while realizing data Anonymization.
3) An article by Marco et al, Towards Privacy-forecasting Publishing of space technical project Data, proposes to use a k-merge algorithm to solve the problem of effective generalization encountered in the process of anonymization of space-time Trajectory Data sets, and proposes a method capable of realizing k-anonymity based on the concept of k-merge. The method realizes track anonymity while protecting the privacy of the user from being attacked, but can cause a large amount of information loss.
Although the method can protect the privacy of the user, the algorithm can cause great information loss in the implementation process.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a method for realizing k anonymity of track data distribution based on point density segmentation tracks, solves the problem of large information loss of the existing algorithm to a certain extent, and reduces the information loss under the condition of protecting the privacy of users.
In order to realize the purpose, the technical scheme provided by the invention is as follows: a method for realizing k anonymity of track data distribution based on point density segmentation tracks comprises the following steps:
1) acquiring basic track data including longitude and latitude information of a track and a time sequence relation of a track point set, and establishing a track data set model T;
2) building a DGH (regional generalization hierarchy) tree of a track loss model by utilizing longitude and latitude information of a track;
3) adding virtual points between adjacent points of each track in the track data set model T to generate the track data set model T containing the virtual points virtual And a virtual point marker dataset model virtual;
4) will contain the locus of virtual pointsData set model T virtual Clustering is carried out on the whole point set, the clustering center of each point is marked, and a marked data set model mark is generated;
5) traversing each track in the track data set model T, judging whether adjacent points of each track belong to the same clustering center by marking a data set model mark, if not, segmenting, if so, reserving, and generating a segmented track data set model T partition
6) Model T for segmented trajectory dataset partition And (3) calculating loss by adopting a dynamic sequence alignment algorithm, clustering based on information loss by using an iterative track k anonymous clustering algorithm, and generating a k anonymous data set serving as a data set for data distribution.
In step 1), the trajectory data set model T is defined as follows:
definition 1, trajectory data set model: t ═ tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing the ith trace, i ═ 1, 2, 3, …, n; tr is i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j J-th point representing the trajectory tri, j being 1, 2, 3, …, m; p is a radical of j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trajectory tr i Midpoint p j Longitude and latitude of (c).
In step 2), the maximum value and the minimum value of longitude and latitude are respectively solved through the longitude and latitude information of the track to obtain the area range, then the area is uniformly divided, and a track loss model DGH (regional generalized hierarchy) tree is established, which comprises the following steps:
2.1) respectively solving the maximum value and the minimum value of the longitude and the latitude through the longitude and latitude information of the track, and defining a rectangular area P;
2.2) for the rectangular region P, dividing the rectangular region P into Nx sections with equal length in the transverse direction and Ny sections with equal length in the longitudinal direction;
2.3) respectively establishing transverse and longitudinal DGH (regional generalized hierarchical) trees through the Nx, the Ny and the track data set model T;
wherein, the DGH (regional generalization hierarchy) tree is defined as follows:
definition 2, regional generalized hierarchical tree: the position attribute in the map is divided into a plurality of equally-long cells, the cells are used as leaf nodes to establish a full binary tree, and if the number of the leaf nodes is not enough to fill the bottom layer of the binary tree, a plurality of invalid points are added for filling.
In step 3), traversing each track in the track data set model T, adding a virtual point between each pair of adjacent points in the track, and generating a track data set model T containing the virtual points virtual And a virtual point marker dataset model virtual; trajectory data set model T and trajectory data set model T containing virtual points virtual The virtual point mark data set model virtual and the virtual points are defined as follows:
definition 3, trajectory data set model: t ═ tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing the ith trajectory, i ═ 1, 2, 3, …, n; tr i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j Representative locus tr i J is 1, 2, 3, …, m; p is a radical of formula j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trace tr i Midpoint p j Longitude and latitude of (c);
defining 4, a track data set model containing virtual points: t is a unit of virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Wherein trv i A set of points representing the ith trajectory containing the virtual points; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv j Representative trajectory trv i The jth point of (1); pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
definition 5, virtual point marking data set model: virtuall ═ vir [ -vir 1 ,vir 2 ,vir 3 ,...,vir n ]Wherein vir i Is a trajectory data set model T containing virtual points virtual A virtual point mark list corresponding to the ith track; vir i =[q 1 ,q 2 ,q 3 ,...q m ],q j Represents vir i The jth point; q. q.s j When the value of (b) is 0, it represents a true point, q j When the value of (1) represents a virtual point; virtual point mark data set model virtual and track data set model T containing virtual points virtual There is a one-to-one mapping in position, vir i Corresponds to trv i ,q j Corresponding to pv j Wherein q is j ∈vir i ,pv j ∈trv i And q is j And pv j Each represents vir i And trv i The jth number in (1);
definition 6, virtual points: and for the line segment formed between adjacent points of a certain track, adding a virtual point from one point to the line segment along the line segment at a fixed distance, so that the line segments with different lengths have the same influence on the point density of the area where the line segments are located.
In step 4), a trajectory data set model T containing virtual points is formed virtual Clustering is carried out on the whole point set, generated clustering centers are numbered, and the number of the clustering center to which each point belongs is recorded by using a mark data set model mark, and the method comprises the following steps:
4.1) model T of trajectory data set containing virtual points virtual Clustering is carried out by regarding the whole point set, a clustering center is generated, the clustering center is numbered, and a track data set model T containing virtual points is recorded virtual The number of the cluster center corresponding to each point in the cluster;
4.2) traversing the trajectory data set model T containing the virtual points virtual Judging whether the virtual point is a virtual point or not through a virtual point marking data set model virtual, and recording the number of the clustering center to which the mark data set model mark belongs for the real point; including virtualPseudo-point trajectory dataset model T virtual The mark data set model mark is defined as follows:
defining 7, a track data set model containing virtual points: t is virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Wherein trv i A set of points representing the ith trajectory containing virtual points, where i is 1, 2, 3, …, n; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv j Representative trajectory trv i Wherein j is 1, 2, 3, …, m; pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
definition 8: labeling data set model: mark ═ mark [ [ mark 1 ,mar 2 ,mar 3 ,...,mar n ]Wherein mar i The method comprises the steps that a virtual point mark list corresponding to the ith track in a mark data set model mark containing a virtual point is obtained; mar i =[z 1 ,z 2 ,z 3 ,...,z m ],z j Represents mar i The jth point in (1); mark data set model mark and track data set model T containing virtual point virtual There is a one-to-one mapping relationship on the locations, mar i Corresponds to trv i ,z j Corresponding to pv j Wherein z is j ∈mar i ,pv j ∈trv i And z is j And pv j Respectively represent mar i And trv i The j-th number in (2).
In step 5), judging whether the cluster center numbers of the adjacent points of each track in the track data set model T are the same or not by combining the track data set model T and the mark data set model mark, if the cluster center numbers are different, segmenting the track, and if the cluster center numbers are different, keeping the track unchanged, and generating a segmented track data set model T partition (ii) a Wherein the track data set model T marks the data set model mark, the segmented track data set model T partition The definition is as follows:
definition 9, trajectory data set model: t ═ T[tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing the ith trace, where i is 1, 2, 3, …, n; tr i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j Representative locus tr i Wherein j is 1, 2, 3, …, m; p is a radical of j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trajectory tr i Midpoint p j Longitude and latitude of (c);
definition 10: labeling the data set model: mark ═ mark [ [ mark 1 ,mar 2 ,mar 3 ,...,mar n ]Wherein mar i The method comprises the steps that a virtual point mark list corresponding to the ith track in a mark data set model mark containing a virtual point is formed; mar i =[z 1 ,z 2 ,z 3 ,...,z m ],z j Represents mar i The jth point in (1); mark data set model mark and track data set model T containing virtual point virtual There is a one-to-one mapping relationship on the locations, mar i Corresponds to trv i ,z j Corresponding to pv j Wherein z is j ∈mar i ,pv j ∈trv i And z is j And pv j Respectively represent mar i And trv i The jth number in (1); wherein, the track data set model T containing virtual points virtual The definition is as follows:
defining 11, a trajectory data set model containing virtual points: t is virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Wherein trv i A set of points representing the ith trajectory containing the virtual points; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv is j Representative trajectory trv i The jth point of (1); pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
defining 12, a segmented track data set model: t is a unit of partition =[trp 1 ,trp 2 ,trp 3 ,...,trp n ]Wherein trp i A set of points representing the ith segmented trajectory; trp i =[pp 1 ,pp 2 ,pp 3 ,...,pp m ]Wherein pp j Representative locus trp i The jth point of (1); pp (polypropylene) sheet j =(pp j .X,pp j .Y)∈trp i Wherein pp j .X、pp j Y is the trajectory trp i Midpoint pp j Longitude and latitude of (c).
In step 6), model T is applied to the segmented trajectory data set partition Clustering the tracks by using an iterative track k-anonymous clustering algorithm; segmented trajectory dataset model T partition The definitions of the information loss, the dynamic sequence alignment algorithm, the progressive sequence alignment algorithm and the iterative track k-anonymous clustering algorithm are as follows:
defining 13, a segmented track data set model: t is a unit of partition =[trp 1 ,trp 2 ,trp 3 ,...,trp n ]Wherein trp i A set of points representing the ith segmented trajectory, where i is 1, 2, 3, …, n; trp i =[pp 1 ,pp 2 ,pp 3 ,...,pp m ]Wherein pp j Representative locus trp i Wherein j is 1, 2, 3, …, m; pp (polypropylene) sheet j =(pp j .X,pp j .Y)∈trp i Wherein pp j .X、pp j Y is the trajectory trp i Midpoint pp j Longitude and latitude of (c);
definition 14, loss of information: node i Generalization to parent or higher node j The loss generated in time is calculated as the node j Generalization to node i The formula for the loss of information is:
Loss(node i ,node j )=log 2 (LF(node i ))-log 2 (LF(node j )) (1)
in the formula, Loss (node) i ,node j ) Is a node j Generalization to node i Resulting in loss of information, node i 、node j The number of two nodes, the LF () function will return the number of the bottommost leaf nodes owned by one node;
defining 15, dynamic sequence alignment algorithm: acting on any two tracks A, B, wherein the length A is a, the length B is B, and a dynamic programming method is adopted, and the recurrence equation is as follows:
dp[i][j]=min(dp[i-1][j-1]+Loss(node i ,node j ),
dp[i-1][j]+Loss(node i ,node root ),dp[i][j-1]+Loss(node j ,node root )) (2)
in the formula, a node i 、node j Is the number of two nodes, node root Represents a root node, dp][]Is a two-dimensional matrix with the size of (a +1) × (b +1), dp [ i][j]Representing a two-dimensional matrix dp [ alpha ], [ beta ] and a][]Number of rows (i +1) < th > and columns (j +1) < th > dp [ i-1 >][j-1]Representing a two-dimensional matrix dp [2 ]][]Number of ith row and jth column in middle, dp [ i-1][j]Representing a two-dimensional matrix dp [ alpha ], [ beta ] and a][]Number of i row j +1 column dp [ i ]][j-1]Representing a two-dimensional matrix dp [ alpha ], [ beta ] and a][]The number of the (i +1) th row and the (j) th column; a sequence alignment loss matrix dp [2 ] of (a +1) (b +1) can be obtained by a recurrence equation][]Finding a strategy which can make the loss of the two synthesized tracks be minimum and generating the synthesized track by backtracking the sequence alignment loss matrix;
definition 16, Progressive Sequence Alignment algorithm (PSA): selecting the longest track from a group of tracks as a basic track, then selecting one track from the group of tracks in any order, wherein each track can only be selected once and synthesized with the track, and the track synthesized by dynamic sequence alignment is a new basic track;
definition 17, Iterative track k anonymous Clustering algorithm (Iterative track Clustering): in order to realize k-anonymity of the track, firstly, the number of generated clusters is determined according to a k value, empty clusters are created, and then, the following operations are performed by traversing each cluster: randomly extracting a track from the track set and placing the track into a cluster as a first track, performing a dynamic sequence alignment algorithm on all the remaining tracks and the track one by one to calculate the information loss of track alignment, and selecting k-1 tracks with the minimum information loss to place the k-1 tracks into the cluster; and after completing the operation of adding the tracks into all the clusters, calculating the final generalization loss of each track cluster through a progressive sequence alignment algorithm, and generating a k-anonymous track data set.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention realizes the method for dividing the single track based on the integral point density of the data set for the first time, and can effectively integrate the data sets with different sources.
2. According to the invention, unnecessary information loss in the track anonymization process is reduced by cutting the long track and the zigzag track in the data set.
3. The invention solves the problems of low availability and low anonymity efficiency of anonymous medium and long track data of the track by a method for segmenting the track based on point density.
4. According to the method, the track data set is segmented through analysis of the point density, so that the phenomenon that tracks with overlarge length difference form a cluster in the track clustering process is avoided, and the running time of the clustering process is obviously reduced.
5. The method has wide use space in the field of data publishing, has strong adaptability to data sets from different sources, has high availability and short running time, and has wide prospect in the field of track privacy protection.
Drawings
FIG. 1 is a logic flow diagram of the method of the present invention.
FIG. 2(a) is a flow chart of the data preprocessing of the present invention.
Fig. 2(b) is a flow chart of the invention for building a DGH (regional generalized hierarchical) tree.
FIG. 2(c) is a flow chart showing the pretreatment process of the present invention.
FIG. 3 is a track clustering flow chart of the present invention.
FIG. 4 is a road network model graph constructed by experimentally selected trajectories according to the present invention.
FIG. 5 is a graph of the results of the segmentation of the k-anonymous data set from the experiments of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
As shown in fig. 1 to fig. 3, the method for realizing anonymity of k in track data distribution based on point density segmentation tracks in the embodiment uses auxiliary devices such as mobile phone application software, a vehicle-mounted signal machine, a road surface positioning signal machine, and a cloud server, and includes the following steps:
1) acquiring basic track data including longitude and latitude information of a track and a time sequence relation of a track point set, and establishing a track data set model T; the track data refers to 270 moving tracks intercepted by a user in a range of 1KM × 1KM (corresponding to a longitude of 116.300000-116.316000 ° and a latitude of 39.989500-40.000000 °) in beijing, which are obtained through the Geolife data set, and a road network model formed by the 270 moving tracks is shown in fig. 4.
The trajectory dataset model T is defined as follows:
defining 1, a track data set model: t ═ tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing the ith trajectory, i ═ 1, 2, 3, …, n; tr is i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j Representative locus tr i J is 1, 2, 3, …, m; p is a radical of formula j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trace tr i Midpoint p j Longitude and latitude of (c).
2) Building a DGH (regional generalization hierarchy) tree of a track loss model by utilizing longitude and latitude information of a track;
respectively solving the maximum value and the minimum value of longitude and latitude through the longitude and latitude information of the track to obtain the area range, then uniformly dividing the area and establishing a track loss model DGH (regional generalized hierarchy) tree, comprising the following steps:
2.1) respectively solving the maximum value and the minimum value of the longitude and the latitude through the longitude and latitude information of the track, and defining a rectangular area P;
2.2) for the rectangular region P, dividing the rectangular region P into Nx sections with equal length in the transverse direction and dividing the rectangular region P into Ny sections with equal length in the longitudinal direction;
2.3) respectively establishing transverse and longitudinal DGH (regional generalized hierarchical) trees by the Nx, Ny and the track data set model T;
wherein, the DGH (regional generalized hierarchical) tree is defined as follows:
definition 2, regional generalization hierarchical tree: the position attribute in the map is divided into a plurality of equally-long cells, the cells are used as leaf nodes to establish a full binary tree, and if the number of the leaf nodes is not enough to fill the bottom layer of the binary tree, a plurality of invalid points are added for filling;
according to the selected segmentation area number N and the latitude and longitude range x 1 ~x 2 And y 1 ~y 2 Calculating the height H of the DGH tree and the area size d:
H=log 2 N
Figure GDA0003582954710000101
Figure GDA0003582954710000102
calculated by adopting the steps, N is 100, x 1 =116.300000°,x 2 116.316000 ° corresponding to a latitude of y 1 =39.989500°,y 2 Calculation at 40.000000 ° yields H7, d x =0.000105,d y =0.000160。
3) Adding virtual points between adjacent points of each track in the track data set model T to generate a track data set model T containing the virtual points virtual And a virtual point marker dataset model virtual;
traversing each track in the track data set model T, adding virtual points between each pair of adjacent points in the track, and generating the track number containing the virtual pointsData set model T virtual And a virtual point marker dataset model virtual; trajectory data set model T and trajectory data set model T containing virtual points virtual The virtual point mark data set model virtual and the virtual points are defined as follows:
definition 3, trajectory data set model: t ═ tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing the ith trace, i ═ 1, 2, 3, …, n; tr is i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j Representative locus tr i J is 1, 2, 3, …, m; p is a radical of formula j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trajectory tr i Midpoint p j Longitude and latitude of (c);
defining 4, a track data set model containing virtual points: t is a unit of virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Therein trv i A set of points representing the ith trajectory containing the virtual points; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv is j Representative trajectory trv i The jth point of (1); pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
definition 5, virtual point marking dataset model: virtuall ═ vir [ -vir 1 ,vir 2 ,vir 3 ,...,vir n ]Wherein vir i Is a trajectory data set model T containing virtual points virtual A virtual point mark list corresponding to the ith track; vir i =[q 1 ,q 2 ,q 3 ,...q m ],q j Represents vir i The jth point; q. q of j When the value of (b) is 0, it represents a true point, q j A value of 1 represents a virtual point; virtual point mark data set model virtual and track data set model T containing virtual points virtual There is a one-to-one mapping in position, e.g. vir i Corresponds to trv i ,q j Corresponding to pv j Wherein q is j ∈vir i ,pv j ∈trv i And q is j And pv j Each represents vir i And trv i The jth number in (1);
definition 6, virtual points: for a line segment formed between adjacent points of a certain track, adding a virtual point from one point to each fixed distance along the line segment, so that the line segments with different lengths have the same influence on the point density of the area where the line segments are located;
wherein, the virtual obtained from the experiment in this embodiment is [ [0,1,1,1,0,1,1,0,1,1,0,1,1,0,1,0, … … ], … … ].
4) A trajectory data set model T containing virtual points virtual Clustering is carried out on the whole point set, the clustering center of each point is marked, and a marked data set model mark is generated;
a trajectory data set model T containing virtual points virtual Clustering is carried out by regarding the point set as a whole, the generated clustering centers are numbered, and the number of the clustering center to which each point belongs is recorded by a mark data set model mark, which comprises the following steps:
4.1) model T of trajectory data set containing virtual points virtual Clustering is carried out by regarding the whole point set, a clustering center is generated, the clustering center is numbered, and a track data set model T containing virtual points is recorded virtual The serial number of the cluster center corresponding to each point in the cluster;
4.2) traversing the trajectory dataset model T containing virtual points virtual Judging whether the virtual point is a virtual point or not through a virtual point mark data set model virtual, and recording the serial number of the clustering center to which the mark data set model mark belongs for the real point; trajectory data set model T containing virtual points virtual The mark data set model mark is defined as follows:
defining 7 a trajectory data set model containing virtual points: t is virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Therein trv i Representing the ith track containing virtual pointsA set of trace points, where i ═ 1, 2, 3, …, n; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv is j Representative trajectory trv i Wherein j is 1, 2, 3, …, m; pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
definition 8: labeling data set model: mark ═ mark [ [ mark 1 ,mar 2 ,mar 3 ,...,mar n ]Wherein mar i The method comprises the steps that a virtual point mark list corresponding to the ith track in a mark data set model mark containing a virtual point is obtained; mar i =[z 1 ,z 2 ,z 3 ,...,z m ],z j Represents mar i The jth point in (1); mark data set model mark and track data set model T containing virtual point virtual There is a one-to-one mapping over locations, e.g. mar i Corresponds to trv i ,z j Corresponding to pv j Wherein z is j ∈mar i ,pv j ∈trv i And z is j And pv j Respectively represent mar i And trv i The j-th number in (1);
the mark obtained by the experiment in this example is [ [27,27,8,8,33,33,33,33,33,33,33, … … ], … … ].
5) Traversing each track in the track data set model T, judging whether adjacent points of each track belong to the same clustering center by marking a data set model mark, if not, segmenting, if so, reserving, and generating a segmented track data set model T partition
Judging whether the cluster center numbers of the adjacent points of each track in the track data set model T are the same or not by combining the track data set model T and the mark data set model mark, if the cluster center numbers are different, segmenting the track, and if the cluster center numbers are different, keeping the same, generating a segmented track data set model T partition (ii) a Wherein the track data set model T, the mark data set model mark, and the divided track data set model T partition Definition ofThe following:
definition 9, trajectory data set model: t ═ tr 1 ,tr2,tr 3 ,...,tr n ]Wherein tri represents a point set of the ith track, wherein i is 1, 2, 3, …, n; tr is i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j Representative locus tr i Wherein j is 1, 2, 3, …, m; p is a radical of j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trace tr i Midpoint p j Longitude and latitude of (c);
definition 10: labeling the data set model: mark ═ mark [ [ mark 1 ,mar 2 ,mar 3 ,...,mar n ]Wherein mar i The method comprises the steps that a virtual point mark list corresponding to the ith track in a mark data set model mark containing virtual points is obtained; mar i =[z 1 ,z 2 ,z 3 ,...,z m ],z j Represents mar i J-th point in (1); mark data set model mark and track data set model T containing virtual point virtual There is a one-to-one mapping relationship over locations, e.g. mar i Corresponds to trv i ,z j Corresponding to pv j Wherein z is j ∈mar i ,pv j ∈trv i And z is j And pv j Respectively represent mar i And trv i The j-th number in (1); wherein, the trajectory data set model T containing virtual points virtual The definition is as follows:
defining 11, a trajectory data set model containing virtual points: t is a unit of virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Therein trv i A set of points representing the ith trajectory containing the virtual points; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv is j Representative trajectory trv i The jth point of (1); pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
definition 12. Segmented trajectory dataset model: t is a unit of partition =[trp 1 ,trp 2 ,trp 3 ,...,trp n ]Wherein trp i A set of points representing the ith segmented trajectory; trp i =[pp 1 ,pp 2 ,pp 3 ,...,pp m ]Wherein pp j Representative locus trp i The jth point of (1); pp (polypropylene) sheet j =(pp j .X,pp j .Y)∈trp i In which pp j .X、pp j Y is the trajectory trp i Midpoint pp j Longitude and latitude of (c);
wherein, T obtained from the experiment of this example partition Is [ [27,27 ]],[8,8],[33,33,33,33,33,33,33,……],……]。
6) Model T for segmented trajectory dataset partition In the track distribution method, a dynamic sequence alignment algorithm is adopted to calculate loss, an iterative track k anonymous clustering algorithm is used for clustering based on information loss, a k anonymous data set is generated to serve as a data set for data distribution, the k anonymous data set is shown in figure 5, and tracks in the same color area serve as the same k anonymous data set for track data distribution.
Modeling the segmented trajectory data set partition Clustering the tracks by using an iterative track k-anonymous clustering algorithm; segmented trajectory dataset model T partition The information loss, the dynamic sequence alignment algorithm, the progressive sequence alignment algorithm and the iterative track k-anonymous clustering algorithm are defined as follows:
defining 13, a segmented trajectory data set model: t is partition =[trp 1 ,trp 2 ,trp 3 ,...,trp n ]Wherein trp i A set of points representing the ith segmented trajectory, where i is 1, 2, 3, …, n; trp i =[pp 1 ,pp 2 ,pp 3 ,...,pp m ]Wherein pp j Representative locus trp i Wherein j is 1, 2, 3, …, m; pp (polypropylene) sheet j =(pp j .X,pp j .Y)∈trp i In which pp j .X、pp j Y is the trace trp i Midpoint pp j Longitude and latitude of (c);
definition 14, loss of information: node i Generalization to parent or higher node j The loss generated in time, the node is calculated j Generalization to node i The formula for the loss of information is:
Loss(node i ,node j )=log 2 (LF(node i ))-log 2 (LF(node j )) (1)
in the formula, Loss (node) i ,node j ) Is a node j Generalization to node i Resulting in loss of information, node i 、node j The number of two nodes, the LF () function returns the number of the bottommost leaf nodes owned by one node;
defining 15, dynamic sequence alignment algorithm: acting on any two tracks A, B, wherein the length A is a, the length B is B, adopting a dynamic programming method, and the recurrence equation is as follows:
dp[i][j]=min(dp[i-1][j-1]+Loss(node i ,node j ),
dp[i-1][j]+Loss(node i ,node root ),dp[i][j-1]+Loss(node j ,node root )) (2)
in the formula, a node i 、node j Is the number of two nodes, node root Dp representing a root node][]Is a two-dimensional matrix with the size of (a +1) × (b +1), dp [ i][j]Representing a two-dimensional matrix dp [ alpha ], [ beta ] and a][]Number of rows (i +1) < th > and columns (j +1) < th > dp [ i-1 >][j-1]Representing a two-dimensional matrix dp [2 ]][]Number of ith row and jth column in middle, dp [ i-1][j]Representing a two-dimensional matrix dp [2 ]][]Number of ith row j +1 th column, dp [ i][j-1]Representing a two-dimensional matrix dp [ alpha ], [ beta ] and a][]The number of the (i +1) th row and the (j) th column; a sequence alignment loss matrix dp [2 ] of (a +1) (b +1) can be obtained by a recurrence equation][]Finding a strategy which can make the loss of the two synthesized tracks be minimum and generating the synthesized track by backtracking the sequence alignment loss matrix;
definition 16, Progressive Sequence Alignment algorithm (PSA): selecting the longest track from a group of tracks as a basic track, then selecting one track from the group of tracks in any order, wherein each track can only be selected once and synthesized with the track, and the track synthesized by dynamic sequence alignment is a new basic track;
definition 17, Iterative track k anonymous Clustering algorithm (Iterative track Clustering): in order to realize k-anonymity of the track, firstly, the number of generated clusters is determined according to a k value, empty clusters are created, and then, the following operations are performed by traversing each cluster: randomly extracting a track from the track set and placing the track into a cluster as a first track, performing a dynamic sequence alignment algorithm on all the remaining tracks and the track one by one to calculate the information loss of track alignment, and selecting k-1 tracks with the minimum information loss to place the k-1 tracks into the cluster; and after completing the operation of adding tracks into all the clusters, calculating the final generalization loss of each track cluster through a progressive sequence alignment algorithm, and generating a k-anonymous track data set.
In conclusion, after the scheme is adopted, the method provides a new method for track privacy protection, and the iterative track k anonymous clustering is performed after the tracks are segmented according to the density of the points, so that the information loss in the track clustering procedure is reduced, the track privacy of the user is effectively protected, the method has practical popularization value, and is worthy of popularization.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (7)

1. A method for realizing k anonymity of track data distribution based on a point density segmentation track is characterized by comprising the following steps:
1) acquiring basic track data including longitude and latitude information of a track and a time sequence relation of a track point set, and establishing a track data set model T;
2) building a track loss model DGH tree by utilizing longitude and latitude information of a track; wherein, the DGH tree is defined as follows:
definition 2, regional generalization hierarchical tree: the position attribute in the map is divided into a plurality of cells with equal length by using DGH tree representation, then the cells are used as leaf nodes to establish a full binary tree, and if the number of the leaf nodes is not enough to fill the bottom layer of the binary tree, some invalid points are added for filling;
3) adding virtual points between adjacent points of each track in the track data set model T to generate a track data set model T containing the virtual points virtual And a virtual point marker dataset model virtual;
4) a trajectory data set model T containing virtual points virtual Clustering is carried out by regarding the point set as a whole, the clustering center of each point is marked, and a marked data set model mark is generated;
5) traversing each track in the track data set model T, judging whether adjacent points of each track belong to the same clustering center by marking a data set model mark, if not, segmenting, if so, reserving, and generating a segmented track data set model T partition
6) Model T for segmented trajectory dataset partition Calculating loss by adopting a dynamic sequence alignment algorithm, clustering based on information loss by using an iterative track k anonymous clustering algorithm, and generating a k anonymous data set as a data set for data distribution;
definition 14, loss of information: node i Generalization to parent or higher node j The loss generated in time, the node is calculated j Generalization to node i The formula for the loss of information is:
Loss(node i ,node j )=log 2 (LF(node i ))-log 2 (LF(node j )) (1)
in the formula, Loss (node) i ,node j ) Is a node j Generalization to node i Resulting in loss of information, node i 、node j The number of two nodes, the LF () function returns the number of the bottommost leaf nodes owned by one node;
defining 15, dynamic sequence alignment algorithm: acting on any two tracks A, B, wherein the length A is a, the length B is B, and a dynamic programming method is adopted, and the recurrence equation is as follows:
Figure FDA0003618200030000021
in the formula, a node i 、node j Is the number of two nodes, node root Represents a root node, dp][]Is a two-dimensional matrix with the size of (a +1) × (b +1), dp [ i][j]Representing a two-dimensional matrix dp [ alpha ], [ beta ] and a][]Number of rows (i +1) < th > and columns (j +1) < th > dp [ i-1 >][j-1]Representing a two-dimensional matrix dp [2 ]][]Number of ith row and jth column in middle, dp [ i-1][j]Representing a two-dimensional matrix dp [2 ]][]Number of ith row j +1 th column, dp [ i][j-1]Representing a two-dimensional matrix dp [2 ]][]The number of the (i +1) th row and the (j) th column; a sequence alignment loss matrix dp [ alpha ], (a +1) × (b +1) can be obtained by a recursion equation][]Finding a strategy which can enable the loss of the two synthesized tracks to be minimum and generating the synthesized track by backtracking the sequence alignment loss matrix;
defining 17, an iterative track k anonymous clustering algorithm: in order to realize k-anonymity of the track, firstly, the number of generated clusters is determined according to a k value, empty clusters are created, and then, the following operations are performed by traversing each cluster: randomly extracting a track from the track set and placing the track into a cluster as a first track, performing a dynamic sequence alignment algorithm on all the remaining tracks and the track one by one to calculate the information loss of track alignment, and selecting k-1 tracks with the minimum information loss to place the k-1 tracks into the cluster; and after completing the operation of adding tracks into all the clusters, calculating the final generalization loss of each track cluster through a progressive sequence alignment algorithm, and generating a k-anonymous track data set.
2. The method for realizing k anonymity of track data distribution based on the point density segmentation track as claimed in claim 1, wherein: in step 1), the trajectory dataset model T is defined as follows:
definition 1, trajectory data set model: t ═ tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing the ith trace, i ═ 1, 2, 3,. and n; tr i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j Representative locus tr i 1, 2, 3, a, m; p is a radical of j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trajectory tr i Midpoint p j Longitude and latitude of (c).
3. The method for realizing k anonymity of track data distribution based on the point density segmentation track as claimed in claim 1, wherein: in the step 2), the maximum value and the minimum value of the longitude and the latitude are respectively solved through the longitude and latitude information of the track, so as to obtain the range of the area, and then the area is uniformly divided, so as to establish a track loss model DGH tree, which comprises the following steps:
2.1) respectively solving the maximum value and the minimum value of the longitude and the latitude through the longitude and latitude information of the track, and defining a rectangular area P;
2.2) for the rectangular region P, dividing the rectangular region P into Nx sections with equal length in the transverse direction and dividing the rectangular region P into Ny sections with equal length in the longitudinal direction;
2.3) establishing transverse and longitudinal DGH trees respectively through the Nx, Ny and the track data set model T.
4. The method for realizing k anonymity of trajectory data distribution based on point density segmentation trajectory according to claim 1, wherein: in step 3), traversing each track in the track data set model T, adding a virtual point between each pair of adjacent points in the track, and generating a track data set model T containing the virtual points virtual And a virtual point marker dataset model virtual; trajectory data set model T and trajectory data set model T containing virtual points virtual The virtual point mark data set model virtual and the virtual points are defined as follows:
definition 3, trajectory data set model: t ═ tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing the ith trace, i 1,2,3,...,n;tr i =[p 1 ,p 2 ,p 3 ,...,p m ]wherein p is j Representative locus tr i 1, 2, 3, a, m; p is a radical of formula j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trajectory tr i Midpoint p j Longitude and latitude of (c);
defining 4, a track data set model containing virtual points: t is a unit of virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Therein trv i A set of points representing the ith trajectory containing the virtual points; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv is j Representative trajectory trv i The jth point of (1); pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
definition 5, virtual point marking data set model: virtuall ═ vir [, vir ═ vir [ ] 1 ,vir 2 ,vir 3 ,...,vir n ]Wherein vir i Is a trajectory data set model T containing virtual points virtual A virtual point mark list corresponding to the ith track; vir i =[q 1 ,q 2 ,q 3 ,...q m ],q j Represents vir i The jth point; q. q.s j When the value of (A) is 0, q represents a true point j When the value of (1) represents a virtual point; virtual point mark data set model virtual and track data set model T containing virtual points virtual There is a one-to-one mapping in position, vir i Corresponds to trv i ,q j Corresponding to pv j Wherein q is j ∈vir i ,pv j ∈trv i And q is j And pv j Each represents vir i And trv i The jth number in (1);
definition 6, virtual points: and for the line segment formed between adjacent points of a certain track, adding a virtual point from one point to the line segment along the line segment at a fixed distance, so that the line segments with different lengths have the same influence on the point density of the area where the line segments are located.
5. The method for realizing k anonymity of trajectory data distribution based on point density segmentation trajectory according to claim 1, wherein: in step 4), a trajectory data set model T containing virtual points is formed virtual Clustering is carried out by regarding the point set as a whole, the generated clustering centers are numbered, and the number of the clustering center to which each point belongs is recorded by a mark data set model mark, which comprises the following steps:
4.1) model T of the trajectory data set containing the virtual points virtual Clustering is carried out by regarding the whole point set, a clustering center is generated, the clustering center is numbered, and a track data set model T containing virtual points is recorded virtual The serial number of the cluster center corresponding to each point in the cluster;
4.2) traversing the trajectory dataset model T containing virtual points virtual Judging whether the virtual point is a virtual point or not through a virtual point marking data set model virtual, and recording the number of the clustering center to which the mark data set model mark belongs for the real point; trajectory dataset model T containing virtual points virtual The mark data set model mark is defined as follows:
defining 7 a trajectory data set model containing virtual points: t is virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Therein trv i A point set representing the ith trajectory containing virtual points, where i ═ 1, 2, 3., n; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv is j Representative trajectory trv i The jth point of (a), wherein j is 1, 2, 3, ·, m; pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
definition 8: labeling data set model: mark ═ mark [ [ mark 1 ,mar 2 ,mar 3 ,...,mar n ]Wherein mar i Is to contain a virtual pointA virtual point mark list corresponding to the ith track in the mark data set model mark; mar i =[z 1 ,z 2 ,z 3 ,...,z m ],z j Represents mar i J-th point in (1); mark data set model mark and track data set model T containing virtual point virtual There is a one-to-one mapping relationship on the locations, mar i Corresponds to trv i ,z j Corresponding to pv j Wherein z is j ∈mar i ,pv j ∈trv i And z is j And pv j Respectively represent mar i And trv i The j-th number in (2).
6. The method for realizing k anonymity of track data distribution based on the point density segmentation track as claimed in claim 1, wherein: in step 5), judging whether the numbers of the clustering centers of the adjacent points of each track in the track data set model T are the same or not by combining the track data set model T and the mark data set model mark, if the numbers are different, segmenting the track, and if the numbers are different, keeping the same, generating a segmented track data set model T partition (ii) a Wherein the track data set model T marks the data set model mark, the segmented track data set model T partition The definition is as follows:
definition 9, trajectory data set model: t ═ tr 1 ,tr 2 ,tr 3 ,...,tr n ]Wherein tr is i A set of points representing an ith trace, wherein i is 1, 2, 3. tr i =[p 1 ,p 2 ,p 3 ,...,p m ]Wherein p is j Representative locus tr i The jth point of (a), wherein j is 1, 2, 3, ·, m; p is a radical of j =(p j .X,p j .Y)∈tr i Wherein p is j .X、p j Y is the trace tr i Midpoint p j Longitude and latitude of (c);
definition 10: labeling data set model: mark ═ mark [ [ mark 1 ,mar 2 ,mar 3 ,...,mar n ]Wherein mar i Is a virtual point mark list corresponding to the ith track in a mark data set model mark containing virtual points;mar i =[z 1 ,z 2 ,z 3 ,...,z m ],z j Represents mar i J-th point in (1); mark data set model mark and track data set model T containing virtual point virtual There is a one-to-one mapping relationship on the locations, mar i Corresponds to trv i ,z j Corresponding to pv j Wherein z is j ∈mar i ,pv j ∈trv i And z is j And pv j Respectively represent mar i And trv i The jth number in (1); wherein, the track data set model T containing virtual points virtual The definition is as follows:
defining 11, a trajectory data set model containing virtual points: t is a unit of virtual =[trv 1 ,trv 2 ,trv 3 ,...,trv n ]Therein trv i A set of points representing the ith trajectory containing the virtual points; trv i =[pv 1 ,pv 2 ,pv 3 ,...,pv m ]Wherein pv is j Representative trajectory trv i The jth point of (1); pv j =(pv j .X,pv j .Y)∈trv i Wherein pv is j .X、pv j Y is the locus trv i Midpoint pv j Longitude and latitude of (c);
defining 12, a segmented track data set model: t is partition =[trp 1 ,trp 2 ,trp 3 ,...,trp n ]Wherein trp i A set of points representing the ith segmented trajectory; trp i =[pp 1 ,pp 2 ,pp 3 ,...,pp m ]In which pp j Representative trajectory trp i The jth point of (1); pp (polypropylene) j =(pp j .X,pp j .Y)∈trp i In which pp j .X、pp j Y is the trace trp i Midpoint pp j Longitude and latitude of (c).
7. The method for realizing k anonymity of trajectory data distribution based on point density segmentation trajectory according to claim 1, wherein: in step 6), the segmented trajectory data set is modeled partition Using iterationsClustering the tracks by a formula track k-anonymous clustering algorithm; segmented trajectory dataset model T partition And the definition of the progressive sequence alignment algorithm is as follows:
defining 13, a segmented track data set model: t is a unit of partition =[trp 1 ,trp 2 ,trp 3 ,...,trp n ]Wherein trp i A set of points representing an ith segmented trajectory, where i ═ 1, 2, 3., n; trp i =[pp 1 ,pp 2 ,pp 3 ,...,pp m ]In which pp j Representative locus trp i The jth point of (a), wherein j is 1, 2, 3, ·, m; pp (polypropylene) j =(pp j .X,pp j .Y)∈trp i Wherein pp j .X、pp j Y is the trace trp i Midpoint pp j Longitude and latitude of (c);
define 16, progressive sequence alignment algorithm: the method comprises the steps of selecting the longest track from a group of tracks as a basic track, then selecting one track in any order from the rest of the group of tracks, synthesizing the selected track with each track only once, and realizing the purpose through dynamic sequence alignment, wherein the track synthesized through the dynamic sequence alignment becomes a new basic track.
CN202110213797.3A 2021-02-26 2021-02-26 Method for realizing k anonymity of track data release based on point density segmentation track Active CN112818402B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110213797.3A CN112818402B (en) 2021-02-26 2021-02-26 Method for realizing k anonymity of track data release based on point density segmentation track

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110213797.3A CN112818402B (en) 2021-02-26 2021-02-26 Method for realizing k anonymity of track data release based on point density segmentation track

Publications (2)

Publication Number Publication Date
CN112818402A CN112818402A (en) 2021-05-18
CN112818402B true CN112818402B (en) 2022-07-26

Family

ID=75863874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110213797.3A Active CN112818402B (en) 2021-02-26 2021-02-26 Method for realizing k anonymity of track data release based on point density segmentation track

Country Status (1)

Country Link
CN (1) CN112818402B (en)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11562168B2 (en) * 2018-07-16 2023-01-24 Here Global B.V. Clustering for K-anonymity in location trajectory data
CN111259444B (en) * 2020-01-16 2022-09-16 长安大学 Track data label clustering method fusing privacy protection

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Novel trajectory privacy-preserving method based on clustering using differential privacy;Xiaodong Zhao 等;《Expert Systems With Applications》;20201231;第1-14页 *
Semantic Location Privacy Protection Algorithm Based on Edge Cluster Graph;Tao Lu 等;《2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)》;20201231;第1304-1309页 *
一种基于匿名区域变换的位置隐私保护方法;肖燕芳等;《计算机工程》;20130115(第01期);第157-163页 *
基于k-匿名的轨迹数据隐私发布研究综述;赵凯毅等;《软件工程》;20171205(第12期);第12-15页 *
基于网格密度的位置隐私保护***设计与实现;武发明等;《信息技术》;20160125(第01期);第67-71页 *
基于轨迹形状多样性的隐私保护算法;孙丹丹等;《计算机应用》;20160610(第06期);第1544-1551页 *

Also Published As

Publication number Publication date
CN112818402A (en) 2021-05-18

Similar Documents

Publication Publication Date Title
Shen et al. V-tree: Efficient knn search on moving objects with road-network constraints
EP2891993A1 (en) Method for virtualizing large-scale distributed heterogeneous data
CN104754509A (en) LBS (Location Based Service) privacy protecting method based on position semantics K-anonymity
CN106599040A (en) Layered indexing method and search method for cloud storage
CN106899306A (en) A kind of track of vehicle line data compression method of holding moving characteristic
CN107741982A (en) A kind of coordinate and administrative region matching system and method
CN116860905B (en) Space unit coding generation method of city information model
Sun et al. Synthesizing realistic trajectory data with differential privacy
CN112818402B (en) Method for realizing k anonymity of track data release based on point density segmentation track
CN114092729A (en) Heterogeneous electricity consumption data publishing method based on cluster anonymization and differential privacy protection
CN110502919B (en) Track data de-anonymization method based on deep learning
Li et al. An efficient method for privacy-preserving trajectory data publishing based on data partitioning
CN110110154A (en) A kind of processing method of map file, device and storage medium
CN113285960B (en) Data encryption method and system for service data sharing cloud platform
CN100481085C (en) Terrain data storing method based on object storage
CN1414518A (en) Standardization method of virtual reality data
CN103198148B (en) Published cartographic data integrated management and many media dissemination method
Liu et al. Efficiently learning spatial indices
CN106503084A (en) A kind of storage and management method of the unstructured data of facing cloud database
Goffe et al. Tiled top–down combinatorial pyramids for large images representation
CN115309747A (en) Fire fighting management method and platform based on spatial grid data and electronic equipment
CN116167193A (en) Method for analyzing influence of land utilization change on runoff process based on SWAT model
Chiang Experiments on the practical I/O efficiency of geometric algorithms: Distribution sweep versus plane sweep
Clark et al. On variables that affect estimates of the true sizes and densities of radioactively labeled cell nuclei
CN113468625A (en) Computer automatic statistical method for multiple types of areas in limited range

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant