CN110162997A - Anonymous method for secret protection based on interpolation point - Google Patents
Anonymous method for secret protection based on interpolation point Download PDFInfo
- Publication number
- CN110162997A CN110162997A CN201910340914.5A CN201910340914A CN110162997A CN 110162997 A CN110162997 A CN 110162997A CN 201910340914 A CN201910340914 A CN 201910340914A CN 110162997 A CN110162997 A CN 110162997A
- Authority
- CN
- China
- Prior art keywords
- track
- distance
- anonymous
- imhdt
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/10—Geometric CAD
- G06F30/18—Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computational Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Storage Device Security (AREA)
- Complex Calculations (AREA)
Abstract
The present invention is suitable for secret protection technical field; provide a kind of anonymous method for secret protection based on interpolation point; this method specifically comprises the following steps: S1, pre-processes to initial trace data set Ts, forms several consistent track equivalence class Ecs on timestamp;S2, track in each track equivalence class is clustered according to IMHDT distance metric, several track anonymity groups is constituted in each track equivalence class, wherein tracking quantity is no less than k group in each anonymity group;S3, it is disturbed to each to track in anonymous group, finally meets interpolation track (k, δ)-anonymity.It is constraint with trajectory time stamp, interpolation point is limited on the orbit segment of corresponding timestamp, data distortion is reduced during anonymity, increases availability of data under the premise of meeting publication data-privacy protection.
Description
Technical field
The invention belongs to secret protection technical fields, provide a kind of anonymous method for secret protection based on interpolation point.
Background technique
Modern society's trace information can be had the mobile phone of GPS, and PDA, automatic navigator, intelligent wearable device etc. is conveniently
Be acquired and share.User is so as to being convenient to use location based service1(LBS), for example " neighbouring add is searched
Petrol station ", " motion profile for recording me " etc., the trace information of collection can be used for business decision, such as intensive in location information
Area opens up supermarket etc., and usually this kind of area has biggish commercial value, to make investor's maximizing the benefits.It can also be used for
The development of the applications such as urban planning.Trace information because it contains special space time information and there are immense values, but these believe
Breath can also be collected and be analyzed by malice mechanism, and privacy of user is caused to be revealed.
Therefore it needs to carry out anonymous processing to publication data set, solves the problems, such as privacy leakage.Secret protection system simultaneously
The data exported should not excessively change the track characteristics such as length and the duration of the track of relative users, how issue
Not only it can handle the availability of publication data while trace information, but also can guarantee that individual track is not identified by attacker
Being that protecting track privacy application instantly needs the problem of paying close attention to.Existing many methods for protecting track data publication privacy
In, most methods do not account for the availability of the data for publication.
Summary of the invention
The embodiment of the invention provides a kind of anonymous method for secret protection based on interpolation point, is about with trajectory time stamp
Interpolation point is limited on the orbit segment of corresponding timestamp by beam, and data distortion is reduced during anonymity, meets publication data
Availability of data is increased under the premise of secret protection.
To achieve the goals above, the present invention provides a kind of anonymous method for secret protection based on interpolation point, the side
Method specifically comprises the following steps:
S1, initial trace data set Ts is pre-processed, forms several consistent track equivalence classes on timestamp
Ecs;
S2, track in each track equivalence class is clustered according to IMHDT distance metric, structure in each track equivalence class
At several track anonymity groups, wherein tracking quantity is no less than k group in each anonymity group;
S3, track in each anonymous group is disturbed, finally meets interpolation track (k, δ)-anonymity.
Further, the step S1 specifically comprises the following steps:
S11, trajectory processing fragment value P is definedi;
S12, the beginning and ending time for obtaining initial trace Tr stab { tb,te};
S13, acquisition time are later than initial time tbAnd mould PiFor 0 timestamp tiAnd the time is earlier than termination time teAnd mould
PiFor 0 timestamp tj;
S14, it will be intercepted in initial trace as { ti,tj, and it is put into track equivalence class D { i, j }.
Further, the step S2 specifically comprises the following steps:
S21, the track that do not cluster in each track equivalence class set is put into active set, gathers random choosing from active
Select a track;
S22, calculate active set in other tracks arrive the selected track IMHDT distance, by IMHDT distance farthest
A track centered on track;
IMHDT distance of other tracks to centrode in S23, calculating active set;
S24, it takes k-1 nearest track and the centrode in IMHDT distance to constitute an anonymous cluster, hides described
Name cluster is added in anonymity set;
S25, the track that distance is farthest in k-1 nearest track is obtained, if the track and centrode
IMHDT distance be greater than threshold value max_radius, then inhibit the anonymous cluster;
The IMHDT distance is the Hausdorff distance of interpolation point under time-constrain.
Further, the IMHDT distance calculating method between two tracks Tr1, Tr2 is specific as follows:
S221, each track sampled point Tr1_node is calculatedT=tiTo track endBetween
The shortest distance;
S222, track sampled point Tr1_node is calculatedT=tiTo track endBetween most short distance
From;
S223, using in step S221 and step S222 apart from minimum value as track sampled point Tr1_nodeT=ti's
IMHDT distance;
S224, track Tr1The average value of the IMHDT sum of the distance of each track sampled point is between track Tr1, Tr2
IMHDT distance.
Further, the track sampled point is specific as follows to the minimum distance calculation method between orbit segment:
Judge with the presence or absence of interpolation point on orbit segment, so that the line of track sampled point and interpolation point is perpendicular to track
Section;
If it is present the Euclidean distance of track sampled point to interpolation point be track sampled point between orbit segment most
Short distance;
If do not deposited, track sampled point to track end two-end-point apart from minimum value be track sampled point to orbit segment
Between the shortest distance.
The anonymous method for secret protection based on interpolation point that the embodiment of the present invention proposes has the following beneficial effects:
1, which uses a variety of pretreatment fragments in the preprocessing process of track, comes Regularization track, and to different pretreatments
It the retention amount of Regularization track under fragment and is comprehensively compared to anonymous track quality to determine that pretreatment fragment takes
Value is conducive to the progress that track inhibition and subsequent track anonymization is reduced in preprocessing process.
2 introduce track uncertain theory in interpolation point anonymity model, are unique in that each of which due to track data
Sampled point is likely to become standard identifier, and directly carrying out moving operation to it will lead to the increase of anonymous cost.Therefore rail is introduced
The intrinsic uncertain region of mark advantageously reduces anonymous cost as the anonymous region of track.
3 when measuring trajectory distance in the cluster process of track using the Hausdorff distance based on interpolation point, and it is resonable
Identical rail is always less than or equal to apart from calculated value based on the Hausdorff of interpolation point to centrode by the anonymous track of upper proof
Euclidean distance calculated value between mark.Therefore carrying out cluster using the distance, can to obtain area more extensive than Euclidean distance smaller
Clustering cluster.
4 propose the anonymous mould that interpolation point in the adjacent track section with sampled point replaces sampled point during the anonymity of track
Type can reduce the track during anonymity using the model and disturb, to reduce data distortion, meeting, publication data are hidden
Availability of data is increased under the premise of private protection.
Detailed description of the invention
Fig. 1 is the uncertain region schematic diagram of track sampled point provided in an embodiment of the present invention;
Fig. 2 is three initial trace schematic diagrames provided in an embodiment of the present invention for being unsatisfactory for track (3, δ)-anonymity;
Fig. 3 is track schematic diagram of three initial traces provided in an embodiment of the present invention after anonymization;
Fig. 4 provides the anonymous method for secret protection flow chart based on interpolation point for the embodiment of the present invention;
Fig. 5 is the interpolation track similarity measurement schematic diagram under no time-constrain provided in an embodiment of the present invention;
Fig. 6 is the interpolation track similarity measurement schematic diagram under having time provided in an embodiment of the present invention constraint;
Euclidean distance of the Fig. 7 between track provided in an embodiment of the present invention and the Hausdorff distance based on interpolation point
Contrast schematic diagram;
Fig. 8 is the anonymization operation chart provided in an embodiment of the present invention based on interpolation point.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
The definition of related terms:
1) track
Track is usually moving line of the mobile object in a period of time.Trace information is by the biography with positioning system
Feel equipment to collect, these sensing equipments store in the corresponding time and send the coordinate of mobile object to trace information gatherer
Value, there are two types of different representations:
Track Tr is by a string using timestamp as the triple (t of sequencei,xi,yi) constitute
Tr={ (t1,x1,y1),(t2,x2,y2),...,(tn,xn,yn)}
Wherein, xi,yiTrack is represented in timestamp tiWhen 1≤i of coordinate value≤n.
Another track representation method is by a string of continuous broken linesIt constitutes
Wherein, pJRepresent a sampled point in the Tr of track, plenTrThe length of track Tr is represented,It is two tracing points
Between piecewise linear approximation simulation reality in path, when sampled point sample frequency level off to 0 when, track closer to reality in
Moving line, but sample frequency it is higher storage and parsing track cost it is higher.
2) Hausdorff distance
Hausdorff distance is to give two point set T apart from measurement method between two point sets of image domainsi=
{a1,a2,...,ai,...,amAnd Tj={ b1,b2,...,bj,...,bn, Ti,TjBetween Hausdorff distance definition such as
Under:
Wherein
Since original Hausdorff distance calculates the distance between point set using maximum value and minimum value, one will receive
A little outlier large effects;In order to improve Hausdorff distance to the robustness of isolated point and noise, a kind of improvement is proposed
Hausdorff distance, reduced by the way of being averaged outlier bring influence, improve Hausdorff apart from table
Show:
Hausdorff distance based on interpolation point is expressed as follows:
3) uncertainty of track sampled point
Existing location technology can not one coordinate points of point-device positioning, physical location under normal conditions is general
A border circular areas is turned to, which is to realize the basis of track (k, δ)-anonymity, and the present invention is equally hidden by this characteristic
Name track, but unlike, the present invention does not use the uncertain region of initial trace sampled point but using interpolation point
Instead of to obtain smaller anonymous cost.
Due in reality location technology it is inaccurate, it is assumed that there is uncertain threshold value δ to indicate, be then circle with track sampled point
The heart, it is the uncertain region (as shown in Figure 1) of track sampled point that uncertain threshold value δ, which is the border circular areas of radius:
dist(preal,p)≤δ
Wherein, prealThe actual position of track is represented, p represents sampled point.prealIt can reside in any of uncertain region
Position.
4) cooperation of track is anonymous (Co-localization)
It is finally realizing in anonymization, it would be desirable to which the corresponding sampled point on each track meets it two-by-two each other
Uncertain region in, to make path implementation cooperation anonymous.Define centrode Tr and anonymity track Tr '
Tr={ (t1,x1,y1),(t2,x2,y2),...,(tn,xn,yn)}
Tr '={ (t1,x′1,y′1),(t2,x′2,y′2),...,(tn,x′n,y′n)}
Tr ' goes up each track sampled point in the uncertain region of corresponding centrode sampled point.Assuming that track degree
Flow function then needs to meet using Euclidean distance
Then claim two tracks to meet and cooperate anonymous (Co-localization), is denoted as Coloc (Tr, Tr ')
5) track (k, δ)-anonymous group
If any track two-by-two is all satisfied the cooperation anonymity of track, uncertain region δ in track anonymity group, and hides
Tracking quantity is more than or equal to k in name group, then the anonymity group is a track (k, δ)-anonymity group.
Track (k, δ)-anonymity is the uncertain proposition based on above track sampled point, in Fig. 2, centrode
First sampled point do not know in the uncertain region that threshold value δ is radius, anonymous track Tr1, Tr2 using the point as the center of circle
Correspondence sampled point be all satisfied anonymous demand, at this time by anonymous track Tr1, Tr2With centrode in first track sampled point
Upper satisfaction (3, δ)-is anonymous.Second sampled point only meets (2, δ)-anonymity in figure, because corresponding to sampled point on anonymity track Tr2
Not in uncertain region.Similarly third and the 4th only satisfaction (2, δ)-anonymity, at this time if will be by this three tracks
The trajectory set of composition is converted into the anonymous group in track (3, δ)-, then needs to carry out moving operation to corresponding tracing point, which can lead
Cause data distortion.Fig. 3 is the track set that above-mentioned three initial traces meet (3, δ)-anonymity after anonymization, ash in Fig. 3
Color track sampled point is that the former tracing point progress position movement for being unsatisfactory for condition is formed, and all tracks are all satisfied track at this time
Cooperation it is anonymous, therefore the track anonymity group be one (3, δ)-anonymous group.
Problem to be solved by this invention is that initial trace data set is carried out anonymous operation, assumes it in above-mentioned attack
Under the premise of can reduce the privacy leakage risk of the track owner, any operation for initial trace data set can all make to send out
Cloth data distortion reduces the use value of publication data set.
Evaluation index
1) data distortion
Cluster is that data are classified with certain itself feature, and data similarity is smaller between cluster and data phase in cluster
It is larger like property.Similitude is the emphasis of cluster process, and trajectory clustering is no exception, how to indicate that the similitude between track becomes
The core of trajectory clustering algorithm.It realizes in the classic algorithm NMA of track k- anonymity and its similitude is calculated with Euclidean distance, and
Invention uses the Hausdorff distance based on interpolation calculation, and the calculating of this verified distance above is less than or equal to be widely used
Euclidean distance.Therefore cluster process can be made to form the smaller smaller cluster of extensive area using Hausdorff distance.Extensive face
Product, which reduces, can be such that data distortion in cluster process reduces.
Data distortion
Wherein len (Ecs) represents the quantity of cluster after cluster, ClusterArea (Eci) indicate cluster EciExtensive area,
MaxArea indicates the track regions gross area.
Anonymous cost
The anonymization of track is that data conversion is carried out in the cluster formed after trajectory clustering, i.e. motion track sampled point makes it
It is anonymous to meet (k, δ)-.Since track meets the requirement of k track in cluster after the cluster process of track, which needs
So that it is met the sampled point in every track and meets its distance for arriving centrode no more than δ.
Anonymous cost
Wherein, TranslationNode represents the moving distance of track sampled point, and maxTranslation is represented in track
The moving distance of all the points.
Fig. 4 provides the anonymous method for secret protection flow chart based on interpolation point for the embodiment of the present invention, and this method is specifically wrapped
Include following steps:
S1, initial trace data set Ts is pre-processed, forms several consistent track equivalence classes on timestamp
Ecs;
Since sampling times of actual life different tracks, there are the measurements that biggish difference will affect track similitude.Cause
This needs initial trace being divided into several equivalence classes according to timestamp, wherein the track of each equivalence class possesses the consistent time
Stamp.But the trace number with identical timestamp is less, directly carries out classification according to timestamp and inevitably results in track
The number of equivalence class is excessive, and trace number is less in each equivalence class, if the number deficiency k, this equivalence class is discontented
Sufficient k- anonymity requires then be inhibited, second-rate so as to cause anonymous data.The present invention is using the track in NWA algorithm
Pretreatment mode makes each track equivalence class possess more track, relative to above by way of inhibiting partial traces point
It is middle to inhibit to remain a large amount of tracks for entire track equivalence class, greatly improve track anonymity quality.
Since the selection of pretreatment fragment is needed in track retention amount and to weigh between anonymous data quality, the present invention exists
Different pretreatments fragment is used in preprocessing process to be pre-processed respectively, to form several equivalence class groups.It is testing
Part carries out comprehensive analysis to its each index, and a selected suitable pretreatment fragment is tested, kept to anonymous rail
Mark Mass lost track inhibits data volume.
Algorithm 1 is the preprocessing process of initial trace, and input value is initial trace set Ts, and track pre-processes fragment value
Pi exports as by corresponding pretreatment fragment treated track equivalence class, the trajectory time that each equivalence class is consistent
Stamp.For each track track in data set, the beginning and ending time of first recording track stabs [tb,te], take i to be greater than t to meetbAnd
The timestamp that mould Pi is 0 takes j to meet to be less than and terminating timestamp teAnd the timestamp that mould Pi is 0.Track is intercepted as [ti,tj]
And it is put into same i, in the track equivalence class of j value.
Final all tracks are placed into corresponding i in equivalence class set entirely, in the equivalence class of j value, track in each equivalence class
Have the identical beginning and ending time, note: data set time stamp of the present invention is continuous, i.e., if two track beginning and ending time stamps are consistent, it
All timestamps it is consistent.
S2, track in each track equivalence class is clustered according to IMHDT distance metric, if being constituted in each equivalence class
Dry track anonymity group, wherein tracking quantity is no less than k in each anonymity group;
The process of cluster needs to measure track in same timestamp equivalence class according to specific similarity function, rail
Mark similitude is higher be divided into it is same to which in anonymous group, each anonymity group of components quantity is not less than k.The core of the process be as
What determines the similitude of two tracks, the i.e. determination of track metric function.Classical metric function is Euclidean distance, first calculating pair
The Euclidean distance of track sampled point on timestamp is answered, arithmetic average then is taken to the upper sampled point of all timestamps, the value
That is the Euclidean distance of track.The present invention proposes a kind of new metric form IMHDT, the interpolation point under time-constrain
Hausdorff distance (IMHDT) calculating process is as follows:
Wherein, dist (pa,pb) represent sampled point pa, pbThe distance between, interpolation pointMake dist (pa, ⊙) most
It is small, generally sampled point paTo broken line pb-1pbVertical line.
Wherein, dist (Tra,Trb) represent track Tra,TrbBetween IMHDT distance, t number of sampling points.
Not introducing time-constrain in the similarity system design of track may result in opposite comparison result, and Fig. 5 is no time
The lower interpolation track similarity measurement of constraint, two tracks IMHD in the case where interpolation point searching no time-constrain in Fig. 5
It calculates apart from very little, but actual conditions are then the tracks that two directions are completely contradicted, there is also very big differences for track sampled point distance
It is different, especially t1 in figure, the t4 moment.Therefore actual conditions are not met using IMHD measurement track under such situation;Fig. 6 is sometimes
Between constrain under interpolation track similarity measurement, the searching of interpolation point is limited in the adjacent track section of same sampling instant,
The IMHDT acquired tallies with the actual situation apart from relatively large.The searching range of trajectory interpolation point substantially reduces simultaneously, does not need
The full track mark scale of IMHD searches interpolation point, and computational efficiency is improved.
The present invention uses the Hausdorff distance (IMHDT) under time-constrain based on interpolation point to be used as track metric function,
Trajectory clustering is carried out based on greed cluster.Since IMHDT distance is less than or equal to Euclidean distance (hereinafter under square one
To issuing a certificate) because formed anonymous group of cluster has lesser generalized radius compared with Euclidean distance, clustering cluster possesses smaller
Extensive area, reduce the distortion of extensive bring track data.
Anonymous track is always less than or equal to identical rail apart from calculated value based on the Hausdorff of interpolation point to centrode
Euclidean distance calculated value between mark, proof procedure are specific as follows:
It is different that the Hausdorff distance of track sampled point calculates calculation in varied situations:
1, as shown in t1 in Fig. 7, when anonymous track sampled point can not be to track adjacent for timestamp in centrode
When section makees perpendicular bisector, interpolation point can not be made.The Hausdorff distance of track sampled point is consistent with Euclidean distance at this time;
2, as shown in t3 in Fig. 7, when anonymous track sampled point may only be to the one of orbit segment adjacent for timestamp in centrode
When perpendicular bisector is made at end, an available interpolation point.At this time the Hausdorff distance of track sampled point take sampled point to insert
It is worth the Euclidean distance of point.Due to form right angle triangle between 3 points, it is greater than any right angle by hypotenuse is long
Hausdorff distance known to side length between sampled point is less than Euclidean distance;3, as shown in t2 in Fig. 7, when anonymous track samples
When point can make perpendicular bisector to the both ends of orbit segment adjacent for timestamp in centrode, available two interpolation
Point.The Hausdorff distance of track sampled point takes one shorter into the Euclidean distance of two interpolation points of sampled point at this time.Together
Reason will also realize that the distance of the Hausdorff between sampled point is less than Euclidean distance.
Since the IMHDT distance of track takes the mean value of the above tracing point distance, and value is small under three circumstances
In equal to Euclidean distance.Available anonymity track is to the total apart from calculated value based on the Hausdorff of interpolation point of centrode
It is less than the Euclidean distance calculated value being equal between same trajectories.All sampled points can not in centrode for the time
It stabs in the case that adjacent orbit segment makees perpendicular bisector, two calculated values are equal.
The present invention treats the judgement of the track in anonymous group, and core out track will be limited to other tracks in anonymous group
In uncertain region, to realize that the cooperation of track is anonymous, achieve the purpose that secret protection.Moreover, being different from track (k, δ)-
Anonymity model, the present invention replace uncertainty node to disturb using interpolation point, are distorted to obtain little data, are smaller
Anonymous cost and higher availability of data.
2 clustering algorithm of algorithm inputs as track equivalence class Ecs, secret protection degree k.Output is of equal value for the track after cluster
Class set clusteredEcs.Max_radius is set first, and inhibiting if cluster generates IMHDT distance more than the threshold value should
Clustering cluster.Second step initialization does not cluster track and gathers and be set as empty, later to pretreated each track equivalence class
Cluster operation is executed, initializes clustered track set clustered first, if tracking quantity is less than in the track equivalence class
K then inhibits it.The set is inserted, to indicate not in tracks all in the equivalence class by first initialization active set
Gather the track of cluster.Reinitialize centrode set, first selects a track from active set at random.Then for
It is all do not cluster track and execute the IMHDT distance of itself and the track calculate, select the track of IMHDT distance value farthest as
One centrode, then to all IMHDT distance calculating for not clustering track and executing itself and the centrode.Initialization is anonymous
Then cluster anonymity takes IMHDT to constitute an anonymous cluster apart from k-1 nearest track and the centrode.This is hidden
Name cluster is added in anonymity set.Need first to calculate farthest one of the distance in k-1 nearest track at this time
Track, if the threshold value max_radius that the IMHDT of this track and centrode distance is arranged before being greater than, by these tracks
Inhibit.By no matter whether track is classified as anonymous track (anonymity set is added) after this step, these tracks are both needed to
Never it to be deleted in anonymity track set active, therefore when active collection is combined into sky, the track equivalence class end of clustering.
After all equivalence class end of clustering, trajectory clustering process of the invention terminates.
Algorithm 3 is the specific algorithm for calculating the IMHDT distance in two tracks, and input is two track Tr1, Tr2.It is defeated
IMHDT distance between this two tracks out.Each track sampled point Tr1_node is calculated firstT=tiTo orbit segmentBetween the shortest distance, then calculateTo orbit segment
Between the shortest distance.It is finally minimized the IMHDT distance as the point, is averaged again after cumulative as between track
IMHDT distance.
Algorithm 4 is to calculate track sampled point to the algorithm of the shortest distance between orbit segment, is inputted as track sampled pointAnd by two track sampled pointsThe orbit segment of composition.Output is that track samples
It puts to the shortest distance between orbit segment.It first determines whether with the presence or absence of interpolation point on orbit segment, so that sampled point and interpolation point
Line perpendicular to orbit segment, and if so, return sampled point to interpolation point Euclidean distance, if there is no then returning to rail
Mark sampled pointTo other two-end-point apart from minimum value.
S3, anonymization operation is carried out to each anonymous group, it is made to meet interpolation track (k, δ)-anonymity.
It needs to disturb the track in anonymous group as shown in figure 8, completing this step, to meet interpolation track (k, δ)-
Anonymous requirement.Specific embodiment is to move the track sampled point for being unsatisfactory for requiring, make its to centrode away from
From less than or equal to δ.
Anonymous cost can be reduced by replacing track sampled point to carry out anonymization operation using interpolation point, and proof procedure is specific
It is as follows:
Translation (IMHDT)=Eurp (Trp_ ⊙i,Tri)-δ
Translation (Eurp)=Eurp (Trpi,Tri)-δ
According to the total apart from calculated value based on the Hausdorff of interpolation point of the anonymous track being proved above to centrode
It is less than the Euclidean distance calculated value being equal between same trajectories:
IMHDT(Trpi,Tri)=Eurp (Trp_ ⊙i,Tri)≤Eurp(Trpi,Tri)
Due to track do not know threshold value δ be it is determining, can demonstrate,prove that replace track sampled point to carry out using interpolation point anonymous
The anonymous cost of tracing point can be reduced by changing operation.
The anonymous method for secret protection based on interpolation point that the embodiment of the present invention proposes has the following beneficial effects:
1, which uses a variety of pretreatment fragments in the preprocessing process of track, comes Regularization track, and to different pretreatments
It the retention amount of Regularization track under fragment and is comprehensively compared to anonymous track quality to determine that pretreatment fragment takes
Value is conducive to the progress that track inhibition and subsequent track anonymization is reduced in preprocessing process.
2 introduce track uncertain theory in interpolation point anonymity model, are unique in that each of which due to track data
Sampled point is likely to become standard identifier, and directly carrying out moving operation to it will lead to the increase of anonymous cost.Therefore rail is introduced
The intrinsic uncertain region of mark advantageously reduces anonymous cost as the anonymous region of track.
3 when measuring trajectory distance in the cluster process of track using the Hausdorff distance based on interpolation point, and it is resonable
Identical rail is always less than or equal to apart from calculated value based on the Hausdorff of interpolation point to centrode by the anonymous track of upper proof
Euclidean distance calculated value between mark.Therefore carrying out cluster using the distance, can to obtain area more extensive than Euclidean distance smaller
Clustering cluster.
4 propose the anonymous mould that interpolation point in the adjacent track section with sampled point replaces sampled point during the anonymity of track
Type can reduce the track during anonymity using the model and disturb, to reduce data distortion, meeting, publication data are hidden
Availability of data is increased under the premise of private protection.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (5)
1. a kind of anonymous method for secret protection based on interpolation point, which is characterized in that the method specifically comprises the following steps:
S1 pre-processes initial trace data set Ts, forms several consistent track equivalence class Ecs on timestamp;
S2, track in each track equivalence class is clustered according to IMHDT distance metric, if being constituted in each track equivalence class
Dry track anonymity group, wherein tracking quantity is no less than k group in each anonymity group;
S3, the track in each anonymous group is disturbed, finally meets interpolation track (k, δ)-anonymity.
2. the anonymous method for secret protection based on interpolation point as described in claim 1, which is characterized in that the step S1 is specifically wrapped
Include following steps:
S11, trajectory processing fragment value P is definedi;
S12, the beginning and ending time for obtaining initial trace Tr stab { tb,te};
S13, acquisition time are later than initial time tbAnd mould PiFor 0 timestamp tiAnd the time is earlier than termination time teAnd mould PiIt is 0
Timestamp tj;
S14, it will be intercepted in initial trace as { ti,tj, and it is put into track equivalence class D { i, j }.
3. the anonymous method for secret protection based on interpolation point as described in claim 1, which is characterized in that the step S2 is specifically wrapped
Include following steps:
S21, the track that do not cluster in each track equivalence class set is put into active set, from active set random selection one
Track;
S22, calculate active set in other tracks arrive the selected track IMHDT distance, by IMHDT distance farthest one
Track centered on track;
IMHDT distance of other tracks to centrode in S23, calculating active set;
S24, k-1 nearest track and the centrode in IMHDT distance is taken to constitute an anonymous cluster, by the anonymous cluster
It is added in anonymity set;
S25, the track that distance is farthest in k-1 nearest track is obtained, if the track and centrode
IMHDT distance is greater than threshold value max_radius, then inhibits the anonymous cluster;
The IMHDT distance is the Hausdorff distance of interpolation point under time-constrain.
4. the anonymous method for secret protection based on interpolation point as claimed in claim 3, which is characterized in that two track Tr1、Tr2Between
IMHDT distance calculating method it is specific as follows:
S221, each track sampled point Tr1_node is calculatedT=tiTo track endBetween it is most short
Distance;
S222, track sampled point Tr1_node is calculatedT=tiTo track endBetween the shortest distance;
S223, using in step S221 and step S222 apart from minimum value as track sampled point Tr1_nodeT=tiIMHDT away from
From;
S224, track Tr1The average value of the IMHDT sum of the distance of each track sampled point is the IMHDT between track Tr1, Tr2
Distance.
5. the anonymous method for secret protection based on interpolation point as claimed in claim 4, which is characterized in that the track sampled point arrives
Minimum distance calculation method between orbit segment is specific as follows:
Judge with the presence or absence of interpolation point on orbit segment, so that the line of track sampled point and interpolation point is perpendicular to orbit segment;
If it is present the Euclidean distance of track sampled point to interpolation point is track sampled point to the most short distance between orbit segment
From;
If do not deposited, track sampled point to track end two-end-point apart from minimum value be track sampled point between orbit segment
The shortest distance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910340914.5A CN110162997B (en) | 2019-04-25 | 2019-04-25 | Anonymous privacy protection method based on interpolation points |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910340914.5A CN110162997B (en) | 2019-04-25 | 2019-04-25 | Anonymous privacy protection method based on interpolation points |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110162997A true CN110162997A (en) | 2019-08-23 |
CN110162997B CN110162997B (en) | 2021-01-01 |
Family
ID=67640021
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910340914.5A Active CN110162997B (en) | 2019-04-25 | 2019-04-25 | Anonymous privacy protection method based on interpolation points |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110162997B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111026930A (en) * | 2019-12-02 | 2020-04-17 | 东北大学 | Track data privacy protection method based on track segmentation |
CN111259434A (en) * | 2020-01-08 | 2020-06-09 | 广西师范大学 | Privacy protection method for individual preference position in track data release |
CN111625587A (en) * | 2020-05-28 | 2020-09-04 | 泰康保险集团股份有限公司 | Data sharing apparatus |
CN112883423A (en) * | 2021-02-25 | 2021-06-01 | 吉林师范大学 | Similarity-based k-anonymous privacy protection method for release track |
CN113672975A (en) * | 2021-08-03 | 2021-11-19 | 支付宝(杭州)信息技术有限公司 | Privacy protection method and device for user track |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103605362A (en) * | 2013-09-11 | 2014-02-26 | 天津工业大学 | Learning and anomaly detection method based on multi-feature motion modes of vehicle traces |
CN105760780A (en) * | 2016-02-29 | 2016-07-13 | 福建师范大学 | Trajectory data privacy protection method based on road network |
US20160285827A1 (en) * | 2012-02-23 | 2016-09-29 | Tenable Network Security, Inc. | System and method for facilitating data leakage and/or propagation tracking |
CN107358113A (en) * | 2017-06-01 | 2017-11-17 | 徐州医科大学 | Based on the anonymous difference method for secret protection of micro- aggregation |
CN108734022A (en) * | 2018-04-03 | 2018-11-02 | 安徽师范大学 | The secret protection track data dissemination method divided based on three-dimensional grid |
CN108733774A (en) * | 2018-04-27 | 2018-11-02 | 上海世脉信息科技有限公司 | A kind of unemployment dynamic monitoring method based on big data |
CN109376184A (en) * | 2018-10-16 | 2019-02-22 | 网链科技集团有限公司 | A method of windward driving is taken based on big data |
-
2019
- 2019-04-25 CN CN201910340914.5A patent/CN110162997B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160285827A1 (en) * | 2012-02-23 | 2016-09-29 | Tenable Network Security, Inc. | System and method for facilitating data leakage and/or propagation tracking |
CN103605362A (en) * | 2013-09-11 | 2014-02-26 | 天津工业大学 | Learning and anomaly detection method based on multi-feature motion modes of vehicle traces |
CN105760780A (en) * | 2016-02-29 | 2016-07-13 | 福建师范大学 | Trajectory data privacy protection method based on road network |
CN107358113A (en) * | 2017-06-01 | 2017-11-17 | 徐州医科大学 | Based on the anonymous difference method for secret protection of micro- aggregation |
CN108734022A (en) * | 2018-04-03 | 2018-11-02 | 安徽师范大学 | The secret protection track data dissemination method divided based on three-dimensional grid |
CN108733774A (en) * | 2018-04-27 | 2018-11-02 | 上海世脉信息科技有限公司 | A kind of unemployment dynamic monitoring method based on big data |
CN109376184A (en) * | 2018-10-16 | 2019-02-22 | 网链科技集团有限公司 | A method of windward driving is taken based on big data |
Non-Patent Citations (2)
Title |
---|
ZHAOWEI HU、 JING YANG、 JIANPEI ZHANG: "Trajectory privacy protection method based on the time interval divided", 《COMPUTERS & SECURITY》 * |
郭良敏、王安鑫、郑孝遥: "基于区域划分的轨迹隐私保护方法", 《计算机应用》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111026930A (en) * | 2019-12-02 | 2020-04-17 | 东北大学 | Track data privacy protection method based on track segmentation |
CN111026930B (en) * | 2019-12-02 | 2021-06-01 | 东北大学 | Track data privacy protection method based on track segmentation |
CN111259434A (en) * | 2020-01-08 | 2020-06-09 | 广西师范大学 | Privacy protection method for individual preference position in track data release |
CN111259434B (en) * | 2020-01-08 | 2022-04-12 | 广西师范大学 | Privacy protection method for individual preference position in track data release |
CN111625587A (en) * | 2020-05-28 | 2020-09-04 | 泰康保险集团股份有限公司 | Data sharing apparatus |
CN111625587B (en) * | 2020-05-28 | 2022-02-15 | 泰康保险集团股份有限公司 | Data sharing apparatus |
CN112883423A (en) * | 2021-02-25 | 2021-06-01 | 吉林师范大学 | Similarity-based k-anonymous privacy protection method for release track |
CN112883423B (en) * | 2021-02-25 | 2023-02-17 | 吉林师范大学 | Similarity-based k-anonymous privacy protection method for release track |
CN113672975A (en) * | 2021-08-03 | 2021-11-19 | 支付宝(杭州)信息技术有限公司 | Privacy protection method and device for user track |
Also Published As
Publication number | Publication date |
---|---|
CN110162997B (en) | 2021-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110162997A (en) | Anonymous method for secret protection based on interpolation point | |
Yuan et al. | An interactive-voting based map matching algorithm | |
CN106874432B (en) | A kind of public transport passenger trip space-time trajectory extracting method | |
Zahedi et al. | A framework for QoI-inspired analysis for sensor network deployment planning | |
CN105843829B (en) | A kind of big data creditability measurement method based on hierarchical mode | |
CN108595539A (en) | A kind of recognition methods of trace analogical object and system based on big data | |
CN107018493A (en) | A kind of geographical position Forecasting Methodology based on continuous sequential Markov model | |
CN104661306B (en) | Mobile terminal Passive Location and system | |
Vajakas et al. | Trajectory reconstruction from mobile positioning data using cell-to-cell travel time information | |
CN107516417A (en) | A kind of real-time highway flow estimation method for excavating spatial and temporal association | |
CN109039503A (en) | A kind of frequency spectrum sensing method, device, equipment and computer readable storage medium | |
CN105120433A (en) | WLAN indoor positioning method based on continuous sampling and fuzzy clustering | |
CN110378002A (en) | Social relationships modeling method based on motion track | |
TW200814593A (en) | Channel estimation apparatus with an optimal search and method thereof | |
CN108875761A (en) | A kind of method and device for expanding potential user | |
SE0201315D0 (en) | A method and system of rating in a charging system | |
CN107426736A (en) | The frequency spectrum sensing method and system of a kind of cognitive radio | |
CN107194515A (en) | Determine user's current behavior and the method and apparatus for predicting user view | |
CN111400747B (en) | Measurement method based on track privacy protection | |
CN110289926B (en) | Spectrum sensing method based on symmetric peak values of cyclic autocorrelation function of modulation signal | |
CN110059795A (en) | A kind of mobile subscriber's node networking method merging geographical location and temporal characteristics | |
CN108345662A (en) | A kind of microblog data weighted statistical method of registering considering user distribution area differentiation | |
CN103209102A (en) | Web quality of service distributed measurement system and method | |
CN109117439A (en) | Take the public security event space-time co-occurrence patterns method for digging of crime time uncertainty into account | |
CN109581280A (en) | The adaptive tuning on-line method, system and device of terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |