CN110162997A - Anonymous method for secret protection based on interpolation point - Google Patents

Anonymous method for secret protection based on interpolation point Download PDF

Info

Publication number
CN110162997A
CN110162997A CN201910340914.5A CN201910340914A CN110162997A CN 110162997 A CN110162997 A CN 110162997A CN 201910340914 A CN201910340914 A CN 201910340914A CN 110162997 A CN110162997 A CN 110162997A
Authority
CN
China
Prior art keywords
track
distance
anonymous
imhdt
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910340914.5A
Other languages
Chinese (zh)
Other versions
CN110162997B (en
Inventor
汪小寒
张泽培
何增宇
王涛春
孙丽萍
郑孝遥
罗永龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Normal University
Original Assignee
Anhui Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Normal University filed Critical Anhui Normal University
Priority to CN201910340914.5A priority Critical patent/CN110162997B/en
Publication of CN110162997A publication Critical patent/CN110162997A/en
Application granted granted Critical
Publication of CN110162997B publication Critical patent/CN110162997B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Storage Device Security (AREA)
  • Complex Calculations (AREA)

Abstract

The present invention is suitable for secret protection technical field; provide a kind of anonymous method for secret protection based on interpolation point; this method specifically comprises the following steps: S1, pre-processes to initial trace data set Ts, forms several consistent track equivalence class Ecs on timestamp;S2, track in each track equivalence class is clustered according to IMHDT distance metric, several track anonymity groups is constituted in each track equivalence class, wherein tracking quantity is no less than k group in each anonymity group;S3, it is disturbed to each to track in anonymous group, finally meets interpolation track (k, δ)-anonymity.It is constraint with trajectory time stamp, interpolation point is limited on the orbit segment of corresponding timestamp, data distortion is reduced during anonymity, increases availability of data under the premise of meeting publication data-privacy protection.

Description

Anonymous method for secret protection based on interpolation point
Technical field
The invention belongs to secret protection technical fields, provide a kind of anonymous method for secret protection based on interpolation point.
Background technique
Modern society's trace information can be had the mobile phone of GPS, and PDA, automatic navigator, intelligent wearable device etc. is conveniently Be acquired and share.User is so as to being convenient to use location based service1(LBS), for example " neighbouring add is searched Petrol station ", " motion profile for recording me " etc., the trace information of collection can be used for business decision, such as intensive in location information Area opens up supermarket etc., and usually this kind of area has biggish commercial value, to make investor's maximizing the benefits.It can also be used for The development of the applications such as urban planning.Trace information because it contains special space time information and there are immense values, but these believe Breath can also be collected and be analyzed by malice mechanism, and privacy of user is caused to be revealed.
Therefore it needs to carry out anonymous processing to publication data set, solves the problems, such as privacy leakage.Secret protection system simultaneously The data exported should not excessively change the track characteristics such as length and the duration of the track of relative users, how issue Not only it can handle the availability of publication data while trace information, but also can guarantee that individual track is not identified by attacker Being that protecting track privacy application instantly needs the problem of paying close attention to.Existing many methods for protecting track data publication privacy In, most methods do not account for the availability of the data for publication.
Summary of the invention
The embodiment of the invention provides a kind of anonymous method for secret protection based on interpolation point, is about with trajectory time stamp Interpolation point is limited on the orbit segment of corresponding timestamp by beam, and data distortion is reduced during anonymity, meets publication data Availability of data is increased under the premise of secret protection.
To achieve the goals above, the present invention provides a kind of anonymous method for secret protection based on interpolation point, the side Method specifically comprises the following steps:
S1, initial trace data set Ts is pre-processed, forms several consistent track equivalence classes on timestamp Ecs;
S2, track in each track equivalence class is clustered according to IMHDT distance metric, structure in each track equivalence class At several track anonymity groups, wherein tracking quantity is no less than k group in each anonymity group;
S3, track in each anonymous group is disturbed, finally meets interpolation track (k, δ)-anonymity.
Further, the step S1 specifically comprises the following steps:
S11, trajectory processing fragment value P is definedi
S12, the beginning and ending time for obtaining initial trace Tr stab { tb,te};
S13, acquisition time are later than initial time tbAnd mould PiFor 0 timestamp tiAnd the time is earlier than termination time teAnd mould PiFor 0 timestamp tj
S14, it will be intercepted in initial trace as { ti,tj, and it is put into track equivalence class D { i, j }.
Further, the step S2 specifically comprises the following steps:
S21, the track that do not cluster in each track equivalence class set is put into active set, gathers random choosing from active Select a track;
S22, calculate active set in other tracks arrive the selected track IMHDT distance, by IMHDT distance farthest A track centered on track;
IMHDT distance of other tracks to centrode in S23, calculating active set;
S24, it takes k-1 nearest track and the centrode in IMHDT distance to constitute an anonymous cluster, hides described Name cluster is added in anonymity set;
S25, the track that distance is farthest in k-1 nearest track is obtained, if the track and centrode IMHDT distance be greater than threshold value max_radius, then inhibit the anonymous cluster;
The IMHDT distance is the Hausdorff distance of interpolation point under time-constrain.
Further, the IMHDT distance calculating method between two tracks Tr1, Tr2 is specific as follows:
S221, each track sampled point Tr1_node is calculatedT=tiTo track endBetween The shortest distance;
S222, track sampled point Tr1_node is calculatedT=tiTo track endBetween most short distance From;
S223, using in step S221 and step S222 apart from minimum value as track sampled point Tr1_nodeT=ti's IMHDT distance;
S224, track Tr1The average value of the IMHDT sum of the distance of each track sampled point is between track Tr1, Tr2 IMHDT distance.
Further, the track sampled point is specific as follows to the minimum distance calculation method between orbit segment:
Judge with the presence or absence of interpolation point on orbit segment, so that the line of track sampled point and interpolation point is perpendicular to track Section;
If it is present the Euclidean distance of track sampled point to interpolation point be track sampled point between orbit segment most Short distance;
If do not deposited, track sampled point to track end two-end-point apart from minimum value be track sampled point to orbit segment Between the shortest distance.
The anonymous method for secret protection based on interpolation point that the embodiment of the present invention proposes has the following beneficial effects:
1, which uses a variety of pretreatment fragments in the preprocessing process of track, comes Regularization track, and to different pretreatments It the retention amount of Regularization track under fragment and is comprehensively compared to anonymous track quality to determine that pretreatment fragment takes Value is conducive to the progress that track inhibition and subsequent track anonymization is reduced in preprocessing process.
2 introduce track uncertain theory in interpolation point anonymity model, are unique in that each of which due to track data Sampled point is likely to become standard identifier, and directly carrying out moving operation to it will lead to the increase of anonymous cost.Therefore rail is introduced The intrinsic uncertain region of mark advantageously reduces anonymous cost as the anonymous region of track.
3 when measuring trajectory distance in the cluster process of track using the Hausdorff distance based on interpolation point, and it is resonable Identical rail is always less than or equal to apart from calculated value based on the Hausdorff of interpolation point to centrode by the anonymous track of upper proof Euclidean distance calculated value between mark.Therefore carrying out cluster using the distance, can to obtain area more extensive than Euclidean distance smaller Clustering cluster.
4 propose the anonymous mould that interpolation point in the adjacent track section with sampled point replaces sampled point during the anonymity of track Type can reduce the track during anonymity using the model and disturb, to reduce data distortion, meeting, publication data are hidden Availability of data is increased under the premise of private protection.
Detailed description of the invention
Fig. 1 is the uncertain region schematic diagram of track sampled point provided in an embodiment of the present invention;
Fig. 2 is three initial trace schematic diagrames provided in an embodiment of the present invention for being unsatisfactory for track (3, δ)-anonymity;
Fig. 3 is track schematic diagram of three initial traces provided in an embodiment of the present invention after anonymization;
Fig. 4 provides the anonymous method for secret protection flow chart based on interpolation point for the embodiment of the present invention;
Fig. 5 is the interpolation track similarity measurement schematic diagram under no time-constrain provided in an embodiment of the present invention;
Fig. 6 is the interpolation track similarity measurement schematic diagram under having time provided in an embodiment of the present invention constraint;
Euclidean distance of the Fig. 7 between track provided in an embodiment of the present invention and the Hausdorff distance based on interpolation point Contrast schematic diagram;
Fig. 8 is the anonymization operation chart provided in an embodiment of the present invention based on interpolation point.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
The definition of related terms:
1) track
Track is usually moving line of the mobile object in a period of time.Trace information is by the biography with positioning system Feel equipment to collect, these sensing equipments store in the corresponding time and send the coordinate of mobile object to trace information gatherer Value, there are two types of different representations:
Track Tr is by a string using timestamp as the triple (t of sequencei,xi,yi) constitute
Tr={ (t1,x1,y1),(t2,x2,y2),...,(tn,xn,yn)}
Wherein, xi,yiTrack is represented in timestamp tiWhen 1≤i of coordinate value≤n.
Another track representation method is by a string of continuous broken linesIt constitutes
Wherein, pJRepresent a sampled point in the Tr of track, plenTrThe length of track Tr is represented,It is two tracing points Between piecewise linear approximation simulation reality in path, when sampled point sample frequency level off to 0 when, track closer to reality in Moving line, but sample frequency it is higher storage and parsing track cost it is higher.
2) Hausdorff distance
Hausdorff distance is to give two point set T apart from measurement method between two point sets of image domainsi= {a1,a2,...,ai,...,amAnd Tj={ b1,b2,...,bj,...,bn, Ti,TjBetween Hausdorff distance definition such as Under:
Wherein
Since original Hausdorff distance calculates the distance between point set using maximum value and minimum value, one will receive A little outlier large effects;In order to improve Hausdorff distance to the robustness of isolated point and noise, a kind of improvement is proposed Hausdorff distance, reduced by the way of being averaged outlier bring influence, improve Hausdorff apart from table Show:
Hausdorff distance based on interpolation point is expressed as follows:
3) uncertainty of track sampled point
Existing location technology can not one coordinate points of point-device positioning, physical location under normal conditions is general A border circular areas is turned to, which is to realize the basis of track (k, δ)-anonymity, and the present invention is equally hidden by this characteristic Name track, but unlike, the present invention does not use the uncertain region of initial trace sampled point but using interpolation point Instead of to obtain smaller anonymous cost.
Due in reality location technology it is inaccurate, it is assumed that there is uncertain threshold value δ to indicate, be then circle with track sampled point The heart, it is the uncertain region (as shown in Figure 1) of track sampled point that uncertain threshold value δ, which is the border circular areas of radius:
dist(preal,p)≤δ
Wherein, prealThe actual position of track is represented, p represents sampled point.prealIt can reside in any of uncertain region Position.
4) cooperation of track is anonymous (Co-localization)
It is finally realizing in anonymization, it would be desirable to which the corresponding sampled point on each track meets it two-by-two each other Uncertain region in, to make path implementation cooperation anonymous.Define centrode Tr and anonymity track Tr '
Tr={ (t1,x1,y1),(t2,x2,y2),...,(tn,xn,yn)}
Tr '={ (t1,x′1,y′1),(t2,x′2,y′2),...,(tn,x′n,y′n)}
Tr ' goes up each track sampled point in the uncertain region of corresponding centrode sampled point.Assuming that track degree Flow function then needs to meet using Euclidean distance
Then claim two tracks to meet and cooperate anonymous (Co-localization), is denoted as Coloc (Tr, Tr ')
5) track (k, δ)-anonymous group
If any track two-by-two is all satisfied the cooperation anonymity of track, uncertain region δ in track anonymity group, and hides Tracking quantity is more than or equal to k in name group, then the anonymity group is a track (k, δ)-anonymity group.
Track (k, δ)-anonymity is the uncertain proposition based on above track sampled point, in Fig. 2, centrode First sampled point do not know in the uncertain region that threshold value δ is radius, anonymous track Tr1, Tr2 using the point as the center of circle Correspondence sampled point be all satisfied anonymous demand, at this time by anonymous track Tr1, Tr2With centrode in first track sampled point Upper satisfaction (3, δ)-is anonymous.Second sampled point only meets (2, δ)-anonymity in figure, because corresponding to sampled point on anonymity track Tr2 Not in uncertain region.Similarly third and the 4th only satisfaction (2, δ)-anonymity, at this time if will be by this three tracks The trajectory set of composition is converted into the anonymous group in track (3, δ)-, then needs to carry out moving operation to corresponding tracing point, which can lead Cause data distortion.Fig. 3 is the track set that above-mentioned three initial traces meet (3, δ)-anonymity after anonymization, ash in Fig. 3 Color track sampled point is that the former tracing point progress position movement for being unsatisfactory for condition is formed, and all tracks are all satisfied track at this time Cooperation it is anonymous, therefore the track anonymity group be one (3, δ)-anonymous group.
Problem to be solved by this invention is that initial trace data set is carried out anonymous operation, assumes it in above-mentioned attack Under the premise of can reduce the privacy leakage risk of the track owner, any operation for initial trace data set can all make to send out Cloth data distortion reduces the use value of publication data set.
Evaluation index
1) data distortion
Cluster is that data are classified with certain itself feature, and data similarity is smaller between cluster and data phase in cluster It is larger like property.Similitude is the emphasis of cluster process, and trajectory clustering is no exception, how to indicate that the similitude between track becomes The core of trajectory clustering algorithm.It realizes in the classic algorithm NMA of track k- anonymity and its similitude is calculated with Euclidean distance, and Invention uses the Hausdorff distance based on interpolation calculation, and the calculating of this verified distance above is less than or equal to be widely used Euclidean distance.Therefore cluster process can be made to form the smaller smaller cluster of extensive area using Hausdorff distance.Extensive face Product, which reduces, can be such that data distortion in cluster process reduces.
Data distortion
Wherein len (Ecs) represents the quantity of cluster after cluster, ClusterArea (Eci) indicate cluster EciExtensive area, MaxArea indicates the track regions gross area.
Anonymous cost
The anonymization of track is that data conversion is carried out in the cluster formed after trajectory clustering, i.e. motion track sampled point makes it It is anonymous to meet (k, δ)-.Since track meets the requirement of k track in cluster after the cluster process of track, which needs So that it is met the sampled point in every track and meets its distance for arriving centrode no more than δ.
Anonymous cost
Wherein, TranslationNode represents the moving distance of track sampled point, and maxTranslation is represented in track The moving distance of all the points.
Fig. 4 provides the anonymous method for secret protection flow chart based on interpolation point for the embodiment of the present invention, and this method is specifically wrapped Include following steps:
S1, initial trace data set Ts is pre-processed, forms several consistent track equivalence classes on timestamp Ecs;
Since sampling times of actual life different tracks, there are the measurements that biggish difference will affect track similitude.Cause This needs initial trace being divided into several equivalence classes according to timestamp, wherein the track of each equivalence class possesses the consistent time Stamp.But the trace number with identical timestamp is less, directly carries out classification according to timestamp and inevitably results in track The number of equivalence class is excessive, and trace number is less in each equivalence class, if the number deficiency k, this equivalence class is discontented Sufficient k- anonymity requires then be inhibited, second-rate so as to cause anonymous data.The present invention is using the track in NWA algorithm Pretreatment mode makes each track equivalence class possess more track, relative to above by way of inhibiting partial traces point It is middle to inhibit to remain a large amount of tracks for entire track equivalence class, greatly improve track anonymity quality.
Since the selection of pretreatment fragment is needed in track retention amount and to weigh between anonymous data quality, the present invention exists Different pretreatments fragment is used in preprocessing process to be pre-processed respectively, to form several equivalence class groups.It is testing Part carries out comprehensive analysis to its each index, and a selected suitable pretreatment fragment is tested, kept to anonymous rail Mark Mass lost track inhibits data volume.
Algorithm 1 is the preprocessing process of initial trace, and input value is initial trace set Ts, and track pre-processes fragment value Pi exports as by corresponding pretreatment fragment treated track equivalence class, the trajectory time that each equivalence class is consistent Stamp.For each track track in data set, the beginning and ending time of first recording track stabs [tb,te], take i to be greater than t to meetbAnd The timestamp that mould Pi is 0 takes j to meet to be less than and terminating timestamp teAnd the timestamp that mould Pi is 0.Track is intercepted as [ti,tj] And it is put into same i, in the track equivalence class of j value.
Final all tracks are placed into corresponding i in equivalence class set entirely, in the equivalence class of j value, track in each equivalence class Have the identical beginning and ending time, note: data set time stamp of the present invention is continuous, i.e., if two track beginning and ending time stamps are consistent, it All timestamps it is consistent.
S2, track in each track equivalence class is clustered according to IMHDT distance metric, if being constituted in each equivalence class Dry track anonymity group, wherein tracking quantity is no less than k in each anonymity group;
The process of cluster needs to measure track in same timestamp equivalence class according to specific similarity function, rail Mark similitude is higher be divided into it is same to which in anonymous group, each anonymity group of components quantity is not less than k.The core of the process be as What determines the similitude of two tracks, the i.e. determination of track metric function.Classical metric function is Euclidean distance, first calculating pair The Euclidean distance of track sampled point on timestamp is answered, arithmetic average then is taken to the upper sampled point of all timestamps, the value That is the Euclidean distance of track.The present invention proposes a kind of new metric form IMHDT, the interpolation point under time-constrain Hausdorff distance (IMHDT) calculating process is as follows:
Wherein, dist (pa,pb) represent sampled point pa, pbThe distance between, interpolation pointMake dist (pa, ⊙) most It is small, generally sampled point paTo broken line pb-1pbVertical line.
Wherein, dist (Tra,Trb) represent track Tra,TrbBetween IMHDT distance, t number of sampling points.
Not introducing time-constrain in the similarity system design of track may result in opposite comparison result, and Fig. 5 is no time The lower interpolation track similarity measurement of constraint, two tracks IMHD in the case where interpolation point searching no time-constrain in Fig. 5 It calculates apart from very little, but actual conditions are then the tracks that two directions are completely contradicted, there is also very big differences for track sampled point distance It is different, especially t1 in figure, the t4 moment.Therefore actual conditions are not met using IMHD measurement track under such situation;Fig. 6 is sometimes Between constrain under interpolation track similarity measurement, the searching of interpolation point is limited in the adjacent track section of same sampling instant, The IMHDT acquired tallies with the actual situation apart from relatively large.The searching range of trajectory interpolation point substantially reduces simultaneously, does not need The full track mark scale of IMHD searches interpolation point, and computational efficiency is improved.
The present invention uses the Hausdorff distance (IMHDT) under time-constrain based on interpolation point to be used as track metric function, Trajectory clustering is carried out based on greed cluster.Since IMHDT distance is less than or equal to Euclidean distance (hereinafter under square one To issuing a certificate) because formed anonymous group of cluster has lesser generalized radius compared with Euclidean distance, clustering cluster possesses smaller Extensive area, reduce the distortion of extensive bring track data.
Anonymous track is always less than or equal to identical rail apart from calculated value based on the Hausdorff of interpolation point to centrode Euclidean distance calculated value between mark, proof procedure are specific as follows:
It is different that the Hausdorff distance of track sampled point calculates calculation in varied situations:
1, as shown in t1 in Fig. 7, when anonymous track sampled point can not be to track adjacent for timestamp in centrode When section makees perpendicular bisector, interpolation point can not be made.The Hausdorff distance of track sampled point is consistent with Euclidean distance at this time; 2, as shown in t3 in Fig. 7, when anonymous track sampled point may only be to the one of orbit segment adjacent for timestamp in centrode When perpendicular bisector is made at end, an available interpolation point.At this time the Hausdorff distance of track sampled point take sampled point to insert It is worth the Euclidean distance of point.Due to form right angle triangle between 3 points, it is greater than any right angle by hypotenuse is long Hausdorff distance known to side length between sampled point is less than Euclidean distance;3, as shown in t2 in Fig. 7, when anonymous track samples When point can make perpendicular bisector to the both ends of orbit segment adjacent for timestamp in centrode, available two interpolation Point.The Hausdorff distance of track sampled point takes one shorter into the Euclidean distance of two interpolation points of sampled point at this time.Together Reason will also realize that the distance of the Hausdorff between sampled point is less than Euclidean distance.
Since the IMHDT distance of track takes the mean value of the above tracing point distance, and value is small under three circumstances In equal to Euclidean distance.Available anonymity track is to the total apart from calculated value based on the Hausdorff of interpolation point of centrode It is less than the Euclidean distance calculated value being equal between same trajectories.All sampled points can not in centrode for the time It stabs in the case that adjacent orbit segment makees perpendicular bisector, two calculated values are equal.
The present invention treats the judgement of the track in anonymous group, and core out track will be limited to other tracks in anonymous group In uncertain region, to realize that the cooperation of track is anonymous, achieve the purpose that secret protection.Moreover, being different from track (k, δ)- Anonymity model, the present invention replace uncertainty node to disturb using interpolation point, are distorted to obtain little data, are smaller Anonymous cost and higher availability of data.
2 clustering algorithm of algorithm inputs as track equivalence class Ecs, secret protection degree k.Output is of equal value for the track after cluster Class set clusteredEcs.Max_radius is set first, and inhibiting if cluster generates IMHDT distance more than the threshold value should Clustering cluster.Second step initialization does not cluster track and gathers and be set as empty, later to pretreated each track equivalence class Cluster operation is executed, initializes clustered track set clustered first, if tracking quantity is less than in the track equivalence class K then inhibits it.The set is inserted, to indicate not in tracks all in the equivalence class by first initialization active set Gather the track of cluster.Reinitialize centrode set, first selects a track from active set at random.Then for It is all do not cluster track and execute the IMHDT distance of itself and the track calculate, select the track of IMHDT distance value farthest as One centrode, then to all IMHDT distance calculating for not clustering track and executing itself and the centrode.Initialization is anonymous Then cluster anonymity takes IMHDT to constitute an anonymous cluster apart from k-1 nearest track and the centrode.This is hidden Name cluster is added in anonymity set.Need first to calculate farthest one of the distance in k-1 nearest track at this time Track, if the threshold value max_radius that the IMHDT of this track and centrode distance is arranged before being greater than, by these tracks Inhibit.By no matter whether track is classified as anonymous track (anonymity set is added) after this step, these tracks are both needed to Never it to be deleted in anonymity track set active, therefore when active collection is combined into sky, the track equivalence class end of clustering. After all equivalence class end of clustering, trajectory clustering process of the invention terminates.
Algorithm 3 is the specific algorithm for calculating the IMHDT distance in two tracks, and input is two track Tr1, Tr2.It is defeated IMHDT distance between this two tracks out.Each track sampled point Tr1_node is calculated firstT=tiTo orbit segmentBetween the shortest distance, then calculateTo orbit segment Between the shortest distance.It is finally minimized the IMHDT distance as the point, is averaged again after cumulative as between track IMHDT distance.
Algorithm 4 is to calculate track sampled point to the algorithm of the shortest distance between orbit segment, is inputted as track sampled pointAnd by two track sampled pointsThe orbit segment of composition.Output is that track samples It puts to the shortest distance between orbit segment.It first determines whether with the presence or absence of interpolation point on orbit segment, so that sampled point and interpolation point Line perpendicular to orbit segment, and if so, return sampled point to interpolation point Euclidean distance, if there is no then returning to rail Mark sampled pointTo other two-end-point apart from minimum value.
S3, anonymization operation is carried out to each anonymous group, it is made to meet interpolation track (k, δ)-anonymity.
It needs to disturb the track in anonymous group as shown in figure 8, completing this step, to meet interpolation track (k, δ)- Anonymous requirement.Specific embodiment is to move the track sampled point for being unsatisfactory for requiring, make its to centrode away from From less than or equal to δ.
Anonymous cost can be reduced by replacing track sampled point to carry out anonymization operation using interpolation point, and proof procedure is specific It is as follows:
Translation (IMHDT)=Eurp (Trp_ ⊙i,Tri)-δ
Translation (Eurp)=Eurp (Trpi,Tri)-δ
According to the total apart from calculated value based on the Hausdorff of interpolation point of the anonymous track being proved above to centrode It is less than the Euclidean distance calculated value being equal between same trajectories:
IMHDT(Trpi,Tri)=Eurp (Trp_ ⊙i,Tri)≤Eurp(Trpi,Tri)
Due to track do not know threshold value δ be it is determining, can demonstrate,prove that replace track sampled point to carry out using interpolation point anonymous The anonymous cost of tracing point can be reduced by changing operation.
The anonymous method for secret protection based on interpolation point that the embodiment of the present invention proposes has the following beneficial effects:
1, which uses a variety of pretreatment fragments in the preprocessing process of track, comes Regularization track, and to different pretreatments It the retention amount of Regularization track under fragment and is comprehensively compared to anonymous track quality to determine that pretreatment fragment takes Value is conducive to the progress that track inhibition and subsequent track anonymization is reduced in preprocessing process.
2 introduce track uncertain theory in interpolation point anonymity model, are unique in that each of which due to track data Sampled point is likely to become standard identifier, and directly carrying out moving operation to it will lead to the increase of anonymous cost.Therefore rail is introduced The intrinsic uncertain region of mark advantageously reduces anonymous cost as the anonymous region of track.
3 when measuring trajectory distance in the cluster process of track using the Hausdorff distance based on interpolation point, and it is resonable Identical rail is always less than or equal to apart from calculated value based on the Hausdorff of interpolation point to centrode by the anonymous track of upper proof Euclidean distance calculated value between mark.Therefore carrying out cluster using the distance, can to obtain area more extensive than Euclidean distance smaller Clustering cluster.
4 propose the anonymous mould that interpolation point in the adjacent track section with sampled point replaces sampled point during the anonymity of track Type can reduce the track during anonymity using the model and disturb, to reduce data distortion, meeting, publication data are hidden Availability of data is increased under the premise of private protection.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims (5)

1. a kind of anonymous method for secret protection based on interpolation point, which is characterized in that the method specifically comprises the following steps:
S1 pre-processes initial trace data set Ts, forms several consistent track equivalence class Ecs on timestamp;
S2, track in each track equivalence class is clustered according to IMHDT distance metric, if being constituted in each track equivalence class Dry track anonymity group, wherein tracking quantity is no less than k group in each anonymity group;
S3, the track in each anonymous group is disturbed, finally meets interpolation track (k, δ)-anonymity.
2. the anonymous method for secret protection based on interpolation point as described in claim 1, which is characterized in that the step S1 is specifically wrapped Include following steps:
S11, trajectory processing fragment value P is definedi
S12, the beginning and ending time for obtaining initial trace Tr stab { tb,te};
S13, acquisition time are later than initial time tbAnd mould PiFor 0 timestamp tiAnd the time is earlier than termination time teAnd mould PiIt is 0 Timestamp tj
S14, it will be intercepted in initial trace as { ti,tj, and it is put into track equivalence class D { i, j }.
3. the anonymous method for secret protection based on interpolation point as described in claim 1, which is characterized in that the step S2 is specifically wrapped Include following steps:
S21, the track that do not cluster in each track equivalence class set is put into active set, from active set random selection one Track;
S22, calculate active set in other tracks arrive the selected track IMHDT distance, by IMHDT distance farthest one Track centered on track;
IMHDT distance of other tracks to centrode in S23, calculating active set;
S24, k-1 nearest track and the centrode in IMHDT distance is taken to constitute an anonymous cluster, by the anonymous cluster It is added in anonymity set;
S25, the track that distance is farthest in k-1 nearest track is obtained, if the track and centrode IMHDT distance is greater than threshold value max_radius, then inhibits the anonymous cluster;
The IMHDT distance is the Hausdorff distance of interpolation point under time-constrain.
4. the anonymous method for secret protection based on interpolation point as claimed in claim 3, which is characterized in that two track Tr1、Tr2Between IMHDT distance calculating method it is specific as follows:
S221, each track sampled point Tr1_node is calculatedT=tiTo track endBetween it is most short Distance;
S222, track sampled point Tr1_node is calculatedT=tiTo track endBetween the shortest distance;
S223, using in step S221 and step S222 apart from minimum value as track sampled point Tr1_nodeT=tiIMHDT away from From;
S224, track Tr1The average value of the IMHDT sum of the distance of each track sampled point is the IMHDT between track Tr1, Tr2 Distance.
5. the anonymous method for secret protection based on interpolation point as claimed in claim 4, which is characterized in that the track sampled point arrives Minimum distance calculation method between orbit segment is specific as follows:
Judge with the presence or absence of interpolation point on orbit segment, so that the line of track sampled point and interpolation point is perpendicular to orbit segment;
If it is present the Euclidean distance of track sampled point to interpolation point is track sampled point to the most short distance between orbit segment From;
If do not deposited, track sampled point to track end two-end-point apart from minimum value be track sampled point between orbit segment The shortest distance.
CN201910340914.5A 2019-04-25 2019-04-25 Anonymous privacy protection method based on interpolation points Active CN110162997B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910340914.5A CN110162997B (en) 2019-04-25 2019-04-25 Anonymous privacy protection method based on interpolation points

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910340914.5A CN110162997B (en) 2019-04-25 2019-04-25 Anonymous privacy protection method based on interpolation points

Publications (2)

Publication Number Publication Date
CN110162997A true CN110162997A (en) 2019-08-23
CN110162997B CN110162997B (en) 2021-01-01

Family

ID=67640021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910340914.5A Active CN110162997B (en) 2019-04-25 2019-04-25 Anonymous privacy protection method based on interpolation points

Country Status (1)

Country Link
CN (1) CN110162997B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026930A (en) * 2019-12-02 2020-04-17 东北大学 Track data privacy protection method based on track segmentation
CN111259434A (en) * 2020-01-08 2020-06-09 广西师范大学 Privacy protection method for individual preference position in track data release
CN111625587A (en) * 2020-05-28 2020-09-04 泰康保险集团股份有限公司 Data sharing apparatus
CN112883423A (en) * 2021-02-25 2021-06-01 吉林师范大学 Similarity-based k-anonymous privacy protection method for release track
CN113672975A (en) * 2021-08-03 2021-11-19 支付宝(杭州)信息技术有限公司 Privacy protection method and device for user track

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605362A (en) * 2013-09-11 2014-02-26 天津工业大学 Learning and anomaly detection method based on multi-feature motion modes of vehicle traces
CN105760780A (en) * 2016-02-29 2016-07-13 福建师范大学 Trajectory data privacy protection method based on road network
US20160285827A1 (en) * 2012-02-23 2016-09-29 Tenable Network Security, Inc. System and method for facilitating data leakage and/or propagation tracking
CN107358113A (en) * 2017-06-01 2017-11-17 徐州医科大学 Based on the anonymous difference method for secret protection of micro- aggregation
CN108734022A (en) * 2018-04-03 2018-11-02 安徽师范大学 The secret protection track data dissemination method divided based on three-dimensional grid
CN108733774A (en) * 2018-04-27 2018-11-02 上海世脉信息科技有限公司 A kind of unemployment dynamic monitoring method based on big data
CN109376184A (en) * 2018-10-16 2019-02-22 网链科技集团有限公司 A method of windward driving is taken based on big data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160285827A1 (en) * 2012-02-23 2016-09-29 Tenable Network Security, Inc. System and method for facilitating data leakage and/or propagation tracking
CN103605362A (en) * 2013-09-11 2014-02-26 天津工业大学 Learning and anomaly detection method based on multi-feature motion modes of vehicle traces
CN105760780A (en) * 2016-02-29 2016-07-13 福建师范大学 Trajectory data privacy protection method based on road network
CN107358113A (en) * 2017-06-01 2017-11-17 徐州医科大学 Based on the anonymous difference method for secret protection of micro- aggregation
CN108734022A (en) * 2018-04-03 2018-11-02 安徽师范大学 The secret protection track data dissemination method divided based on three-dimensional grid
CN108733774A (en) * 2018-04-27 2018-11-02 上海世脉信息科技有限公司 A kind of unemployment dynamic monitoring method based on big data
CN109376184A (en) * 2018-10-16 2019-02-22 网链科技集团有限公司 A method of windward driving is taken based on big data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHAOWEI HU、 JING YANG、 JIANPEI ZHANG: "Trajectory privacy protection method based on the time interval divided", 《COMPUTERS & SECURITY》 *
郭良敏、王安鑫、郑孝遥: "基于区域划分的轨迹隐私保护方法", 《计算机应用》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026930A (en) * 2019-12-02 2020-04-17 东北大学 Track data privacy protection method based on track segmentation
CN111026930B (en) * 2019-12-02 2021-06-01 东北大学 Track data privacy protection method based on track segmentation
CN111259434A (en) * 2020-01-08 2020-06-09 广西师范大学 Privacy protection method for individual preference position in track data release
CN111259434B (en) * 2020-01-08 2022-04-12 广西师范大学 Privacy protection method for individual preference position in track data release
CN111625587A (en) * 2020-05-28 2020-09-04 泰康保险集团股份有限公司 Data sharing apparatus
CN111625587B (en) * 2020-05-28 2022-02-15 泰康保险集团股份有限公司 Data sharing apparatus
CN112883423A (en) * 2021-02-25 2021-06-01 吉林师范大学 Similarity-based k-anonymous privacy protection method for release track
CN112883423B (en) * 2021-02-25 2023-02-17 吉林师范大学 Similarity-based k-anonymous privacy protection method for release track
CN113672975A (en) * 2021-08-03 2021-11-19 支付宝(杭州)信息技术有限公司 Privacy protection method and device for user track

Also Published As

Publication number Publication date
CN110162997B (en) 2021-01-01

Similar Documents

Publication Publication Date Title
CN110162997A (en) Anonymous method for secret protection based on interpolation point
Yuan et al. An interactive-voting based map matching algorithm
CN106874432B (en) A kind of public transport passenger trip space-time trajectory extracting method
Zahedi et al. A framework for QoI-inspired analysis for sensor network deployment planning
CN105843829B (en) A kind of big data creditability measurement method based on hierarchical mode
CN108595539A (en) A kind of recognition methods of trace analogical object and system based on big data
CN107018493A (en) A kind of geographical position Forecasting Methodology based on continuous sequential Markov model
CN104661306B (en) Mobile terminal Passive Location and system
Vajakas et al. Trajectory reconstruction from mobile positioning data using cell-to-cell travel time information
CN107516417A (en) A kind of real-time highway flow estimation method for excavating spatial and temporal association
CN109039503A (en) A kind of frequency spectrum sensing method, device, equipment and computer readable storage medium
CN105120433A (en) WLAN indoor positioning method based on continuous sampling and fuzzy clustering
CN110378002A (en) Social relationships modeling method based on motion track
TW200814593A (en) Channel estimation apparatus with an optimal search and method thereof
CN108875761A (en) A kind of method and device for expanding potential user
SE0201315D0 (en) A method and system of rating in a charging system
CN107426736A (en) The frequency spectrum sensing method and system of a kind of cognitive radio
CN107194515A (en) Determine user's current behavior and the method and apparatus for predicting user view
CN111400747B (en) Measurement method based on track privacy protection
CN110289926B (en) Spectrum sensing method based on symmetric peak values of cyclic autocorrelation function of modulation signal
CN110059795A (en) A kind of mobile subscriber's node networking method merging geographical location and temporal characteristics
CN108345662A (en) A kind of microblog data weighted statistical method of registering considering user distribution area differentiation
CN103209102A (en) Web quality of service distributed measurement system and method
CN109117439A (en) Take the public security event space-time co-occurrence patterns method for digging of crime time uncertainty into account
CN109581280A (en) The adaptive tuning on-line method, system and device of terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant