CN103678905B - Method for evaluating quality based on track data - Google Patents

Method for evaluating quality based on track data Download PDF

Info

Publication number
CN103678905B
CN103678905B CN201310661072.6A CN201310661072A CN103678905B CN 103678905 B CN103678905 B CN 103678905B CN 201310661072 A CN201310661072 A CN 201310661072A CN 103678905 B CN103678905 B CN 103678905B
Authority
CN
China
Prior art keywords
grid
data
track
entropy
track data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310661072.6A
Other languages
Chinese (zh)
Other versions
CN103678905A (en
Inventor
黄�俊
张帆
李晔
须成忠
王丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201310661072.6A priority Critical patent/CN103678905B/en
Publication of CN103678905A publication Critical patent/CN103678905A/en
Application granted granted Critical
Publication of CN103678905B publication Critical patent/CN103678905B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a kind of method for evaluating quality based on track data, including step:S1, the tuple-set that track data is expressed as four dimensions, obtain map;S2, the track data is grouped according to entity, using time dimension to the packet sequencing, and every track of the track data is converted into the set of orderly line segment;S3, to divide the map be equal-sized grid, during the track data projected into the grid;The entropy of S4, the calculating grid, and calculate the mean entropy and weighted average entropy of all grid.The present invention is effectively assessed track data quality using entropy, and to realize to many parts of comparings of the quality of data of track data, analysis carries out the feasibility of data analysis using it, and data quality accessment is simple, effective, low cost.

Description

Method for evaluating quality based on track data
Technical field
The present invention relates to areas of information technology, and in particular to the method for evaluating quality based on track data.
Background technology
The mankind, animal and all moveable equipment can portray the track of its behavior according to the time, and science and technology Development can be portrayed the motion track of the mankind or movable object using different equipment.For example, installing GPS to taxi Equipment can record the motion track of automobile;The communication of mobile phone user its mobile phone to communication base station(Check-In)Can To react the motion track of user.
For track data analysis in Geographical Information Sciences, mankind's power behaviouristics, Animal Behavior Science, business decision etc. Aspect is by will have the important and meaning of key.And above-mentioned track data is limited or transmitting procedure appearance due to capacity of equipment Mistake or Data Input Process go wrong, it would be possible to cause the appearance of low quality data.A large amount of low-quality data may The situation of mistake can be reflected, divided so as to lead to not the situation appearance expected, therefore how assess the quality of the quality of data turn into Analysis and a mission critical using track data.
Prior art is mostly that artificially data are disaggregatedly gone through, so as to find out do not meet convention or The part of person's apparent error.For example, track is drawn on map manually, the track substantially gone wrong is checked by naked eyes afterwards. Substantial amounts of manpower will be spent to pre-process and visualize data, and find out the defect of data needs the experience and fortune of people Gas.
The content of the invention
The technical problem to be solved in the present invention is to provide a kind of accurate method for evaluating quality based on track data.
Technical scheme includes a kind of method for evaluating quality based on track data, including step:S1, by track Data are expressed as the tuple-set of four dimensions, obtain map;S2, the track data is grouped according to entity, used Every track of the track data is converted into the set of orderly line segment to the packet sequencing for time dimension;S3, division The map is equal-sized grid, during the track data projected into the grid;S4, the entropy for calculating the grid, And calculate the mean entropy and weighted average entropy of all grid.
Preferably, the step S1 is specifically included, and S11, the track data is expressed as the tuple-set of four dimensions< O, t, x, y>, each tuple is one-time positioning;S12, the o are the numbering of location entity, and the entity and the numbering one One correspondence;S13, the t are the time that positioning occurs;S14, the x and y are the entity o in the position of the time t, institute X and y coordinates axle is stated for two dimensional surface, i.e. map.
Preferably, the two dimensional surface is using terrestrial longitude and dimension as the two dimensional surface of reference axis.
Preferably, the two dimensional surface is to carry out linearly or nonlinearly reversible torsion as reference axis using terrestrial longitude and dimension The two dimensional surface of Qu Shengcheng.
Preferably, the step S2 is specifically included, and S21, the track data is grouped according to the entity o, will Location data with identical numbering is divided into one group;S22, the location data to the packet are ranked up according to the time t; S23, two location datas of adjacent time in the packet are recorded as a track line segment, the two-end-point of the line segment The respectively two location data records, and then the track data is converted to the set of orderly line segment.
Preferably, the step S3 is to divide institute using the fixed family of straight lines parallel to the x and y coordinates axle of distance Map is stated for equal-sized square, and corresponding numbering is defined to each grid, by each location data according to position Confidence breath is projected in the grid of the map.
Preferably, the step S4 is that S41, the grid to the numbering find out the rail that all starting points fall into the grid Trace segments, count the terminal distribution of the track line segment, calculate the corresponding entropy of the grid;S42, to described on the map Grid calculates an entropy, and portrays the entropy distribution of the map;S43, the average value of entropy distribution and weighted average are flat Homoentropic and weighted average entropy.
Beneficial effects of the present invention:Track data quality is effectively assessed using entropy, to realize to many parts of track numbers According to the quality of data comparing, analysis carries out the feasibility of data analysis using it, data quality accessment is simple, effective, it is low into This.
Brief description of the drawings
Fig. 1 is the flow chart of the method for evaluating quality of one embodiment of the invention.
Fig. 2 is the entropy result figure of the method for evaluating quality of one embodiment of the invention.
Specific embodiment
The present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings.
As shown in figure 1, the embodiment of the present invention provides a kind of method for evaluating quality based on track data, including step:
S1, the tuple-set that track data is expressed as four dimensions, obtain map;
S2, track data is grouped according to entity, using time dimension to packet sequencing, and by the every of track data Bar track is converted into the set of orderly line segment;
S3, division map are equal-sized grid, during track data projected into grid;
The entropy of S4, calculating grid, and calculate the mean entropy and weighted average entropy of whole grid.
The embodiment of the present invention is effectively assessed track data quality using entropy, and carrying out the quality of data to track data comments Estimate, to realize that, to many parts of comparings of the quality of data of track data, analysis carries out the feasibility of data analysis, data matter using it Amount assessment is simple, effective, low cost.
Specifically, step S1 includes,
S11, the tuple-set that track data is expressed as four dimensions<O, t, x, y>, each tuple is one-time positioning;I.e. Track data is regarded as the set of positioning;
S12, o for location entity numbering, and entity with numbering correspond;One numbering only represents an entity, such as The license plate number of taxi only represents a taxi;
S13, t are the time that positioning occurs;
S14, x and y be entity o in the position of time t, x and y coordinates axle is two dimensional surface, i.e. map.
Further, two dimensional surface is using terrestrial longitude and dimension as the two dimensional surface of reference axis;Also or with the earth Longitude and dimension carry out the two dimensional surface of linearly or nonlinearly reversible distortion generation as reference axis.
Step S2 specifically,
S21, track data is grouped according to entity o, the location data with identical numbering is divided into one group;
S22, the location data to being grouped are ranked up according to time t;By the time before location data above is placed on Sequence;
S23, will be grouped in two location datas of adjacent time be recorded as a track line segment, the two-end-point difference of line segment It is two location datas record, wherein, it is starting point than location data earlier, another location data is terminal, and then by rail Mark data are converted to the set of orderly line segment.
Step S3 is, using the fixed family of straight lines parallel to x and y coordinates axle of distance divide map for it is equal-sized just It is square, and corresponding numbering is defined to each grid, give once the grid that can be found on map is numbered, by each positioning Data are projected in the grid of map according to positional information, so as to be mapped in corresponding numbering.
The entropy of grid is calculated, and calculates the mean entropy and weighted average entropy of whole grid and be,
S41, on map each with numbered grid, find out the track line segment that all starting points fall into grid, unite The terminal distribution of track line segment is counted, the corresponding entropy of grid is calculated;
S42, on map grid calculate an entropy, and portray map entropy distribution;
S43, the average value of entropy distribution and weighted average are mean entropy and weighted average entropy.
The embodiment of the present invention also further provides implementation steps in detail, specific as follows.
Track data is converted into following form:
P=<o,t,x,y>
Represent object o and be in position x, y in time t.Wherein, o uniquely represents the entity of real world presence, (x, y) It is two coordinates on two dimensional surface.
The reference axis (x, y) of above-mentioned two dimensional surface can be:1. it is, flat as the two dimension of reference axis using terrestrial longitude and dimension Face;2., the two dimensional surface for carrying out any linearly or nonlinearly reversible transformation as reference axis and generating using terrestrial longitude and dimension. And then each track data turns into one-time positioning data.Whole track data is the set of location data, and wherein reference axis is former Point can be any point in plane.
For all of location data, it is grouped according to o, it is assumed that a total of k different entity o, in packet It is ranked up according to time t, obtains orderly data groups
{Pj|j=1…ki},i=1...k,
Wherein, kiRepresent the number of the location data of each packet.
For each packet, two location datas of adjacent time can be recorded as a track line segment, track line segment Two end points are to position twice:
{(Pj,Pj+1)|j=1…ki-1},i=1...k,
Wherein, (Pj,Pj+1) it is track line segment, PjIt is starting point, Pj+1It is terminal.
Above-mentioned two dimensional surface is a map, gives one apart from d, is drawn on map parallel to two reference axis (x, y) Series of parallel family of straight lines, the distance of two neighboring parallel lines is d.And then whole map be divided into it is equal-sized just It is square, set up an invertible mapping, each grid is projected on nature manifold N, the selection reference data of length d positioning when Between interval and average translational speed.For each point P on map, point P is projected on affiliated grid, and project to its volume On number, i.e.,:
F:R2→N,F(P)=r
The end points of each track line segment in above-mentioned track line segment aggregate is all projected on the grid that length is d, that Each track line segment is converted into two grid numberings
(Pj,Pj+1)→F(Pj),F(Pj+1))=(rjs,rj+1,t)
Track line segment aggregate, for each grid r, is found out after the above method is converted into grid numbering tuple data All line segments with above-mentioned grid as starting point
{(F(Pj),F(Pj+1))|F(Pj)=r}
The entropy of one grid is defined as:
Herein:
crp=|{(F(Pj),F(Pj+1))|F(Pj)=r&F(Pj+1)=p}|
And
Thus, the distribution of whole each grid of map is depicted.
The mean entropy of a data is exactly the entropy that each grid is calculated using the above method, is then averaging:
Wherein, weighted average entropy is:
The method of a track data quality of assessment is comparator weighted average entropy or mean entropy, wherein junior's data Quality is preferable.
As shown in Fig. 2 the result of calculation of the entropy to the GPS data from taxi all day of on 07 28th, 2013, source data quilt Two parts of samples are extracted, is respectively the distribution of the entropy that upload frequencies are 2 beats/min and 1 beat/min, it is thus found that upload frequencies are got over Entropy high is smaller.
The specific embodiment of present invention described above, is not intended to limit the scope of the present invention..Any basis Various other corresponding change and deformation done by technology design of the invention, should be included in the guarantor of the claims in the present invention In the range of shield.

Claims (6)

1. a kind of method for evaluating quality based on track data, it is characterised in that including step:
S1, the tuple-set that track data is expressed as four dimensions, obtain map, and the step S1 is specifically included,
S11, the tuple-set that the track data is expressed as four dimensions<O, t, x, y>, each tuple is one-time positioning;
S12, the o for location entity numbering, and the entity and it is described numbering correspond;
S13, the t are the time that positioning occurs;
S14, the x and y are the entity o in the position of the time t, and the x and y coordinates axle is two dimensional surface, i.e. map;
S2, the track data is grouped according to entity, using time dimension to the packet sequencing, and by the track Every track of data is converted into the set of orderly line segment;
S3, to divide the map be equal-sized grid, during the track data projected into the grid;
The entropy of S4, the calculating grid, and calculate the mean entropy and weighted average entropy of all grid.
2. method for evaluating quality according to claim 1, it is characterised in that the two dimensional surface is with terrestrial longitude and latitude Spend as the two dimensional surface of reference axis.
3. method for evaluating quality according to claim 1, it is characterised in that the two dimensional surface is with terrestrial longitude and latitude Spend the two dimensional surface that linearly or nonlinearly reversible distortion generation is carried out as reference axis.
4. method for evaluating quality according to claim 1, it is characterised in that the step S2 is specifically included,
S21, the track data is grouped according to the entity o, the location data with identical numbering is divided into one group;
S22, the location data to the packet are ranked up according to the time t;
S23, two location datas of adjacent time in the packet are recorded as a track line segment, the two of the line segment End points is respectively two location data records, and then the track data is converted to the set of orderly line segment.
5. method for evaluating quality according to claim 1, it is characterised in that the step S3 is to use distance fixed Family of straight lines parallel to the x and y coordinates axle divides the map for equal-sized square, and fixed to each grid The corresponding numbering of justice, during each location data projected into the grid of the map according to positional information.
6. method for evaluating quality according to claim 1, it is characterised in that the step S4 is,
S41, the grid to the numbering, find out the track line segment that all starting points fall into the grid, count the track line segment Terminal distribution, calculate the corresponding entropy of the grid;
S42, on the map the grid calculate an entropy, and portray the map entropy distribution;
S43, the average value of entropy distribution and weighted average are mean entropy and weighted average entropy.
CN201310661072.6A 2013-12-09 2013-12-09 Method for evaluating quality based on track data Active CN103678905B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310661072.6A CN103678905B (en) 2013-12-09 2013-12-09 Method for evaluating quality based on track data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310661072.6A CN103678905B (en) 2013-12-09 2013-12-09 Method for evaluating quality based on track data

Publications (2)

Publication Number Publication Date
CN103678905A CN103678905A (en) 2014-03-26
CN103678905B true CN103678905B (en) 2017-06-13

Family

ID=50316435

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310661072.6A Active CN103678905B (en) 2013-12-09 2013-12-09 Method for evaluating quality based on track data

Country Status (1)

Country Link
CN (1) CN103678905B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108279428B (en) * 2017-01-05 2020-10-16 武汉四维图新科技有限公司 Map data evaluating device and system, data acquisition system, acquisition vehicle and acquisition base station
CN111369787A (en) * 2018-12-26 2020-07-03 杭州海康威视***技术有限公司 Vehicle track prediction method and device and electronic equipment
CN110866000B (en) * 2019-11-20 2022-04-08 珠海格力电器股份有限公司 Data quality evaluation method and device, electronic equipment and storage medium
CN111811542B (en) * 2020-08-07 2021-08-06 中国矿业大学(北京) Path-finding performance distribution calculation method and system based on track data
CN113642845B (en) * 2021-07-13 2023-09-26 同济大学 Quality evaluation method for road traffic perception track data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101464158A (en) * 2009-01-15 2009-06-24 上海交通大学 Automatic generation method for road network grid digital map based on GPS positioning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101464158A (en) * 2009-01-15 2009-06-24 上海交通大学 Automatic generation method for road network grid digital map based on GPS positioning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于GPS轨迹的矢量路网地图自动生成方法;孔庆杰等;《中国科学技术大学学报》;20120815;第7256-7260页 *
基于轨迹分析的自主导航性能评估方法;王勇鑫等;《计算机工程》;20110320;第37卷(第6期);第141-144页 *
大规模轨迹数据的检索、挖掘和应用;袁晶;《中国博士学位论文全文数据库》;20130131;全文 *

Also Published As

Publication number Publication date
CN103678905A (en) 2014-03-26

Similar Documents

Publication Publication Date Title
CN103678905B (en) Method for evaluating quality based on track data
CN106528597B (en) The mask method and device of point of interest
CN105509743B (en) Location processing method, business platform and network system
CN106447774B (en) A kind of mapping method of GPS map to three-dimensional map
CN107169605B (en) Urban electric vehicle charging station site selection method based on vehicle positioning information
CN103929719B (en) The optimization method and optimization device of location information
CN105550199A (en) Point position clustering method and point position clustering apparatus based on multi-source map
WO2014071761A1 (en) Statistical method and apparatus for passenger flow
CN108986207A (en) A kind of road based on true road surface data and emulation modelling method is built along the line
CN109688532A (en) A kind of method and device dividing city function region
CN115796712B (en) Regional land ecosystem carbon reserve estimation method and device and electronic equipment
JPWO2012108540A1 (en) Area range estimation apparatus and area range estimation method
CN103530382A (en) Method for positioning railway space kilometer post
CN103177189B (en) Register Data Quality Analysis method in a kind of many source positions
CN106294484B (en) A kind of method and device updating electronic map data
CN107832386A (en) A kind of error correction method and apparatus of electronic map
CN105844031B (en) A kind of urban transportation gallery recognition methods based on mobile phone location data
CN103577484B (en) A kind of space-location method of random variation map
CN114881430A (en) Community life convenience evaluation method based on network map service
CN113990508A (en) Individual air pollution exposure accurate evaluation method based on mobile phone APP
Biswas et al. Microsegmenting: An approach for precise distance calculation for GPS based ITS applications
CN106291756B (en) The construction method of near space air virtual environment resource
WO2017215447A1 (en) Method and system for verifying positioning precision in mobile communication network
CN109005501A (en) Vehicle positioning method, device, server and system
CN109982368A (en) The azimuthal check method of cell, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant