CN107644069A

CN107644069A - High density Monitoring Data vacuates method

Info

Publication number: CN107644069A
Application number: CN201710815056.6A
Authority: CN
Inventors: 陈有为; 万景琨; 曲洋; 高山岳
Original assignee: Qianxun Position Network Co Ltd
Current assignee: Qianxun Position Network Co Ltd
Priority date: 2017-09-11
Filing date: 2017-09-11
Publication date: 2018-01-30
Anticipated expiration: 2037-09-11
Also published as: CN107644069B

Abstract

What the present invention disclosed high density Monitoring Data vacuates method.This method tries to achieve required set of data points using the general gram of algorithm of Douglas for adding predictive pruning condition from high density Monitoring Data.This method can preferably describe the trend of data fluctuations with point as few as possible, using the Douglas algorithm of predictive pruning, can also effectively reduce the computation complexity for the method for vacuating, efficiently extract data point interested.

Description

High density Monitoring Data vacuates method

Technical field

The present invention relates to the method that vacuates of data, more particularly to high density Monitoring Data vacuates method, further relates to be used for The high density Monitoring Data of satellite navigation foundation enhancing difference base station vacuates method.

Background technology

With satellite navigation, Internet technology and economic fast development is shared, high accuracy positioning demand is more strong.For Meet high accuracy positioning demand, ground enhancing technology can be used to carry out difference correction to satellite data.Stable base station clothes Business is to ensure the important step of final service quality., it is necessary to observe the (central of base station in the O＆M support process of base station Processing unit, CPU), the situation of the utilization rate of other equipment such as internal memory, and the sample frequency of data be typically the second very To being millisecond, so high frequency causes data volume very huge, and therefore, such usual data have highdensity characteristic.Together When, the superelevation sample frequency of monitoring process, which determines, will produce substantial amounts of temporal redundancy data.If complete to extract it is a certain compared with The data visualization operation of the data of long period change at observed number strong point on browser page, it will be difficult in the page plus Substantial amounts of data are carried, and can not meet the needs of low latency inquiry.If for example, CPU usage change that observe one day Situation, sample frequency are the second, then the number of data point will be 86400.If all numbers are drawn in browser page Strong point, the problems such as causing browser page loading slowly or even can not load.And the substantial amounts of period in its real data be present Data fluctuations it is very small, can only be represented with a small amount of data point.So in order to meet this demand, it is necessary to data Vacuated, the trend of data fluctuations is preferably described with point as few as possible.

The content of the invention

The present invention solves the problems, such as it is that high density Monitoring Data needs to vacuate data, and use point as few as possible is preferable Description data fluctuations trend the problem of.

The present invention solve another problem is how reduce vacuate the computation complexity of method the problem of.

To solve the above problems, a kind of high density Monitoring Data of present invention offer vacuates method.This method utilizes increase Douglas-Pu Ke the algorithms of predictive pruning condition try to achieve from high density Monitoring Data required set of data points.

Compared with prior art, the present invention at least has advantages below：

1st, the present invention obtains required data point by using the Douglas algorithm of increase predictive pruning from Monitoring Data Set, so, meets to vacuate data, the trend of data fluctuations is preferably described with point as few as possible, using preshearing The Douglas algorithm of branch, can also effectively reduce the computation complexity for the method for vacuating, efficiently extract number interested Strong point.

2nd, this method can pre-set required maximum and can use number of data points, and the set of data points that this method is drawn is not Set maximum number strong point number can be exceeded, avoid cumbersome last handling process.

3rd, the fluctuation that the fluctuation in view of base station high density data will not be very violent for a long time, so smaller for fluctuating Data sectional, less data point can be taken, or even only take two end points of head and the tail, so that more data points be distributed to Fluctuate larger data sectional.

4th, using the strategy of segment processing, according to difference standard deviation, effectively by more data points distribute to fluctuation compared with Big and frequently data sectional, and difference standard deviation can effectively estimate the violent journey of each data sectional fluctuation Degree,.Meanwhile partition strategy also makes it possible parallel processing or distributed treatment.

Embodiment

To describe technology contents, construction feature, institute's reached purpose and effect of the present invention in detail, below in conjunction with embodiment It is described in detail.

The purpose of the present invention is to extract data point interested, such data point from the high density Monitoring Data of base station Line by the fluctuating change of good response data.This method such as passes through at the segmentation method cutting data of data volume, and calculates The difference standard deviation of each data sectional carrys out the severe degree of reactions change.According to standard deviation size and be previously set The desirable number of data points of maximum can use data point number to distribute the maximum of each data segment.Used in each data sectional Improved Douglas-Pu Ke algorithms take a little.The detailed process of method is as follows：

1. maximum desirable data point number n, data segment complexity threshold ε are set₀, finally take point set S and be initialized as Empty set.

2. calculate data sectional numberAnd data are segmented, the data point number being each segmented is Wherein N represents the number of all data points.

3. calculating data sectional i difference result, difference is with+1 data point P of kth_k+1Subtract k-th of data point P_k。

4. calculate the standard deviation sigma of data sectional i difference result_iIf σ_iLess than data segment complexity threshold ε₀, then will The difference standard deviation sigma of the data segment_iIt is arranged to 0, wherein σ_iCalculation formula it is as follows：

5. calculating the desirable data point number of each segmentation, the maximum of segmentation i distribution can use data point number to be designated as n_i,

6. using improved Douglas-Pu Ke algorithms in each data sectional, detailed process is described below：

● the maximal distance threshold ε of initialization Douglas-Pu Ke algorithms₁, current recursion depth d is 1, each segmentation Contained minimum data point number n_min。

● calculate segmentation in each data point to be segmented both ends data point P_SAnd P_ELine Euclidean distance and by P_SAnd P_E It is added in set S.It is noted herein that the abscissa of data point is the time, ordinate is the size of data point.If The data point for obtaining ultimate range is P_max, and it is d that it, which arrives the distance of line,_max.If d_max＜ ε₁, recurrence termination.Otherwise Calculate the at most desirable data point number n of current depth_max, i.e., 2^d-1+1.If n_max≤n_i, recurrence termination.Otherwise assume to work as The data point number of preceding segmentation is n_i′If n_i′≤n_min, recurrence termination.Otherwise by P_maxIt is added in S, and to new segmentation (P_S, P_max) and (P_max, P_E) above-mentioned recursive procedure is repeated, and depth adds 1.

Embodiment of the present invention compared with prior art, at least with following difference and effect：

First, this method can pre-set required maximum and can use number of data points, the data point set that this method is drawn Close not over set maximum number strong point number, avoid cumbersome last handling process.Secondly, it is contemplated that base station is highly dense The fluctuation of degrees of data will not very violent fluctuation for a long time, so for fluctuating less segmentation, less data can be taken Point, or even two end points of head and the tail are only taken, fluctuate larger segmentation so as to which more data points be distributed to.Again, using segmentation The strategy of processing, the severe degree of each data sectional fluctuation can be effectively estimated according to difference standard deviation.Finally, for Traditional Douglas algorithm, add the function of predictive pruning：1) application scenarios of base station are considered, adjacent data point is not Fluctuation larger suddenly is had, so when the data point number of segmentation is less, recurrence can be stopped.2) consider to set in advance The maximum put can use data point number, according to all downward recursive situation of each segmentation is assumed, obtain recurrence to current layer when institute The data point number taken, i.e., 2 power add 1.If it is assumed that the number of the data point taken is more than or equal to maximum desirable data The quantity of point, then stop recurrence, so, can use relatively low computation complexity cost, efficiently extract data interested Point, that is, the extreme point changed greatly.

Claims

1. strengthen the high density Monitoring Data of difference base station for aeronautical satellite ground vacuates method, it is characterised in that：The party Method tries to achieve required data point using the Douglas-Pu Ke algorithms for adding predictive pruning condition from high density Monitoring Data Set.

2. the high density Monitoring Data of aeronautical satellite ground enhancing difference base station as claimed in claim 1 vacuates method, its It is characterised by：Methods described also comprises the following steps：

(1) parameter initialization, including maximum desirable number of data points n, data sectional complexity threshold ε₀With finally take point set S And it is initialized as empty set.

(2) data sectional number is calculated, and carries out data sectional.

(3) the difference standard deviation of the difference result of each data sectional and the difference result of the data sectional is calculated, and is calculated every The desirable data point number of individual data sectional, the maximum of segmentation i distribution can use data point number to be designated as ni.

(4) using the Douglas-Pu Ke algorithms for adding predictive pruning condition the data of each data sectional take a little with Form the set of data points needed.

3. strengthen the side of vacuating of the high density Monitoring Data of difference base station for aeronautical satellite ground as claimed in claim 2 Method, it is characterised in that：The data volume of each data sectional is equal.

4. strengthen the side of vacuating of the high density Monitoring Data of difference base station for satellite navigation foundation as claimed in claim 2 Method, it is characterised in that：The step (4) comprises the following steps：

Initialize the maximal distance threshold ε of Douglas-Pu Ke algorithms₁, current recursion depth d is 1, contained by each data sectional Minimum data point number n_min。

Each data point is calculated in data sectional to data sectional both ends data point P_SAnd P_ELine Euclidean distance and by P_SAnd P_E It is added in set S, if the data point for obtaining ultimate range is P_max, and it is d that it, which arrives the distance of line,_maxIf d_max＜ ε₁, recurrence termination, otherwise calculate current depth at most can use data point number n_maxIf n_max≤n_i, recurrence termination, otherwise Assuming that the data point number of current data segmentation is n_i′If n_i′≤n_min, recurrence terminates, otherwise by P_maxIt is added in S, and To new data sectional (P_S, P_max) and (P_max, P_E) above-mentioned recursive procedure is repeated, and depth adds 1.

5. high density Monitoring Data vacuates method, it is characterised in that：This method is drawn using the Doug for adding predictive pruning condition Si-Pu Ke algorithms try to achieve required set of data points from high density Monitoring Data.

6. high density Monitoring Data as claimed in claim 5 vacuates method, it is characterised in that：Methods described also includes as follows Step：

(2) data sectional number is calculated, and carries out data sectional.

(3) the difference standard deviation of the difference result of each data sectional and the difference result of the data sectional is calculated, and is calculated every The desirable data point number of individual data sectional, the maximum of segmentation i distribution can use data point number to be designated as n_i.

(4) using the Douglas-Pu Ke algorithms for adding beta pruning condition the data of each data sectional take a little with Form the set of data points needed.

7. strengthen the side of vacuating of the high density Monitoring Data of difference base station for aeronautical satellite ground as claimed in claim 6 Method, it is characterised in that：The data volume of each data sectional is equal.

8. strengthen the side of vacuating of the high density Monitoring Data of difference base station for satellite navigation foundation as claimed in claim 6 Method, it is characterised in that：The step (4) comprises the following steps：

Initialize the maximal distance threshold ε of Douglas-Pu Ke algorithms₁, current recursion depth d is 1, minimum contained by each segmentation Data point number n_min。