CN106709250A - Data flow abnormality detection method based on parallel Kalman algorithm - Google Patents
Data flow abnormality detection method based on parallel Kalman algorithm Download PDFInfo
- Publication number
- CN106709250A CN106709250A CN201611197599.8A CN201611197599A CN106709250A CN 106709250 A CN106709250 A CN 106709250A CN 201611197599 A CN201611197599 A CN 201611197599A CN 106709250 A CN106709250 A CN 106709250A
- Authority
- CN
- China
- Prior art keywords
- value
- measured value
- factor
- influence
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Z—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
- G16Z99/00—Subject matter not provided for in other main groups of this subclass
Landscapes
- Testing Or Calibration Of Command Recording Devices (AREA)
Abstract
The invention discloses a data flow abnormality detection method based on a parallel Kalman algorithm. The data flow abnormality detection method comprises the following steps that 1, measurement data of a sensor in a period of time is acquired; 2, the measurement data is compared with a measurement value in a previous period of time, once a change is generated, an estimation value is calculated through the Kalman algorithm according to the measurement value, an absolute value of a difference between the estimation value and the measurement value is compared with a specified threshold value, and if the absolute value is not smaller than the threshold value, the absolute value is judged to be an abnormal value, and the next step is conducted; 3, the generation reasons of the abnormal value are judged by considering a time influence factor, a space influence factor and other factors such as the flood period, the weather and the human factors which influence abnormality detection and recorded, and information is stored in a database. According to the data flow abnormality detection method, the time influence factor, the space influence factor and the other provenance information influence factor are taken into account; an algorithm task is decomposed and processed in parallel in order to improve the algorithm efficiency, and the detection precision is improved.
Description
Technical field
The invention belongs to big data administrative skill field, more particularly to a kind of data flow based on parallel Kalman algorithms is different
Normal detection method.
Background technology
Method for detecting abnormality to data flow is generally required for numerous and diverse calculating, in addition it is also necessary to data are modified and is melted
Close, this is a process for complexity, so how to ensure that the accuracy and efficiency of detection is most important.
Although also there is Kalman algorithms in the prior art, consider the time, space and other play source information,
The calculating and detection of individual event are simply carried out simultaneously.So that it is not high for the precision of overall data flow anomaly detection, can cause
Occurs error in detection process.
The content of the invention
Goal of the invention:The present invention provides a kind of data flow anomaly detection method based on parallel Kalman algorithms, can solve
Certainly in data flow abnormality detection the low problem of inaccuracy and efficiency.
The invention discloses a kind of data flow anomaly detection method based on parallel Kalman algorithms, comprise the steps of:
Step one:Obtain one section of measurement data of sensor, predominantly real time water level measured value;
Step 2:The current measured value of sensor and measured value for the previous period are compared, judge that measured value is
It is no to stablize unchanged;If measured value is not changed within a period of time, data are just carried out summary extraction by that, then by data
It is stored in database;If measured value has a difference with data for the previous period, that is put into step 3;
Step 3:Measured value in step 2 is calculated into estimate by Kalman algorithms, and by estimate and is measured
The absolute value of difference is compared with given threshold value between value, normal value is then judged to if less than threshold values and summary is carried out shift to an earlier date
Then database is stored in, otherwise is judged to exceptional value and is entered step 4;
Step 4:Time-concerning impact factor is calculated according to sensor measured value for the previous period, if time-concerning impact factor
Less than threshold value, then it is judged to improper value, Exception Type is monodrome exception, then records and correct replacement exceptional value and then be stored in number
According to storehouse, otherwise into step 5;
Step 5:Measured value according to the sensor being associated with sensing station calculates the spacial influence factor, if empty
Between factor of influence be less than threshold value, then be judged to improper value, Exception Type is single sensor continuous abnormal, and recording exceptional is simultaneously corrected
Then database is stored in, is processed according to sensor states, otherwise into step 6;
Step 6:Factor according to the other influences abnormality detection including flood season, weather and human factor is different to judge this
The producing cause of constant value, Exception Type is multiple sensor continuous abnormals, is finally recorded the abnormal cause of the exceptional value
And information is stored in database.
The present invention provides a kind of parallel Kalman methods based on multidimensional factor of influence according to flow data feature.Algorithm by when
Between, space and other play these three dimensional informations of source information as factor of influence, improve the degree of accuracy of abnormality detection result;So
Task-decomposing is carried out into parallel processing, boosting algorithm efficiency afterwards;Finally algorithm is tested, the feasible of algorithm is demonstrated
Property.
Further, calculate estimate using Kalman algorithms in the step 3, and by estimate and measured value it
Between difference it is as follows with the concrete operation step that given threshold value is compared judgement:
Step 3.1:Input Initial state estimation value, initial mean square error estimate and initial covariance;
Step 3.2:One section of measurement data before to current time is decomposed using wavelet transformation to data, by coefficient
The noise as measurement is extracted less than the HFS of threshold value, a measured value is obtained;
Step 3.3:The weights of one and time correlation are added to the measured value in this period;
Step 3.4:Measured value according to last moment estimates the measured value at current time, further according to the measurement at current time
Value corrects state estimation in real time, and will update measurement noise, prediction task and amendment task three tasks is carried out at parallelization
Reason, COMPREHENSIVE CALCULATING obtains an estimate;
Step 3.5:The result of calculation that step 3.1 and step 3.3 are obtained is carried out the judgement of estimate and measurement value difference, such as
Fruit estimate is more than threshold value with the absolute value of measurement value difference, into step 4;Otherwise it is judged to normal value.
Further, measurement noise, three tasks of prediction task and amendment task are updated in the step 3.4 is carried out parallel
Change is processed, and last COMPREHENSIVE CALCULATING obtains comprising the following steps that for estimate:
3.4.1:Update measurement noise
To length for the measurement value sequence of L carries out wavelet decomposition, input quantity is measured value y (t-L-1), the y (t- of L
L) ..., y (t-1), and it is output as covariance Qw (t) of the measurement noise of t;Wherein update measurement noise formula be:
QW(t-1)=Ew (t-1) wT(t-1)
Wherein, the process of small echo extraction noise is:W (t-1)=WaveDec (y (t-1));
It is subsequently adding forgetting factor:
Qw (t-1)=(1- λt)Qw(t-1)
+λt[w(t-1)wT(t-1)-C(t-1)P(t|t-1)CT(t-1)]
Wherein, λt=(1- λ)/(1- λt), and 0 < λ < 1;
3.4.2:Prediction task
The predicting link of the task is to calculate state estimation and mean square error estimation;Therefore when the input of prediction task is t-1
The state estimation at quarterWith the error estimate P (t-1) at t-1 moment, and export be t state estimationsAnd the error of t estimates P (t | t-1);
Calculate a step state estimation:
Calculate step mean square error estimation:
P (t | t-1)=F (t-1) P (t-1) FT(t-1)+Qv(t-1)
Wherein, Qv (t-1) is the system noise variance at t-1 moment, and Qv (t) here is constant;
3.4.3:Amendment task
Amendment task is mainly calculating Kalman filter gain, then corrects state according to Kalman gains and measured value
Estimate and estimation error;The measurement noise at P (t | t-1), t-1 moment is estimated in input in link is corrected for the error of t
Variance Qw (t-1), the state estimations of tAnd measured value y (t) of t, and it is output as the state of t
EstimateWith estimation error P (t) of t;
Estimated according to mean square error, calculate filtering gain:
According to a step state estimation and filtering gain, state estimation is updated:
Update mean square error:
P (t)=[1-K (t) C (t-1)] P (t | t-1)
3.4.4:COMPREHENSIVE CALCULATING measures estimate:
Further, the step 4 fall into a trap evaluation time factor of influence method it is as follows:
Wherein:λtT () is the time dimension factor of influence of t, yi(t-j) it is the measured value of t-j moment node is,It is the discreet value of t-j moment node is.
Further, the method that the spacial influence factor is calculated in the step 5 is as follows:
Wherein:λsT () is the Spatial Dimension factor of influence of t, yiT () is the measured value of t node i,It is t
The discreet value of moment node i.
Further, the factor of the other influences abnormality detection in the step 6 is mainly flood season, wherein flood season influence
The computational methods of the factor are as follows:
Wherein:λfT () is the flood season factor of influence of t;N, M are respectively the sampling of non-flood period period and period in flood season
Number;PN,t、PM,tRespectively non-flood period period and period in flood season t sampling water level value.
The present invention is directed to data flow anomaly test problems, to improve the degree of accuracy, introduces time, space and other origin letters
Breath, proposes a kind of Kalman methods based on multidimensional factor of influence;On this basis, it is the efficiency of raising algorithm, task is entered
Row parallelization is processed, and proposes a kind of parallel Kalman methods based on multidimensional factor of influence;Finally the algorithm for proposing is carried out
Experiment, demonstrates the feasible, effective of algorithm.
Brief description of the drawings
Fig. 1 is existing Kalman algorithm flow charts;
Fig. 2 is the flow chart of data flow anomaly detection method of the present invention;
Fig. 3 is the Kalman algorithm flow charts after the present invention is improved;
Fig. 4 is weight function schematic diagram;
Fig. 5 is AKF, WKF, MDF-KF Riming time of algorithm comparison diagram in embodiment;
Fig. 6 is AKF, WKF, MDF-KF and PKF algorithm time comparison diagram in embodiment;
Specific embodiment
With reference to specific embodiment, the present invention is furture elucidated, it should be understood that these embodiments are merely to illustrate the present invention
Rather than limitation the scope of the present invention, after the present invention has been read, those skilled in the art are to various equivalences of the invention
The modification of form falls within the application appended claims limited range.
Parallel Kalman algorithms based on multidimensional factor of influence, are the abnormal inspection of lifting to the abnormality detection of data flow first
Survey the degree of accuracy of result, and can determine that abnormal type and producing cause, add the time, space and other play source information
A kind of three factors of influence of dimension, it is proposed that Kalman methods based on multidimensional factor of influence;Then, imitated to improve algorithm
Rate, Task-decomposing is carried out by algorithm, carries out tasks in parallel, proposes that a kind of parallel Kalman based on multidimensional factor of influence is calculated
Method.
Kalman algorithms based on multidimensional factor of influence:
1st, algorithm improvement
(1) improve wavelet transformation and extract measurement noise
Kalman algorithms are substantially a processes for constantly circulating, and constantly correct estimate by measured value y (t)So as to improve the accuracy of prediction.But due to basic Kalman filter algorithm be only applicable to known to measurement noise be
In system, so effect in actual applications is unsatisfactory.So the Kalman algorithms based on multidimensional factor of influence of the invention
Improved in terms of measurement noise is obtained, the Kalman algorithm flow charts before improvement are as shown in Figure 1.
Kalman algorithms based on multidimensional factor of influence of the present invention are improved for obtaining measurement noise, and
Measurement noise will be obtained and be divided into two steps:
1. it is one section of measurement data of L that the algorithm chooses the length before current time before abnormality detection is carried out, first,
It is decomposed using wavelet transformation, coefficient is extracted as measurement noise less than the HFS of threshold value.Based on many
Tie up in the Kalman algorithms of factor of influence using to be the method based on threshold value, the process for extracting noise can be subdivided into small echo
Decomposition, threshold value screening, wavelet reconstruction, extraction 4 steps of noise.
2. go out after measurement noise using wavelet transformation extract real-time, this paper algorithms are also made an uproar to the measurement in this period
Sound adds the weights of and time correlation, so as to further improve the accuracy that measurement noise is estimated.Assuming that in time span
For in the data segment of L, measurement noise the value respectively w1, w2 ..., wL obtained by wavelet transformation, the then noise figure at L+1 moment
For:
Wherein:λt=(1- λ)/(1- λt) it is t measurement noise wtWeights, λ ∈ (0,1).Fig. 4 is that weight function shows
It is intended to, wherein, the first width figure is k moment for being tried to achieve according to wavelet transformation and the L-1 measurement noise value at moment before, originally
Method be to be averaging as the measurement noise value at t+1 moment according to this L measurement noise value.It is obvious that do so has very
Big error, because the measurement noise at more early moment is smaller for the measurement noise influence at current time.Second width figure is weights
Function graft, it can be seen that nearer apart from current time, weights are bigger;Conversely, from current time more away from, weights are smaller.This
One characteristic just with influence of the noise to current noise is not consistent in the same time, therefore, weight function is added to wavelet transformation and is asked
During measurement noise, so as to improve its accuracy.3rd width figure is exactly to add the result after weights.
(2) multidimensional factor of influence is added
Under wireless sensor network environment, existing data fusion operation is main in time dimension and Spatial Dimension logarithm
According to being merged, time dimension is referred to for same sensor node, in data fusion not in the same time;Spatial Dimension refers to
Be in synchronization, for the data fusion between adjacent node.On this basis, data origin is added to abnormality detection
In algorithm, it is proposed that the time, space and other rise source information this 3 dimensions influence abnormality detection factor.
For time dimension, the Kalman algorithms based on multidimensional factor of influence are carrying out the same of abnormality detection to current time
When, the measured value detection case at several moment before the node can be also combined, obtain time-concerning impact factor.The time model that needs compare
Enclose dynamically to adjust as needed, so as to improve algorithm adaptability in actual applications.The definition of time-concerning impact factor
It is as follows:
Wherein:λtT () is the time dimension factor of influence of t, yi(t-j) it is the measured value of t-j moment node is,It is the discreet value of t-j moment node is.
For Spatial Dimension, the Kalman algorithms based on multidimensional factor of influence consider and present node have it is neighbouring or
It is the influence of the abnormality detection situation for present node detection case of the node of the relations such as upstream and downstream, to these space correlations
Node carries out data fusion.The factor of influence of Spatial Dimension is defined as follows:
Wherein:λsT () is the Spatial Dimension factor of influence of t, yiT () is the measured value of t node i,It is t
The discreet value of moment node i.
Other data fusions for playing source information dimension refer to the predicted value and other influences exception according to Kalman algorithms
A source information of detection, including weather, flood season, sensor service condition and human activity etc., for river in the period in flood season
The characteristics of water level value situation of change of water level value changes and non-flood period period has notable difference, it is proposed that flood season factor of influence it is general
Read, it is defined as follows:
Wherein:λfT () is the flood season factor of influence of t;N, M are respectively the sampling of non-flood period period and period in flood season
Number;PN,t、PM,tRespectively non-flood period period and period in flood season t sampling water level value.
2. algorithm is realized
(1) first, each sensor node carries out Kalman algorithm detections to the data of oneself respectively, if detected
The value at current time is exceptional value, then calculate time dimension factor of influence λ according to data for the previous periodt(t).If
Go out λtT () is more than or equal to threshold xit, then by the predicted value at current time and the difference of measured valueAnd λtT () records, go forward side by side
Enter second step;Otherwise it is assumed that the abnormity point is single abnormity point, replace the exceptional value with predicted value, and result of determination recorded
In database.
(2) and then, according to each related sensor be transmitted through come characteristic valueAnd λtT () calculates Spatial Dimension influence
Factor lambdas(t), if drawing λsT () is more than or equal to threshold xis, then by λsT () records, and enter the 3rd step;Otherwise it is assumed that should
Abnormal is measurement value sensor exception, during result of determination recorded into database.
(3) it is last, according to the incoming λ of systemsT other play source information for () and flood season, working sensor state etc., judge different
Normal type.
Parallel Kalman algorithms based on multidimensional factor of influence
1. parallelization is improved
In sensor network, Kalman algorithms are carried out into parallelization treatment can make full use of section in sensor
Point, improves the efficiency of algorithm.
Conventional Kalman Algorithm parallelization methods have two kinds:
(1) matrix disassembling method.In the iterative process of Kalman algorithms, substantial amounts of matrix has been used to be added the behaviour being multiplied
Make, someone decomposed and is simplified to these matrix operations, calculating these can be while carries out, so as to realize the effect of multimachine
Really.
(2) task analytic approach.Kalman algorithms mainly have two links of prediction and amendment, in unit Kalman algorithms
In, CPU must wait the calculating of prediction process to finish and can just be modified process afterwards, can so have a strong impact on computational efficiency.Dividing
In cloth environment, the two processes can be decomposed, result of calculation is then transmitted by the communication between processor.
This algorithm is ultimately breaks down into the calculating four for updating measurement noise, prediction task, amendment task and the three dimensional effects factor
Individual task, wherein update measurement noise, prediction task and amendment task can be with executed in parallel:
(1) measurement noise is updated
The method that the improvement Kalman algorithms based on multidimensional factor of influence extract measurement noise is described above, at this
, it is necessary to length for the measurement value sequence of L carries out wavelet decomposition during individual.Therefore the input of this task is that quantity is L's
Measured value y (t-L-1), y (t-L) ..., y (t-1), and be output as covariance Qw (t) of the measurement noise of t.Update measurement
The formula of noise is:
QW(t-1)=Ew (t-1) wT(t-1) (formula 5)
Wherein, w (t-1)=WaveDec (y (t-1)), is the process of small echo extraction noise.
Wherein, λt=(1- λ)/(1- λt), and 0 < λ < 1.
(2) task is predicted
Gone out the flow chart of existing Kalman algorithms in Fig. 1, wherein prediction link task be calculate state estimation and
Error covariance matrix.Therefore the input of prediction task is the state estimation at t-1 momentWith the estimation error at t-1 moment
Value P (t-1), and output is the state estimations of tAnd the error of t estimates P (t | t-1).Prediction task
Formula be:
P (t | t-1)=F (t-1) P (t-1) FT(t-1)+Qv (t-1) (formula 8)
Wherein, Qv (t-1) is the system noise variance at t-1 moment, because application scenarios of the invention are the water in river
Literary sensor network, can regard system constant at one as, so Qv (t) is constant in a short time.
(3) task is corrected
Amendment task is mainly calculating Kalman filter gain, then corrects state according to Kalman gains and measured value
Estimate and estimation error.The measurement noise at P (t | t-1), t-1 moment is estimated in input in link is corrected for the error of t
Variance Qw (t-1), the state estimations of tAnd measured value y (t) of t, and it is output as the state of t
EstimateWith estimation error P (t) of t.The formula for correcting link has:
P (t)=[1-K (t) C (t-1)] P (t | t-1) (formula 11)
As can be seen that the output of each task is the input of another task from three introductions of task above, such as
Fruit is wanted to realize parallelization, just these three links must be adjusted.Method used herein is that will to correct link delayed, in advance
Survey link and carry previous duration, be i.e. amendment link is constant, and predicts the formula of link and be changed into:
So, three tasks presented hereinbefore can just be calculated simultaneously with three different processors respectively, so that
The calculating time is saved.
In addition to the task that above three is synchronously performed, also one is further calculated by three dimension factors of influence
The task of abnormal results.Can just be carried out because this task needs to use the result for improving Kalman algorithms, it is impossible to above three
Individual task is carried out simultaneously, so being carried out after placing it three parallel tasks.
2. parallelization is realized
In implementation process, the initiation parameter of system is provided first, then using three processors respectively to being situated between above
Three tasks for continuing are calculated.Due to having carried out delayed treatment to amendment task, so not needing phase between these three tasks
Mutually wait, it is only necessary to carry out necessary communication after completion is calculated to transmit result of calculation.Finally, will to measure estimate defeated
Go out, carry out the treatment of next step.Fig. 3 gives the Parallel Algorithm flow chart after improving.
With reference to the Kalman algorithms based on multidimensional factor of influence and the parallel Kalman algorithms based on multidimensional factor of influence, plus
Angle of incidence factor of influence, the spacial influence factor and other origins are followed (herein refer to flood season factor), and will extract measurement noise, shape
State is estimated and three tasks of state revision carry out parallelization treatment.Total detection method flow is as shown in Figure 2:
Need to carry out wavelet decomposition according to current time measured value interior for the previous period due to extracting measurement noise, this
The time that process is consumed will be far above state estimations and state revision, so needing for the extracting measurement noise of the task to enter one again
Step is divided.Because the process that each moment extracts noise does not have coupled relation, it is possible to directly decomposed.
Experimental verification
1st, based on the Kalman algorithm experimentals result of multidimensional factor of influence and analysis
Kalman algorithm detection accuracy of the analysis based on multidimensional factor of influence.Use respectively based on time forgetting factor
Kalman algorithms (Amnesic Kalman Filtering), Kalman algorithms (the Wavelet Kalman based on wavelet transformation
Filtering the Kalman algorithms (MDF-KF)) and based on multidimensional factor of influence detect to identical data, comparative analysis
Detection results of three kinds of algorithms to different type abnormity point.Remote measurement water level real time data collection according to certain river, chooses its May
In it is continuous 1000 record.First, discontinuous 5 records are selected in this 1000 data at random, respectively plus or subtract
The threshold value being previously set is gone, in this, as single abnormal data;Then, reselection 10 is continuously counted from this 1000 data
Threshold value is added respectively according to section, as the situation of continuous abnormal point;Finally, then 5 continuous data are chosen, is modified as continuous
The data successively decreased, the abnormal conditions die-offed as measured value.Analyzed by many experiments, this experiment is concluded that:And its
He compares two algorithms, the error rate highest of AKF algorithms, and the situation of missing inspection and flase drop is than more serious, the inspection to continuous abnormal point
Survey effect also poor;The error rate of WKF algorithms is slightly better than AKF algorithms, and missing inspection number is less, for single and continuous constant exceptional value
Detection results it is slightly good, but it is poor to the abnormal conditions Detection results of continuous cataclysm, and false drop rate is also higher;Based on multidimensional
The Kalman algorithms of factor of influence combine above two advantages of algorithm, and the detection number of continuous abnormal value is significantly improved,
Missing inspection number and flase drop number are also reduced simultaneously, the accuracy rate of detection is improve.
The execution time of Kalman algorithm of the analysis based on multidimensional factor of influence.To the telemetry of above-mentioned selection, difference
8 sampled points between 100 to 5000 records are taken, three run times of algorithm (second) are contrasted, Fig. 5 is difference
The broken line graph of Riming time of algorithm during data volume.From experiment, with the growth of data volume, the run time of AKF algorithms increases
Long unobvious, because AKF algorithms do not need extract real-time measurement noise, calculating process is most simple.And WKF algorithms and based on many
Wavelet transformation is needed to use to extract the measurement noise in a time period in the Kalman algorithms for tieing up factor of influence, so with number
According to the growth of amount, the run time of the two algorithms can also increase quickly.Kalman algorithms based on multidimensional factor of influence
Although operational efficiency is higher than AKF algorithm, also it is higher by than WKF algorithm a little, the accuracy rate of abnormality detection is apparently higher than forefathers
Algorithm.Table 1 is algorithm detection error rate contrast (MDF-KF is the abbreviation of Kalman algorithms in table).
The algorithm of table 1 detects error rate contrast table
2nd, the parallel Kalman algorithm experimentals result based on multidimensional factor of influence and analysis
Present invention is generally directed to the on-line checking of abnormal data, the data volume in whole wireless sensor network is not very
It is many, therefore in an experiment, at most only select 5000 datas.It is same to use remote measurement waterlevel data collection, choose 100 respectively and arrive
8 sampled points between 5000 datas, on the algorithm after parallelization and AKF algorithms, WKF algorithms and based on multidimensional influence because
The run time (second) of Kalman algorithms (MDF-KF) these three algorithms of son is contrasted, and Fig. 6 is right for algorithm execution time
Than figure.
The run time of AKF algorithms is most short, and WKF algorithms and the Kalman Riming time of algorithm based on multidimensional factor of influence
It is more long.Parallel Kalman algorithms (PKF) based on multidimensional factor of influence after the parallelization effect in lower data amount is poor,
Because Storm is before data processing is carried out, first having to take some time carries out resource allocation.When data volume is relatively low, point
With the shared large percentage in the wastage in bulk or weight time of task and distribution time for being consumed of resource, therefore parallelization effect and pay no attention to
Think;After data volume increase to a certain extent, the time effects that distribution task and resource are consumed diminish, at this moment parallelization
Advantage is also embodied.From fig. 6, it can be seen that when data volume is reached after 1000, needed for the algorithm after parallelization treatment
Time less than the algorithm before parallelization.Table 2 gives mistake when carrying out abnormality detection using these factors of influence respectively
Rate.As can be seen from the results, the factor of influence of time dimension can reduce missing inspection number and lift the detection of continuous abnormal value
Number, the factor of influence of Spatial Dimension can reduce flase drop number, and other factors of influence for playing source information dimension can reduce mistake
Inspection number.
The different dimensions error rate contrast table of table 2
Claims (6)
1. a kind of data flow anomaly detection method based on parallel Kalman algorithms, it is characterised in that comprise the steps of:
Step one:Obtain one section of measurement data of sensor, predominantly real time water level measured value;
Step 2:The current measured value of sensor and measured value for the previous period are compared, judge whether measured value is steady
It is fixed unchanged;If measured value is not changed within a period of time, data are just carried out summary extraction by that, are then stored in data
Database;If measured value has a difference with data for the previous period, that is put into step 3;
Step 3:Measured value in step 2 is calculated into estimate by Kalman algorithms, and by estimate and measured value it
Between the absolute value of difference be compared with given threshold value, normal value is then judged to if less than threshold values and summary is carried out shift to an earlier date then
Database is stored in, otherwise is judged to exceptional value and is entered step 4;
Step 4:Time-concerning impact factor is calculated according to sensor measured value for the previous period, if time-concerning impact factor is less than
Threshold value, then be judged to improper value, and Exception Type is monodrome exception, then records and corrects replacement exceptional value and then be stored in data
Storehouse, otherwise into step 5;
Step 5:Measured value according to the sensor being associated with sensing station calculates the spacial influence factor, if space shadow
Ring the factor and be less than threshold value, be then judged to improper value, Exception Type is single sensor continuous abnormal, and then recording exceptional is simultaneously corrected
Database is stored in, is processed according to sensor states, otherwise into step 6;
Step 6:Factor according to the other influences abnormality detection including flood season, weather and human factor judges the exceptional value
Producing cause, Exception Type is multiple sensor continuous abnormals, finally by the abnormal cause of the exceptional value recorded and incite somebody to action
Information is stored in database.
2. a kind of data flow anomaly detection method based on parallel Kalman algorithms according to claim 1, its feature exists
In, calculate estimate using Kalman algorithms in the step 3, and by the difference between estimate and measured value with it is given
The concrete operation step that threshold value is compared judgement is as follows:
Step 3.1:Input Initial state estimation value, initial mean square error estimate and initial covariance;
Step 3.2:One section of measurement data before to current time is decomposed using wavelet transformation to data, and coefficient is less than
The HFS of threshold value extracts the noise as measurement, obtains a measured value;
Step 3.3:The weights of one and time correlation are added to the measured value in this period;
Step 3.4:Measured value according to last moment estimates the measured value at current time, and the measured value further according to current time comes
State estimation is corrected in real time, and will update measurement noise, three tasks of prediction task and amendment task carries out parallelization treatment, comprehensive
Conjunction is calculated an estimate;
Step 3.5:The result of calculation that step 3.1 and step 3.3 are obtained is carried out the judgement of estimate and measurement value difference, if estimated
Evaluation is more than threshold value with the absolute value of measurement value difference, into step 4;Otherwise it is judged to normal value.
3. a kind of data flow anomaly detection method based on parallel Kalman algorithms according to claim 1, its feature exists
In, measurement noise, three tasks of prediction task and amendment task are updated in the step 3.4 carries out parallelization treatment, and finally
COMPREHENSIVE CALCULATING obtains comprising the following steps that for estimate:
3.4.1:Update measurement noise
To length for the measurement value sequence of L carries out wavelet decomposition, input quantity is the measured value y (t-L-1) of L, y (t-L) ..., y
(t-1), it is output as covariance Qw (t) of the measurement noise of t;Wherein update measurement noise formula be:
QW(t-1)=Ew (t-1) wT(t-1)
Wherein, the process of small echo extraction noise is:W (t-1)=WaveDec (y (t-1));
It is subsequently adding forgetting factor:
Qw (t-1)=(1- λt)Qw(t-1)+λt[w(t-1)wT(t-1)-C(t-1)P(t|t-1)CT(t-1)]
Wherein, λt=(1- λ)/(1- λt), and 0 < λ < 1;
3.4.2:Prediction task
The predicting link of the task is to calculate state estimation and mean square error estimation;Therefore the input of prediction task is the t-1 moment
State estimationWith the error estimate P (t-1) at t-1 moment, and export be t state estimations
And the error of t estimates P (t | t-1);
Calculate a step state estimation:
Calculate step mean square error estimation:
P (t | t-1)=F (t-1) P (t-1) FT(t-1)+Qv(t-1)
Wherein, Qv (t-1) is the system noise variance at t-1 moment, and Qv (t) here is constant;
3.4.3:Amendment task
Amendment task is mainly calculating Kalman filter gain, then corrects state estimation according to Kalman gains and measured value
And estimation error;The measurement noise variance at P (t | t-1), t-1 moment is estimated in input in link is corrected for the error of t
Qw (t-1), the state estimations of tAnd measured value y (t) of t, and it is output as the state estimation of tWith estimation error P (t) of t;
Estimated according to mean square error, calculate filtering gain:
According to a step state estimation and filtering gain, state estimation is updated:
Update mean square error:
P (t)=[1-K (t) C (t-1)] P (t | t-1)
3.4.4:COMPREHENSIVE CALCULATING measures estimate:
4. according to a kind of data flow anomaly detection method based on parallel Kalman algorithms that one of claims 1 to 3 is described, its
Be characterised by, the step 4 fall into a trap evaluation time factor of influence method it is as follows:
Wherein:λtT () is the time dimension factor of influence of t, yi(t-j) it is the measured value of t-j moment node is,
It is the discreet value of t-j moment node is.
5. a kind of data flow anomaly detection method based on parallel Kalman algorithms according to claim 4, its feature exists
In the method that the spacial influence factor is calculated in the step 5 is as follows:
Wherein:λsT () is the Spatial Dimension factor of influence of t, yiT () is the measured value of t node i,It is t
The discreet value of node i.
6. a kind of data flow anomaly detection method based on parallel Kalman algorithms according to claim 1, its feature exists
In the factor of the other influences abnormality detection in the step 6 is mainly flood season, the wherein computational methods of flood season factor of influence
It is as follows:
Wherein:λfT () is the flood season factor of influence of t;N, M are respectively the hits of non-flood period period and period in flood season;PN,t、
PM,tRespectively non-flood period period and period in flood season t sampling water level value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611197599.8A CN106709250A (en) | 2016-12-22 | 2016-12-22 | Data flow abnormality detection method based on parallel Kalman algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611197599.8A CN106709250A (en) | 2016-12-22 | 2016-12-22 | Data flow abnormality detection method based on parallel Kalman algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106709250A true CN106709250A (en) | 2017-05-24 |
Family
ID=58938799
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611197599.8A Pending CN106709250A (en) | 2016-12-22 | 2016-12-22 | Data flow abnormality detection method based on parallel Kalman algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106709250A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107484196A (en) * | 2017-08-14 | 2017-12-15 | 北京上格云技术有限公司 | The quality of data ensuring method and computer-readable medium of sensor network |
CN108571997A (en) * | 2017-12-26 | 2018-09-25 | 深圳市鼎阳科技有限公司 | A kind of method and apparatus that measured point is steadily contacted in detection probe |
CN108616838A (en) * | 2018-04-29 | 2018-10-02 | 山东省计算中心(国家超级计算济南中心) | Agricultural greenhouse Data Fusion method based on Kalman filtering algorithm |
CN108981679A (en) * | 2017-05-31 | 2018-12-11 | 精工爱普生株式会社 | Circuit device, physical amount measuring device, electronic equipment and moving body |
CN109388772A (en) * | 2018-09-04 | 2019-02-26 | 河海大学 | A kind of taboo search method that time-based Large Scale Graphs equilibrium k is divided |
CN109522520A (en) * | 2018-11-09 | 2019-03-26 | 河海大学 | The multiple small echo coherent analysis method of groundwater level fluctuation and multiple factors |
CN109699021A (en) * | 2018-12-31 | 2019-04-30 | 宁波工程学院 | One kind is based on time-weighted agriculture Internet of Things method for diagnosing faults |
CN109990789A (en) * | 2019-03-27 | 2019-07-09 | 广东工业大学 | A kind of flight navigation method, apparatus and relevant device |
CN112650281A (en) * | 2020-12-14 | 2021-04-13 | 一飞(海南)科技有限公司 | Multi-sensor tri-redundancy system, control method, unmanned aerial vehicle, medium and terminal |
CN114137636A (en) * | 2021-11-11 | 2022-03-04 | 四川九通智路科技有限公司 | Regional meteorological monitoring management method and system for annular pressure sensor |
CN115388931A (en) * | 2022-10-27 | 2022-11-25 | 河北省科学院应用数学研究所 | Credible monitoring method, monitoring terminal and storage medium for sensor abnormal data |
CN115795350A (en) * | 2023-01-29 | 2023-03-14 | 北京众驰伟业科技发展有限公司 | Abnormal data information processing method in production process of blood rheology test cup |
-
2016
- 2016-12-22 CN CN201611197599.8A patent/CN106709250A/en active Pending
Non-Patent Citations (5)
Title |
---|
KATSUYA KONDO等: "《European Signal Processing Conference》", 31 December 2015 * |
焉晓贞等: ""基于卡尔曼滤波的动态传感数据流估计方法"", 《仪器仪表学报》 * |
王永利等: ""数据流上异常数据的在线检测与修正"", 《应该科学学报》 * |
花青等: ""基于多维滑窗的异常数据检测方法"", 《计算机应用》 * |
高羽等: ""小波变换域估计观测噪声方差的Kalman滤波算法及其在数据融合中的应用"", 《电子学报》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108981679A (en) * | 2017-05-31 | 2018-12-11 | 精工爱普生株式会社 | Circuit device, physical amount measuring device, electronic equipment and moving body |
CN107484196B (en) * | 2017-08-14 | 2020-10-09 | 博锐尚格科技股份有限公司 | Data quality assurance method for sensor network and computer readable medium |
CN107484196A (en) * | 2017-08-14 | 2017-12-15 | 北京上格云技术有限公司 | The quality of data ensuring method and computer-readable medium of sensor network |
CN108571997A (en) * | 2017-12-26 | 2018-09-25 | 深圳市鼎阳科技有限公司 | A kind of method and apparatus that measured point is steadily contacted in detection probe |
CN108616838A (en) * | 2018-04-29 | 2018-10-02 | 山东省计算中心(国家超级计算济南中心) | Agricultural greenhouse Data Fusion method based on Kalman filtering algorithm |
CN109388772A (en) * | 2018-09-04 | 2019-02-26 | 河海大学 | A kind of taboo search method that time-based Large Scale Graphs equilibrium k is divided |
CN109522520A (en) * | 2018-11-09 | 2019-03-26 | 河海大学 | The multiple small echo coherent analysis method of groundwater level fluctuation and multiple factors |
CN109699021A (en) * | 2018-12-31 | 2019-04-30 | 宁波工程学院 | One kind is based on time-weighted agriculture Internet of Things method for diagnosing faults |
CN109699021B (en) * | 2018-12-31 | 2021-08-10 | 宁波工程学院 | Agricultural Internet of things fault diagnosis method based on time weighting |
CN109990789A (en) * | 2019-03-27 | 2019-07-09 | 广东工业大学 | A kind of flight navigation method, apparatus and relevant device |
CN112650281A (en) * | 2020-12-14 | 2021-04-13 | 一飞(海南)科技有限公司 | Multi-sensor tri-redundancy system, control method, unmanned aerial vehicle, medium and terminal |
CN112650281B (en) * | 2020-12-14 | 2023-08-22 | 一飞(海南)科技有限公司 | Multi-sensor three-redundancy system, control method, unmanned aerial vehicle, medium and terminal |
CN114137636A (en) * | 2021-11-11 | 2022-03-04 | 四川九通智路科技有限公司 | Regional meteorological monitoring management method and system for annular pressure sensor |
CN114137636B (en) * | 2021-11-11 | 2022-08-12 | 四川九通智路科技有限公司 | Regional meteorological monitoring management method and system for annular pressure sensor |
CN115388931A (en) * | 2022-10-27 | 2022-11-25 | 河北省科学院应用数学研究所 | Credible monitoring method, monitoring terminal and storage medium for sensor abnormal data |
CN115388931B (en) * | 2022-10-27 | 2023-02-03 | 河北省科学院应用数学研究所 | Credible monitoring method, monitoring terminal and storage medium for sensor abnormal data |
CN115795350A (en) * | 2023-01-29 | 2023-03-14 | 北京众驰伟业科技发展有限公司 | Abnormal data information processing method in production process of blood rheology test cup |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106709250A (en) | Data flow abnormality detection method based on parallel Kalman algorithm | |
US20170300546A1 (en) | Method and Apparatus for Data Processing in Data Modeling | |
US8700550B1 (en) | Adaptive model training system and method | |
CN113435725B (en) | Power grid host dynamic threshold setting method based on FARIMA-LSTM prediction | |
Visser | Estimation and detection of flexible trends | |
CN106529145A (en) | Bridge monitoring data prediction method based on ARIMA-BP neural network | |
CN110991625B (en) | Surface anomaly remote sensing monitoring method and device based on recurrent neural network | |
CN107977710A (en) | Electricity consumption abnormal data detection method and device | |
CN111461321A (en) | Improved deep reinforcement learning method and system based on Double DQN | |
CN111639798A (en) | Intelligent prediction model selection method and device | |
CN104156615A (en) | Sensor test data point anomaly detection method based on LS-SVM | |
Meng et al. | Analysis of ecological resilience to evaluate the inherent maintenance capacity of a forest ecosystem using a dense Landsat time series | |
CN115587666A (en) | Load prediction method and system based on seasonal trend decomposition and hybrid neural network | |
Grobler et al. | Using page's cumulative sum test on modis time series to detect land-cover changes | |
CN114580260A (en) | Landslide section prediction method based on machine learning and probability theory | |
CN113868953A (en) | Multi-unit operation optimization method, device and system in industrial system and storage medium | |
Pueyo et al. | A dynamic model to characterize beat-to-beat adaptation of repolarization to heart rate changes | |
Chabane et al. | Sensor fault detection and diagnosis using zonotopic set-membership estimation | |
CN113835947B (en) | Method and system for determining abnormality cause based on abnormality recognition result | |
CN117271979A (en) | Deep learning-based equatorial Indian ocean surface ocean current velocity prediction method | |
CN115935283B (en) | Drought cause tracing method based on multi-element nonlinear causal analysis | |
CN115759263A (en) | Strategy effect evaluation method and device based on cause and effect inference | |
CN115310705A (en) | Method and device for determining gas emission quantity and computer readable storage medium | |
CN113569324A (en) | Slope deformation monitoring abnormal data analysis and optimization method | |
CN115310359A (en) | Method, device, equipment and medium for determining transient emission of nitrogen oxides |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170524 |