CN113255593B - Sensor information anomaly detection method facing space-time analysis model - Google Patents

Sensor information anomaly detection method facing space-time analysis model Download PDF

Info

Publication number
CN113255593B
CN113255593B CN202110708811.7A CN202110708811A CN113255593B CN 113255593 B CN113255593 B CN 113255593B CN 202110708811 A CN202110708811 A CN 202110708811A CN 113255593 B CN113255593 B CN 113255593B
Authority
CN
China
Prior art keywords
time
information
abnormal
sensor
sensor data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110708811.7A
Other languages
Chinese (zh)
Other versions
CN113255593A (en
Inventor
王妍
张新
邹晓涛
雷健
刘建国
杨朴
杨素花
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Smart Water Development Research Institute
Original Assignee
BEIJING WATER CONSERVANCY AUTOMATION INST
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING WATER CONSERVANCY AUTOMATION INST filed Critical BEIJING WATER CONSERVANCY AUTOMATION INST
Priority to CN202110708811.7A priority Critical patent/CN113255593B/en
Publication of CN113255593A publication Critical patent/CN113255593A/en
Application granted granted Critical
Publication of CN113255593B publication Critical patent/CN113255593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • G06F2218/06Denoising by applying a scale-space analysis, e.g. using wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Alarm Systems (AREA)

Abstract

The invention relates to a space-time analysis model-oriented sensor information anomaly detection method, which comprises the following steps: acquiring a sensor data set; determining first abnormal information in the time distribution characteristics according to the time information in the sensor data set and the sensor data values; determining second anomaly information in the spatially distributed features; screening the site information corresponding to the first abnormal information and the site information corresponding to the second abnormal information to determine abnormal sites; the method comprises the steps of determining a station with abnormal sensor data information by screening a station corresponding to first abnormal information and a station corresponding to second abnormal information through first abnormal information in time distribution characteristics and second abnormal information in space distribution in a sensor data set, and dividing a sensor data value by time and space, so that the determination of the station with the abnormal sensor data value is more accurate, the accuracy of judging the abnormal sensor data value is improved, and the detection efficiency of abnormal data is improved.

Description

Sensor information anomaly detection method facing space-time analysis model
Technical Field
The invention relates to the field of data detection, in particular to a sensor information anomaly detection method facing a space-time analysis model.
Background
With the rapid development of economy and the accelerated industrialization process of China, the urban scale is continuously enlarged, and water pollution becomes one of the main reasons for hindering the implementation of sustainable development strategies of China, so that the detection of the pollution range, depth and diffusion condition of water pollution becomes a necessary path for urban development.
Because the territory of China is vast, the detection and analysis of the water pollution condition in the wide territory range is time-consuming and labor-consuming work, generally, people are required to carry out sampling detection on the spot, and the labor cost is increased.
If the water pollution can not be timely treated, huge losses which are difficult to reverse can be caused to the human society, but the water pollution is more and more serious because the water pollution degree of each place is different, the pollution conditions are different, and the monitoring and the management can not be carried out at any time.
Disclosure of Invention
Therefore, the invention provides a space-time analysis model-oriented sensor information anomaly detection method which can solve the problem that whether sensor anomaly data used for detecting the water pollution degree is reserved or not is detected.
In order to achieve the above object, the present invention provides a method for detecting sensor information abnormality for a spatio-temporal analysis model, comprising:
acquiring a sensor data set, wherein the sensor data set comprises information of a site where a sensor is located, longitude and latitude information of the sensor, time information for generating a sensor data value and the sensor data value;
constructing a time distribution characteristic of the sensor data according to the time information and the sensor data value in the sensor data set, and determining first abnormal information in the time distribution characteristic;
constructing a spatial distribution characteristic of the sensor data according to the longitude and latitude information and the sensor data value in the sensor data set, and determining second abnormal information in the spatial distribution characteristic;
screening the site information corresponding to the first abnormal information and the site information corresponding to the second abnormal information to determine abnormal sites;
the determining first anomaly information in the time distribution features comprises:
constructing a spatio-temporal cube from the sensor data set, the spatio-temporal cube comprising temporal and spatial sensor data values for the plurality of sites;
dividing the time information into a first time, a second time, a third time and a fourth time;
a first standard data value, a second standard data value, a third standard data value and a fourth standard data value are arranged in the central control unit;
if the average data value in the first time is larger than the first standard data value, the first abnormal information is located in the sensor working in the first time;
if the average data value in the second time is larger than the second standard data value, the first abnormal information is located in the sensor working in the second time;
if the average data value in the third time is larger than the third standard data value, the first abnormal information is located in the sensor working in the third time;
if the average data value in the fourth time is larger than the fourth standard data value, the first abnormal information is located in the sensor working in the fourth time;
screening the site information corresponding to the first abnormal information and the site information corresponding to the second abnormal information, and determining an abnormal site includes:
the method comprises the steps that an abnormal value A is set for each station, the abnormal value A = K1 × a1+ K2 × a2, wherein a1 is a first abnormal value, a2 is a second abnormal value, K1 is a first coefficient, K2 is a second coefficient, and an early warning value A0 is set in a central control unit;
and if the abnormal value A of the data value of the station is less than the early warning value A0, the data value acquired by the sensor of the station is normal and accords with the normal sensor data value of the current longitude and latitude and the current time.
Further, when it is determined that the first abnormal information is located in the first time, the first time is divided into n1 time periods, n1 is an even number, then a time node is arbitrarily selected from n1 time periods, the difference between the data values of the n1 time nodes and the first standard data value is compared, and if the difference between the data value of at least n1/2 time nodes and the first standard data value is higher than a preset standard difference value in the n1 time nodes, the sensor data value corresponding to the n1 nodes is determined as the first abnormal information.
Further, when the first abnormal information is determined to be located in the second time, the second time is divided into n2 time periods, n2 is an even number, then a time node is arbitrarily selected from n2 time periods, the difference value between the data value of n2 time nodes and the second standard data value is compared, and if the difference value between the data value of at least n2/2 time nodes and the second standard data value is higher than a preset standard difference value in n2 time nodes, the sensor data value corresponding to n2 nodes is determined as the first abnormal information.
Further, when it is determined that the first abnormal information is located within the third time, the third time is divided into n3 time periods, n3 is an even number, then a time node is arbitrarily selected from n3 time periods, the difference between the data values of n3 time nodes and the third standard data value is compared, and if the difference between the data value of at least n3/2 time nodes and the third standard data value is higher than a preset standard difference value in n3 time nodes, the sensor data value corresponding to n3 nodes is determined as the first abnormal information.
Further, when the first abnormal information is determined to be located in the fourth time, the fourth time is divided into n4 time periods, n4 is an even number, then one time node is arbitrarily selected from n4 time periods, the difference value between the data value of n4 time nodes and the fourth standard data value is compared, and if the difference value between the data value of at least n4/2 time nodes and the fourth standard data value is higher than a preset standard difference value in n4 time nodes, the sensor data value corresponding to n4 nodes is determined as the first abnormal information.
Further, the determining second anomaly information in the spatially distributed features comprises:
constructing a spatio-temporal cube from the sensor data set, the spatio-temporal cube comprising temporal and spatial sensor data values for the plurality of sites;
dividing the space information according to the region information, wherein the divided space comprises at least one site information;
selecting any site in any space, constructing a Thiessen polygon of the site, and constructing at least three Thiessen polygons which are a first Thiessen polygon, a second Thiessen polygon and a third Thiessen polygon respectively when constructing the Thiessen polygon;
and determining the sensor data values of all the sites in the Thiessen polygons, calculating the G value of the site by utilizing Grabs, comparing the G value with a critical value given in a Grabs table, and if the G values in the first Thiessen polygon, the second Thiessen polygon and the third Thiessen polygon are all larger than the critical value, indicating that the sensor data value of the site is second abnormal information.
Further, the method can also comprise the steps of determining whether abnormal data of the sensor of the station needs to be corrected;
correcting the abnormal data if the abnormal data is caused by the using time of the sensor information, and correcting the sensor data value according to time, altitude, terrain and monsoon when the abnormal data is corrected;
if the abnormal data does not belong to the abnormal data caused by the sensor information, the abnormal data does not need to be corrected.
Further, modifying the sensor data values as a function of time, altitude, terrain, and monsoon includes:
and if the sensor data value is determined to be caused by the abnormality of the sensor information, determining the time of generation of the sensor data of the site, the altitude of the site, the terrain of the site and the monsoon of the site, and selecting corresponding parameters in the correction matrix to correct the sensor data value.
Further, if the sensor data value does not coincide with the time generated by the sensor data value of the station, the sensor data is corrected by using the time factor t in the correction value matrix C (t, h, f, g, s);
if the sensor data value does not accord with the station altitude, correcting the sensor data by using an altitude factor h in a correction value matrix C (t, h, f, g, s);
if the sensor data value does not accord with the terrain of the station, correcting the sensor data by using a terrain factor f in a correction value matrix C (t, h, f, g, s);
if the sensor data value does not accord with the terrain of the station, correcting the sensor data by using a terrain factor g in a correction value matrix C (t, h, f, g, s);
if the sensor data value does not match the site's monsoon, the sensor data is corrected using monsoon factor s in correction value matrix C (t, h, f, g, s).
Further, in correcting the sensor data, the sensor data values are superimposed by the time factor, the altitude factor, the terrain factor, and the monsoon factor in the correction value matrix C (t, h, f, g, s).
Compared with the prior art, the method and the device have the advantages that the sites with abnormal sensor data information are determined by screening the sites corresponding to the first abnormal information and the sites corresponding to the second abnormal information through the first abnormal information and the second abnormal information in the time distribution characteristics and the space distribution characteristics in the sensor data set of the sensor data.
Particularly, by dividing the time, the detection range of the abnormal data of the sensor can be further reduced, the detection speed of the abnormal data of the sensor is improved, the corresponding sensor station is quickly positioned, and the detection efficiency of the abnormal data detection is improved.
In particular, each time is cut, abnormal sensor data are divided more finely, so that the abnormal data are determined more accurately, the stations corresponding to the abnormal sensor data are positioned on the abnormal data, the detection efficiency of the stations corresponding to the abnormal sensor data is further improved, and the detection speed is increased.
Particularly, the Thisen polygon of the station information is constructed, the G value of the station is the Grabs value according to the G value of the station in the Thisen polygon, the G value of the station is calculated by utilizing the Grabs, and the G value is compared with the critical value given in the Grabs table, so that whether the sensor data value of the station is the second abnormal information or not is quickly determined, the detection efficiency of the abnormal information is further improved, and the detection speed of the abnormal data is improved.
Particularly, the abnormality degree of the site sensor is evaluated, and early warning is performed if the abnormality degree is higher than the early warning value, so that the sensor data value of the site is determined more intuitively when site data is detected, the confirming efficiency of the abnormal information site of the sensor is improved, and the detection speed is increased.
Drawings
Fig. 1 is a flowchart of a spatio-temporal analytic model-oriented sensor information anomaly detection method according to an embodiment of the present invention.
Detailed Description
In order that the objects and advantages of the invention will be more clearly understood, the invention is further described below with reference to examples; it should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and do not limit the scope of the present invention.
Referring to fig. 1, a method for detecting abnormal sensor information based on a spatio-temporal analysis model according to an embodiment of the present invention includes:
step S100: acquiring a sensor data set, wherein the sensor data set comprises information of a site where a sensor is located, longitude and latitude information of the sensor, time information for generating a sensor data value and the sensor data value;
step S200: constructing a time distribution characteristic of the sensor data according to the time information and the sensor data value in the sensor data set, and determining first abnormal information in the time distribution characteristic;
step S300: constructing a spatial distribution characteristic of the sensor data according to the longitude and latitude information and the sensor data value in the sensor data set, and determining second abnormal information in the spatial distribution characteristic;
step S400: and screening the site information corresponding to the first abnormal information and the site information corresponding to the second abnormal information to determine abnormal sites.
Specifically, the execution subject in the embodiment of the present invention is a central control unit, and the application scenario of the sensor information anomaly detection method for the spatio-temporal analysis model in the embodiment of the present invention is the detection of water pollution in a certain country, a plurality of sensors are arranged in the national region range and are used for detecting the water pollution degree, different sensors form a sensor information network and are used for monitoring the water pollution degree of each region in real time, in the detection process, the water pollution information acquired by the sensor information can be transmitted to the central control unit, then the central control unit carries out analog processing on the acquired data so as to acquire abnormal information in the data information, the abnormal information can be used for predicting the time of the emergency or predicting the emergency, so that people can perform advanced protection according to the predicted information, and personal and property loss brought to the society by the emergency is reduced.
Specifically, according to the method for detecting sensor information abnormality for the spatio-temporal analysis model provided by the embodiment of the present invention, a site corresponding to first abnormal information and a site corresponding to second abnormal information are screened through first abnormal information in time distribution characteristics and second abnormal information in spatial distribution in a sensor data set, so as to determine a site with abnormal sensor data information.
Specifically, the determining the first abnormal information in the time distribution characteristic includes:
constructing a spatio-temporal cube from the sensor data set, the spatio-temporal cube comprising temporal and spatial sensor data values for the plurality of sites;
dividing the time information into a first time, a second time, a third time and a fourth time;
a first standard data value, a second standard data value, a third standard data value and a fourth standard data value are arranged in the central control unit;
if the average data value in the first time is larger than the first standard data value, the first abnormal information is located in the sensor working in the first time;
if the average data value in the second time is larger than the second standard data value, the first abnormal information is located in the sensor working in the second time;
if the average data value in the third time is larger than the third standard data value, the first abnormal information is located in the sensor working in the third time;
if the average data value in the fourth time is greater than the fourth standard data value, the first abnormal information is located in the sensor that operates in the fourth time.
Specifically, the time is cut, specifically, the time may be spring, summer, autumn and winter in one year, or may be divided by other dividing methods, which are not listed here. According to the embodiment of the invention, time information is divided, each divided time has consistency, different standard pollution degrees are set at the first time, the second time, the third time and the fourth time, and if the pollution amount in any time is higher than the standard pollution amount in the corresponding time, first abnormal information exists in the time.
Specifically, when it is determined that the first abnormal information is located within the first time, the first time is divided into n1 time periods, n1 is an even number, one time node is arbitrarily selected from n1 time periods, the difference between the data values of the n1 time nodes and the first standard data value is compared, and if the difference between the data value of at least n1/2 time nodes and the first standard data value is higher than a preset standard difference value among the n1 time nodes, the sensor data value corresponding to n1 nodes is determined as the first abnormal information.
And when the first abnormal information is determined to be located in the second time, the second time is cut into n2 time periods, n2 is an even number, then a time node is arbitrarily selected from n2 time periods, the difference value between the data value of n2 time nodes and the second standard data value is compared, and if the difference value between the data value of at least n2/2 time nodes and the second standard data value is higher than a preset standard difference value in the n2 time nodes, the sensor data value corresponding to n2 nodes is determined as the first abnormal information.
And when the first abnormal information is determined to be in the third time, cutting the third time into n3 time periods, wherein n3 is an even number, then randomly selecting a time node from n3 time periods, comparing the difference value between the data value of n3 time nodes and the third standard data value, and if the difference value between the data value of at least n3/2 time nodes and the third standard data value is higher than a preset standard difference value in the n3 time nodes, determining the sensor data value corresponding to n3 nodes as the first abnormal information.
And when the first abnormal information is determined to be located in the fourth time, the fourth time is cut into n4 time periods, n4 is an even number, then one time node is arbitrarily selected from n4 time periods, the difference value between the data value of n4 time nodes and the fourth standard data value is compared, and if the difference value between the data value of at least n4/2 time nodes and the fourth standard data value is higher than a preset standard difference value in the n4 time nodes, the sensor data value corresponding to n4 nodes is determined as the first abnormal information.
Specifically, in the embodiment of the invention, each time is cut, and the data with abnormal data values are divided more finely, so that the determination of the abnormal data is more accurate, the station location corresponding to the abnormal data with abnormal data values is performed on the abnormal data, the detection efficiency of the station corresponding to the abnormal data is further improved, and the detection speed is improved.
Specifically, the determining the second anomaly information in the spatial distribution features includes:
constructing a spatio-temporal cube from the sensor data set, the spatio-temporal cube comprising temporal and spatial sensor data values for the plurality of sites;
dividing the space information according to the region information, wherein the divided space comprises at least one site information;
selecting any site in any space, constructing a Thiessen polygon of the site, and constructing at least three Thiessen polygons which are a first Thiessen polygon, a second Thiessen polygon and a third Thiessen polygon respectively when constructing the Thiessen polygon;
and determining data values of all the sites in the Thiessen polygons, calculating G values of the sites by utilizing Grabs, comparing the G values with a critical value given in a Grabs table, and if the G values in the first Thiessen polygon, the second Thiessen polygon and the third Thiessen polygon are all larger than the critical value, indicating that the sensor data value of the site is second abnormal information.
Specifically, the second abnormal information provided in the embodiment of the present invention is obtained by constructing a thieson polygon of the site information, calculating a G value of the site by using the grubbs according to the G value of the site in the thieson polygon, where the G value is a grubbs value, and comparing the G value with a critical value given in a grubbs table, thereby quickly determining whether the sensor data value of the site is the second abnormal information, further improving the detection efficiency of the abnormal information, and increasing the detection speed of the abnormal data.
Specifically, the screening the site information corresponding to the first abnormal information and the site information corresponding to the second abnormal information, and the determining the abnormal site includes:
an outlier a is set for each site, the outlier a = K1 × a1+ K2 × a2, where a1 is the first outlier, a2 is the second outlier, K1 is the first coefficient, and K2 is the second coefficient. If the abnormal value A of the site data is larger than or equal to the early warning value A0, the sensor data of the site is an abnormal value, early warning is carried out, and the sensor data of the site needs to be corrected;
if the abnormal value A of the site sensor is less than the early warning value A0, the data of the site sensor is normal and accords with the current longitude and latitude without correction.
Specifically, according to the embodiment of the invention, the abnormality degree of the sensor data in the station is evaluated, and the early warning is carried out if the abnormality degree is higher than the early warning value, so that the sensor data value of the station is determined more intuitively when the station data is detected, the confirmation efficiency of the abnormal information station of the sensor is improved, and the detection speed is increased.
The embodiment of the invention utilizes the wavelet analysis theory to detect abnormal sampling points on the time dimension, and determines the partition map of the sampling points through the prior knowledge; and determining the space field of the research object by adopting Voronoi, and judging an abnormal value by using a Grubbs statistical test method to position an abnormal site in a water pollution degree monitoring station network in the research area. The embodiment of the invention avoids manual parameter input in space-time data anomaly detection, reduces the influence of user experience on the detection result, ensures correct and error-free detection of local and global anomaly data, and improves the detection level of different types of anomaly data.
Specifically, after step S400, determining whether the abnormal data of the sensor data of the station needs to be corrected;
if the abnormal data is abnormal due to the use time of the sensor information, correcting the abnormal data, and correcting the water pollution degree data according to time, altitude, terrain and monsoon when correcting the abnormal data;
if the abnormal data does not belong to the abnormal data caused by the abnormality of the sensor information, the abnormal data does not need to be corrected, the abnormal data is used for predicting the extreme pollution which is possibly generated in the future, people can conveniently protect in advance according to the prediction result, and the loss of people or property is reduced.
Specifically, the embodiment of the present invention determines the factor of the sensor data abnormality, and if the data abnormality is caused by the sensor information, the data abnormality needs to be corrected to correct the acquired sensor data, so as to improve the accuracy of the sensor data.
Specifically, correcting the sensor data according to time, altitude, topography, terrain, and monsoon includes:
a correction value matrix C (t, h, f, g, s) is arranged in the central control unit, wherein t represents a time factor, h represents an altitude factor, f represents a terrain factor, g represents a terrain factor, and s represents a monsoon factor, and if the sensor data is determined to be caused by the abnormality of the sensor, the time for generating the sensor data of the site, the altitude of the site, the terrain of the site and the monsoon of the site are determined, and corresponding parameters in the correction matrix are also selected to correct the sensor data;
if the sensor data does not accord with the time generated by the sensor data of the station, correcting the sensor data by using a time factor t in a correction value matrix C (t, h, f, g, s);
if the sensor data does not accord with the station altitude, correcting the sensor data by using an altitude factor h in a correction value matrix C (t, h, f, g, s);
if the sensor data does not accord with the terrain of the station, correcting the sensor data by using a terrain factor f in a correction value matrix C (t, h, f, g, s);
if the sensor data does not conform to the terrain of the station, correcting the sensor data by using a terrain factor g in a correction value matrix C (t, h, f, g, s);
if the sensor data does not match the site's monsoon, the sensor data is corrected using monsoon factor s in correction value matrix C (t, h, f, g, s).
Specifically, when the water pollution degree value obtained by detecting the sensor information is detected, the sensor data value can be influenced by time, altitude, terrain and monsoon, the influence can occur simultaneously or not simultaneously, the correction of the sensor data value is more accurate by setting a correction value matrix C (t, h, f, g, s), and the accuracy of the sensor data is improved.
Specifically, when the sensor data is corrected, the time factor, the altitude factor, the terrain factor, and the monsoon factor in the correction value matrix C (t, h, f, g, s) superimpose the sensor data values to determine that the corrected sensor data is more accurate and closer to the true sensor data values.
The sensor information in the embodiment of the present invention is information of the water pollution condition acquired by the sensor, and may be a pollution degree, a metal content value in a certain representative water pollution environment, or a characteristic quantity of the water pollution information, which is not described herein again.
Specifically, the method for detecting the sensor information abnormity facing to the space-time analysis model comprises the following steps:
the method comprises the steps of obtaining a total station network (Beijing city water pollution monitoring station network, city level and district level, 233 stations in total) sensor information data set of a long-time sequence (water pollution course daily data of nearly ten years) in a research area, and storing the sensor information data set into a MapFile distributed format of Hadoop (more than 85 ten thousand pieces of data are stored in the Hadoop for facilitating later access calculation;
the file contains the code ID of the sensor (used for distinguishing different sites), space variables (including longitude coordinate X and latitude coordinate Y), time variable values D (including year, month and day, such as 2000-01-01) and attribute variable values Z (water pollution degree, such as A river 2019-09-13 water pollution degree is middle grade, and is assigned as 3), and the file is in a MapFile distributed format of Hadoop.
Constructing a space-time cube of water pollution degree data, and extracting the space-time characteristics of the space-time cube (the space-time distribution rule of ten-year pollution degree in a research area is presented in a one-time and visual mode, and then the storage space of a pollution degree space-time data set is compressed through the aggregation processing of the pollution degree space-time data, so that data support is provided for the visual expression of the space-time distribution of the pollution degree space-time data set);
according to hydrological partition characteristics, selecting pollution influence factors and partitioning a research area (the purpose of partitioning is that the difference of sample points in layers is minimum and the difference between layers is maximum; considering the pollution distribution characteristics, selecting the partition condition of main variables influencing the pollution distribution as the basis, the pollution rules of the sample points in the same area are similar and have comparability; for example, finding out the pollution distribution characteristics of the research area through a space-time cube, selecting a topographic distribution map as the partition basis if the pollution distribution characteristics are similar to topographic distribution, comparing the sample points in the same area, and not performing comparison operation on different areas);
constructing a time-space cube of water pollution data, and mining a time-space distribution rule of water pollution in a research area;
and selecting the sample point information of a specific time point or a key point to pay attention to, and drawing the slice.
And (5) detecting time anomaly. The wavelet analysis theory is adopted to analyze the water pollution data of the long-time sequence of the stations in the research area, and the station or the area to be detected can be set according to the requirement.
Aiming at complex hydrological characteristics such as periodicity and non-stationarity of water pollution, detecting abnormal sampling points on a time dimension by utilizing a wavelet analysis theory; establishing an original time sequence signal with a horizontal axis as time and a vertical axis as water pollution amount, delivering the original time sequence signal to a wavelet tool of matlab, analyzing the daily water pollution amount by adopting Db4 wavelet, acquiring periodic rule of water pollution through low-frequency information, and acquiring abnormal signal of water pollution through high-frequency information.
And detecting spatial anomaly. The method comprises the steps of determining the space field of a research object (a certain sampling point) by adopting Voronoi, and respectively calculating the water pollution amount in different time dimensions according to abnormal data of the day, the month and the year of a sensor concerned.
Selecting any sample point in each area data set as an experimental sample point, drawing Voronoi of the experimental sample point, and forming a corresponding relation table (converting a sampling point data set file into a sampling point space data set with an industrial standard Shape file format according to longitude and latitude coordinates of the sample point data set, and converting a data set projection of the longitude and latitude coordinates into a Gaussian projection coordinate data set;
and judging an abnormal value by using a Grubbs statistical test method, and positioning the outlier station in the water pollution monitoring station network in the research area.
Selecting any sample point as a sample point to be detected, collecting water pollution data of all neighborhood sampling points except the sampling point to be detected in a sampling point data set on a corresponding time dimension, calculating a G value of the sample point to be detected by utilizing Grubbs, comparing the G value with a critical value given in a Grabbs table, selecting sample points which appear three times or more (namely, are determined to be abnormal in three or more neighborhoods), and further judging the abnormal condition of the sample points.
And comprehensively considering two dimensions of time and space, and detecting the abnormality of any time period in the research area.
Selecting any sample point as a sample point to be detected, collecting water pollution data of all neighborhood sampling points except the sampling point to be detected in the sampling point data set on the corresponding time dimension, calculating the G value of the sample point to be detected by utilizing Grubbs, comparing the G value with a critical value given in a Grabbs table, selecting the sample point, and further judging the abnormal condition of the sample point.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention; various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A sensor information anomaly detection method for a space-time analysis model is characterized by comprising the following steps:
acquiring a sensor data set, wherein the sensor data set comprises information of a site where a sensor is located, longitude and latitude information of the sensor, time information for generating a sensor data value and the sensor data value;
constructing a time distribution characteristic of the sensor data according to the time information and the sensor data value in the sensor data set, and determining first abnormal information in the time distribution characteristic;
constructing a spatial distribution characteristic of the sensor data according to the longitude and latitude information and the sensor data value in the sensor data set, and determining second abnormal information in the spatial distribution characteristic;
the determining second anomaly information in the spatially distributed features comprises:
constructing a spatio-temporal cube from the sensor data set, the spatio-temporal cube comprising temporal and spatial sensor data values for the plurality of sites;
dividing the space information according to the region information, wherein the divided space comprises at least one site information;
selecting any site in any space, constructing a Thiessen polygon of the site, and constructing at least three Thiessen polygons which are a first Thiessen polygon, a second Thiessen polygon and a third Thiessen polygon respectively when constructing the Thiessen polygon;
determining sensor data values of all sites in the Thiessen polygons, calculating G values of the sites by utilizing Grabas, comparing the G values with a critical value given in a Grabas table, and if the G values in the first Thiessen polygons, the second Thiessen polygons and the third Thiessen polygons are all larger than the critical value, indicating that the sensor data values of the sites are second abnormal information;
screening the site information corresponding to the first abnormal information and the site information corresponding to the second abnormal information to determine abnormal sites;
the determining first anomaly information in the time distribution features comprises:
constructing a spatio-temporal cube from the sensor data set, the spatio-temporal cube comprising temporal and spatial sensor data values for the plurality of sites;
dividing the time information into a first time, a second time, a third time and a fourth time;
a first standard data value, a second standard data value, a third standard data value and a fourth standard data value are arranged in the central control unit;
if the average data value in the first time is larger than the first standard data value, the first abnormal information is located in the sensor working in the first time;
if the average data value in the second time is larger than the second standard data value, the first abnormal information is located in the sensor working in the second time;
if the average data value in the third time is larger than the third standard data value, the first abnormal information is located in the sensor working in the third time;
if the average data value in the fourth time is larger than the fourth standard data value, the first abnormal information is located in the sensor working in the fourth time;
screening the site information corresponding to the first abnormal information and the site information corresponding to the second abnormal information, and determining an abnormal site includes:
the method comprises the steps that an abnormal value A is set for each station, the abnormal value A = K1 × a1+ K2 × a2, wherein a1 is a first abnormal value, a2 is a second abnormal value, K1 is a first coefficient, K2 is a second coefficient, and an early warning value A0 is set in a central control unit;
and if the abnormal value A of the data value of the station is less than the early warning value A0, the data value acquired by the sensor of the station is normal and accords with the normal sensor data value of the current longitude and latitude and the current time.
2. The method for detecting sensor information abnormality based on spatio-temporal analytic model of claim 1, wherein when it is determined that the first abnormality information is within the first time, the first time is divided into n1 time periods, n1 is an even number, then a time node is arbitrarily selected from n1 time periods, the difference between the data values of n1 time nodes and the first standard data value is compared, and if the difference between the data value of at least n1/2 time nodes and the first standard data value is higher than a preset standard difference among n1 time nodes, the sensor data value corresponding to n1 nodes is determined as the first abnormality information.
3. The method for detecting sensor information abnormality based on spatio-temporal analytic model of claim 2, wherein when the first abnormal information is determined to be within the second time, the second time is divided into n2 time periods, n2 is an even number, then a time node is arbitrarily selected from n2 time periods, the difference between the data value of n2 time nodes and the second standard data value is compared, and if the difference between the data value of at least n2/2 time nodes and the second standard data value is higher than a preset standard difference in the n2 time nodes, the sensor data value corresponding to n2 nodes is determined as the first abnormal information.
4. The method for detecting sensor information abnormality based on spatio-temporal analytic model of claim 3, wherein when the first abnormal information is determined to be within the third time, the third time is divided into n3 time periods, n3 is an even number, then a time node is arbitrarily selected from n3 time periods, the difference between the data values of n3 time nodes and the third standard data value is compared, and if the difference between the data value of at least n3/2 time nodes and the third standard data value is higher than a preset standard difference in the n3 time nodes, the sensor data value corresponding to n3 nodes is determined as the first abnormal information.
5. The method for detecting sensor information abnormality based on spatio-temporal analytic model of claim 4, wherein when the first abnormality information is determined to be within the fourth time, the fourth time is divided into n4 time periods, n4 is an even number, then a time node is arbitrarily selected from n4 time periods, the difference between the data values of n4 time nodes and the fourth standard data value is compared, and if the difference between the data value of at least n4/2 time nodes and the fourth standard data value is higher than a preset standard difference among the n4 time nodes, the sensor data value corresponding to n4 nodes is determined as the first abnormality information.
6. The spatio-temporal analytic model-oriented sensor information anomaly detection method according to claim 1,
determining whether abnormal data of the sensor of the station needs to be corrected;
correcting the abnormal data if the abnormal data is caused by the using time of the sensor information, and correcting the sensor data value according to time, altitude, terrain and monsoon when the abnormal data is corrected;
if the abnormal data does not belong to the abnormal data caused by the sensor information, the abnormal data does not need to be corrected.
7. The spatio-temporal analytic model-oriented sensor information anomaly detection method according to claim 6,
modifying the sensor data values as a function of time, altitude, terrain, and monsoon includes:
and if the sensor data value is determined to be caused by the abnormality of the sensor information, determining the time of generation of the sensor data of the site, the altitude of the site, the terrain of the site and the monsoon of the site, and selecting corresponding parameters in the correction matrix to correct the sensor data value.
8. The method for detecting sensor information abnormality for a spatio-temporal analytic model according to any of claims 1 to 7,
if the sensor data value does not accord with the time generated by the sensor data value of the station, correcting the sensor data by using a time factor t in a correction value matrix C (t, h, f, g, s);
if the sensor data value does not accord with the station altitude, correcting the sensor data by using an altitude factor h in a correction value matrix C (t, h, f, g, s);
if the sensor data value does not accord with the terrain of the station, correcting the sensor data by using a terrain factor f in a correction value matrix C (t, h, f, g, s);
if the sensor data value does not accord with the terrain of the station, correcting the sensor data by using a terrain factor g in a correction value matrix C (t, h, f, g, s);
if the sensor data value does not match the site's monsoon, the sensor data is corrected using monsoon factor s in correction value matrix C (t, h, f, g, s).
9. The spatio-temporal analytic model-oriented sensor information anomaly detection method according to claim 8,
in correcting the sensor data, the sensor data values are superimposed by the time factor, the altitude factor, the terrain factor, and the monsoon factor in the correction value matrix C (t, h, f, g, s).
CN202110708811.7A 2021-06-25 2021-06-25 Sensor information anomaly detection method facing space-time analysis model Active CN113255593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110708811.7A CN113255593B (en) 2021-06-25 2021-06-25 Sensor information anomaly detection method facing space-time analysis model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110708811.7A CN113255593B (en) 2021-06-25 2021-06-25 Sensor information anomaly detection method facing space-time analysis model

Publications (2)

Publication Number Publication Date
CN113255593A CN113255593A (en) 2021-08-13
CN113255593B true CN113255593B (en) 2021-09-24

Family

ID=77189780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110708811.7A Active CN113255593B (en) 2021-06-25 2021-06-25 Sensor information anomaly detection method facing space-time analysis model

Country Status (1)

Country Link
CN (1) CN113255593B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114236645B (en) * 2021-11-26 2022-07-26 中国水利水电科学研究院 Large-scale rainfall monitoring abnormal site screening method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103179602A (en) * 2013-03-15 2013-06-26 无锡清华信息科学与技术国家实验室物联网技术中心 Method and device for detecting abnormal data of wireless sensor network
CN108770079A (en) * 2018-05-31 2018-11-06 深圳汇通智能化科技有限公司 A kind of water environment monitoring system based on underwater robot
CN110830946A (en) * 2019-11-15 2020-02-21 江南大学 Mixed type online data anomaly detection method
CN110895526A (en) * 2019-11-29 2020-03-20 南京信息工程大学 Method for correcting data abnormity in atmosphere monitoring system
CN110992690A (en) * 2019-11-29 2020-04-10 中原工学院 False data detection method based on space-time outliers in Internet of vehicles
CN111858712A (en) * 2020-07-20 2020-10-30 上海仪电(集团)有限公司中央研究院 In-situ water quality inspection data time-space analysis and anomaly detection method and system
CN112597691A (en) * 2020-09-01 2021-04-02 新天绿色能源股份有限公司 LightGBM algorithm-based fault early warning method for wind turbine generator variable pitch motor temperature sensor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107205244A (en) * 2016-03-18 2017-09-26 哈尔滨工业大学(威海) A kind of design method of the sensor network anomaly data detection based on temporal correlation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103179602A (en) * 2013-03-15 2013-06-26 无锡清华信息科学与技术国家实验室物联网技术中心 Method and device for detecting abnormal data of wireless sensor network
CN108770079A (en) * 2018-05-31 2018-11-06 深圳汇通智能化科技有限公司 A kind of water environment monitoring system based on underwater robot
CN110830946A (en) * 2019-11-15 2020-02-21 江南大学 Mixed type online data anomaly detection method
CN110895526A (en) * 2019-11-29 2020-03-20 南京信息工程大学 Method for correcting data abnormity in atmosphere monitoring system
CN110992690A (en) * 2019-11-29 2020-04-10 中原工学院 False data detection method based on space-time outliers in Internet of vehicles
CN111858712A (en) * 2020-07-20 2020-10-30 上海仪电(集团)有限公司中央研究院 In-situ water quality inspection data time-space analysis and anomaly detection method and system
CN112597691A (en) * 2020-09-01 2021-04-02 新天绿色能源股份有限公司 LightGBM algorithm-based fault early warning method for wind turbine generator variable pitch motor temperature sensor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A DATA MINING APPROACH FOR MULTIVARIATE OUTLIER DETECTION IN;M. Bakon等;《IGARSS 2016》;20161231;全文 *
基于Voronoi和空间自相关的离群点检测;王妍等;《计算机工程》;20100131;全文 *

Also Published As

Publication number Publication date
CN113255593A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN115578015B (en) Sewage treatment whole process supervision method, system and storage medium based on Internet of things
CN105873105B (en) A kind of mobile radio communication abnormality detection and localization method based on network Quality of experience
CN113567635B (en) Intelligent monitoring integrated system and monitoring method for industrial gas
CN112308292A (en) Method for drawing fire risk grade distribution map
Feng et al. How much can temporally stationary factors explain cellular automata-based simulations of past and future urban growth?
CN108632832B (en) Network coverage analysis method and system
CN117037076B (en) Intelligent soil moisture content monitoring method based on remote sensing technology
CN111475746B (en) Point-of-interest mining method, device, computer equipment and storage medium
CN113255593B (en) Sensor information anomaly detection method facing space-time analysis model
CN112330184A (en) Construction risk prediction method, device, equipment and medium
CN115730684A (en) Air quality detection system based on LSTM-CNN model
CN115100819B (en) Landslide hazard early warning method and device based on big data analysis and electronic equipment
CN117538503A (en) Real-time intelligent soil pollution monitoring system and method
CN113420458A (en) Wind field data refining processing method and device, terminal and medium
CN115035475A (en) Water supply pipeline real-time monitoring method and device, electronic equipment and medium
CN112686773B (en) Electric power metering all-link key business anomaly positioning model construction method based on fusion business topology
CN116401879B (en) Method for simulating downstream evolution of breaking tailing sand
CN110399537B (en) Artificial intelligence technology-based warning situation space-time prediction method
CN116151482B (en) Method, device, equipment and medium for predicting mining earthwork of open-pit mining area
CN111400424A (en) GIS-based automatic identification method and device for abnormal gathering of people
CN115144548B (en) Harmful gas composition real-time monitoring system and monitoring method thereof
CN114511239B (en) Method and device for dividing soil heavy metal pollution risk area, electronic equipment and medium
CN116383937A (en) Digital twin protection evaluation method for villages
CN113901043B (en) Pollution source intelligent supervision and data fusion analysis method and system
CN114510823A (en) Method for rapidly evaluating heavy metal pollution degree of soil

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100036 Beijing city Haidian District Cuiwei Road No. 3

Patentee after: Beijing Smart Water Development Research Institute

Address before: 100036 Beijing city Haidian District Cuiwei Road No. 3

Patentee before: Beijing Water Conservancy Automation Inst.