CN114816825B

CN114816825B - Internet of things gateway data error correction method

Info

Publication number: CN114816825B
Application number: CN202210717724.2A
Authority: CN
Inventors: 蔡黔江; 严可达; 许大为; 侯金彪; 占浩; 刘强; 涂杰
Original assignee: Optical Valley Technology Co ltd
Current assignee: Optical Valley Technology Co ltd
Priority date: 2022-06-23
Filing date: 2022-06-23
Publication date: 2022-09-09
Anticipated expiration: 2042-06-23
Also published as: CN114816825A

Abstract

The invention relates to an error correction method for gateway data of the Internet of things, belonging to the technical field of big data analysis, and the method comprises the following steps: acquiring gateway sample data; dividing the sample data into a plurality of optimal time sequence units with equal length according to the optimal length numerical value; calculating the autocorrelation of each optimal time sequence unit and the normality of each optimal time sequence unit; determining the attention degree of each optimal time sequence unit according to the autocorrelation and the normal degree of each optimal time sequence unit; training a single-class support vector machine algorithm classifier by using the attention of each optimal time sequence unit, and correcting the error of the gateway data of the Internet of things by using the trained classifier. According to the method, the influence of different optimal time sequence units on the single-class support vector machine algorithm classifier in the training process is controlled according to the attention degree of each optimal time sequence unit, and the accuracy of the classifier is improved.

Description

Error correction method for gateway data of Internet of things

Technical Field

The invention belongs to the technical field of big data analysis, and particularly relates to an error correction method for gateway data of an Internet of things.

Background

With the expansion of the application of the internet of things in actual life and production, the characteristic of taking data as the center is increasingly prominent. Whether the internet of things can be widely applied depends on extraction of useful information in gateway data to a certain extent, namely mining of the gateway data, and the data quality directly determines the extraction efficiency of the useful information and the correctness of final decision of the internet of things, so that the function realization and the user experience of an application scene are influenced. In order to be able to extract useful information in gateway data efficiently, the quality of the data needs to be improved.

In the scene of the internet of things, abnormal data can be generated due to factors such as unstable sensor performance, data transmission network faults, interference and damage caused by human or natural environments and the like, so that the data quality is rapidly reduced, and therefore, the identification of the abnormal data in the gateway data of the internet of things is particularly important.

The single-class support vector machine algorithm is an algorithm for detecting abnormal data, and can establish a single-classification algorithm of a data detection classifier only by normal data. However, when the classifier is trained, the sample possibly belonging to the abnormal data in the sample data may affect the classifier to learn the characteristics of the normal data, so that the accuracy of the classifier for detecting the abnormal data is low.

Disclosure of Invention

The invention provides an error correction method for gateway data of the Internet of things, and aims to solve the problem that sample data may belong to the class of sample data when a single-class support vector machine algorithm classifier is trained at present

The sample of the abnormal data can influence the classifier to learn the characteristics of the normal data, so that the accuracy of the classifier for detecting the abnormal data is low.

The invention discloses an error correction method for gateway data of the Internet of things, which adopts the following technical scheme: the method comprises the following steps:

acquiring single type sample data of a gateway;

dividing the sample data according to any length numerical value in a preset time length range to obtain a plurality of time sequence units with equal length, and forming time sequence data corresponding to the length numerical value by the plurality of time sequence units with equal length;

acquiring time series data corresponding to each length value within a preset time length range, fitting the time series data corresponding to each length value, determining an optimal length value according to a fitting result, and dividing the sample data into a plurality of optimal time series units with equal length according to the optimal length value;

calculating the autocorrelation of each optimal time-series unit;

converting all the obtained optimal time sequence units into a multidimensional space, wherein the dimension of the multidimensional space is equal to the optimal length value;

determining a neighboring data set of each of the optimal time series units within the multi-dimensional space centered on each of the optimal time series units and a radius of a numerical value determined from the sample data;

determining the normality of each optimal time sequence unit according to each optimal time sequence unit and the adjacent data set corresponding to each optimal time sequence unit;

determining the attention degree of each optimal time sequence unit according to the autocorrelation of each optimal time sequence unit and the normal degree of each optimal time sequence unit;

and training a single-class support vector machine algorithm classifier by using the attention of each optimal time sequence unit, and correcting the error of the gateway data of the Internet of things by using the trained classifier.

Further, the fitting the time series data corresponding to each length value and determining an optimal length value according to a fitting result includes:

fitting the time sequence data corresponding to each length value to obtain a fitting result corresponding to each length value;

when the fitting result corresponding to any length value is larger than the threshold value determined by the length value, marking the fitting result corresponding to the length value to obtain a marked fitting result;

judging the fitting result corresponding to each length value to obtain all the post-labeling fitting results, and selecting the maximum post-labeling fitting result from all the post-labeling fitting results;

and taking the length value corresponding to the maximum value of the marked fitting result as an optimal length value.

Further, said calculating an autocorrelation of each of said optimal time series units comprises:

respectively fitting the sample data contained in each optimal time sequence unit by using a least square method to obtain the autocorrelation of each optimal time sequence unit;

the autocorrelation calculation formula of each optimal time series unit is shown as the following formula:

wherein,

is shown as

Autocorrelation of each optimal time series unit;

is shown as

A total number of said sample data within an optimal time series unit;

denotes the first

In the unit of optimal time sequence

The true value of the individual sample data;

representing a linear equation fitted according to the least squares method

And predicting the predicted value of the first sample data in the optimal time sequence unit.

Further, said determining a neighboring data set of each said optimal time series unit within said multidimensional space centered at each said optimal time series unit and having a radius of a numerical value determined from said sample data comprises:

converting all the obtained optimal time sequence units into the multidimensional space to obtain a plurality of multidimensional coordinate points;

selecting any one of the optimal time sequence units to be recorded as a first optimal time sequence unit;

selecting the maximum value of the sample data and the minimum value of the sample data in the sample data, and calculating the difference value between the maximum value of the sample data and the minimum value of the sample data;

establishing a first multi-dimensional space geometry in the multi-dimensional space with the first optimal time series unit as a center and the difference value as a radius;

taking all the multi-dimensional coordinate points contained in the first multi-dimensional space geometrical body as an adjacent data set of the first optimal time sequence unit in the multi-dimensional space, and simultaneously calculating the density and the density center of the adjacent data set;

according to the determination method of the adjacent data sets of any optimal time sequence unit in the multidimensional space, the adjacent data sets of each optimal time sequence unit in the multidimensional space are determined, and the density center of each adjacent data set are calculated at the same time.

Further, the determining the degree of normality of each optimal time-series unit according to each optimal time-series unit and the adjacent data set corresponding to each optimal time-series unit includes:

calculating the distance between each optimal time sequence unit and the density center of the adjacent data set corresponding to each optimal time sequence unit;

and determining the normality degree of each optimal time sequence unit according to each distance and the density of the adjacent data sets corresponding to each distance.

Further, the calculation formula of the normality of each of the optimal time-series units is shown as follows:

wherein,

is shown as

Normality of each optimal time series unit;

is shown as

The total number of the sample data in an optimal time sequence unit is also the dimension of the multidimensional space;

is shown as

In the optimum time series unit

Dimension data;

is shown as

Second of the density center of the adjacent data sets of the optimal time series unit

Dimension data;

is shown as

Distance between each optimal time series unit and the density center of the corresponding adjacent data set;

denotes the first

In adjacent data sets of an optimal time series unit

The first of the data

Dimension data;

is shown as

The total number of data contained in the adjacent data sets of the optimal time sequence units;

is shown as

Density of adjacent data sets of the respective optimal time series units.

Further, the calculation formula of the attention of each of the optimal time series units is shown as follows:

wherein,

is shown as

Attention of the optimal time sequence unit;

denotes the first

Normality of each optimal time series unit;

is shown as

Autocorrelation of each optimal time series unit;

wherein,

for the judgment function, the specific rule of the judgment function is as follows:

when in use

When the temperature of the water is higher than the set temperature,

=

when is coming into contact with

When the temperature of the water is higher than the set temperature,

=

。

further, the training of the one-class support vector machine algorithm classifier by using the attention of each optimal time series unit comprises:

introducing the attention degree of each optimal time sequence unit into an optimization objective function of an OCSVM algorithm to obtain a decision function of a classifier belonging to a single-class support vector machine algorithm;

and training a single-class support vector machine algorithm classifier by using the decision function.

The invention has the beneficial effects that:

the OCSVM is a single classification algorithm which can construct an abnormal data classifier only by normal data, but when the classifier is trained, samples possibly belonging to the abnormal data in sample data influence the classifier to learn the characteristics of the normal data, so that the accuracy of the classifier in detecting the abnormal data is low. If the influence of the abnormal samples on the classifier is reduced, the classifier can better learn the characteristics of normal data, and the accuracy of detecting abnormal data by the classifier is improved.

For a small-scale internet of things application scenario adopting a heterogeneous deployment strategy, the internet of things gateway data has the following characteristics: 1) the gateway data are closely connected, have certain time correlation, and keep relatively stable in certain time, will not change sharply, and the relation between the adjacent gateway data is bigger. 2) The gateway of the internet of things uninterruptedly collects data in a specific mode, and the gateway data exist in a data flow mode along with the time, so that the gateway has the characteristic of dynamic property. Based on the above features, when performing gateway data anomaly detection, it needs to be converted into a time-series unit with a certain length, so the anomaly determination of the gateway data depends on two aspects: 1) correlation of the time series units themselves. If the time series data have the same change trend and have differences, the data are possibly abnormal data; 2) normality of time series units. Because the normal data and the abnormal data are different in forming mechanism, the abnormal data are far away from the normal data, and therefore the farther the data are from the cluster center, the higher the possibility of belonging to the abnormal data is.

Therefore, the invention provides an error correction method for gateway data of the Internet of things, which is improved based on a single-type support vector machine algorithm. The method comprises the steps of converting gateway sample data into time sequence units with a certain length, determining the attention degree of each time sequence unit according to the autocorrelation of each time sequence unit and the normality characteristic of each time sequence unit, controlling the influence of each time sequence unit on a single-class support vector machine algorithm classifier in the training process according to the attention degree, and improving the accuracy of the classifier.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic flowchart illustrating the general steps of an internet-of-things gateway data error correction method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating step S6 of the internet of things gateway data error correction method according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

As shown in fig. 1, an embodiment of a method for correcting error in gateway data of the internet of things includes:

and S1, acquiring the single type of sample data of the gateway.

The invention constructs a classifier to detect abnormal data through a single-class support vector machine algorithm. In order to ensure the accuracy of the classifier, the sample data is required to have better quality. Therefore, the classifier is trained by acquiring second-level data in the stable operation time period of the Internet of things as sample data, and the trained classifier is used for detecting abnormal data of the gateway of the Internet of things. Meanwhile, the invention acquires the sample data of a single type of the gateway. For example: in an application scene of the Internet of things, if the type of the sensor is a temperature sensor, the gateway temperature sample data is acquired by the method; if the type of sensor is a pressure sensor, then the gateway pressure sample data is obtained by the invention.

And S2, dividing the sample data according to any length numerical value in a preset time length range to obtain a plurality of time sequence units with equal length, and forming the time sequence data corresponding to the length numerical value by the plurality of time sequence units with equal length.

In the invention, because the sample data of the gateway is closely connected, has certain time correlation, is relatively stable within a certain time, and does not change rapidly, when the gateway data is detected to be abnormal, the change of the gateway data within a period of time needs to be ensured to be the same, and therefore, the gateway data needs to be converted into the time series data. When converting gateway data into time series data, the length of the time series data needs to be given, and the length needs to ensure that the autocorrelation of each time series data is large enough.

The preset time length range in the invention is

Sample data is expressed in terms of

Any length ofThe numerical value is divided to obtain a plurality of time sequence units with equal length, and the time sequence data corresponding to the numerical value with equal length is formed by the time sequence units with equal length. For example: if the length value is 5s, dividing the sample data according to the length value of 5s to obtain a plurality of time series units with the length of 5s, and forming time series data corresponding to the length value of 5s by all the time series units with the length of 5 s. If the time length numerical value is 10s, dividing the sample data according to the length numerical value of 10s to obtain a plurality of time series units with the length of 10s, and forming time series data corresponding to the length numerical value of 10s by all the time series units with the length of 10 s. Similarly, time series data corresponding to each length value in a preset time length range is obtained.

S3, acquiring time series data corresponding to each length numerical value in a preset time length range, fitting the time series data corresponding to each length numerical value, determining an optimal length numerical value according to a fitting result, and dividing the sample data into a plurality of optimal time series units with equal length according to the optimal length numerical value.

Fitting the time series data corresponding to each length numerical value and determining an optimal length numerical value according to a fitting result, wherein the fitting comprises the following steps: fitting the time sequence data corresponding to each length value to obtain a fitting result corresponding to each length value; when the fitting result corresponding to any length value is larger than the threshold value determined by the length value, marking the fitting result corresponding to the length value to obtain a marked fitting result; judging the fitting result corresponding to each length value to obtain all the post-labeling fitting results, and selecting the maximum post-labeling fitting result from all the post-labeling fitting results;

According to the invention, time sequence data corresponding to each length value in a preset time length range is obtained. In the invention, the least square method is utilized to carry out time sequence data corresponding to each length numerical valueFitting to obtain a fitting result corresponding to each length value, wherein the length values of the time series units are used

It shows that, since it has been explained in step S1 that the present invention obtains the second-level data in the internet of things smooth operation time period as sample data, the length value is used as the value

The total number of the sample data in the time sequence unit obtained by dividing is also

。

The specific calculation formula of the fitting result corresponding to any length value is shown in the following formula (1):

（1）

wherein,

denotes the first

The total number of said sample data contained within a time series unit, also representing the length value of the time series unit; n represents that the sample data is according to

Dividing the length to obtain n time sequence units;

is as follows

Within a time series unit

The true value of each datum;

derived from a linear formula fitted according to the least squares method

Within a time series unit

A predicted value of each datum;

as contained in the first time series unit

Mean of individual data;

is represented by a length value

A determined threshold value.

When linear fitting is carried out, the length value M of the time sequence unit limits the fitting effect, so that the requirements on the fitting effect are different for different length values M, and the threshold value of the fitting effect is set to be the length value M

Therefore, when the length value is selected as M, only the fitting effect is greater than

Then, the fitting result corresponding to the length M is marked to obtain a marked fitting result.

In the same way, obtain

And fitting the time series data corresponding to each length value in the whole range to obtain a plurality of fitting results, selecting the maximum value of the post-labeling fitting results from all the post-labeling fitting results, taking the length value corresponding to the maximum value of the post-labeling fitting results as an optimal length value, and dividing the sample data into a plurality of optimal time series units with equal length according to the optimal length value. Among them, the optimum length value in the present invention is used

It is shown that, since it has been described in step S1 that the present invention obtains the data of the second level in the internet of things smooth operation time period as the sample data, the numerical value is calculated according to the optimal length

The total number of the sample data in the optimal time sequence unit obtained by division is also

。

And S4, calculating the autocorrelation of each optimal time sequence unit.

the autocorrelation calculation formula of each optimal time series unit is shown as the following formula (2):

（2）

wherein,

is shown as

Autocorrelation of each optimal time series unit;

is shown as

A total number of said sample data within an optimal time series unit;

is shown as

In the unit of optimal time sequence

The true value of the individual sample data;

representing a linear equation fitted according to the least squares method

In the unit of optimal time sequence

A predicted value of individual sample data.

And S5, converting all the obtained optimal time sequence units into a multi-dimensional space, wherein the dimension of the multi-dimensional space is equal to the optimal length value.

In the invention, all the obtained optimal time sequence units are converted into a multidimensional space to obtain a plurality of multidimensional coordinate points. For example: if the optimum length value determined in the present invention is

Then according to the optimum length value

The total number of the sample data in the divided optimal time-series unit is also

And simultaneously converting all the obtained optimal time sequence units into

Obtaining a plurality of

And (5) dimensional coordinate points.

And S6, taking each optimal time sequence unit as a center and taking the numerical value determined according to the sample data as a radius, and determining an adjacent data set of each optimal time sequence unit in the multidimensional space.

Wherein, as shown in fig. 2: s61, converting all the obtained optimal time sequence units into the multidimensional space to obtain a plurality of multidimensional coordinate points; s62, selecting any one of the optimal time sequence units as a first optimal time sequence unit; s63, selecting the maximum value of the sample data in the sample data and the minimum value of the sample data, and calculating the difference value between the maximum value of the sample data and the minimum value of the sample data; s64, establishing a first multi-dimensional space geometrical body in the multi-dimensional space by taking the first optimal time sequence unit as a center and the difference value as a radius; s65, taking all the multi-dimensional coordinate points contained in the first multi-dimensional space geometry as an adjacent data set of the first optimal time sequence unit in the multi-dimensional space, and simultaneously calculating the density and the density center of the adjacent data set; and S66, according to the determination method of the adjacent data sets of any optimal time sequence unit in the multidimensional space, determining the adjacent data sets of each optimal time sequence unit in the multidimensional space, and simultaneously calculating the density and the density center of each adjacent data set.

In the invention, each optimal time sequence unit is mapped into a multidimensional space. Wherein, the first

At the optimum timeFor data in inter-sequence units

It is shown that the process of the present invention,

. To be provided with

Is taken as the center to

Being side-long

Inclusion in dimensional space geometry

Sample data of each sample to be tested

Is described as the first

The optimal time series data

In that

Adjacent datasets in dimensional space. Wherein

And the difference value between the maximum value of the sample data and the minimum value of the sample data is a single type of gateway.

And S7, determining the normality of each optimal time sequence unit according to each optimal time sequence unit and the adjacent data set corresponding to each optimal time sequence unit.

Because only a small part of the abnormal data is in the acquired gateway data, and because the abnormal data and the normal data have different forming mechanisms, the abnormal data are far away from the normal data, and therefore, the normal degree of each time series unit is judged by judging the density of the adjacent data of each time series unit in a certain neighborhood and the distance between each time series unit and the density center of the adjacent data set in the sample data. If the density of the adjacent data set of a time sequence unit in the multidimensional space is larger, and the distance between the time sequence unit and the density center of the adjacent data set is smaller, the probability that the time sequence unit belongs to normal data is larger, and the probability that the time sequence unit belongs to abnormal data is smaller.

The calculation formula of the normality of each optimal time series unit in the invention is shown as the following formula (3):

（3）

wherein,

is shown as

Normality of each optimal time series unit;

denotes the first

The total number of the sample data in the optimal time sequence unit is also the dimension of the multidimensional space;

denotes the first

In the optimum time series unit

Dimension data;

denotes the first

The first of the density centers of adjacent data sets of the optimal time series unit

Dimension data;

is shown as

is shown as

In adjacent data sets of an optimal time series unit

A first of the data

Dimension data;

denotes the first

is shown as

The density of adjacent data sets of the optimal time series unit is larger, the time series unit

The greater the probability of belonging to normal data, the smaller the probability of belonging to abnormal data.

And S8, determining the attention of each optimal time sequence unit according to the autocorrelation of each optimal time sequence unit and the normality of each optimal time sequence unit.

Autocorrelation of optimal time series units

The greater the degree of normality

The larger the probability that the data belongs to normal data is, the larger the attention is; autocorrelation of optimal time series data

The smaller, the degree of normality

The smaller the probability that it belongs to abnormal data, the smaller the attention. By setting a larger attention degree for normal data and a smaller attention degree for abnormal data, the influence of a sample which may be the abnormal data on the classifier is reduced when the classifier is trained, so that the classifier learns more characteristics of the normal data, and the accuracy of detecting the abnormal data by the classifier is improved.

The calculation formula of the attention of each optimal time series unit is shown as the following formula (4):

（4）

wherein,

representing the attention of the time-series unit;

representing the degree of normality of the time series unit;

representing the autocorrelation of a time series of units;

wherein,

to determine the function, the specific rule of the decision function is as follows:

when in use

When the utility model is used, the water is discharged,

=

when is coming into contact with

When the temperature of the water is higher than the set temperature,

=

。

and S9, training a single-class support vector machine algorithm classifier by using the attention of each optimal time sequence unit, and correcting error of the gateway data of the Internet of things by using the trained classifier.

According to the method, the attention of each optimal time sequence unit is utilized to train the single-class support vector machine algorithm classifier, when the single-class support vector machine algorithm classifier is trained, the attention of each optimal time sequence unit is firstly introduced into an optimized objective function of an OCSVM algorithm to obtain a decision function belonging to the single-class support vector machine algorithm classifier, and the decision function is utilized to train the single-class support vector machine algorithm classifier.

When the trained classifier is used for correcting the data to be detected of the gateway of the Internet of things, the type of the data to be detected is ensured to be the same as the type of sample data used when the single-class support vector machine algorithm classifier is trained. Dividing the data to be detected into a plurality of optimal time sequence units with equal length according to the optimal length numerical value, inputting the optimal time sequence units with equal length into a decision function of a single-class support vector machine algorithm classifier, and outputting the numerical value of the decision function. And judging whether the data to be detected is normal data or not according to the numerical value of the decision function.

If the numerical value of the decision function is 1, indicating that the time sequence data to be detected is a normal sample; and if the numerical value of the decision function is-1, indicating that the time sequence data to be detected is an abnormal sample.

In summary, the invention provides an error correction method for gateway data of the internet of things, which is improved based on a single-type support vector machine algorithm. The method comprises the steps of converting gateway sample data into time sequence units with a certain length, determining the attention degree of each time sequence unit according to the autocorrelation of each time sequence unit and the normality characteristic of each time sequence unit, controlling the influence of each time sequence unit on a single-class support vector machine algorithm classifier in the training process according to the attention degree, and improving the accuracy of the classifier.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. An error correction method for gateway data of an internet of things is characterized by comprising the following steps:

acquiring single type sample data of a gateway;

dividing the sample data according to any length value in a preset time length range to obtain a plurality of time sequence units with equal length, and forming time sequence data corresponding to the length value by the plurality of time sequence units with equal length;

calculating the autocorrelation of each optimal time-series unit;

determining a neighboring data set of each of the optimal time series units within the multi-dimensional space centered on each of the optimal time series units and having a radius of a numerical value determined from the sample data;

2. The internet of things gateway data error correction method according to claim 1, wherein fitting the time series data corresponding to each length value and determining an optimal length value according to a fitting result includes:

3. The method for correcting the error of the gateway data of the internet of things according to claim 1, wherein the calculating the autocorrelation of each optimal time sequence unit comprises:

wherein,

is shown as

Autocorrelation of each optimal time series unit;

is shown as

A total number of said sample data within an optimal time series unit;

is shown as

Within an optimal time series unit

The true value of the individual sample data;

representing a linear equation fitted according to the least squares method

In the unit of optimal time sequence

A predicted value of individual sample data.

4. The method according to claim 1, wherein the determining the neighboring data set of each optimal time-series unit in the multidimensional space with each optimal time-series unit as a center and a numerical value determined according to the sample data as a radius comprises:

establishing a first multi-dimensional space geometry within the multi-dimensional space centered on the first optimal time series unit and with the difference as a radius;

taking all the multi-dimensional coordinate points contained in the first multi-dimensional space geometric body as an adjacent data set of the first optimal time sequence unit in the multi-dimensional space, and simultaneously calculating the density and the density center of the adjacent data set;

5. The method for correcting the error of the gateway data of the internet of things according to claim 4, wherein the determining the normality of each optimal time-series unit according to each optimal time-series unit and the adjacent data set corresponding to each optimal time-series unit comprises:

6. The method for correcting error in gateway data of the internet of things as claimed in claim 5, wherein the calculation formula of the normality of each optimal time series unit is as follows:

wherein,

is shown as

Normality of each optimal time series unit;

is shown as

denotes the first

In the optimum time series unit

Dimension data;

is shown as

Dimension data;

is shown as

is shown as

In adjacent data sets of an optimal time series unit

The first of the data

Dimension data;

is shown as

denotes the first

Density of adjacent data sets of the respective optimal time series units.

7. The method for correcting the error of the gateway data of the internet of things according to claim 1, wherein a calculation formula of the attention of each optimal time sequence unit is shown as follows:

wherein,

is shown as

Attention of the optimal time sequence unit;

is shown as

Normality of each optimal time series unit;

denotes the first

Autocorrelation of each optimal time series unit;

wherein,

when in use

When the temperature of the water is higher than the set temperature,

=

when is coming into contact with

When the temperature of the water is higher than the set temperature,

=

。

8. the method for correcting errors in gateway data of the internet of things according to claim 1, wherein training a one-class support vector machine algorithm classifier by using the attention of each optimal time series unit comprises: