CN112101482A - Method for detecting abnormal parameter mode of missing satellite data - Google Patents

Method for detecting abnormal parameter mode of missing satellite data Download PDF

Info

Publication number
CN112101482A
CN112101482A CN202011152977.7A CN202011152977A CN112101482A CN 112101482 A CN112101482 A CN 112101482A CN 202011152977 A CN202011152977 A CN 202011152977A CN 112101482 A CN112101482 A CN 112101482A
Authority
CN
China
Prior art keywords
data
missing
satellite
time
masking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011152977.7A
Other languages
Chinese (zh)
Other versions
CN112101482B (en
Inventor
翟磊
鲍军鹏
颜博
郭思尧
张国亭
张超
辛为政
万俊伟
魏巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
CETC 54 Research Institute
Original Assignee
Xian Jiaotong University
CETC 54 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University, CETC 54 Research Institute filed Critical Xian Jiaotong University
Priority to CN202011152977.7A priority Critical patent/CN112101482B/en
Publication of CN112101482A publication Critical patent/CN112101482A/en
Application granted granted Critical
Publication of CN112101482B publication Critical patent/CN112101482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2451Classification techniques relating to the decision surface linear, e.g. hyperplane
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Radio Relay Systems (AREA)

Abstract

A method for detecting abnormal pattern of parameters of missing satellite data, comprising: data preprocessing, namely completing the normalization and standardization processing of satellite time sequence data; the method comprises the following steps of anomaly detection, wherein the satellite time sequence data containing the missing is loaded based on an artificial neural network model, batch processing parameters are set, the satellite time sequence data are grouped according to the parameter values, and then each group of the satellite time sequence data containing the missing is subjected to anomaly detection; and visualizing the result graph, storing the detection result and displaying the abnormal detection result in a graphical mode. The method utilizes a brand new neural network model and combines the inherent time dependence characteristic of satellite time sequence data to realize the process of accurately detecting the satellite abnormal mode under the condition of a large amount of missing data, thereby ensuring the stability of the satellite fault diagnosis and health management system.

Description

Method for detecting abnormal parameter mode of missing satellite data
Technical Field
The invention belongs to the technical field of intelligent information processing and computers, and particularly relates to a method for detecting abnormal parameter modes of missing satellite data.
Background
The satellite time series data may cause missing of original time series data due to a blind reception area or interference, that is, there are some time points or time periods without observation data. Missing time series data can cause great difficulty in application of pattern recognition, anomaly detection, fault diagnosis and the like.
In the past decades, researchers have developed various methods to address situations where there are missing values in time series data. A simple solution is to directly ignore missing data and only analyze the observed data, but this approach does not provide good performance when the miss rate is high and insufficient samples remain in the data. Another approach is to fill in missing values with replacement values, i.e., data fill. The filled-in data pattern does not necessarily coincide exactly with the true pattern implied by the original data. Meanwhile, in recent years, the recurrent neural network has shown good effects in many application fields of time series data, including machine translation, classification, prediction tasks and the like. The recurrent neural network has a plurality of good characteristics, such as strong prediction performance, the capability of capturing long-term time dependence, the capability of observing variable-length data and the like, and due to the characteristics, the recurrent neural network can reliably guarantee the problem of solving time sequence data.
In summary, research on time series data with missing values has been gradually highlighted, and until now, the processing of the time series data with missing values uses an evaluation method to guess the missing values, which causes a certain deviation of the data, and even influences the judgment of the time series data when the deviation is extremely large.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a method for detecting the abnormal parameter mode of missing satellite data, which effectively avoids the deviation, does not use a method for estimating a missing value to represent the data, does not process the missing value, directly takes the missing time series data as initial complete data, ensures the authenticity of the data and reduces the deviation degree of the data; the method is applied to the missing satellite time sequence data, and the parameter abnormal mode of the missing satellite time sequence data is detected by using a brand new neural network module. Compared with the traditional model, the method has differences in network structure and calculation mode.
In order to achieve the purpose, the invention adopts the technical scheme that:
a method for detecting abnormal pattern of parameters of missing satellite data, comprising:
data preprocessing, namely completing the normalization and standardization processing of satellite time sequence data;
the method comprises the following steps of anomaly detection, wherein the satellite time sequence data containing the missing is loaded based on an artificial neural network model, batch processing parameters are set, the satellite time sequence data are grouped according to the parameter values, and then each group of the satellite time sequence data containing the missing is subjected to anomaly detection; if the mode detected by the model does not belong to the existing data mode category, the mode is called as an abnormal mode, and all abnormal modes are uniformly defined as a new category during detection;
and visualizing the result graph, storing the detection result and displaying the abnormal detection result in a graphical mode.
The data preprocessing comprises the following steps: and loading a data set and generating satellite time sequence data containing a large amount of missing data, wherein the complete satellite time sequence data is set to have a large amount of random or continuous missing, so that the preprocessing process of the satellite time sequence data is realized.
The artificial neural network model takes a cyclic neural network module as reference, improves and optimizes the cyclic neural network module, and on one hand, a mask (mask) is utilized to represent the data missing condition; on the other hand, the time interval (time interval) is used to represent the time difference between the observed adjacent two satellite time sequence data; for arbitrary input time series data, masking and time intervals are applied to the input of the model and the network state, long-term time dependency on time series data observation is captured, and detection results are improved.
The masking means that: if the input value at the time t is not missing, setting the masking value to be 1, otherwise, setting the masking value to be 0 so as to represent that the input value is missing; the value of the time interval changes along with the change of the number of the missing values, the time interval is measured by an attenuation term, and the attenuation term is added to the input part and the output part of the single neural network module unit respectively so as to measure the time interval between the adjacent time sequence data input.
The calculation method of the neural network module is as follows:
Figure BDA0002741759410000031
in the formula (I), the compound is shown in the specification,
Figure BDA0002741759410000032
representing the output of the neural network module unit at the time t,trepresents the value of a time interval, andtthe value at the initial instant is 0, the matrix Wr、Wh_rAnd vector brModel parameters respectively representing initialized attenuation weight, attenuation weight corresponding to a single neural network module unit and initialized attenuation bias;
will be provided with
Figure BDA0002741759410000033
Rule that only diagonal variables exist and other positions are 0
Figure BDA0002741759410000034
To be provided with
Figure BDA0002741759410000035
Represents the attenuation ratio of the attenuation term in the input x direction at the moment t;
Figure BDA0002741759410000036
Figure BDA0002741759410000037
st=σ(Wsxt'+Usht-1'+Vsmt+bs) (4)
zt=σ(Wzxt'+Uzht-1'+Vzmt+bz) (5)
Figure BDA0002741759410000038
Figure BDA0002741759410000039
in the above formula, mtDenotes the masking value, xtIndicates the input at time t, ht-1The output value of the network module unit corresponding to the t-1 moment is represented, and h is finally obtained by calculationtThe final output value of a single neural network module unit at the time t is called as the output value of a hidden module unit corresponding to the model; wherein the matrix Ws、Wz、W、Us、Uz、U、Vs、VzV and vector bs、bzAnd b are model parametersA number, σ, denotes the sigmoid activation function,
Figure BDA00027417594100000310
representing multiplication operations between elements, st、zt、ht' are all intermediate calculation parameters, xt' is formed by xtCalculated, instead of the true input value at each moment, ht'-1Is formed byt-1The calculated output value replaces the output value of the neural network module unit at the moment;
as can be seen from the above calculation method of the neural network module, the neural network module introduces m, unlike the conventional methodtAnd
Figure BDA00027417594100000311
the two new variables respectively act on an input part at the time t and an output part of the corresponding hidden module unit; m istAnd (4) judging whether the deletion occurs at the time t or not, and only acting on the input part.
Figure BDA00027417594100000312
The attenuation term is acted on the input part and the output part of the hidden module unit, so that the concept of time interval is added to the input data of adjacent time moments, and the time characteristic of time sequence data is better represented.
The masking includes random masking and continuous masking, wherein:
randomly masking, namely randomly selecting missing positions to represent masking, setting the proportion of missing, and randomly masking initial complete satellite time sequence data according to the proportion;
continuous masking is the random selection of the number of times a continuous masking occurs, each masking randomly selects a position on the time series data, selects a random length from which all values within the length range are missing.
And the result graph is visualized, and the visualization tool is used for displaying satellite time sequence data, including non-missing data, random mask missing data, continuous mask missing data and abnormal mode detection results.
Compared with the prior art, the invention adds two new variables on the basis of the traditional recurrent neural network module, comprehensively considers the time dependence characteristic of time sequence data, realizes the abnormal mode detection of satellite time sequence data parameters according to the constructed new model, and can more accurately realize the abnormal mode detection of satellite time sequence data containing a large amount of deletions.
Drawings
Fig. 1 is an overall frame diagram of the module of the present invention.
FIG. 2 is a block diagram of a neural network module according to the present invention.
FIG. 3 is a timing graph of satellite load power parameters in a satellite timing data set without a miss.
Fig. 4 is a timing graph of satellite load power parameters in satellite timing data with random masking.
Fig. 5 is a timing graph of satellite load power parameters in satellite timing data with continuous masking.
Fig. 6 shows the detection result of abnormal pattern data of satellite load power parameters with random masking.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
As shown in fig. 1, a system for detecting abnormal parameter patterns of missing satellite data includes a data preprocessing module, an abnormal detection module and a result graph visualization module, and the corresponding detection method includes:
and (4) preprocessing data, and finishing the normalization and standardization processing of the satellite time sequence data. Specifically, the method completes the loading of a data set and generates satellite time sequence data containing a large amount of missing data, wherein the complete satellite time sequence data is set to have a large amount of missing data randomly or continuously, so as to realize a preprocessing process of the satellite time sequence data.
The method comprises the following steps of anomaly detection, wherein the satellite time sequence data containing the missing is loaded based on an artificial neural network model, batch processing parameters are set, the satellite time sequence data are grouped according to the parameter values, and then each group of the satellite time sequence data containing the missing is subjected to anomaly detection; if the mode detected by the model does not belong to the existing data mode category, the mode is called as an abnormal mode, and all abnormal modes are uniformly defined as a new category during detection;
and visualizing the result graph, storing the detection result and displaying the abnormal detection result in a graphical mode. Specifically, the visualization tool is used for displaying satellite time sequence data, including non-missing data, random mask missing data, continuous mask missing data and abnormal mode detection results.
The invention utilizes the artificial neural network model and combines the inherent time dependence characteristic of the satellite time sequence data to realize the accurate detection of the satellite abnormal mode under the condition of a large amount of missing data, thereby ensuring the stability of a satellite fault diagnosis and health management system, having better effect than the existing recurrent neural network model, helping personnel in related fields to more efficiently realize the detection of the missing time sequence data, improving the performance and expanding the structural framework of the existing recurrent neural network. The method is different from the traditional recurrent neural network in parameter abnormity detection, namely, different from the traditional recurrent neural network, for missing time sequence data, the long-term time dependence characteristic in the time sequence data can be better captured, and therefore a better detection result is generated.
The artificial neural network model is important content of the invention, and takes a cyclic neural network module as reference to improve and optimize the cyclic neural network module, introduces masking and time interval variables, and on one hand, uses masking (mask) to express data missing condition; on the other hand, the time interval (time interval) is used for representing the observed time difference between the adjacent two satellite time sequence data, and the time dependency relationship between the input time sequence data is replaced; obviously, the time interval variable can better ensure the time dependency relationship in the time sequence data, and can help the detection result more and be easier to understand. For any input time series data, the invention applies masking and time intervals to the input of the model and the network state, captures the long-term time dependency of the time series data observation and improves the detection result.
In the present invention, masking means: if the input value at the time t is not missing, setting the masking value to be 1, otherwise, setting the masking value to be 0 so as to represent that the input value is missing; in the present invention, the masking includes random masking and continuous masking, wherein:
randomly masking, namely randomly selecting missing positions to represent masking, setting the proportion of missing, and randomly masking initial complete satellite time sequence data according to the proportion;
continuous masking is the random selection of the number of times a continuous masking occurs, each masking randomly selects a position on the time series data, selects a random length from which all values within the length range are missing.
In the invention, the value of the time interval changes along with the change of the number of the missing values, the time interval is measured by an attenuation term, and the attenuation term is added to the input part and the output part of a single neural network module unit respectively so as to measure the time interval between adjacent time sequence data inputs.
The calculation method and parameter setting of the neural network module are different from those of the traditional model and are more complex than those of the traditional model-based method; for any input time sequence data, applying the mask and the time interval to each input and output unit of the neural network module improves the detection result, and the calculation method is as follows:
Figure BDA0002741759410000061
in the formula (I), the compound is shown in the specification,
Figure BDA0002741759410000062
representing the output of the neural network module unit at the time t,trepresents the value of a time interval, andtthe value at the initial instant is 0, the matrix Wr、Wh_rAnd vector brModel parameters respectively representing initialized attenuation weight, attenuation weight corresponding to a single neural network module unit and initialized attenuation bias;
will be provided with
Figure BDA0002741759410000063
Rule that only diagonal variables exist and other positions are 0
Figure BDA0002741759410000064
To be provided with
Figure BDA0002741759410000065
Represents the attenuation ratio of the attenuation term in the input x direction at the moment t;
Figure BDA0002741759410000066
Figure BDA0002741759410000067
st=σ(Wsxt'+Usht-1'+Vsmt+bs) (4)
zt=σ(Wzxt'+Uzht-1'+Vzmt+bz) (5)
Figure BDA0002741759410000071
Figure BDA0002741759410000072
in the above formula, mtDenotes the masking value, xtIndicates the input at time t, ht-1The output value of the network module unit corresponding to the t-1 moment is represented, and h is finally obtained by calculationtThe final output value of a single neural network module unit at the time t is called as the output value of a hidden module unit corresponding to the model; wherein the matrix Ws、Wz、W、Us、Uz、U、Vs、VzV and vector bs、bzB are allTo model parameters, σ denotes the sigmoid activation function,
Figure BDA0002741759410000073
representing multiplication operations between elements, st、zt、ht' are all intermediate calculation parameters, xt' is formed by xtCalculated, instead of the true input value at each moment, ht'-1Is formed byt-1The calculated output value replaces the output value of the neural network module unit at the moment;
as can be seen from the above calculation method of the neural network module, the neural network module introduces m, unlike the conventional methodtAnd
Figure BDA0002741759410000074
the two new variables respectively act on an input part at the time t and an output part of the corresponding hidden module unit; m istAnd (4) judging whether the deletion occurs at the time t or not, and only acting on the input part.
Figure BDA0002741759410000075
The attenuation term is acted on the input part and the output part of the hidden module unit, so that the concept of time interval is added to the input data of adjacent time moments, and the time characteristic of time sequence data is better represented.
Referring to fig. 2, it is a block diagram of the neural network module of the present invention.
It can be obviously seen from the structural diagram that the neural network module has a unique part, two new variables of mask and time interval are introduced, the time interval represents attenuation and acts on two parts of input and output of the network module unit respectively; the mask measures whether the data is missing or not and only acts on the input part. The model is obviously different from the traditional model, mainly represented on the variable of time interval, the traditional model has no concept of attenuation term in the structure, and the input of each moment and the output of the network module unit are directly applied to the network model without any change; the invention improves in this respect, and the loss of the information transmitted from time t to time t +1 can be represented by adding a variable, which can also be regarded as the attenuation degree of the information.
The unique structure of the neural network module determines that the neural network module has different calculation methods compared with the traditional model. Compared with the traditional most common model, the invention additionally considers the time delay factor, and respectively applies the masking and attenuation operations to each input and the output of the network module unit, and the traditional most common model does not consider the masking and attenuation steps of the data.
In addition, compared with the traditional model introducing two variables of mask and time interval, the neural network module of the invention is modified on a specific calculation method. While the traditional method uses the mean value and the last observation value to measure the input state at each moment when the mask is introduced, the invention directly carries out masking and attenuation operations on the input and applies the mean value and the last observation value to the specific situation of processing the mask.
In a specific application, according to specific requirements, appropriate parameters are selected in the satellite time sequence data set, and the data change rule of the selected parameters is analyzed. And correspondingly converting the format of the time sequence data corresponding to the selected parameters to finish the preliminary pretreatment of the time sequence data.
Initial satellite time series data is recorded in seconds. The load power of the satellite is selected as initial data. Reference is made to fig. 3, which is a timing graph of a satellite load power parameter in an unneeded satellite timing data set. As can be seen from fig. 3, the satellite data sets exhibit a significant periodic variation. And (5) integrating the data change trend of the parameter, and selecting a fixed window with the length of 600 to intercept the data. According to the window, a certain amount of time sequence data is intercepted from the data of the parameter, so as to form an initial data set.
For the satellite load power parameter, data of length 300 is previously intercepted, and the 300 data represented by the parameter are selected to constitute an initial training data set. As can be seen from fig. 3: the parameters contain 4 different pattern types, rising, falling, floating, and stationary. And sequentially intercepting data respectively representing the 4 different modes in each data in the initial training data set according to the length of 100, and adding corresponding labels to the data to realize a preliminary preprocessing process of the initial training data set.
After the preliminary preprocessing operation is completed, a mask miss is added to the current training data set, so as to simulate a missing time sequence data. The invention divides the data loss into two situations, namely random mask data loss and continuous mask data loss. Wherein, each mask condition is divided into three parts of 0 value replacing mask, mean value replacing mask and mask deleting.
Overall, (1) for a value of 0 instead of mask, since the initial value of mask is set to {0,1}, after the mask operation is performed on time series data, the missing value is replaced with 0, which is called a value of 0 instead of mask; (2) in the case of the mean value replacement mask, in the above case, the 0 value indicating the missing is replaced with the mean value of the time series data, and the missing value in each input time series data is replaced with the mean value of the input data, respectively, and this case is referred to as a mean value replacement mask; (3) for the delete mask, a part of data in each input time series data is deleted to make a deletion.
For the random mask case, refer to fig. 4, which is a timing diagram of the satellite load power parameter of fig. 3 after the random mask, and the red box part marks the change of the data after the random mask is added. The random mask was chosen to be implemented at a 10% ratio, as can be seen in the figure: the data fluctuation range after the mask is increased, and the up-and-down floating is obvious. Three different approaches to handling missing values are considered separately: (1) the value 0 replaces mask. The initial setting of the mask is {0,1}, so that after the mask operation is performed on the satellite time sequence data, the missing value is replaced by 0, the mask is directly multiplied by the input value x, and other processing is not needed; (2) mean value replaces mask. The 0 value indicating the missing is replaced by the mean value of the time series data, and the missing value in each of the input satellite time series data is replaced by the mean value of the input data, respectively. The initial satellite timing data may contain a value of 0, and in order to avoid confusion, the position where the value of 0 in the initial data is set is not missing. The position which is 0 after multiplying the input value x by the mask and meets the condition that the input value x is not 0 at the same time is replaced by a corresponding mean value; (3) and deleting the mask. In both cases, other values are used to replace missing values, and besides the method, the missing values can be directly deleted, and the method is simple and direct. Under the condition of random mask, the number of missing values generated by each input time sequence data is different, in order to realize simple process, the fixed mask length is set to complete the missing processing of the initial data, so that the uniform missing number of each input data is ensured, and the length of each input data is still consistent after the missing values are deleted. The original time series data may contain 0 value, and in order to avoid confusion, as in the case of using the average value instead, the position of 0 value in the original data is set to be not missing. And deleting the position which is 0 after multiplying the input value x by the mask and meets the condition that the input value x is not 0. In a specific experiment, the length of the mask is set to be 100, 100 different positions are randomly selected on each input data in the initial satellite time series data to carry out the mask operation, and the positions subjected to the mask processing are deleted from the initial data.
For the case of successive masks, reference is made to FIG. 5, which is a plot of the time series of the satellite load power parameter of FIG. 3 after successive masks, the red box portion of which marks the change in data after successive masks are added. As can be seen from the figure: the data after the mask has a certain level of persistence, which reflects the effect of a continuous mask. Three different approaches to handling missing values are considered separately: (1) the value 0 replaces mask. According to the method of continuous masks, the deletion in a certain range is set, the initial setting of the mask is {0,1}, the deletion value after the mask is represented by 0, and the mask is directly multiplied by the input value x. In a specific experiment, an integer c is randomly selected in the range from 0 to 100, thereby representing the number of times of continuous mask for the initial satellite timing data. Specifically, each time a random mask is executed, an integer r is randomly selected in the size (600) of the shape of each input satellite timing data, thereby indicating the starting position of the execution of the successive mask operations. In addition, an integer l is randomly selected from the range of 0 to 30, so as to represent the length of the continuous mask, the continuous mask with a certain degree of randomness is realized according to the method, and the data loss is represented by 0; (2) mean value replaces mask. And replacing the value 0 with the mean value to indicate the missing, wherein the basic method is consistent with the random mask condition, and after the continuous mask operation is completed, replacing the position which is 0 in the time sequence data and simultaneously satisfies the condition that the input value x is not 0 with the corresponding mean value. In a specific experiment, firstly, the relevant mask operation on initial satellite time sequence data is completed according to a method for replacing a 0 value, then, in order to avoid confusion, the position of the initial 0 in the satellite time sequence data is set without the occurrence of mask missing, and after the operation is completed, the position of the 0 in the current satellite time sequence data is respectively replaced by the mean value of each corresponding input time sequence data; (3) and deleting the mask. And selecting a constant position, a constant mask length and a constant mask frequency to realize continuous mask operation, and ensuring that the length of each time sequence data is kept consistent after the mask is deleted. After the mask operation is continued, the fixed execution mask position is deleted from the time series data. In a specific experiment, a fixed 10 missing positions are selected corresponding to each satellite timing data input, and the missing length of each position is 10, so that a situation that each input data is accompanied by a continuous missing of length 100 is realized. These consecutive missing locations are then deleted from the current satellite timing data.
After the mask missing operation is completed, the whole preprocessing module of the data is completed, and then the satellite time sequence data is missed to a certain extent. The current training data set with the missing data is selected as a test data set according to the proportion of 20 percent, so as to complete the subsequent training and testing process.
As can be seen from fig. 2: the core of the invention is two parts of mask and time interval, the detailed representation of the time interval in the structure chart is described by using a decade item, and the decade item is added into the two parts of each input and the output of the network module unit respectively so as to measure the time interval between adjacent time sequence data input. On the basis of the structure, a structural framework for realizing parameter mode prediction on satellite time sequence data by using the neural network module is constructed. In the prediction framework, the input at each time instant contains three parts: each network module unit is realized according to the specific calculation process, a complete neural network model is finally formed after a certain length of time step, and the output of the last network module unit is taken as the final output value of the neural network.
And predicting the satellite time sequence data which is obtained by preprocessing and has the loss by using the constructed mode prediction network frame, readjusting the shape and size of the current satellite time sequence data, setting a fixed time step and the number of units of the middle layer, circularly processing each input data according to the time step, and finally taking the output of the last network module unit as the output value of the whole network due to the long-term time dependence characteristic of the model. And finally expressing the output value as a 4-dimensional tensor through regression operation of a linear layer and a softmax activation function so as to respectively express the possible probability values of predicting the occurrence of the 4 different modes in the selected satellite load power parameters, wherein the index with the position with the maximum probability value corresponds to the mode class corresponding to certain input data in the satellite load power parameters predicted by using the neural network.
Based on the predicted pattern type, the input data is subjected to abnormal pattern detection.
For each input time sequence data with the loss in the satellite load power parameters, training parameters are set according to 4 different modes in the parameters, and the 4 modes are trained respectively, so that the model can predict the 4 different modes. For these categories, for each input datum, some abnormal patterns different from the existing patterns may be generated in the datum. And for the input data with the condition, the abnormal mode detection process is completed.
And on the test data set, performing mode class prediction on each input data by using the model after the training operation is finished, and measuring whether an abnormal mode exists in the input data according to a prediction result. If the difference between the 0 value and the 1 value in the output 4-dimensional probability vector is obvious, the model can well predict the mode type, and the data is judged to be abnormal; if the values in the output 4-dimensional probability vector are close and have no obvious difference, and the model of the output 4-dimensional probability vector cannot predict the mode type of the input data, the data is judged to be abnormal, and the mode belongs to an abnormal mode.
And carrying out the prediction process of the mode types on each input data of the test set, predicting the mode types corresponding to the input data, finding out abnormal modes in the data, and displaying the abnormal parts through a graphic visualization tool, thereby realizing the method for detecting the parameter abnormal modes of the missing satellite data.
Taking the load power parameter of the satellite as an example, the satellite load power parameter subjected to the random mask operation in fig. 4 is selected for explanation.
For the initial data, the data is pre-processed. Each data is preprocessed to 1 x 600 type, and only the portion of the data representing the satellite load power is selected, thereby forming an initial training data set. And dividing the initial training data set into 4 data sets respectively representing different mode types according to 4 different modes of the satellite load power parameters, and adding corresponding mode type labels to the data sets. And setting the length of each data to be 100 during division, and disordering the sequence of the divided data so as to finish the primary processing operation on the initial training data set.
Thereafter, a random mask is added to the current data and the mean is used to replace the missing value. Randomly adding a mask in the current data according to a proportion of 10%, representing a mask value by using the average value of each data so as to finish the whole preprocessing operation on the training data set, and selecting a part of data in the current training data set as a test data set so as to finish the dividing operation of the training data set and the test data set.
The novel neural network provided by the invention is built. For the data of this embodiment, the following parameters are specifically set, the step size is set to 5, the number of cells in the middle layer is 20, and the current training data and test data are also set to 5 × 20. Executing training operation by using the network model, setting the number of training rounds to be 5000, the learning rate to be 0.001 and the batch _ size to be 100; computing a training loss value by using MSE, and selecting an Adam optimizer to realize an optimization process of the loss value; and (4) observing the loss value and the accuracy result of each round of training, wherein the result shows that the network can accurately predict the mode type.
And then, performing mode prediction on the test data set by using the trained new neural network model, and outputting a prediction result. According to the result, the test effect is good, so that the model can well predict the mode type of the parameter.
In addition, there may be some data in the data set that may cause other patterns of exceptions different from the existing patterns due to the addition of the mask. In this case, for a certain input data, if the predicted mode by the model is different from the existing modes, or the prediction result is obviously different from the existing modes, it can be determined that the data generates an exception when a mask is added, and the abnormal modes in the input data are output and displayed vividly by a graphical visualization tool.
Selecting one data set after preprocessing and random mask adding as input, performing mode prediction on the input data by using a previously trained mode prediction network model according to 4 different corresponding modes in satellite load power parameters, finally outputting a prediction result of a mode type, outputting abnormal modes in the input data, and graphically displaying the abnormal modes.
Referring to fig. 6, the detection result of the abnormal pattern data with the satellite load power parameter having the random mask is shown in a red frame, that is, a pattern in which an abnormality is detected in the input data. From the graphical display results, it is apparent that an abnormal pattern different from the existing 4 patterns is generated in the input data.
By the neural network model, the mode type corresponding to certain input data can be predicted accurately finally, and abnormal modes corresponding to different parameters in missing time sequence data can be detected.

Claims (8)

1. A method for detecting abnormal pattern of parameters of missing satellite data, comprising:
data preprocessing, namely completing the normalization and standardization processing of satellite time sequence data;
the method comprises the following steps of anomaly detection, wherein the satellite time sequence data containing the missing is loaded based on an artificial neural network model, batch processing parameters are set, the satellite time sequence data are grouped according to the parameter values, and then each group of the satellite time sequence data containing the missing is subjected to anomaly detection;
and visualizing the result graph, storing the detection result and displaying the abnormal detection result in a graphical mode.
2. The method for detecting abnormal pattern of parameters in missing satellite data as claimed in claim 1, wherein said data preprocessing comprises: and loading a data set and generating satellite time sequence data containing a large amount of missing data, wherein the complete satellite time sequence data is set to have a large amount of random or continuous missing, so that the preprocessing process of the satellite time sequence data is realized.
3. The method for detecting abnormal pattern of parameters of missing satellite data as claimed in claim 1, wherein said artificial neural network model is improved and optimized with reference to a recurrent neural network module, on one hand, masking is used to represent data missing; on the other hand, the time interval is used for representing the time difference between the observed adjacent two satellite time sequence data; for arbitrary input time series data, masking and time intervals are applied to the input of the model and the network state, long-term time dependency on time series data observation is captured, and detection results are improved.
4. The method for detecting abnormal pattern of parameters of missing satellite data as claimed in claim 3, wherein said masking is: if the input value at the time t is not missing, setting the masking value to be 1, otherwise, setting the masking value to be 0 so as to represent that the input value is missing; the value of the time interval changes along with the change of the number of the missing values, the time interval is measured by an attenuation term, and the attenuation term is added to the input part and the output part of the single neural network module unit respectively so as to measure the time interval between the adjacent time sequence data input.
5. The method for detecting abnormal pattern of parameters of missing satellite data as claimed in claim 4, wherein said neural network module is calculated as follows:
Figure FDA00027417594000000210
in the formula (I), the compound is shown in the specification,
Figure FDA0002741759400000029
representing the attenuation term corresponding to the output of the neural network module unit at the time t, and acting on the output of the hidden module unit;trepresents the value of a time interval, andtthe value at the initial instant is 0, the matrix Wr、Wh_rAnd vector brModel parameters respectively representing initialized attenuation weight, attenuation weight corresponding to a single neural network module unit and initialized attenuation bias;
will be provided with
Figure FDA0002741759400000028
Rule that only diagonal variables exist and other positions are 0
Figure FDA0002741759400000026
To be provided with
Figure FDA0002741759400000027
Represents the attenuation ratio of the attenuation term in the input x direction at the moment t;
Figure FDA0002741759400000021
Figure FDA0002741759400000022
st=σ(Wsxt'+Usht-1'+Vsmt+bs) (4)
zt=σ(Wzxt'+Uzht-1'+Vzmt+bz) (5)
Figure FDA0002741759400000023
Figure FDA0002741759400000024
in the above formula, mtRepresenting a masking value, acting only on the input at time t, and measuring whether a miss occurs at time t, xtIndicates the input at time t, ht-1The output value of the network module unit corresponding to the t-1 moment is represented, and h is finally obtained by calculationtThe final output value of a single neural network module unit at the time t is called as the output value of a hidden module unit corresponding to the model; wherein the matrix Ws、Wz、W、Us、Uz、U、Vs、VzV and vector bs、bzB are model parameters, sigma represents sigmoid activation function,
Figure FDA0002741759400000025
representing multiplication operations between elements, st、zt、ht'are intermediate calculated parameters, x'tIs formed by xtCalculated, instead of the true input value at each moment, h′t-1Is formed byt-1And the calculated output value replaces the output value of the neural network module unit at the moment.
6. The method for detecting abnormal pattern of parameters of missing satellite data according to claim 3 or 4 or 5, wherein said masking comprises random masking and continuous masking, wherein:
randomly masking, namely randomly selecting missing positions to represent masking, setting the proportion of missing, and randomly masking initial complete satellite time sequence data according to the proportion;
continuous masking is the random selection of the number of times a continuous masking occurs, each masking randomly selects a position on the time series data, selects a random length from which all values within the length range are missing.
7. The method of claim 1, wherein the result is graphically visualized, and the visualization tool is used to display the satellite timing data, including non-missing data, random mask missing data, continuous mask missing data, and abnormal pattern detection result.
8. The method of claim 1, wherein the pattern detected by the model is not classified as an existing data pattern, and all abnormal patterns are defined as a new class when detecting.
CN202011152977.7A 2020-10-26 2020-10-26 Method for detecting abnormal parameter mode of missing satellite data Active CN112101482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011152977.7A CN112101482B (en) 2020-10-26 2020-10-26 Method for detecting abnormal parameter mode of missing satellite data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011152977.7A CN112101482B (en) 2020-10-26 2020-10-26 Method for detecting abnormal parameter mode of missing satellite data

Publications (2)

Publication Number Publication Date
CN112101482A true CN112101482A (en) 2020-12-18
CN112101482B CN112101482B (en) 2022-05-06

Family

ID=73786019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011152977.7A Active CN112101482B (en) 2020-10-26 2020-10-26 Method for detecting abnormal parameter mode of missing satellite data

Country Status (1)

Country Link
CN (1) CN112101482B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077357A (en) * 2021-03-29 2021-07-06 国网湖南省电力有限公司 Power time sequence data abnormity detection method and filling method thereof
CN113949656A (en) * 2021-10-15 2022-01-18 任桓影 Security protection network monitoring system based on artificial intelligence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415022A (en) * 2019-07-05 2019-11-05 阿里巴巴集团控股有限公司 Handle the method and device of user behavior sequence
CN110597799A (en) * 2019-09-17 2019-12-20 上海仪电(集团)有限公司中央研究院 Automatic filling method, system and equipment for missing value of time sequence data
CN110647980A (en) * 2019-09-18 2020-01-03 成都理工大学 Time sequence prediction method based on GRU neural network
CN110889546A (en) * 2019-11-20 2020-03-17 浙江省交通规划设计研究院有限公司 Attention mechanism-based traffic flow model training method
CN111161535A (en) * 2019-12-23 2020-05-15 山东大学 Attention mechanism-based graph neural network traffic flow prediction method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415022A (en) * 2019-07-05 2019-11-05 阿里巴巴集团控股有限公司 Handle the method and device of user behavior sequence
CN110597799A (en) * 2019-09-17 2019-12-20 上海仪电(集团)有限公司中央研究院 Automatic filling method, system and equipment for missing value of time sequence data
CN110647980A (en) * 2019-09-18 2020-01-03 成都理工大学 Time sequence prediction method based on GRU neural network
CN110889546A (en) * 2019-11-20 2020-03-17 浙江省交通规划设计研究院有限公司 Attention mechanism-based traffic flow model training method
CN111161535A (en) * 2019-12-23 2020-05-15 山东大学 Attention mechanism-based graph neural network traffic flow prediction method and system

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
KYUNGHYUN CHO ETC.: "Learning phrase representations using RNN encoder-decoder for statistical machine translation", 《ARXIV[CS.CL]》 *
NAZANIN FOULADGAR ETC.: "A Novel LSTM for Multivariate Time Series with Massive Missingness", 《SENSORS》 *
QIANTING LI ETC.: "VS-GRU: A Variable Sensitive Gated Recurrent", 《APPLIED SCIENCES》 *
YAN HAORAN ETC.: "Long-term gear life prediction based on ordered neurons LSTM neural Networks", 《MEASUREMENT》 *
YAN TIAN ETC.: "LSTM-based traffic flow prediction with missing data", 《NEUROCOMPUTING》 *
ZHENGPING CHE ETC.: "Recurrent Neural Networks for Multivariate Time Series with Missing Values", 《SCIENTIFIC REPORTS》 *
李卉 等: "基于LSTM模型的卫星电源***异常检测方法", 《装甲兵工程学院学报》 *
许宏才 等: "卫星健康管理***的发展与探索", 《无线电工程》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077357A (en) * 2021-03-29 2021-07-06 国网湖南省电力有限公司 Power time sequence data abnormity detection method and filling method thereof
CN113077357B (en) * 2021-03-29 2023-11-28 国网湖南省电力有限公司 Power time sequence data anomaly detection method and filling method thereof
CN113949656A (en) * 2021-10-15 2022-01-18 任桓影 Security protection network monitoring system based on artificial intelligence
CN113949656B (en) * 2021-10-15 2022-11-04 国家电投集团江西电力有限公司景德镇发电厂 Security protection network monitoring system based on artificial intelligence

Also Published As

Publication number Publication date
CN112101482B (en) 2022-05-06

Similar Documents

Publication Publication Date Title
CN109165664B (en) Attribute-missing data set completion and prediction method based on generation of countermeasure network
CN110533631B (en) SAR image change detection method based on pyramid pooling twin network
WO2023020388A1 (en) Gearbox fault diagnosis method and apparatus, gearbox signal collection method and apparatus, and electronic device
CN110674604A (en) Transformer DGA data prediction method based on multi-dimensional time sequence frame convolution LSTM
CN111562108A (en) Rolling bearing intelligent fault diagnosis method based on CNN and FCMC
CN112101482B (en) Method for detecting abnormal parameter mode of missing satellite data
CN112763967B (en) BiGRU-based intelligent electric meter metering module fault prediction and diagnosis method
CN111695607A (en) Electronic equipment fault prediction method based on LSTM enhanced model
CN109450405A (en) A kind of combined type software filtering method and system applied in AD acquisition
CN116681945A (en) Small sample class increment recognition method based on reinforcement learning
CN114492533A (en) Construction method and application of variable working condition bearing fault diagnosis model
CN114091504A (en) Rotary machine small sample fault diagnosis method based on generation countermeasure network
CN115438897A (en) Industrial process product quality prediction method based on BLSTM neural network
CN110334105A (en) A kind of flow data Outlier Detection Algorithm based on Storm
Hamar et al. State-of-health estimation using a neural network trained on vehicle data
CN115964258A (en) Internet of things network card abnormal behavior grading monitoring method and system based on multi-time sequence analysis
CN118037112A (en) Tread quality prediction model construction method based on data driving
Kim et al. End-to-end multi-task learning of missing value imputation and forecasting in time-series data
CN113203953B (en) Lithium battery residual service life prediction method based on improved extreme learning machine
CN112947080B (en) Scene parameter transformation-based intelligent decision model performance evaluation system
CN114818500A (en) Method for predicting soil bin pressure based on LSTM algorithm
CN114595448A (en) Industrial control anomaly detection method, system and equipment based on correlation analysis and three-dimensional convolution and storage medium
CN114692721A (en) Electronic information system test data prediction method based on simplified deep forest
CN114036035A (en) Method for recognizing abnormity based on real-time data traffic complexity
CN112215351A (en) Enhanced multi-scale convolution neural network soft measurement method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant