CN107832170B - Method and device for recovering missing data - Google Patents

Method and device for recovering missing data Download PDF

Info

Publication number
CN107832170B
CN107832170B CN201711047191.7A CN201711047191A CN107832170B CN 107832170 B CN107832170 B CN 107832170B CN 201711047191 A CN201711047191 A CN 201711047191A CN 107832170 B CN107832170 B CN 107832170B
Authority
CN
China
Prior art keywords
matrix
data
scada data
factor
missing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711047191.7A
Other languages
Chinese (zh)
Other versions
CN107832170A (en
Inventor
张光磊
刘源
邱忠营
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Goldwind Science and Creation Windpower Equipment Co Ltd
Original Assignee
Beijing Goldwind Science and Creation Windpower Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Goldwind Science and Creation Windpower Equipment Co Ltd filed Critical Beijing Goldwind Science and Creation Windpower Equipment Co Ltd
Priority to CN201711047191.7A priority Critical patent/CN107832170B/en
Publication of CN107832170A publication Critical patent/CN107832170A/en
Application granted granted Critical
Publication of CN107832170B publication Critical patent/CN107832170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1443Transmit or communication errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for recovering missing data, which are used for recovering the missing data. The method for recovering the missing data comprises the following steps: acquiring a plurality of groups of data; carrying out probability matrix decomposition on a numerical matrix formed by the multiple groups of data; determining the location of missing data in the plurality of sets of data; solving the product of elements corresponding to the positions of the missing data in the plurality of groups of data in the result of the probability matrix decomposition as missing data; and restoring the obtained missing data to the position of the missing data in the plurality of data groups.

Description

The restoration methods and device of missing data
Technical field
The present invention relates to data processing fields, more particularly, to the restoration methods and device of missing data.
Background technique
In data processing field, generally requires based on complete data and carry out data processing.
By taking the compress technique of data as an example, be divided into lossless compression and lossy compression two major classes, based on principal component analysis (PCA: Principle Components Analysis) data compression algorithm be a kind of Lossy Compression Algorithm, according to different variables it Between linear dependence carry out de-redundancy, to realize Data Dimensionality Reduction and data compression.But it is current based on principal component analysis Data compression algorithm, mostly need in advance choose batch data carry out principal component analysis, when newly generated data cannot be worked as When preceding principal component reconstructs well, then need to carry out the update of principal component.
That is, in the case where causing the incomplete situation of data due to data transmission fault etc., can not carry out it is main at Analysis, generally can only be by removal deficiency of data part, then carries out principal component analysis calculating.But this simple processing Mode is likely to result in the loss of partial data mode, so that the principal component inaccuracy generated, so that biggish reconstruct be caused to miss Difference.
In addition, being not only data compression technique, also all there is such problems in other data processing techniques.
Summary of the invention
The present invention is proposed in view of problem above, and its purpose is to provide the missing numbers for the recovery for realizing missing data According to restoration methods and device.
According to an aspect of the present invention, a kind of restoration methods of missing data are provided, comprising: obtain multi-group data;To institute It states numerical matrix composed by multi-group data and carries out probability matrix decomposition;Determine the position of the data lacked in the multi-group data It sets;It finds out in the result that the probability matrix decomposes and multiplies with the corresponding element in the position of data that is lacked in the multi-group data Product is used as missing data;And calculated missing data is restored to the position of the data lacked in the multi-group data.
According to another aspect of the present invention, a kind of recovery device of missing data is provided, comprising: data capture unit, Obtain multi-group data;Probability matrix decomposition unit carries out probability matrix point to numerical matrix composed by the multi-group data Solution;Deletion sites determination unit determines the position of the data lacked in the multi-group data;Missing data seeks unit, Find out member corresponding with the position of data lacked in the multi-group data in the decomposition result of the probability matrix decomposition unit The product of element is as missing data;And data recovery unit, the missing data is sought into the missing data that unit is found out It is restored to the position of the data lacked in the multi-group data.
According to another aspect of the present invention, a kind of computer-readable medium is provided, computer program is stored with, when described The step of restoration methods of above-mentioned missing data are realized when computer program is executed by processor.
According to another aspect of the present invention, a kind of computer equipment is provided, comprising: processor;Memory, being stored with can The computer program executed on a processor realizes above-mentioned missing when the computer program is executed by the processor The step of restoration methods of data.
According to the present invention, (Probabilistic Matrix Factorization, PMF) benefit is decomposed by probability matrix It is iterated calculating with the data of known portions, the partial data that can be lacked according to the data reconstruction of known portions.In this way, not It will cause the loss of partial data mode.
Detailed description of the invention
Fig. 1 shows the flow chart of the restoration methods of the missing data of embodiment according to the present invention one.
Fig. 2 shows the flow charts of the restoration methods of the missing data of embodiment according to the present invention two.
Fig. 3 shows the block diagram of the recovery device of the missing data of embodiment according to the present invention three.
Fig. 4 shows the block diagram of the recovery device of the missing data of embodiment according to the present invention four.
Specific embodiment
Hereinafter, being described with reference to embodiments of the present invention.
In the present invention, for multi-group data, by being analyzed using probability matrix, to realize the data of missing Recovery.
In addition, it should be noted that, in the present invention, multi-group data is 2 groups or more and each group separately includes multiple data Data, the data types of the multiple data is numeric type or the type that can be converted to numeric type, and then the multiple number According to data amount check it is preferably identical.
Embodiment one
In the present embodiment, it is assumed that the data comprising missing in multi-group data.
Fig. 1 shows the flow chart of the restoration methods of the missing data of embodiment according to the present invention one.
Referring to Fig.1, multi-group data is obtained in step S110 first, it will be consisting of corresponding numerical matrix.Specifically, Multi-group data is obtained from data source.In one embodiment, which is one or more monitoring devices, i.e., in this step Multiple groups monitoring data are obtained in chronological order from one or more monitoring devices, as the multi-group data.
As an example it is assumed that multi-group data is SCADA (Supervisory Control And shown in following table 1 Data Acquisition, data acquisition are controlled with monitoring) data, then in this step, from multiple biographies as monitoring device Sensor obtains the multi-group data in chronological order, will be consisting of numerical matrix A shown in formula (1), every a line generation of the matrix A The SCADA data at one moment of table, each column indicate the measurement result of a sensor.
Table 1
Date-time Sensor 1 Sensor 2 …… Sensor n
2016/3/15 15:25:36 0.5 0.2 0.9
2016/3/15 15:25:45 0.4 0.2 ?
2016/3/15 15:25:52 0.1 ? 0.7
2016/3/15 15:25:58 0.9 0.4 0.2
2016/3/15 15:26:06 0.2 0.0 0.1
Therein "? " indicate missing values.
In addition, above example shows that multi-group data is SCADA data and data inherently numeric type data Situation, even but in fact, SCADA data according to the difference of sensing data type also can include numeric type and enumeration type Two types, numeric type can be divided into integer and two kinds of floating type again;Enumeration type can be divided into Boolean type and two kinds of classification type again.
Therefore, in order to the recovery to acquired data progress missing data, in step s 110, also according to needs The pretreatment that data type conversion is carried out to the multi-group data, i.e., convert numeric type variable for non-numerical variable, such as Boolean type variable is indicated with 0 and 1;Then floating type variable is converted by integer variable, in order to carry out the recovery of missing data. Floating type variable is converted into original data type again after missing data recovery by above-mentioned data type conversion process.
In addition, in this step, in addition to above-mentioned data type conversion, according to actual needs may will also to multi-group data into The normalized pretreatment of row.For by taking SCADA data as an example, data normalization processing will each sensor data it is linear Within the scope of transforming to 0~1, different degrees of influence is generated to part field to prevent rounding error.In fact, returning to realize One changes, as long as equalization is generally gone to handle, i.e., the data of each sensor subtract sensor generation in SCADA data The mean value of total data exactly the data of each column are all subtracted for the other types data other than SCADA data The mean value of the total data of the column.Similarly, after missing data restores, normalized will also carry out in turn, therefore should protect Deposit the key messages such as mean value, the maximin of data used in normalization in the process.
It should be noted that, although above example shows the case where multi-group data is SCADA data, but it is not limited to This, in the present invention, the source of data can be varied, such as the height and weight data of people, economic growth data etc. when Between upper related data, be in addition also possible to spatially related data, be possibly even not associated each other Data etc..
Then, in step S120, probability matrix decomposition is carried out to the numerical matrix.
Probability matrix decomposition is a kind of matrix disassembling method based on probability graph model, with singular value in the prior art point The difference of solution is not necessarily to meet orthogonality, is iterated optimization to the matrix after decomposition by gradient descent method.
Specifically, probability matrix decomposition is the decomposition of following form as shown in following formula (2): for numerical matrix A={ aij, Solve factor I matrix UkWith factor Ⅱ matrix Vk, by factor I matrix UkWith factor Ⅱ matrix VkConjugate transposition Matrix Vk *Product as numerical matrix A probability matrix decompose result.
It should be noted that the factor I matrix U in above-mentioned formula (2)kIt is not necessarily unitary matrice, and factor Ⅱ matrix VkIt is unitary matrice, Vk *Indicate VkAssociate matrix.
As can be seen that result and singular value decomposition in the prior art that probability matrix decomposesResult it is different, eliminate intermediate diagonal matrix Σ.
In turn, the essential idea that the probability matrix in the present invention decomposes are as follows: in the probability matrix of numerical matrix A decomposes, Solve such factor I matrix UkAnd factor Ⅱ matrix Vk, i.e. the factor I matrix UkAnd factor Ⅱ matrix VkMost Each element a in the smallization numerical matrix AijWith the factor I matrix UkAnd factor Ⅱ matrix VkIn respective element Objective function.
Specifically, it is first determined a dimension, i.e. principal component number k, it is also assumed that the preceding k in numerical matrix A is arranged, Then factor I matrix U is iteratively solvedkWith factor Ⅱ matrix Vk, so that following objective function is minimum:
Wherein, uiAnd vjRespectively matrix UkAnd VkI-th and j-th row vector transposition, λ be specification item weight system Number,
Specifically, the process that above-mentioned probability matrix decomposes is as follows:
(1) random initializtion variable uiAnd vj
(2) it enablesCalculate gradientWith
(3) according to above-mentioned gradient updating uiAnd vj,Wherein α and β is to set Fixed step-length;
(4) it calculates
(5) above-mentioned (3) and (4) are repeated, until reaching the scheduled condition of convergence, such as φt+1< ε or | φt+1t| < ε, wherein ε is the threshold value of setting.
The process that above-mentioned probability matrix decomposes can be calculated using alternating least-squares, Levenberg-Marquardt Method or Wiberg algorithm etc. implement.
In addition, from the above, it can be seen that since each iteration only needs a given data to carry out parameter update, i.e., Make have missing data in numerical matrix A, probability matrix decomposes the decomposition that also can handle the numerical matrix.
Then, in step S130, the position of the data lacked in the multi-group data is determined.
In step S140, the position pair in the result of probability matrix decomposition with the data lacked in the multi-group data is found out The product for the element answered is as missing data.
Specifically, since the result that the probability matrix as shown in formula (2) decomposes isSo according in matrix A The position of the data of missing, by matrix UkWithIn corresponding position element multiplication, just can obtain missing data.
In step S150, calculated missing data is restored to the position of the data lacked in the multi-group data.By This, the multi-group data after obtaining completion.
The restoration methods of missing data according to the present embodiment, since probability matrix decomposition only needs in each iteration One given data carries out parameter update, therefore even if there is the data of missing in acquired multi-group data, also being capable of high-precision The probability matrix that ground carries out numerical matrix decomposes, so missing data found out according to the result that probability matrix decomposes and will be acquired Multi-group data completion, to provide complete data for carrying out other data processings.
Embodiment two
In the present embodiment, not only restore the data of the missing in multi-group data, but also the multi-group data is carried out Data compression.
Fig. 2 shows the flow charts of the restoration methods of the missing data of embodiment according to the present invention two.
As shown in Fig. 2, in the present embodiment, in addition to the recovery for including the steps that realizing missing data in embodiment one Except S110-S150, include the steps that realizing data compression decompression S260 and step S270.About step S110-S150, It is not described in detail herein.
In step S260, the compression of the multi-group data is carried out using the result that the probability matrix decomposes.
Specifically, following formula (4) are based on, are obtained by the result of the probability matrix decomposition of step S120 and in step S120 Factor Ⅱ matrix VkIt is multiplied to carry out the compression of the dimensionality reduction of data:
It is exactly that logarithm matrix A carries out the compressed number obtained after dimensionality reduction compression according to the matrix B that formula (4) obtains According to.In addition, due to needing in the decompression of matrix B using factor Ⅱ matrix VkAssociate matrix, i.e.So needing Save the matrix.
Then, in step S270, when needed, the compressed data are unziped it.
Specifically, from above-mentioned formula (4) as can be seen that after Data Dimensionality Reduction compression only remaining factor I matrix Uk(one As k < < m, m be A columns), as long as so decompression reconstruct when by it directly multiplied by factor Ⅱ matrix VkAssociate matrixData after decompression can be obtained.Therefore, it is solved according to following formula (5) in the compressed data of step S260 dimensionality reduction Compression.
It is exactly the matrix after decompression.
In addition, in decompression step S270, after decompression, it is also necessary to which progress is located in advance with the data in step S110 Data after decompression are transformed to former categorical data by the process for managing contrary.
As long as being not necessarily required to it should be noted that step S260 and S270 is executed after step S120 It is executed after step S150.
The restoration methods of missing data according to the present embodiment can not only realize the extensive of missing data in multi-group data Again to provide complete data, additionally it is possible to the dimensionality reduction compression for realizing the multi-group data comprising missing data, without will cause portion The loss of divided data mode, and then not will cause biggish reconstructed error.Further, since can be realized the multiple groups of missing data The substantially compression of data, so saving memory space and transmission cost.
Under same inventive concept, the present invention provides dress corresponding with the method for embodiment one and embodiment two It sets, is described separately below.
Embodiment three
Fig. 3 shows the block diagram of the recovery device of the missing data of embodiment according to the present invention three.
As shown in figure 3, the recovery device 300 of the missing data of present embodiment includes: data capture unit 310, probability Matrix decomposition unit 320, deletion sites determination unit 330, missing data seek unit 340 and data recovery unit 350.
Data capture unit 310 obtains multi-group data, will be consisting of corresponding numerical matrix.Specifically, data acquisition Unit 310 obtains multi-group data from data source.In one embodiment, which is one or more monitoring devices, that is, is counted Multiple groups monitoring data are obtained in chronological order from one or more monitoring devices according to acquiring unit 310, as the multi-group data.
In addition, as needed, data capture unit 310 also carries out data type conversion, normalization etc. to the multi-group data Pretreatment, and the key messages such as mean value, maximin for saving data used in normalization in the process.
Probability matrix decomposition unit 320 carries out probability matrix decomposition to the numerical matrix.Specifically, probability matrix decomposes Unit 320 is for numerical matrix A={ aij, solve factor I matrix UkWith factor Ⅱ matrix Vk, by factor I matrix Uk With factor Ⅱ matrix VkAssociate matrix Vk *Product as numerical matrix A probability matrix decompose result.In turn, What probability matrix decomposition unit 320 was solved in the probability matrix of numerical matrix A decomposes is such factor I matrix Uk And factor Ⅱ matrix Vk, i.e. the factor I matrix UkAnd factor Ⅱ matrix VkMinimize each member in the numerical matrix A Plain aijWith the factor I matrix UkAnd factor Ⅱ matrix VkIn respective element objective function.More specifically, probability square Battle array decomposition unit 320 carries out probability matrix decomposition to numerical matrix A according to above-mentioned formula (3), obtains the square of form shown in formula (2) Battle array decomposition result.In turn, the process and the step in embodiment one that probability matrix decomposition unit 320 carries out probability matrix decomposition Process shown in S120 is identical, in this detailed description will be omitted.
In the present embodiment, it is assumed that the data comprising missing in the multi-group data that data capture unit 310 obtains.
Deletion sites determination unit 330 determines the data lacked in the multi-group data obtained by data capture unit 310 Position.
Missing data seek unit 340 find out in the decomposition result of probability matrix decomposition unit 320 with the multi-group data The product of the corresponding element in the position of the data of middle missing is as missing data.Specifically, as the probability square as shown in above formula (2) Battle array decompose result beSo missing data seeks unit 340 and obtains the decomposition of probability matrix decomposition unit 320 Matrix UkWithIn corresponding position element multiplication, just can obtain corresponding missing data in matrix A.
Missing data is sought the missing data that unit 340 is found out and is restored to the multi-group data by data recovery unit 350 The position of the data of middle missing.
The recovery device of the missing data of present embodiment functionally can be realized the missing number of above embodiment one According to restoration methods.
Embodiment four
In the present embodiment, not only restore the data of the missing in multi-group data, but also the multi-group data is carried out Data compression.
Fig. 4 shows the block diagram of the recovery device of the missing data of embodiment according to the present invention four.
As shown in figure 4, the recovery device 400 of the missing data of present embodiment is in addition to the device comprising embodiment three Data capture unit 310, probability matrix decomposition unit 320, deletion sites determination unit 330, missing data in 300 seek list It also include compression unit 460 and decompression unit 470 except member 340 and data recovery unit 350.About unit 310-350, It is not described in detail herein.
Compression unit 460 carries out the compression of the multi-group data using the decomposition result of probability matrix decomposition unit 320.Tool Body, the factor Ⅱ matrix that compression unit 460 is decomposed by the result that the probability matrix decomposes and by probability matrix VkIt is multiplied, to obtain compressed data.More specifically, compression unit 460 is based on above-mentioned formula (4), the dimensionality reduction pressure of data is carried out Contracting, and save it is being decomposed by probability matrix, in the decompression of matrix need split-matrix to be used.
Decompression unit 470 unzips it the compressed data of dimensionality reduction.Specifically, decompression unit 470 will pass through The compressed data of compression unit 460 and the factor Ⅱ matrix VkAssociate matrix Vk *It is multiplied, to be decompressed Data afterwards.More specifically, decompression unit 470 unzips it the compressed data of dimensionality reduction according to above-mentioned formula (5).This Outside, after decompression, decompression unit 470 also needs to carry out to carry out the multi-group data with data capture unit 310 pre- Data after decompression are transformed to former categorical data by the processing for handling contrary.
The recovery device of the missing data of present embodiment functionally can be realized the missing number of above embodiment two According to restoration methods.
According to embodiment of the present invention, a kind of computer equipment is also provided.The computer equipment includes processing Device and memory, memory are stored with the computer program that can be executed on a processor, when the computer program is processed When device executes, the step of realizing the restoration methods of the missing data of embodiment according to the present invention.
Moreover, it should be understood that each unit in the device of illustrative embodiments can be implemented hardware according to the present invention Component and/or component software.Those skilled in the art's processing according to performed by each unit of restriction, can be for example using existing Field programmable gate array (FPGA) or specific integrated circuit (ASIC) Lai Shixian each unit.
In addition, the method for illustrative embodiments may be implemented as in computer readable recording medium according to the present invention Computer program.Those skilled in the art can realize the computer program according to the description to the above method.When described Computer program is performed in a computer realizes the above method of the invention.
Although being particularly shown and describing the present invention, those skilled in the art referring to its illustrative embodiments Member is it should be understood that can carry out shape to it in the case where not departing from the spirit and scope of the present invention defined by claim Various changes in formula and details.

Claims (12)

1. a kind of restoration methods for lacking SCADA data characterized by comprising
Obtain multiple groups SCADA data;
The pretreatment that data type conversion is carried out to the multiple groups SCADA data, is converted to integer for nonumeric type SCADA data SCADA data, and then the integer SCADA data is converted into floating type SCADA data;
Probability matrix decomposition is carried out to numerical matrix composed by the pretreated multiple groups SCADA data;
Determine the position of the SCADA data lacked in the multiple groups SCADA data;
Find out the position pair in the result that the probability matrix decomposes with the SCADA data lacked in the multiple groups SCADA data The product for the element answered is as missing SCADA data;And
Calculated missing SCADA data is restored to the position of the SCADA data lacked in the multiple groups SCADA data,
In the probability matrix decomposition step, factor I matrix and factor Ⅱ matrix are solved for the numerical matrix, The product of the factor I matrix and the associate matrix of the factor Ⅱ matrix is decomposed as the probability matrix As a result,
Factor I matrix is solved for the numerical matrix and factor Ⅱ matrix specifically includes, in the general of the numerical matrix Rate matrix solves such factor I matrix and factor Ⅱ matrix, i.e. the factor I matrix and factor Ⅱ square in decomposing Battle array minimizes under the respective element in each element and the factor I matrix and factor Ⅱ matrix in the numerical matrix State objective function:
Wherein, uiFor the transposition of i-th of row vector of factor I matrix, vjFor turn of j-th of row vector of factor Ⅱ matrix It setting, λ is specification item weight coefficient,
2. the restoration methods of missing SCADA data according to claim 1, which is characterized in that find out corresponding element The step of product includes:
It will be corresponding with the position of the SCADA data of the missing respectively in the factor I matrix and the factor Ⅱ matrix Element multiplication and as the missing SCADA data.
3. the restoration methods of missing SCADA data according to claim 1, which is characterized in that further include:
The compression of the multiple groups SCADA data is carried out using the result that the probability matrix decomposes.
4. the restoration methods of missing SCADA data according to claim 3, which is characterized in that described to utilize the probability The compression that the result of matrix decomposition carries out the multiple groups SCADA data specifically includes, the result that the probability matrix is decomposed with The factor Ⅱ matrix multiple, to obtain compressed SCADA data.
5. the restoration methods of missing SCADA data according to claim 4, which is characterized in that will be described compressed SCADA data is multiplied with the associate matrix of the factor Ⅱ matrix, with the SCADA data after being decompressed.
6. a kind of recovery device for lacking SCADA data characterized by comprising
Data capture unit obtains multiple groups SCADA data, and carries out data type conversion to the multiple groups SCADA data Pretreatment, is converted to integer SCADA data for nonumeric type SCADA data, and then the integer SCADA data is converted to floating Point-type SCADA data;
Probability matrix decomposition unit carries out probability to numerical matrix composed by the pretreated multiple groups SCADA data Matrix decomposition;
Deletion sites determination unit determines the position of the SCADA data lacked in the multiple groups SCADA data;
Missing data seeks unit, find out in the decomposition result of the probability matrix decomposition unit with the multiple groups SCADA number According to the product of the corresponding element in the position of the SCADA data of middle missing as missing SCADA data;And
Data recovery unit, by the missing SCADA data seek the missing SCADA data that unit is found out be restored to it is described more The position of the SCADA data lacked in group SCADA data,
The probability matrix decomposition unit solves factor I matrix and factor Ⅱ matrix for the numerical matrix, will be described The product of the associate matrix of factor I matrix and the factor Ⅱ matrix as the probability matrix decompose as a result,
The probability matrix decomposition unit solves such factor I matrix in the probability matrix of the numerical matrix decomposes And factor Ⅱ matrix, i.e. the factor I matrix and factor Ⅱ matrix minimize each element in the numerical matrix and should Following objective functions of factor I matrix and the respective element in factor Ⅱ matrix:
Wherein, uiFor the transposition of i-th of row vector of factor I matrix, vjFor turn of j-th of row vector of factor Ⅱ matrix It setting, λ is specification item weight coefficient,
7. the recovery device of missing SCADA data according to claim 6, which is characterized in that the missing data is sought Unit will be corresponding with the position of the SCADA data of the missing respectively in the factor I matrix and the factor Ⅱ matrix Element multiplication and as the missing SCADA data.
8. the recovery device of missing SCADA data according to claim 6, which is characterized in that further include:
Compression unit carries out the compression of the multiple groups SCADA data using the decomposition result of the probability matrix decomposition unit.
9. the recovery device of missing SCADA data according to claim 8, which is characterized in that the compression unit is by institute The result and the factor Ⅱ matrix multiple that probability matrix decomposes are stated, to obtain compressed SCADA data.
10. the recovery device of missing SCADA data according to claim 9, which is characterized in that further include that decompression is single The compressed SCADA data is multiplied, to be decompressed by member with the associate matrix of the factor Ⅱ matrix SCADA data afterwards.
11. a kind of computer-readable medium, is stored with computer program, which is characterized in that when the computer program is processed The step of restoration methods of missing SCADA data described in any one in claim 1 to 5 are realized when device executes.
12. a kind of computer equipment characterized by comprising
Processor;
Memory is stored with the computer program that can be executed on a processor, when the computer program is by the processor When execution, described in any one in realization claim 1 to 5 the step of the restoration methods of missing SCADA data.
CN201711047191.7A 2017-10-31 2017-10-31 Method and device for recovering missing data Active CN107832170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711047191.7A CN107832170B (en) 2017-10-31 2017-10-31 Method and device for recovering missing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711047191.7A CN107832170B (en) 2017-10-31 2017-10-31 Method and device for recovering missing data

Publications (2)

Publication Number Publication Date
CN107832170A CN107832170A (en) 2018-03-23
CN107832170B true CN107832170B (en) 2019-03-12

Family

ID=61651164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711047191.7A Active CN107832170B (en) 2017-10-31 2017-10-31 Method and device for recovering missing data

Country Status (1)

Country Link
CN (1) CN107832170B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108880620B (en) * 2018-08-20 2021-06-11 广东石油化工学院 Power line communication signal reconstruction method
CN108918928B (en) * 2018-09-11 2020-11-10 广东石油化工学院 Power signal self-adaptive reconstruction method in load decomposition
CN109166626B (en) * 2018-10-29 2021-09-14 中山大学 Method for supplementing medical index missing data of peptic ulcer patient
CN112165403B (en) * 2020-09-29 2021-04-27 北京视界云天科技有限公司 UDP (user Datagram protocol) data packet recovery method and device, computer equipment and storage medium
CN113918541B (en) * 2021-12-13 2022-04-26 广州市玄武无线科技股份有限公司 Preheating data processing method and device and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1526103A (en) * 2001-07-11 2004-09-01 ��ʽ���羫������ Dct matrix decomposing method and dct device
CN102402569A (en) * 2010-09-08 2012-04-04 索尼公司 Rating prediction device, rating prediction method, and program
CN103942545A (en) * 2014-05-07 2014-07-23 中国标准化研究院 Method and device for identifying faces based on bidirectional compressed data space dimension reduction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1526103A (en) * 2001-07-11 2004-09-01 ��ʽ���羫������ Dct matrix decomposing method and dct device
CN102402569A (en) * 2010-09-08 2012-04-04 索尼公司 Rating prediction device, rating prediction method, and program
CN103942545A (en) * 2014-05-07 2014-07-23 中国标准化研究院 Method and device for identifying faces based on bidirectional compressed data space dimension reduction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向时序数据的矩阵分解;黄晓宇 等;《软件学报》;20150930;第2262页至第2275页 *

Also Published As

Publication number Publication date
CN107832170A (en) 2018-03-23

Similar Documents

Publication Publication Date Title
CN107832170B (en) Method and device for recovering missing data
CN107800437B (en) Data compression method and device
Sun et al. Feature selection using rough entropy-based uncertainty measures in incomplete decision systems
CN110175541B (en) Method for extracting sea level change nonlinear trend
Lan et al. Matrix recovery from quantized and corrupted measurements
CN103559205A (en) Parallel feature selection method based on MapReduce
CN112862127A (en) Sensor data exception handling method and device, electronic equipment and medium
Ding et al. An improved adaptive bivariate dimension-reduction method for efficient statistical moment and reliability evaluations
CN115618212A (en) Power data processing method and device, computer equipment and storage medium
Ritz Goodness‐of‐fit tests for mixed models
CN106407620B (en) A kind of engineering structure response surface stochastic finite element analysis processing method based on ABAQUS
CN109635452B (en) Efficient multimodal random uncertainty analysis method
CN107766294A (en) Method and device for recovering missing data
JP2017151497A (en) Time-sequential model parameter estimation method
Shmueli et al. Updating kernel methods in spectral decomposition by affinity perturbations
Hallmann et al. All solutions of the stochastic fixed point equation of the Quicksort process
Kruzick et al. Spectral statistics of lattice graph structured, non-uniform percolations
Greenwood et al. Information bounds for Gibbs samplers
Demir et al. Maximum likelihood estimation for the parameters of the generalized gompertz distribution under progressive type-ii right censored samples
Min et al. Variance reduced stochastic optimization for PCA and PLS
Chen et al. Bayesian hierarchical modelling on dual response surfaces in partially replicated designs
CN111382891A (en) Short-term load prediction method and short-term load prediction device
Tsagkatakis et al. Matrix and tensor signal modelling in cyber physical systems
Tanak et al. A new lifetime distribution by maximizing entropy: properties and applications
de Gooijer et al. On the choice of basis in proper orthogonal decomposition-based surrogate models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant