CN105279705A

CN105279705A - Sparse representation method for on-line data collection of power

Info

Publication number: CN105279705A
Application number: CN201510639465.6A
Authority: CN
Inventors: 周爱华; 孟祥君; 丁杰; 朱力鹏; 胡斌; 饶玮; 潘森
Original assignee: State Grid Corp of China SGCC; State Grid Shandong Electric Power Co Ltd; Smart Grid Research Institute of SGCC
Current assignee: State Grid Corp of China SGCC; State Grid Shandong Electric Power Co Ltd; Smart Grid Research Institute of SGCC
Priority date: 2015-09-30
Filing date: 2015-09-30
Publication date: 2016-01-27

Abstract

The invention discloses a sparse representation method for the on-line data collection of power, and the method comprises the following steps: (1) building a sparse coding mode, and initializing a zero vector; (2) randomly sampling a data sample for iteration; (3) updating a sparse coding coefficient; (4) calculating an approximation error; (5) judging the algorithm convergence. The method is used for the research of big data of power based on a random gradient descent algorithm, can quickly and effectively process data, greatly improves the solving efficiency, also can effectively solve a sparse coding coefficient, achieves the compression of big-data sparse coding, reduces the storage space, and reduces the storage requirements for hardware.

Description

A kind of sparse representation method towards electric power online image data

Technical field

The present invention relates to a kind of sparse representation method, be specifically related to a kind of sparse representation method towards electric power online image data.

Background technology

At society, the importance of data manifests just gradually, has just like become the important evidence of company's overall situation, complete analysis.Along with the progress of science and technology, the channel that people obtain information gets more and more, and the data resource of accumulation also gets more and more, and the data volume in many Internet enterprises odd-numbered days has reached hundreds of GB, and What is more, has reached TB (1TB=1024GB) rank.Traditional database cannot meet so huge call data storage.Under the large data background that current data volume is explosive growth, the appearance of sparse coding successfully achieves data compression, solves the series of problems such as memory data output is large, hardware requirement is high.

The rarefaction representation of large data can the fully immanent structure of mining data and feature, large for the higher-dimension of complexity data compression can be become simple coding form, and can nondestructively recover former data by the set trained before, obtain application in a lot of field such as signal transacting, artificial intelligence, pattern-recognition.But the greatest problem that traditional sparse coding technology faces is exactly poor real, computing velocity is slow, requires a great deal of time to process data, and when in the face of large data, can not solve sparse coding easily.Therefore, the large data of the electric power for real-time change, how research effectively utilizes Sparse methods to solve sparse coding is rapidly significantly.

Summary of the invention

In order to overcome above-mentioned the deficiencies in the prior art, the invention provides a kind of sparse representation method towards electric power online image data, the present invention is based on stochastic gradient descent algorithm and can process data fast and effectively, more effectively process the large data of electric power and other large data.

In order to realize foregoing invention object, the present invention takes following technical scheme:

Towards a sparse representation method for electric power online image data, described method comprises the steps:

(1) sparse coding model is set up, and initialization null vector;

(2) stochastic sampling data sample carries out iterative processing;

(3) sparse coding coefficient is upgraded;

(4) approximate error is calculated;

(5) distinguished number convergence.

Preferably, in described step (1), described sparse coding model is:

w = \arg \underset{w}{m i n} \frac{1}{2} | | y - Φ * w | |_{2}^{2} + λ | | w | |_{1} - - - (1)

Y ∈ R in formula ⁿfor object vector matrix, matrix Φ ∈ R ^{n × m}be the set for vector matrix rarefaction representation, m is the atom number in set, w ∈ R ^mfor sparse coding coefficient vector, λ is the step-length of vector, R ^{n × m}for n is capable, the vector matrix of m column data sample;

U and v is initialized as null vector, and sparse coding coefficient w is expressed as:

w＝(u ₁-v ₁,…,u _m-v _m)(2)

U in formula _m, v _mbe respectively m the component of vectorial u and v.

Preferably, in described step (2), t the component y of described stochastic sampling data sample y _tu is upgraded successively according to iterative formula below _i, v _i(1≤i≤m):

\{\begin{matrix} u_{i} &LeftArrow; {[u_{i} - (λ - (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \\ v_{i} &LeftArrow; {[v_{i} - (λ + (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \end{matrix} - - - (3)

In formula, t (1≤t≤n) represents that current Stochastic choice carries out approaching the sample point position sequence number of expression, Φ (t) ∈ R ^mrepresent t (1≤t≤n) row vector of set Φ, Φ _it () ∈ R represents i-th component of set Φ t row vector, be real number value, y _trepresent t the component of sample data y, [x] ₊=max{0, x}.

Preferably, in described step (3), by the u after iteration _iand v _isubstitute into formula (2) and upgrade described sparse coding coefficient.

Preferably, in described step (4), by following formulae discovery approximate error:

r e c e r r = \frac{| | y - Φ w | |_{2}^{2}}{| | y | |_{2}^{2}} - - - (4)

Preferably, in described step (5), distinguished number convergence, when described approximate error recerr<1 × 10 ^-5time, algorithm convergence stops iteration, exports sparse coding w, otherwise goes to step (2) continuation iterative processing.

Compared with prior art, beneficial effect of the present invention is:

The present invention is based on the large Sparse method for expressing of stochastic gradient descent algorithm research electric power, data can not only be processed fast and effectively, substantially increase solution efficiency, also sparse coding coefficient can effectively be obtained, achieve the compression that large Sparse is encoded, decrease storage space, reduce the memory requirement to hardware.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of a kind of sparse representation method towards electric power online image data provided by the invention

Embodiment

Below in conjunction with accompanying drawing, the present invention is described in further detail.

Sparse coding model is the effective method for expressing of one of signal, and encoding model is:

w = \arg \underset{w}{m i n} \frac{1}{2} | | y - Φ * w | |_{2}^{2} + λ | | w | |_{1}

Y ∈ R in formula Chinese style ⁿfor object vector matrix, matrix Φ ∈ R ^{n × m}be the set for vector matrix rarefaction representation, m is the atom number in set, w ∈ R ^mfor sparse coding coefficient vector, λ is the step-length of vector, R ^{n × m}for n is capable, the vector matrix of m column data sample.The w solved should guarantee to recover target y accurately, ensures enough degree of rarefications again.Traditional method is all adopt interior point method, gradient descent method to solve w.It is O (n that now widely used interior point method solves its complexity of software L1-Magic ^2.5), n is the dimension of pending data, and computational complexity is high.Although traditional gradient descent method reduces the operand of each iteration, the L1 norm of more difficult Non-smooth surface.Simultaneously under large data background, because data volume is large, the intrinsic dimensionality n of data sample is high, makes solving of rarefaction representation coefficient w particularly difficult.Particularly for the large data (high definition line walking image, monitoring image etc. that such as unmanned plane is taken photo by plane) of power industry, not only intrinsic dimensionality is high, simultaneously also normally online acquisition data, therefore, the efficiency for rarefaction representation algorithm is had higher requirement.

The present invention takes stochastic sampling strategy to approach expression to sample data, the data characteristics point processing random selecting is only needed during each iteration, do not need all characteristic point data traveling through each sample, decrease required treatment samples notebook data dimension, simultaneously in order to effectively process the L1 norm of Non-smooth surface, w is resolved into the difference of 2 variablees by the present invention, regards (u as by w ₁-v ₁..., u _m-v _m) array configuration.Although look that variable is many, in the solution procedure of reality, calculate and become more succinct, and also can deal carefully with traditional gradient descent method is difficult to differentiate embarrassment at w place.The present invention is about of vectorial u and v _iindividual component (is expressed as _uiwith _vi) iteration more new formula is as follows:

\{\begin{matrix} u_{i} &LeftArrow; {[u_{i} - (λ - (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \\ v_{i} &LeftArrow; {[v_{i} - (λ + (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \end{matrix}

Wherein t (1≤t≤n) represents that current Stochastic choice carries out approaching the sample point position sequence number of expression, Φ (t) ∈ R ^mrepresent t (1≤t≤n) row vector of set Φ, Φ _it () ∈ R represents i-th component of set Φ t row vector, be real number value, y _trepresent t the component of sample data y.

As shown in Figure 1, be a kind of sparse representation method process flow diagram towards electric power online image data provided by the invention, concrete steps are as follows:

1, sparse coding model is set up, and initialization null vector;

Described sparse coding model is:

w = \arg \underset{w}{m i n} \frac{1}{2} | | y - Φ * w | |_{2}^{2} + λ | | w | |_{1} - - - (1)

w＝(u ₁-v ₁,…,u _m-v _m)(2)

U in formula _m, v _mbe respectively m the component of vectorial u and v.

2, stochastic sampling data sample carries out iterative processing;

T the component y of described stochastic sampling data sample y _tu is upgraded successively according to iterative formula below _i, v _i(1≤i≤m):

\{\begin{matrix} u_{i} &LeftArrow; {[u_{i} - (λ - (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \\ v_{i} &LeftArrow; {[v_{i} - (λ + (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \end{matrix} - - - (3)

3, sparse coding coefficient is upgraded;

By after iteration with substitution formula (2) upgrade described sparse coding coefficient.

4, approximate error is calculated;

By following formulae discovery approximate error:

r e c e r r = \frac{| | y - Φ w | |_{2}^{2}}{| | y | |_{2}^{2}} - - - (4)

5, distinguished number convergence.

Distinguished number convergence, when described approximate error recerr<1 × 10 ^-5time, algorithm convergence stops iteration, exports sparse coding w, otherwise goes to step 2 continuation iterative processings.

USPS data set carries out image reconstruction experiment, and USPS is a handwritten form data set, comprises " 0 " and arrives " 9 " ten class arabic numeral.The idiographic flow of sparse representation method is as follows:

A, the pixel of each digital picture in USPS training set is pulled into a column vector y _i, y _i∈ (n, 1)

B, set up Its Sparse Decomposition model:

w = \arg \underset{w}{m i n} \frac{1}{2} | | y - Φ * w | |_{2}^{2} + λ | | w | |_{1}

C, random selecting y _iin a pixel x (n, i), utilize context of methods to obtain coding:

Initialization vector u _iand v _i;

Upgrade successively according to iterative formula

\{\begin{matrix} u_{i} &LeftArrow; {[u_{i} - (λ - (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \\ v_{i} &LeftArrow; {[v_{i} - (λ + (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \end{matrix}

D, renewal sparse coding coefficient

w＝(u ₁-v ₁,…,u _m-v _m)

E, reconstruct a width handwritten word image Y:Y=Φ w

Table 1 is the method and the contrast of traditional L1magic method time used that propose.Along with the carrying out of iteration, nonzero coefficient reduces gradually, and tends towards stability, and forms sparse expression.

10 average reconstruction times tested by table 1

	L1magic	Ours
			Working time	4.7265s	0.3421s

Stochastic gradient method is utilized to solve the process of sparse coding below in conjunction with classical to describe in detail with the example that sparse coding carries out image super-resolution in image.Whole algorithm is as follows:

Input: low-resolution image y _l

1) by y _limage is divided into several blocks, such as, be divided into 5*5 or 3*3.

2) each fritter x is wherein got in a certain order _l, to x _lcarry out pretreatment work by x _lpull into a column vector.

3) the stochastic gradient method w=(u introduced is utilized herein ₁-v ₁..., u _m-v _m), solve sparse coding w.

4) sparse coding obtained and high resolving power set is utilized to recover high-definition picture block x _h.

Export: high-definition picture y _h

Classical sparse coding algorithm is in process of reconstruction, and need to solve a coding to each fritter, the computing overhead of this process need is very large, and uses our method, can greatly save computing cost.

The present embodiment do the allocation of computer of testing and be: the operating system of 64, the internal memory of 16GB, Intel processors, software runtime environment is MATLABR2012a version.

Finally should be noted that: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit, although with reference to above-described embodiment to invention has been detailed description, those of ordinary skill in the field are to be understood that: still can modify to the specific embodiment of the present invention or equivalent replacement, and not departing from any amendment of spirit and scope of the invention or equivalent replacement, it all should be encompassed in the middle of right of the present invention.

Claims

1. towards a sparse representation method for electric power online image data, it is characterized in that, described method comprises the steps:

(1) sparse coding model is set up, and initialization null vector;

(2) stochastic sampling data sample carries out iterative processing;

(3) sparse coding coefficient is upgraded;

(4) approximate error is calculated;

(5) distinguished number convergence.

2. sparse representation method according to claim 1, it is characterized in that, in described step (1), described sparse coding model is:

w = \arg \underset{w}{m i n} \frac{1}{2} | | y - Φ * w | |_{2}^{2} + λ | | w | |_{1} - - - (1)

w＝(u ₁-v ₁,…,u _m-v _m)(2)

U in formula _m, v _mbe respectively m the component of vectorial u and v.

3. sparse representation method according to claim 2, is characterized in that, in described step (2), and t the component y of described stochastic sampling data sample y _tu is upgraded successively according to iterative formula below _i, v _i(1≤i≤m):

\{\begin{matrix} u_{i} &LeftArrow; {[u_{i} - (λ - (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \\ v_{i} &LeftArrow; {[v_{i} - (λ + (y_{t} - W^{T} Φ (t)) Φ_{i} (t))]}_{+} \end{matrix} - - - (3)

4. sparse representation method according to claim 3, is characterized in that, in described step (3), by the u after iteration _iand v _isubstitute into formula (2) and upgrade described sparse coding coefficient.

5. sparse representation method according to claim 4, is characterized in that, in described step (4), by following formulae discovery approximate error:

r e c e r r = \frac{| | y - Φ w | |_{2}^{2}}{| | y | |_{2}^{2}} - - - (4)

6. sparse representation method according to claim 5, it is characterized in that, in described step (5), distinguished number convergence, when described approximate error recerr<1 × 10 ^-5time, algorithm convergence stops iteration, exports sparse coding w, otherwise goes to step (2) continuation iterative processing.