A kind of collecting and distributing flexible measurement method based on Girvan-Newman algorithm
Technical field
The present invention relates to a kind of industrial process flexible measurement methods, are based on Girvan-Newman algorithm more particularly, to one kind
Collecting and distributing flexible measurement method.
Background technique
During modern industry, measuring technique and equipment holder's critical role in entire production system, accordingly
Measurement data can provide solid data basis for Production Scheduling, process monitoring and other industrial " big data " applications.
Although.Advanced instrumental technique can easily measure flow, liquid in having obtained development at full speed, industrial process in recent years
The information such as position, pressure, temperature, the information of direct or indirect reactor product quality can also be counted in real time by instrument and meter measurement
According to.But relative to instrument such as flow or temperature, the equipment of on line real time product quality information is usually at high price, and
Maintenance cost is higher.If the acquisition of product quality data there is certain delay, operator using off-line analysis means
Therefore product quality data can not be known in time and accurately.However, the monitoring about quality, which be unable to do without these, can reflect production again
The data information of quality.In recent ten years, with the extensive use of data-driven method, soft-measuring technique comes into being.It
By establishing in production process the regression model being easy between the data and product quality data of measurement, realize to qualitative data
Real-time estimation.In recent years, the research for flexible measurement method technology has received industry and has more and more closed with academia
Note.
The core of soft-measuring technique be to establish input data (information usually easily measured in industrial process, as pressure,
Temperature, flow etc.) with output data (usually can directly or indirectly reflect the measurement index of quality information, such as concentration) between
Regression model.And in current existing document and Patent data, establish regression model frequently with method have: statistical regression
Method, neural network, support vector machine etc..Further, since modern industry process especially process industry, production scale are gradually moved towards
Enlargement is staggeredly associated between multiple production units.If directly establishing single soft-sensing model, attainable hard measurement essence
It spends always barely satisfactory.In recent years, distributed soft-sensing model is paid attention to, because the Generalization Capability of multi-model is usually excellent
More in single model.It is decomposed in general, implementing multi-model modeling firstly the need of to process object.However, due to production
The complexity of unit staggeredly and control system feedback relationship, implements effective procedure decomposition and needs to rely on enough process mechanisms to know
Know the priori knowledge with operator.
For convenience of the popularization and application of soft-measuring technique, how under the premise of not needing process mechanism knowledge, to entire
It is first problem implementing collecting and distributing type soft-measuring technique scheme and being faced that production process object, which is implemented to decompose,.In addition, i.e. enabled
Smoothly process object is decomposed, but how these distributed soft-sensing models to be become one, product is obtained
The final estimated value of quality is second problem implemented collecting and distributing flexible measurement method technology and need to face.That says here is collecting and distributing
Concept, first consisting in soft-sensing model must disperse, but the estimation or prediction to product quality can only provide one as a result,
It seeks to integrate the result of multiple soft-sensing models.
Summary of the invention
Technical problem underlying to be solved by this invention is: how from data Angle to large-scale production process implementation process
It decomposes, to establish distributed soft-sensing model, and how the distributed hard measurement result of integrated utilization integrates to be produced
The estimated value of quality data.
The technical scheme of the invention to solve the technical problem is: a kind of based on Girvan-Newman algorithm
Collecting and distributing flexible measurement method, comprising the following steps:
(1): data corresponding to the index that can reflect product quality are found out from the historical data base of production process object
Form output matrix Y ∈ Rn×f, sampled data composition input matrix X ∈ R corresponding with output Yn×m, wherein n is training sample
This number, m are process measurement variable number, and f is quality index number, and R is set of real numbers, Rn×mIndicate the real number matrix of n × m dimension.
(2): calculating the mean μ of each column vector in output matrix Y1, μ2..., μfWith standard deviation δ1, δ2..., δfAfterwards, according to
FormulaOutput matrix after being standardized is handled to row vector execution standardization each in YWherein row vector
Y withRespectively indicate matrix Y withIn any one row vector, export mean vector μ=[μ1, μ2..., μf], outputting standard
Poor diagonal matrixElement on middle diagonal line is δ1, δ2..., δf。
(3): the input matrix to the processing of matrix X execution standardization, after being standardized
(4): according to formulaAfter calculating correlation matrix C, setting neighbour's number is c and initializes j
=1.
(5): the maximum c element of jth row vector in Matrix C is all arranged to 1, and remaining element is arranged to 0.
(6): judging whether to meet condition: j < m? if so, setting return step after j=j+1 (5);If it is not, then obtaining more
Correlation matrix C after new.
(7): according to formula Ξ=max { C, CTMatrix Ξ is calculated, wherein max { C, CTIndicate to take Matrix C and Matrix CT
The maximum value of middle same position element, the transposition of upper label T representing matrix or vector, matrix Ξ can regard as between m variable
Connection network matrix, each variable can indicate there is connection side between two nodes as a node in network, element 1, and
Element 0 indicates connectionless side between two nodes.
(8) hierarchical cluster is implemented to m variable using Girvan-Newman algorithm, specific implementation process is as follows:
1. calculate in m node connection network it is all connect while while betweenness.
2. finding in the highest connection of betweenness and removing it from network, i.e., will connect corresponding in network matrix Ξ
Element sets 0.
3. recalculate in network remaining connection while while betweenness.
2. and 3. 4. repeating step, all connection sides are all removed in guidance connection network, i.e., matrix Ξ becomes one zero
Matrix.
Above-mentioned steps 1. in some connection while while betweenness is defined as: curbside is connected by this from some node and is reached
The shortest path number of other nodes repeats same calculating to all nodes, and by obtained relative to each different nodes
Side betweenness be added, as this connection while while betweenness.
(9):, accordingly can be by input matrix according to the hierarchical cluster in step (8) as a result, m variable can be divided into D blockIt is divided into D matrix:
Above-mentioned steps (8) and step (9) complete the decomposition to process object by Girvan-Newman algorithm, according to
Rely between variable on the basis of correlation, process measurement variable is resolved into D sub-block, completes distributing soft-sensing model
The first step of foundation works.
(10): being established using partial least squares algorithmWith output matrixBetween soft-sensing model:Wherein d=1,2 ..., D, BdFor regression coefficient matrix, EdFor error matrix.
Although it is emphasized that being to establish soft-sensing model using partial least squares algorithm in step (10), originally
Inventive method equally can establish soft-sensing model using neural network, support vector regression, core partial least squares algorithm.
(11): repeating step (10) until obtaining D soft-sensing model, and utilize regression coefficient matrix B1, B2..., BDRoot
According to formulaCalculate the corresponding output estimation value of each soft-sensing model
(12): by output estimation valueIt is merged into a matrixUtilize minimum two partially
Multiplication algorithm is establishedWith outputBetween soft-sensing model:WhereinFor regression coefficient matrix,For error moments
Battle array.
Above-mentioned steps (1) are the off-line modeling stage of the method for the present invention to step (12), and wherein step (11) establishes point
D soft-sensing model of formula is dissipated, and step (12) then integrates distributed hard measurement result.Step (13) as shown below is to step
It (18) is the online soft sensor implementation process of the method for the present invention.
(13): the sample data z ∈ R of process object when acquiring new1×m, and to being granted at the identical standardization of matrix X in fact
Reason obtains vector
(14): according to the hierarchical cluster in step (8) as a result, by vectorIt is divided into D row vector:And
Initialize d=1.
(15): according to formulaOutput estimation value y corresponding to d-th of soft-sensing model is calculatedd。
(16): judging whether to meet condition: d < D? if so, setting return step after d=d+1 (15);If it is not, then by D
The output estimation value of soft-sensing model is merged into a vector
(17): according to formulaDistributing soft-sensing model is integrated for the final estimated value of outputSo
The estimated value of the quality index of current sample time isWherein μ withFor the mean vector and standard in step (2)
Poor diagonal matrix.
(18): return step (13) continues to implement the hard measurement of the quality index to new sampling instant.
Compared with the conventional method, inventive process have the advantage that:
Girvan-Newman algorithm can be described as being used to implement piecemeal processing to variable for the first time in the methods of the invention,
Any process mechanism knowledge and priori operating experience are not needed.The method of the present invention is completed using Girvan-Newman algorithm
To the procedure decomposition task of monitored target, lay a good groundwork to establish distributed soft-sensing model.In addition, the method for the present invention
Although partial least squares algorithm is utilized to establish soft-sensing model in step (10), the method for the present invention is but not limited only to
Using partial least squares algorithm, other methods such as neural network, support vector machine, core partial least squares algorithm etc. all be can be used to
Establish soft-sensing model.Furthermore the method for the present invention is calculated to integrate distributed hard measurement as a result, reusing offset minimum binary
Method, which completes, integrates hard measurement result.It can be said that the method for the present invention is a kind of novel collecting and distributing flexible measurement method, adapt to
In the hard measurement of large-scale production process object product quality indicator.
Detailed description of the invention
Fig. 1 is the implementation flow chart of the method for the present invention.
Fig. 2 is the implementation flow chart of Girvan-Newman algorithm.
Specific embodiment
The method of the present invention is described in detail with reference to the accompanying drawing.
As shown in Figure 1, the present invention provides a kind of collecting and distributing flexible measurement method based on Girvan-Newman algorithm, the party
The specific embodiment of method is as follows:
Step (1): finding out from the historical data base of production process object can reflect corresponding to the index of product quality
Data form output matrix Y ∈ Rn×f, sampled data composition input matrix X ∈ R corresponding with output Yn×m, wherein n is instruction
Practice sample number, m is process measurement variable number, and f is quality index number, and R is set of real numbers, Rn×mIndicate the real number matrix of n × m dimension;
Step (2): the mean μ of each column vector in output matrix Y is calculated1, μ2..., μfWith standard deviation δ1, δ2..., δfAfterwards,
According to formulaOutput matrix after being standardized is handled to row vector execution standardization each in YWherein go
Vector y withRespectively indicate matrix Y withIn any one row vector, export mean vector μ=[μ1, μ2..., μf], output
Standard deviation diagonal matrixElement on middle diagonal line is δ1, δ2..., δf;
Step (3): the input matrix to the processing of matrix X execution standardization, after being standardized
Step (4): according to formulaAfter calculating correlation matrix C, setting neighbour's parameter c is simultaneously initial
Change j=1.
Step (5): being all arranged to 1 for the maximum c element of jth row vector in Matrix C, and remaining element is arranged to 0.
Step (6): judge whether to meet condition: j < m? if so, setting return step after j=j+1 (5);If it is not, then
To updated correlation matrix C;
Step (7): according to formula Ξ=max { C, CTMatrix Ξ is calculated, wherein max { C, CTIndicate to take Matrix C and square
Battle array CTThe maximum value of middle same position element, the transposition of upper label T representing matrix or vector, matrix Ξ can regard m variable as
Between connection network matrix, each variable can indicate there is connection between two nodes as a node in network, element 1
Side, and element 0 indicates connectionless side between two nodes;
Step (8): hierarchical cluster is implemented to m variable using Girvan-Newman algorithm, wherein Girvan-Newman
The implementing procedure of algorithm is as shown in Fig. 2, specific embodiment includes step as follows (8.1) to step (8.4).
Step (8.1): calculate in m node connection network it is all connect while while betweenness;
Step (8.2): finding in the highest connection of betweenness and remove it from network, i.e., will connect network matrix Ξ
In corresponding element set 0;
Step (8.3): recalculate in network remaining connection while while betweenness;
2. and 3. step (8.4): repeating step, and all connection sides are all removed in guidance connection network, i.e. matrix Ξ becomes
For a null matrix;
Step (9): it according to the hierarchical cluster in step (8) as a result, m variable can be divided into D block, can will accordingly input
MatrixIt is divided into D matrix:
Step (10): it is established using partial least squares algorithmWith output matrixBetween soft-sensing model:Specific embodiment is as follows:
1. initializing h=1, matrix is setWith matrixAnd vector u is initialized as matrix Y0First
Column.
2. according to formula wh=ZTu/(uTU) input weight vector w is calculatedh, and with formula wh=wh/||wh| | it is unitization to
Measure wh。
3. according to formula sh=Zwh/(wh Twh) calculate score vector sh。
4. according to formula gh=Y0 Tsh/(sh Tsh) calculate output weight vector gh。
5. according to formula u=Y0ghRenewal vector u.
6. repeat 2.~5. until shConvergence, judgment criteria are vector shMiddle each element no longer changes.
7. retaining input weight vector whWith output weight gh, and according to formula ph=Xi Tsh/(sh Tsh) calculate projection vector
ph。
8. updating input matrix Z and matrix Y according to following two formula0:
Z=Z-shph T (7)
Y0=Y0-shgh (8)
9. judging whether to meet condition: h < MdWherein MdFor matrixColumn vector number, if so, setting h=h+1
Afterwards, return step is 2.;If it is not, obtained all input weight vectors are then formed matrix W0=[w1, w2..., wh], it is all defeated
Weight vector forms matrix G out0=[g1, g2..., gh] and all projection vectors composition matrix P0=[p1, p2..., ph]。
10. determining the projection vector number k retained in partial least square model using cross-validation methodd, then minimum two partially
The soft-sensing model that multiplication algorithm is established may be expressed as:Wherein regression coefficient matrix Bd=W (PTW)-1GT, square
W, P and G are respectively by matrix W for battle array0、P0And G0The 1st column to kthdThe Column vector groups of column at.
Step (11): repeating step (10) until obtaining D soft-sensing model, and utilize regression coefficient matrix B1, B2...,
BDAccording to formulaCalculate the corresponding output estimation value of each soft-sensing model
Step (12): by output estimation valueIt is merged into a matrixUsing partially most
Small two multiplication algorithm is establishedWith outputBetween soft-sensing model:Specific embodiment and above-mentioned steps are 1. extremely
Step is 10. identical.
Above-mentioned steps (1) to step (12) is the off-line modeling stage that the method for the present invention is implemented, and needs to retain in step (2)
Output mean vector μ and outputting standard difference diagonal matrixIt is each in hierarchical cluster result, step (11) in step (8)
Regression coefficient matrix in regression coefficient matrix, step (12), in case implementing to call when online collecting and distributing hard measurement.
Step (13): the sample data z ∈ R at collection process object last samples moment1×m, and to granting matrix X phase in fact
Same standardization obtains vector
Step (14): according to the hierarchical cluster in step (8) as a result, by vectorIt is divided into D row vector:And initialize d=1.
Step (15): according to formulaOutput estimation value y corresponding to d-th of soft-sensing model is calculatedd。
Step (16): judge whether to meet condition: d < D? if so, setting return step after d=d+1 (15);If it is not, then
The output estimation value of D soft-sensing model is merged into a vector
Step (17): according to formulaDistributing soft-sensing model is integrated for the final estimated value of output
The estimated value of so quality index of current sample time isWherein μ withFor in step (2) mean vector with
Standard deviation diagonal matrix.
Apply 0071] step (18): return step (13) continues to implement the hard measurement to new sampling instant quality index.
Above-described embodiment is only to the preferred embodiment of the present invention, in the protection model of spirit and claims of the present invention
In enclosing, to any modifications and changes that the present invention makes, it should not exclude except protection scope of the present invention.