CN110309491B

CN110309491B - Transient phase partitioning method and system based on local Gaussian mixture model

Info

Publication number: CN110309491B
Application number: CN201910571289.5A
Authority: CN
Inventors: 刘井响; 王丹; 彭周华; 刘陆
Original assignee: Dalian Maritime University
Current assignee: Dalian Maritime University
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2022-10-14
Anticipated expiration: 2039-06-26
Also published as: CN110309491A

Abstract

The invention discloses a phase partitioning method and system based on a local Gaussian mixture model, which comprises the following steps: s1, collecting a sample and creating a historical training data set; s2, selecting a part of samples from a historical training data set to create a first Gaussian distribution model so as to determine a first steady-state phase; s3, based on the previously determined Gaussian model, creating a Gaussian mixture model containing two Gaussian components to determine the next steady-state phase; s4, determining transient phases possibly existing between two steady-state phases based on the determined two adjacent steady-state phase models; and S5, repeating S3 and S4 to complete phase division of all sample data. The invention greatly reduces redundant calculation, improves the calculation efficiency, adopts a step-by-step updating strategy, and gradually determines the steady-state phase and the transient-state phase according to the sampling time sequence, and has the advantages that the phase division number does not need to be pre-specified, the division result does not need to be subsequently processed, and the like.

Description

Transient phase partitioning method and system based on local Gaussian mixture model

Technical Field

The invention relates to the technical field of batch process statistical modeling, in particular to a transient phase dividing method and a transient phase dividing system in a multiphase batch process.

Background

Batch process is a very common production mode in modern industry and is widely applied to industries such as fine chemical industry, pharmacy, metallurgy, semiconductor and the like. As technology develops and demand diversifies, batch processes become more and more complex, and the direct manifestation is that a batch process comprises a plurality of different operation stages, or a plurality of different reaction/variation stages, such a process is called a multiphase batch process, and each such stage is called a phase. For example, penicillin fermentation process, assuming that the time of one penicillin fermentation process is 400h, the first 45h is a pre-culture stage, and the last 355h belongs to a feed type feeding stage, namely, raw materials are fed into a reaction kettle from the 45 th h. From the reaction mechanism, a typical penicillin fermentation process can be divided into four phases (stages), including a retardation stage, an exponential growth stage, a stabilization stage and an autolysis stage. While the penicillin fermentation process is typically a slowly time-varying process, the transition between different phases is not a sudden one, but a slowly varying one, so that the transition between different phases is not so obvious, and there is a case where a sample between two steady-state phases partially retains the characteristics of the first steady-state phase and contains the characteristics of the next new steady-state phase, and the phase corresponding to such characteristics is called transient phase. How to accurately and reasonably divide a multiphase batch process into different phases is beneficial to enhancing the further understanding of the process mechanism and improving the accuracy of process modeling.

At present, certain research results have been found on multiphase division, including multiphase principal component analysis similar to exhaustion method, which uses repetition factor index to divide batches, however, this method is not suitable for the process without obvious inflection point change. The clustering method is widely used for phase partitioning in a batch process, however, the number of the partition categories is required to be specified in advance based on the K-means algorithm, and the time-sequence relation among samples is not considered, so that the partition result is disordered, further subsequent processing is required, and the problems of difficulty in explanation and the like are caused. Therefore, the timing relationship between samples in phase is an important factor which is not negligible, and the transient phase can be accurately divided as well as the multistable phase. That is, neither of the above two methods considers the problem of insufficient time sequence and transient phase division.

Disclosure of Invention

Based on this, a phase partitioning method based on a local gaussian mixture model is provided in particular for the defects that the existing phase partitioning method does not consider the time sequence, the transient phase partitioning and the like.

A phase partitioning method based on a local Gaussian mixture model comprises the following steps:

s1, collecting a sample and creating a historical training data set;

s2, selecting a part of samples from the historical training data set according to a sampling time sequence to create a first Gaussian distribution model so as to determine a first steady-state phase;

s3, based on the previously determined Gaussian model, creating a Gaussian mixture model containing two Gaussian components to determine the next steady-state phase;

s4, determining transient phases possibly existing between two steady-state phases based on the determined two adjacent steady-state phase models;

and S5, repeating S3 and S4 to complete phase division of all sample data.

Optionally, in one embodiment, the selecting a partial sample from the historical training data set to create a first gaussian distribution model to determine first steady-state phase data includes:

s21, sequentially selecting front N from the historical training data set ₁ Calculating the mean value and the variance of each sample to obtain a corresponding Gaussian distribution model p (x | 1), wherein p (x | 1) represents a probability density function of a first Gaussian distribution model, and x represents the acquired sample data;

s22, extracting sample points to perform steady-state phase verification, namely from the Nth ₁ The 2 sample points start to verify to find three continuous sample points meeting the first verification condition and mark the sequence numbers corresponding to the sample points meeting the first verification condition as

The verification condition is

Where ρ is a pre-specified threshold;

s23, judging N ₁ Whether or not equal to

If yes, the result is converged, namely the first steady-state phase is determined and the next step is carried out;otherwise make

And returning to the step S21 for iteration till N ₁ Is equal to

Optionally, in one embodiment, the creating a gaussian mixture model containing two gaussian components to determine the next steady-state phase based on the previously determined gaussian model includes:

s31, based on the determined previous Gaussian model, creating a mixed model containing two Gaussian distribution functions and training without loss of generality, assuming that the first c-1 steady-state phases are determined, c is an integer greater than or equal to 2, and the formula corresponding to the mixed model is

p(x|θ _c )＝α _c-1 p(x|c-1)+α _c p(x|c)

Wherein the probability density function p (x | c-1) of the c-1 st Gaussian model is determined and includes N _c-1 The c-1 steady-state phase of each sample is X _c-1 The probability density function p (x | c) of the c-th Gaussian model is to be determined, assuming that N is included _c The c steady-state phase of each sample is X _c Record X _m ＝{X _c-1 ,X _c The training parameters θ corresponding to the training data of the Gaussian mixture model _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }，α _c-1 And alpha _c Are the combined coefficients of the c-1 th and the c-th Gaussian components in the Gaussian mixture model, mu _c Sum-sigma _c Respectively training the mixed model by using a maximum expectation algorithm (EM algorithm) which is a mean vector and a variance matrix in the c-th Gaussian probability density function p (x | c);

s32, extracting a sample point to perform steady-state phase verification on the mixed model,

i.e. from N _c-1 The 2 sample points start to be verified to find three continuous sample points meeting the second verification condition and mark the serial numbers corresponding to the sample points meeting the second verification conditionIs marked as

The verification condition is

Where ρ is a pre-specified threshold;

s33, judgment

Whether or not equal to N _c +N _c-1 If yes, the result is converged, namely the c-th steady-state phase is determined; otherwise make the instruction

And returns to step S31 to iterate until

Is equal to N _c +N _c-1 。

Optionally, in one embodiment, the determining, based on the determined two adjacent steady-state phase models, a transient phase that may exist between two steady-state phases includes:

from two determined two steady-state phases X _c-1 And X _c Starting from the first sample point of the c-th steady-state phase, a test is performed to find the consecutive satisfaction of p (x) _n The sample point of | c) < ρ is recorded as the transient phase X _c-1,c 。

In addition, in order to solve the defects of the traditional technology, a phase partitioning system based on a local Gaussian mixture model is also provided.

A local gaussian mixture model based phase partitioning system, comprising:

an acquisition unit for acquiring samples and creating a historical training data set;

the first Gaussian distribution creating unit is used for selecting partial samples from the historical training data set according to the sampling time sequence to create a first Gaussian distribution model so as to determine first steady-state phase data;

the Gaussian mixture model creating unit is used for creating a Gaussian mixture model containing two Gaussian components based on the determined Gaussian model to determine the next steady-state phase and completing phase division of all sample data by matching with the transient phase acquiring unit;

a transient phase acquisition unit for determining corresponding transient phase data based on two adjacent steady state phase data.

Optionally, in one embodiment, the first gaussian distribution creating unit includes:

a first data acquisition module for sequentially selecting N before the historical training data set ₁ Calculating the mean value and the variance of each sample to obtain a corresponding Gaussian distribution function p (x | 1), wherein p (x | 1) represents a probability density function of a first Gaussian distribution model, and x represents the collected sample data;

a first steady-state phase verification module for extracting sample points for steady-state phase verification, i.e. from the Nth ₁ The 2 sample points start to verify to find three continuous sample points meeting the first verification condition and mark the sequence numbers corresponding to the sample points meeting the first verification condition as sequence numbers

The verification condition is

Wherein ρ is a pre-specified threshold;

a first steady-state phase determination module for determining N ₁ Whether or not to be equal to

If yes, the result is converged, namely the first steady-state phase is determined and the next step is carried out; otherwise make

And the first steady-state phase verification module iterates again until N ₁ Is equal to

Optionally, in one embodiment, the gaussian mixture model creating unit includes:

a second data obtaining module, configured to create and train a gaussian mixture model including two gaussian components based on a previously determined gaussian model, assuming that c-1 previous steady-state phases have been determined, c is an integer greater than or equal to 2, and a formula corresponding to the mixture model is

p(x|θ _c )＝α _c-1 p(x|c-1)+α _c p(x|c)

Wherein the probability density function p (x | c-1) of the c-1 st Gaussian model is determined and includes N _c-1 The c-1 steady state phase of each sample is X _c-1 The probability density function p (x | c) of the c-th Gaussian model is to be determined, assuming that N is included _c The c steady-state phase of each sample is X _c Record X _m ＝{X _c-1 ,X _c The corresponding training parameter θ is the training data of the Gaussian mixture model _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }，α _c-1 And alpha _c Are the combined coefficients of the c-1 th and the c-th Gaussian components in the Gaussian mixture model, mu _c Sum-sigma _c Respectively training the mixed model by using a maximum expectation algorithm (EM algorithm) which is a mean vector and a variance matrix in the c-th Gaussian probability density function p (x | c);

training the mixed model by using maximum expectation algorithm (EM algorithm), wherein X is recorded _m ＝{X _c-1 ,X _c H, the corresponding training difference number theta _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }；

A second steady-state phase verification module for extracting sample points to perform steady-state phase verification on the mixed model, i.e. from the Nth _c-1 2 sample Point onStarting verification to find three continuous sample points meeting the second verification condition and marking the sequence numbers corresponding to the sample points meeting the second verification condition as

The verification condition is

Wherein ρ is a pre-specified threshold;

a second steady state phase determination module for determining

Whether or not equal to N _c +N _c-1 If yes, the result convergence is shown, namely the c-th steady-state phase is determined; otherwise make

And the second steady-state phase verification module iterates again until

Is equal to N _c +N _c-1 。

Optionally, in one embodiment, the processing procedure of the transient phase acquiring unit includes: from two determined two steady-state phases X _c-1 And X _c Starting from the first sample point of the c-th steady-state phase, checking and finding the continuous satisfaction of p (x) _n The sample point of | c) < ρ is marked as the transient phase X _c-1,c 。

In addition, in order to solve the disadvantages of the conventional technology, a computer-readable storage medium is provided, which includes computer instructions, when the computer instructions are executed on a computer, the computer executes the method.

By implementing the embodiment of the invention, the defects that the time sequence and the transient phase division are not considered in the existing phase division method are overcome, and the invention also has the following beneficial effects: in the invention, from the angle of Gaussian distribution, an independent Gaussian distribution is used for describing a steady-state phase, and a mixed model of two adjacent Gaussian distributions is used for describing a transient phase, so that the phase division method can effectively divide the steady-state phase and can determine the transient phase at the same time; (2) According to the invention, only local data is adopted for modeling verification in each iteration, so that redundant calculation is greatly reduced, and the calculation efficiency is improved; (3) The method adopts a step-by-step updating strategy, gradually determines the steady-state phase and the transient-state phase according to the sampling time sequence, and has the advantages that the phase division number does not need to be pre-specified, the division result does not need to be subsequently processed, and the like.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Wherein:

FIG. 1a is a diagram illustrating an initial model update in one embodiment;

FIG. 1b is a schematic diagram illustrating update of a mixture model according to an embodiment;

FIG. 2 is a diagram illustrating phase partitioning for a local Gaussian mixture model in one embodiment;

FIG. 3 is a schematic diagram of the penicillin fermentation process in one example;

FIG. 4 is a flow diagram of core steps in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present application. The first and second elements are both elements, but they are not the same element.

In order to overcome the disadvantages of the conventional phase partitioning method, such as time sequence and transient phase partitioning, in this embodiment, a phase partitioning method based on a local gaussian mixture model is provided, in which an improved local gaussian mixture model method is used, and a step-wise probabilistic modeling is used to perform phase partitioning on a multiphase batch process; that is, each time a local gaussian mixture model is established, that is, an initial model with only one gaussian component and a mixture model containing two gaussian components, a steady-state phase and a transient-state phase in the process can be determined simultaneously in an iterative manner, specifically, as shown in fig. 4, the method includes the following steps:

s1, collecting a sample and creating a historical training data set; in some embodiments, the batch process data is collected by collecting batch process data

And unfolded into

Acquiring a historical training data set;

s2, selecting a part of samples from the historical training data set according to a sampling time sequence to create a first Gaussian distribution model so as to determine first steady-state phase data, wherein the purpose of the step is to establish an initial model only containing one Gaussian component, and an updating schematic diagram of the initial model is shown in FIG. 1 (a); in some specific embodiments, selecting a portion of samples from the historical training data set to create a first gaussian distribution model to determine first steady-state phase data comprises:

s22, extracting sample points to perform steady-state phase verification, namely from the Nth ₁ The 2 sample points start to verify to find three continuous sample points meeting the first verification condition and mark the sequence numbers corresponding to the sample points meeting the first verification condition as sequence numbers

The verification condition is

Where ρ is a pre-specified threshold, e.g., ρ =0.001, if N ₁ If/2 is a non-integer, then either rounding forward or rounding backward, i.e. if N is ₁ If the/2 is 16.5, 16 or 17 can be selected;

s23, judging N ₁ Whether or not to be equal to

And returns to step S21 to iterate until N ₁ Is equal to

S3, based on the determined previous Gaussian model, creating a Gaussian mixture model containing two Gaussian components to determine the next steady-state phase, wherein the purpose of the step is to establish the mixture model containing two Gaussian components, the generality is not lost, as the c-1 steady-state phases are determined, the c-th steady-state phase is determined, and the updating schematic diagram of the mixture model is shown in FIG. 1 (b); in some specific embodiments, the creating a gaussian mixture model to determine the next steady-state phase data includes:

s31, creating a mixed model containing two Gaussian distribution functions and training, wherein the formula corresponding to the mixed model is

p(x|θ _c )＝α _c-1 p(x|c-1)+α _c p(x|c)

Wherein, assuming that the first c-1 steady-state phases have been determined, N is included _c-1 The c-1 steady-state phase of each sample is X _c-1 Containing N _c The c steady-state phase of each sample is X _c C is an integer of 2 or more;

training the mixed model by using maximum expectation algorithm (EM algorithm), wherein X is recorded _m ＝{X _c-1 ,X _c H, the corresponding training difference number theta _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }; since the first c-1 steady-state phases have been determined, if the mean and variance of the first Gaussian component has been determined by S2, then this step only calculates the training difference θ _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c } then;

s32, extracting sample points to perform steady-state phase verification on the mixed model, namely, from the Nth point _c-1 The 2 sample points start to verify to find three continuous sample points meeting the second verification condition and mark the sequence numbers corresponding to the sample points meeting the second verification condition as

The verification condition is

Where ρ is a pre-specified threshold, e.g., ρ =0.001;

s33, determining

Whether or not equal to N _c +N _c-1 If yes, the result is converged, namely the c-th steady-state phase is determined; otherwise make

And returns to step S31 to iterate until

Is equal to N _c +N _c-1 。

The creating a gaussian mixture model containing two gaussian components to determine the next steady-state phase based on the previously determined gaussian model comprises:

p(x|θ _c )＝α _c-1 p(x|c-1)+α _c p(x|c)

Wherein the probability density function p (x | c-1) of the c-1 st Gaussian model is determined and includes N _c-1 The c-1 steady-state phase of each sample is X _c-1 The probability density function p (x | c) of the c-th Gaussian model is to be determined, assuming that N is included _c The c steady-state phase of each sample is X _c Record X _m ＝{X _c-1 ,X _c The corresponding training parameter θ is the training data of the Gaussian mixture model _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }，α _c-1 And alpha _c Are the combined coefficients of the c-1 th and the c-th Gaussian components in the Gaussian mixture model, mu _c Sum-sigma _c Respectively training the mixed model by using a maximum expectation algorithm (EM algorithm) which is a mean vector and a variance matrix in the c-th Gaussian probability density function p (x | c);

i.e. from N _c-1 The 2 sample points start to verify to find three continuous sample points meeting the second verification condition and mark the sequence numbers corresponding to the sample points meeting the second verification condition as

The verification condition is

Where ρ is a pre-specified threshold;

s33, judgment

Whether or not equal to N _c +N _c-1 If yes, the result convergence is shown, namely the c-th steady-state phase is determined; otherwise make the instruction

And returns to step S31 to iterate until

Is equal to N _c +N _c-1 。

And S4, determining transient phases possibly existing between the two steady-state phases based on the determined two adjacent steady-state phase models, wherein a phase division schematic diagram of the method is shown in figure 2. In some specific embodiments, the determining, based on the determined two adjacent steady-state phase models, a transient phase that may exist between two steady-state phases includes: from two determined two steady-state phases X _c-1 And X _c Starting from the first sample point of the c-th steady-state phase, a test is performed to find the consecutive satisfaction of p (x) _n The sample point of | c) < ρ is recorded as the transient phase X _c-1,c 。

And S5, repeating S3 and S4 to complete phase division of all sample data.

In addition, in order to solve the defects of the conventional technology, a phase partitioning system based on a local gaussian mixture model is further provided, which includes:

an acquisition unit for acquiring samples and creating a historical training data set; in some embodiments, the batch process data is collected by collecting batch process data

And unfolded into

Acquiring a historical training data set;

the first Gaussian distribution creating unit is used for selecting partial samples from the historical training data set according to the sampling time sequence to create a first Gaussian distribution model so as to determine first steady-state phase data; in some specific embodiments, the first gaussian distribution creating unit includes:

a first data acquisition module for sequentially selecting N before the historical training data set ₁ Calculating the mean value and the variance of each sample to obtain a corresponding Gaussian distribution model p (x | 1), wherein p (x | 1) represents a probability density function of a first Gaussian distribution model, and x represents the acquired sample data;

The verification condition is

Where ρ is a pre-specified threshold, e.g., ρ =0.001;

The Gaussian mixture model creating unit is used for creating a Gaussian mixture model containing two Gaussian components based on the determined Gaussian model to determine the next steady-state phase and complete phase division of all sample data by matching with the transient phase acquiring unit, namely the Gaussian mixture model creating unit determines the steady-state phase, and the transient phase acquiring unit determines the transient phase; in some specific embodiments, the gaussian mixture model creating unit includes:

a second data obtaining module, configured to create a mixture model including two gaussian distribution functions based on a previously determined gaussian model, and train the mixture model without loss of generality, assuming that c-1 previous steady-state phases have been determined, c is an integer greater than or equal to 2, and a formula corresponding to the mixture model is

p(x|θ _c )＝α _c-1 p(x|c-1)+α _c p(x|c)

Wherein the probability density function p (x | c-1) of the c-1 st Gaussian model is determined and includes N _c-1 The c-1 steady-state phase of each sample is X _c-1 The probability density function p (x | c) of the c-th Gaussian model is to be determined, assuming that N is included _c The c steady-state phase of each sample is X _c Record X _m ＝{X _c-1 ,X _c The training parameters θ corresponding to the training data of the Gaussian mixture model _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }，α _c-1 And alpha _c Are the c-1 th and c-th Gaussian components in the Gaussian mixture model, respectivelyCombination coefficient of (a) < mu > _c Sum-sigma _c Respectively training the mixed model by using a maximum expectation algorithm (EM algorithm) which is a mean vector and a variance matrix in the c-th Gaussian probability density function p (x | c);

a second steady-state phase verification module for extracting sample points to perform steady-state phase verification on the mixed model, i.e. from the Nth _c-1 The 2 sample points start to verify to find three continuous sample points meeting the second verification condition and mark the sequence numbers corresponding to the sample points meeting the second verification condition as

The verification condition is

Where ρ is a pre-specified threshold, e.g., ρ =0.001;

a second steady state phase determination module for determining

Whether or not it is equal to N _c +N _c-1 If yes, the result convergence is shown, namely the c-th steady-state phase is determined; otherwise make

And the second steady-state phase verification module iterates again until

Is equal to N _c +N _c-1 。

A transient phase acquisition unit for determining corresponding transient phase data based on two adjacent steady state phase data to complete phase partitioning of all sample data. In some specific embodiments, the processing of the transient phase acquisition unit includes: from two determined two steady-state phases X _c-1 And X _c Starting from the first sample point in the c-th steady-state phaseAnd find out that p (x) is continuously satisfied _n The sample point of | c) < ρ is recorded as the transient phase X _c-1,c 。

Based on the same inventive concept, the present invention also proposes a computer-readable storage medium comprising computer instructions which, when run on a computer, cause the computer to perform the method.

Based on the technical scheme, the effectiveness of the penicillin fermentation process is verified by taking a specific experimental example, namely the penicillin fermentation process as an example, and a schematic diagram of the penicillin fermentation process is shown in fig. 3.

Specifically, the method comprises the following steps:

in the stage of collecting samples and creating a historical training data set: here, 20 batches of normal data are generated in total for phase division, and white noise with the size of N (0, 0.04) is added to each batch of data; set the reaction time per batch to be 400h, sampled every 1h, thus each batch contained 400 sample points, each sample point containing 11 variables, see table 1.

TABLE 1

The phase division stage of determining the first steady-state phase data and the next steady-state phase data according to the sampling time sequence and dividing all the sample data: the number of sample points of the first phase is set to three times the number of variables as the initial modeling sample points, i.e., N ₁ =33. The division results when ρ =0.001 are shown in table 2, in which it can be seen that the entire process is divided into approximately 10 steady-state phases and three transient phases. There are three very small phases between the first and fifth steady-state phases. In the actual fermentation reaction process, the initial stage is a pre-culture stage which is relatively stable and corresponds to the first steady-state phase. Then enter a vigorous reaction phase corresponding to the next three small steady-state phases. Then the process enters a feeding type feeding stage, and then enters a stable fermentation stage after a period of conversion, and finally an autolysis stage. It can be seen that the results of the partitioning of the method and the actual process stage can be corresponded well.

TABLE 2

The embodiment of the invention has the following beneficial effects:

besides solving the defects that the existing phase division method does not consider the time sequence, the transient phase division and the like, the method also has the following beneficial effects: in the invention, from the angle of Gaussian distribution, an independent Gaussian distribution is used for describing a steady-state phase, and a mixed model of two adjacent Gaussian distributions is used for describing a transient-state phase, so that the phase division method can effectively divide the steady-state phase and can also determine the transient-state phase; (2) According to the invention, only local data is adopted for modeling verification in each iteration, so that redundant calculation is greatly reduced, and the calculation efficiency is improved; (3) The method adopts a step-by-step updating strategy, gradually determines the steady-state phase and the transient-state phase according to the sampling time sequence, and has the advantages that the phase division number does not need to be pre-specified, the division result does not need to be subsequently processed, and the like.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A phase partitioning method based on a local Gaussian mixture model comprises the following steps:

s1, collecting a sample and creating a historical training data set;

s5, repeating the S3 and the S4 to complete phase division of all sample data;

wherein selecting a portion of the samples from the historical training data set to create a first gaussian distribution model to determine a first steady-state phase data comprises:

The verification condition is

Wherein ρ is a pre-specified threshold;

s23, judging N ₁ Whether or not equal to

And returns to step S21 to iterate until N ₁ Is equal to

Further, in the above-mentioned case,

s31, based on the previously determined Gaussian model, creating a mixed model containing two Gaussian distribution functions and training, assuming that the first c-1 steady-state phases are determined, c is an integer greater than or equal to 2, and the formula corresponding to the mixed model is

p(x|θ _c )＝α _c-1 p(x|c-1)+α _c p(x|c)

Wherein the probability density function p (x | c-1) of the c-1 st Gaussian model is determined and includes N _c-1 The c-1 steady-state phase of each sample is X _c-1 The probability density function p (x | c) of the c-th Gaussian model is to be determined, assuming that N is included _c The c steady-state phase of each sample is X _c Record X _m ＝{X _c-1 ,X _c The training parameters θ corresponding to the training data of the Gaussian mixture model _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }，α _c-1 And alpha _c Are the combined coefficients of the c-1 th and c-th Gaussian components in the Gaussian mixture model, respectively _c Sum-sigma _c Respectively training the mixed model by using a maximum expectation algorithm (EM algorithm) which is a mean vector and a variance matrix in the c-th Gaussian probability density function p (x | c);

The verification condition is

Where ρ is a pre-specified threshold;

s33, judgment

And returns to step S31 to iterate until

Is equal to N _c +N _c-1 。

2. The phase partitioning method according to claim 1, wherein the determining a possible transient phase between two steady-state phases based on the determined two adjacent steady-state phase models comprises: from two determined two steady-state phases X _c-1 And X _c Starting from the first sample point of the c-th steady-state phase, a test is performed to find the consecutive satisfaction of p (x) _n The sample point of | c) < ρ is recorded as the transient phase X _c-1,c 。

3. A phase partitioning system based on a local gaussian mixture model, comprising:

a first Gaussian distribution creating unit, configured to select a part of samples from the historical training data set according to a sampling time order to create a first Gaussian distribution model to determine first steady-state phase data;

the Gaussian mixture model creating unit is used for creating a Gaussian mixture model containing two Gaussian components based on the previously determined Gaussian model so as to determine the next steady-state phase and complete phase division of all sample data by matching with the transient phase acquisition unit;

a transient phase acquisition unit for determining a transient phase that may exist between two steady-state phases based on the determined two adjacent steady-state phase models;

wherein the first gaussian distribution creating unit includes:

a first data acquisition module for sequentially selecting N before the historical training data set ₁ Calculating the mean value and the variance of each sample to obtain a corresponding Gaussian distribution function p (x | 1), wherein p (x | 1) represents a probability density function of a first Gaussian distribution model, and x represents the acquired sample data;

a first steady-state phase verification module for extracting sample points for steady-state phase verification, i.e. from the Nth ₁ The 2 sample points start to verify to find three continuous sample points meeting the first verification condition and mark the sequence numbers corresponding to the sample points meeting the first verification condition as

The verification condition is

Where ρ is a pre-specified threshold;

a first steady-state phase determination module for determining N ₁ Whether or not equal to

The Gaussian mixture model creating unit includes:

p(x|θ _c )＝α _c-1 p(x|c-1)+α _c p(x|c)

Wherein the probability density function p (x | c-1) of the c-1 st Gaussian model is determined and includes N _c-1 The c-1 steady-state phase of each sample is X _c-1 The probability density function p (x | c) of the c-th Gaussian model is to be determined, assuming that N is included _c The c steady-state phase of each sample is X _c Record X _m ＝{X _c-1 ,X _c The training parameters θ corresponding to the training data of the Gaussian mixture model _c ＝{α _c-1 ,α _c ,μ _c ,Σ _c }，α _c-1 And alpha _c Are the combined coefficients of the c-1 th and the c-th Gaussian components in the Gaussian mixture model, mu _c Sum-sigma _c Respectively training the mixed model by using a maximum expectation algorithm, namely an EM algorithm, namely a mean vector and a variance matrix in a c-th Gaussian probability density function p (x | c);

a second steady-state phase verification module for extracting sample points to perform steady-state phase verification on the mixed model, i.e. from N _c-1 The 2 sample points start to verify to find three continuous sample points meeting the second verification condition and mark the sequence numbers corresponding to the sample points meeting the second verification condition as

The verification condition is

Where ρ is a pre-specified threshold;

a second steady state phase determination module for determining

And the second steady-state phase verification module iterates again until

Is equal to N _c +N _c-1 。

4. The system of claim 3, wherein the processing of the transient phase acquisition unit comprises: from two determined two steady-state phases X _c-1 And X _c Starting from the first sample point of the c-th steady-state phase, checking and finding the continuous satisfaction of p (x) _n The sample point of | c) < ρ is recorded as the transient phase X _c-1,c 。