CN110333077B - Rolling bearing fault diagnosis method based on approximate support vector - Google Patents

Rolling bearing fault diagnosis method based on approximate support vector Download PDF

Info

Publication number
CN110333077B
CN110333077B CN201910611796.7A CN201910611796A CN110333077B CN 110333077 B CN110333077 B CN 110333077B CN 201910611796 A CN201910611796 A CN 201910611796A CN 110333077 B CN110333077 B CN 110333077B
Authority
CN
China
Prior art keywords
support vector
formula
samples
index
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910611796.7A
Other languages
Chinese (zh)
Other versions
CN110333077A (en
Inventor
熊庆宇
吉皇
吴映波
王凯歌
吴丹
邹青宏
何委燚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN201910611796.7A priority Critical patent/CN110333077B/en
Publication of CN110333077A publication Critical patent/CN110333077A/en
Application granted granted Critical
Publication of CN110333077B publication Critical patent/CN110333077B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/04Bearings
    • G01M13/045Acoustic or vibration analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Acoustics & Sound (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)

Abstract

The invention discloses a rolling bearing fault diagnosis method based on approximate support vectors. Firstly, a MiniBatchKmeans algorithm and a support vector machine are used for quickly finding out an approximate support vector, then a sample near the approximate support vector is screened out from an original sample by the approximate support vector, and finally the support vector machine is trained by the screened sample. The invention greatly reduces the training time of the support vector machine and ensures the accuracy of fault diagnosis.

Description

Rolling bearing fault diagnosis method based on approximate support vector
Technical Field
The invention relates to the field of bearing fault diagnosis, in particular to a rolling bearing fault diagnosis method based on approximate support vectors.
Background
With the continuous development of science and technology, people are gradually mature in the use of modern large-scale machines, wherein a large part of the machine structures tend to be precise and complicated. The modern machine greatly improves the production efficiency of people. However, production accidents caused by mechanical failures are increasing, once a part of the production accidents breaks down, the whole mechanical system stops working, so that production cannot be normally carried out, and even the life and property of people are greatly damaged. Therefore, it is important to find the type of the mechanical failure and take corresponding measures in time.
Rolling bearings are the most important part of machinery, and are also one of the most vulnerable parts to failure. According to research by the inventor, the rolling bearing fault accounts for 35% -50% of the mechanical fault, and in view of this, a method capable of accurately and rapidly diagnosing the rolling bearing fault is urgently needed in the field.
At present, the fault diagnosis of the rolling bearing is mainly based on the identification of the vibration signal of the fault bearing in the working state. The signal is acquired by a sensor assembly including an acceleration sensor mounted on the machine. Specifically, when a rolling bearing with a fault at a certain part is in contact with other parts, impact pulse force can be generated, and further a vibration signal generated in the current time period has obvious fluctuation characteristics. The feature is extracted by a vibration signal generated when the fault bearing operates, so that the feature can be used for diagnosing the fault type and the fault degree of the fault bearing. In recent years, various classifiers including an expert system, an Artificial Neural Network (ANN), and the like have been used for fault detection of rolling bearings. However, the inventors investigated that the above classifier has problems that a failure detection result falls into local optimum and detection data is likely to be over-fitted during use.
In order to solve the above problems, a fault diagnosis method for a support vector machine is introduced in the prior art. The support vector machine is a powerful machine learning classification algorithm, a kernel function is used for enabling the support vector machine to be suitable for linear separable data and linear inseparable data, the defect that a neural network is easy to overfit is overcome, and the robustness is good. The inventor researches and researches that the support vector machine comprises the following practical applications: 1) detecting and identifying rolling bearing faults by integrating the EEMD and a support vector machine optimized by the inter-cluster distance; 2) extracting sample entropy and energy ratio characteristics from an original vibration signal through LMD, and then diagnosing the fault of the rolling bearing by applying an SVM classifier; 3) the improved AdaBoost-SVM is used for diagnosing the fault of the wind power converter; 4) extracting features by an EMD method and combining with a particle swarm optimization support vector machine to diagnose faults of the gear; 5) the gear fault is diagnosed by the backtracking search optimization parameter-based support vector machine, so that the defect that the parameters need to be manually set is overcome.
As can be seen from the practical application of the support vector machine, the support vector machine can achieve a good classification effect, but the support vectors in the support vector machine are only a small number of samples which affect the construction of the hyperplane, and when the data volume is large, a large amount of calculation cost is brought, and time is consumed. Computational cost issues may be mitigated by reducing the amount of training data that has no or less impact on the hyperplane construction. However, the main problem of solving the computational cost completely is how to correctly identify the support vectors and the non-support vectors and remove the non-support vectors. In the prior art, there are methods for improving the operation efficiency of a support vector machine, that is, different clustering methods are used to reduce training data, and the method includes extracting a training data set with a size smaller than that of original data from clustered data to train the support vector machine, thereby reducing training time. However, the inventor researches and researches, the above method still has the problems that the non-support vector can not be reasonably identified or the identification precision is not enough, and a lot of time is needed for clustering: for example, a support vector machine is trained through a clustering center after the clustering of a K-means algorithm, although the method can reduce certain training time, when the data volume is large, the K-means algorithm also needs to consume a large amount of time, and according to the principle of the K-means algorithm, the clustering center after the clustering is not an original existing sample but the mean value of all samples belonging to the class; meanwhile, most clustering centers are not support vectors, but instead, the results of non-clustering centers as support vectors are obtained, so that proper samples cannot be screened out to train the support vectors, fault diagnosis data are disordered, and fault troubleshooting and mechanical use are seriously influenced.
Disclosure of Invention
The invention aims to provide a rolling bearing fault diagnosis method based on approximate support vectors, which greatly improves the calculation efficiency of a support vector machine and ensures the accuracy of fault diagnosis.
The technical scheme adopted by the invention for solving the technical problems is as follows: a rolling bearing fault diagnosis method based on approximate support vectors comprises the steps of firstly, respectively collecting vibration acceleration signals of a bearing in a normal state and different fault mode states through an acceleration sensor, and carrying out data preprocessing on the vibration acceleration signals to obtain sample points containing the vibration signals in different states, and further comprises the following steps:
s100, respectively extracting time domain features and frequency domain features in the sample points, and respectively extracting the features of the sample points, wherein the time domain features comprise a peak value, an average value, a root mean square value, a skewness, a pretty, a waveform index, a peak index, a pulse index, a pretty index, a margin index and a skewness index; simultaneously, Fourier transform is carried out on the vibration signals of all the sample points, and the frequency domain characteristics of all the vibration signals are extracted;
s200, performing Hilbert-Huang transform on the original vibration signal, extracting time-frequency domain features, including normalizing the decomposed first 4 eigenmode functions, taking the normalized data as features, and extracting time-frequency domain feature quantities from the normalized data;
s300, sorting the feature quantity data sets extracted in S100 and S200 into X ═ X1,X2,...,Xi,XnY, the corresponding fault type is Y ═ Y1,y2,...yi,yn,}。
Figure GDA0003041394710000041
Wherein, X1~XnRepresents the 1 st to n th feature quantity data set samples, y1y n1 st to n th fault types are represented; x is the number of1 (m)~xn (m)Means that each of the 1 st to n th samples has m features, xi (j)Represents the jth feature of the ith sample; y isiAnd indicating the type label corresponding to the ith sample.
Preferably, in the total length of one signal sample in S100, an arithmetic average of 10 numbers in the top ten rows of absolute values is the peak value, and the calculation formula of the average is
Figure GDA0003041394710000042
The root mean square value is calculated by the formula
Figure GDA0003041394710000043
The skewness is calculated by the formula
Figure GDA0003041394710000044
The formula for calculating the pretty degree is
Figure GDA0003041394710000045
The formula for calculating the waveform index is
Figure GDA0003041394710000046
The calculation formula of the peak index is
Figure GDA0003041394710000047
The pulse index is calculated by the formula
Figure GDA0003041394710000048
The formula for calculating the pretty index is
Figure GDA0003041394710000049
Wherein
Figure GDA00030413947100000410
Figure GDA0003041394710000051
The margin index is calculated by the formula
Figure GDA0003041394710000052
Wherein
Figure GDA0003041394710000053
The skewness index is calculated by the formula
Figure GDA0003041394710000054
Where N is the number of sampling points, xiIs the ith sample point.
Preferably, the calculation formula of the frequency domain features in step S100 includes:
Figure GDA0003041394710000055
Figure GDA0003041394710000056
Figure GDA0003041394710000057
Figure GDA0003041394710000058
where f (K) is a spectrum with K being 1,2, … K, K being the number of spectral lines, fkIs the frequency value of the k-th spectral line, pf1Is a vibration energy value in the frequency domain, pf2、pf3、pf4、pf6、pf7、pf11、pf12、pf13For the convergence value of the spectral power, pf5、pf8、pf9、pf10Is the position variation value of the main frequency.
Preferably, the S300 further includes the steps of:
s301, dividing the data set into k classes through a MiniBatchKmeans algorithm, wherein the k classes only contain a single label yiAlso includes a class of h tags, and for a class containing only a single tag, the average value of such data is calculated as a "representative sample" by formula (1), and its tag is yiPut these representative samples into set M, i.e. have
Figure GDA0003041394710000061
Wherein the content of the first and second substances,
Figure GDA0003041394710000062
d represents the number of samples contained in the cluster; 1 ═ i ═ n, h > 1;
a dataclass containing a plurality of tags is to be placed in set N; for a data class containing h labels, subdividing the data class into h sub-clusters, enabling each sub-cluster to only contain a single label, selecting h representative samples from the h sub-clusters through a formula (1), and putting the samples into a set M;
s302, taking the samples in the set M as training samples to train a support vector machine model;
s303, taking the support vector of the support vector machine trained in the previous step as an approximate support vector, and putting all samples in the cluster to which each approximate support vector belongs into a set S;
and S304, enabling F to be N and S, and taking the sample in F as a final training sample to train the support vector machine.
The invention has the beneficial effects that: the method provides a new method for ensuring the accuracy and greatly reducing the training time of the support vector machine. The method has the advantages that the approximate support vector is quickly found through the MiniBatchKmeans algorithm and the support vector machine, and then the samples near the approximate support vector are screened out from the original samples through the approximate support vector.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the present invention will be further described with reference to the accompanying drawings and embodiments, wherein the drawings in the following description are only part of the embodiments of the present invention, and for those skilled in the art, other drawings can be obtained without inventive efforts according to the accompanying drawings:
FIG. 1 is a schematic flow chart illustrating the steps of a rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 2 is a schematic diagram a of sample training in a model of a rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 3 is a schematic diagram b of sample training in a model of a rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 4 is a schematic diagram c of sample training in a model of a rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 5 is a schematic diagram of original sample data of a rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 6 is a schematic diagram of a final training sample of the rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 7 is a comparison graph of simulated data training time of the rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 8 is a comparison graph of the simulation data accuracy of the rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 9 is a diagram of an experimental apparatus for diagnosing a rolling bearing fault based on approximate support vectors according to the present invention;
FIG. 10 is a fault type and fault degree table of the rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 11 is an empirical mode decomposition diagram of a rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
FIG. 12 is a comparison graph of the fault diagnosis model training time of the rolling bearing fault diagnosis method based on approximate support vectors according to the present invention;
fig. 13 is a comparison graph of the accuracy of the fault diagnosis model of the rolling bearing fault diagnosis method based on the approximate support vector.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the following will clearly and completely describe the technical solutions in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without inventive step, are within the scope of the present invention.
In embodiment 1, a rolling bearing fault diagnosis method based on approximate support vectors includes that vibration acceleration signals of a bearing in a normal state and in different fault mode states are respectively collected by an acceleration sensor, and data preprocessing is performed on the vibration acceleration signals to obtain sample points containing vibration signals in different states, and further includes the following steps:
s100, respectively extracting time domain features and frequency domain features in the sample points, and respectively extracting the features of the sample points, wherein the time domain features comprise a peak value, an average value, a root mean square value, a skewness, a pretty, a waveform index, a peak index, a pulse index, a pretty index, a margin index and a skewness index; simultaneously, Fourier transform is carried out on the vibration signals of all the sample points, and the frequency domain characteristics of all the vibration signals are extracted;
s200, as shown in fig. 11, performing hilbert-yellow transform on the original vibration signal, and extracting time-frequency domain features, including extracting instantaneous amplitude, instantaneous phase, and instantaneous frequency for the decomposed first 4 eigenmode functions;
s300, sorting the feature quantity data sets extracted in S100 and S200 into X ═ X1,X2,...,Xi,XnY, the corresponding fault type is Y ═ Y1,y2,...yi,yn,}:
Figure GDA0003041394710000091
Wherein, X1~XnRepresents the 1 st to n th feature quantity data set samples, y1y n1 st to n th fault types are represented; x is the number of1 (m)~xn (m)Means that each of the 1 st to n th samples has m features, xi (j)Represents the jth feature of the ith sample; y isiAnd indicating the type label corresponding to the ith sample.
Specifically, as shown in fig. 1, firstly, the vibration signal of the bearing is collected, then, the collected data is subjected to data preprocessing, and in the data preprocessing process, an original data set is firstly input; unifying data formats in the original data set, and dividing the data formats in the same column according to the same data type, wherein the data in different columns are not restricted to the difference of the data types; judging whether missing values (null values) exist in the data set or not, and discarding samples containing the missing values; judging whether repeated sample data exists in the data set, wherein samples with the same attribute values and types are repeated samples, and if so, combining a plurality of repeated samples into a unique sample; judging whether invalid samples exist in the data set, namely samples with the same attribute value but different categories; if yes, clearing invalid samples; analyzing whether the data set has abnormal values through the box type graph, and if so, discarding the sample containing the abnormal values; and (3) outputting: a data set that does not contain missing values, outliers, duplicate samples, invalid samples, and has a uniform data format.
And finally, comprehensively calculating the characteristic data, the clusters and the approximate support vectors through a MiniBatchKmeans algorithm to obtain final training data, and further forming a training diagnosis model. The method comprises the steps of acquiring vibration acceleration signals of a bearing in a normal state and in different fault mode states respectively through an acceleration sensor, carrying out pretreatment such as abnormal value elimination and noise reduction on the acquired signals, and adding corresponding labels to the signals to acquire sample points containing the vibration signals in different states.
In step S100, the time domain feature includes 5 dimensional statistical parameters, which are a peak value, an average value, a root mean square value, a skew, and a skew, respectively, where the peak value is 10 numbers ranked earlier by the absolute value of the peak value in the total length of one signal sample, that is, an arithmetic average of the 10 numbers; the time domain features also include 6 dimensionless statistical parameters, which are a waveform index, a peak index, a pulse index, a jerk index, a margin index, and a skew index, respectively.
In step S200, the extracting process of the time-frequency domain features includes: firstly, empirical mode decomposition is performed on an original model, then Hilbert spectrum analysis is sequentially performed on the decomposed first 4 eigenmode functions (IMFs), as shown in FIG. 11, the instantaneous amplitude, the instantaneous phase, the instantaneous frequency sum of the original model and the IMF are extracted, and the method specifically comprises the following steps: s201, firstly, finding the maximum value and the minimum value of the signal, fitting the upper envelope line u (t) and the lower envelope line v (t) by cubic spline interpolation, and calculating the average value of the upper envelope line and the lower envelope line on each point, thereby obtaining an average value curve m1I.e. m1=[u(t)+v(t)]2; let the analysis signal be x (t), subtract the mean value m from x (t)1(t), i.e. h1=x(t)-m1(ii) a S202, if, h1Two conditions for IMF are satisfied, then h1Is the first IMF component of x (t); otherwise, will h1Repeating S201-S202 as the original signal to obtain the average value m of the upper and lower envelopes11Then, judge h again11=h1-m11Whether two conditions of the IMF are satisfied; if not, repeating the cycle k times to obtain h1k=h1(k-1)-m1kUp to h1kSatisfy two conditions of IMF, note c1Is the 1 st IMF component of signal x (t) obtained by EMD; s203, mixing c1Separating from x (t) to obtain r1=x(t)-c1R is to1Repeating S201-S203 as the original data to obtain the 2 nd IMF component c of x (t)2Repeating the cycle n times to obtain n IMF components of the signal x (t), having
Figure GDA0003041394710000101
In the formula rnFor residual components, the decomposition ends with a constant value or a monotonic function representing the average trend of the signal.
The decomposition process is a scale filtering process, and each IMF component reflects the characteristics of the signalThe scale represents the intrinsic modal characteristics of a nonlinear non-stationary signal. After obtaining IMF components of the signals, Hilbert transformation can be carried out on each order of IMF; let ciHilbert transform of (t)
Figure GDA0003041394710000111
Then there is
Figure GDA0003041394710000112
Thus, the analytic signal (analytic signal) of the signal x (t) is
Figure GDA0003041394710000113
Wherein the content of the first and second substances,
Figure GDA0003041394710000114
is the instantaneous amplitude;
Figure GDA0003041394710000115
is the instantaneous phase. In this way, a local optimum approximation is obtained by modulating the frequency and amplitude of a sinusoid. According to the definition of the instantaneous frequency, the instantaneous frequency of the IMF component is
Figure GDA0003041394710000116
In addition, the two conditions of the IMF are: 1) in the whole time range of the function, the number of local extreme points is equal to that of zero-crossing points, and the allowable phase difference range is one; 2) at any point in time, the envelope of the local maximum (upper envelope) and the envelope of the local minimum (lower envelope) must be 0 on average. In fig. 11, a series of IMFs are generated after EMD decomposition is performed on an original signal, and it can be seen from the figure that signal features are basically concentrated in the first 4 IMFs, and signal features of the latter IMFs are not obvious, so that the first 4 IMFs are selected for normalization, thereby improving accuracy of data processing.
Further, as shown in FIG. 9, a specific roll axis failure case is exemplified, comprising a test stand consisting of a 2 horsepower motor, torque sensor/encoder, dynamometer, and control electronics, the test bearing supporting the motor shaft, and an accelerometer placed near the motor drive end for measuring vibration signals. And introducing single-point faults into the tested bearing by using electric spark machining, wherein the model of the bearing is SKF. The diameters of the faults are respectively 0.007, 0.014, 0.021 and 0.028 inches, the rotating speed of the motor under the whole working condition is 1772r/2 by obtaining a sampling frequency of 12kHz, and the outer ring fault position is the vibration signal data of the rolling bearing as a training sample, and 12 different fault types and fault degrees are included, as shown in FIG. 10. The vibration signal data is divided into 2661 training samples, 897 test samples, each data sample being a vibration signal containing 512 sampling points. In step S200, the most important features are mainly concentrated in the high frequency part, and therefore, the first 4 eigenmode functions after decomposition are normalized and selected as the feature quantity. As shown in fig. 12 and 13, compared with the SVM Algorithm, the BPNN Algorithm, the KNN Algorithm, and the ELM Algorithm, the method of the present application consumes less time and has higher accuracy.
Further, in the total length of one signal sample in S100, the arithmetic mean of the top 10 numbers in the order of magnitude of absolute value is the peak value, and the calculation formula of the mean value is
Figure GDA0003041394710000121
The root mean square value is calculated by the formula
Figure GDA0003041394710000122
The skewness is calculated by the formula
Figure GDA0003041394710000123
The formula for calculating the pretty degree is
Figure GDA0003041394710000124
The formula for calculating the waveform index is
Figure GDA0003041394710000125
The calculation formula of the peak index is
Figure GDA0003041394710000126
The pulse index is calculated by the formula
Figure GDA0003041394710000127
The formula for calculating the pretty index is
Figure GDA0003041394710000128
Wherein
Figure GDA0003041394710000129
Figure GDA00030413947100001210
The margin index is calculated by the formula
Figure GDA0003041394710000131
Wherein
Figure GDA0003041394710000132
The skewness index is calculated by the formula
Figure GDA0003041394710000133
Where N is the number of sampling points, xiIs the ith sample point.
Preferably, the calculation formula of the frequency domain features in step S100 includes:
Figure GDA0003041394710000134
Figure GDA0003041394710000135
Figure GDA0003041394710000136
Figure GDA0003041394710000137
where f (K) is a spectrum with K being 1,2, … K, K being the number of spectral lines, fkIs the frequency value of the k-th spectral line, pf1Is a vibration energy value in the frequency domain, pf2、pf3、pf4、pf6、pf7、pf11、pf12、pf13For the convergence value of the spectral power, pf5、pf8、pf9、pf10Is the position variation value of the main frequency.
Further, the S300 further includes the steps of:
s301, as shown in FIGS. 2 and 3, the data set is divided into k classes by MiniBatchKmeans algorithm, and the k classes include a single label yiThe method also comprises the steps of containing a plurality of labels, wherein for the cluster containing only a single label, the average value of the data is calculated through the formula (1) and is used as a representative sample; putting the obtained representative samples into a set M, wherein for data containing h labels, the data are subdivided into h clusters, each cluster only contains a single label, and h representative samples are selected from the h clusters through a formula (1) and are put into the set M;
Figure GDA0003041394710000141
d represents the number of samples contained in the cluster.
In the step S301, the method specifically includes the following steps:
s3011, randomly selecting k samples from the n training samples as center points of k classes.
S3012, randomly selecting g samples from n training samples for clustering (g < n), traversing the g samples, and dividing each sample into a class to which a central point nearest to the sample belongs according to an Euclidean distance calculation method (formula (2));
Figure GDA0003041394710000142
s3013, calculating an average value of samples in each class, and taking the average value as a new central point;
s3014, repeat 3012-S3013 until the k center points no longer change or a sufficient number of iterations are performed.
S3015, traversing all n training samples, and dividing each sample into classes to which the central points nearest to the sample belong according to a Euclidean distance calculation method.
Among these k classes, as shown in fig. 2, the class includes a tag containing only a single tag (1)<=i<N) and also h tags (h) including>1) For classes containing only a single label, the average of these data is calculated as a "representative sample" by equation (1), which is labeled yi
These representative samples are put into set M.
As shown in fig. 3, for a class containing h tags, it is subdivided into h sub-clusters, such that each sub-cluster contains only a single tag. And (3) applying a formula (1) to the h sub-clusters respectively to select h 'representative samples', and putting the h 'representative samples' into the set M. Samples in classes containing multiple labels are usually located around the hyperplane in the support vector machine, which has a large effect on the class samples and will therefore be retained and placed in the set N.
S302, taking the samples in the set M as training samples to train a support vector machine model;
s303, taking the support vector of the support vector machine trained in the second step as an approximate support vector, and putting all samples in the cluster to which each approximate support vector belongs into a set S;
and S304, enabling F to be N and S, and taking the sample in F as a final training sample to train the support vector machine.
In this example, the inventors have experimentally evaluated other algorithms by first generating a random data set by a normal distribution, with the formula
Figure GDA0003041394710000151
Wherein μ and σ are the mean and standard deviation of normal distribution, and smaller standard deviation indicates more concentrated data points, whereas more dispersed data points. As shown in FIG. 5, the generation of two classes of normal distribution data sets containing 20000 data points, with μ being (-1, -1) and (1,1), respectively, and σ being 0.4 and 0.5, respectively.
As shown in fig. 6, by the algorithm flow provided by the present application, and setting the number of clusters to be 200, after processing the previously generated data, the number of samples is greatly reduced to 128, and the remaining samples are all samples having strong guidance for generating the support vector, and the support vector machine is trained by the samples, as shown in fig. 7, the training speed is greatly improved compared with that of the common support vector machine, and as shown in fig. 8, the accuracy is also ensured.

Claims (3)

1. A rolling bearing fault diagnosis method based on approximate support vectors comprises the steps of firstly, respectively collecting vibration acceleration signals of a bearing in a normal state and different fault mode states through an acceleration sensor, and carrying out data preprocessing on the vibration acceleration signals to obtain sample points containing the vibration signals in different states, and is characterized by further comprising the following steps:
s100, respectively extracting time domain features and frequency domain features in the sample points, wherein the time domain features comprise a peak value, an average value, a root mean square value, a skewness, a pretty degree, a waveform index, a peak index, a pulse index, a pretty degree index, a margin index and a skewness index; meanwhile, Fourier transform is carried out on the vibration signals of all the sample points, and the frequency domain characteristics of all the vibration signals are extracted;
s200, performing Hilbert-Huang transform on the original vibration signal, extracting time-frequency domain features, including normalizing the decomposed first 4 eigenmode functions, taking the normalized data as features, and extracting time-frequency domain feature quantities from the normalized data;
s300, sorting the feature quantity data sets extracted in S100 and S200 into X ═ X1,X2,...,Xi,XnY, the corresponding fault type is Y ═ Y1,y2,...yi,yn,};
Figure FDA0003041394700000011
Wherein, X1~XnRepresents the 1 st to n th feature quantity data set samples, y1~yn1 st to n th fault types are represented; x is the number of1 (m)~xn (m)Means that each of the 1 st to n th samples has m features, xi (j)Represents the jth feature of the ith sample; y isiA type label corresponding to the ith sample is represented; the S300 further includes the steps of:
s301, dividing the data set into k classes through a MiniBatchKmeans algorithm, wherein the k classes only contain a single label yiAlso includes a class of h tags, and for a class containing only a single tag, the average value of such data is calculated as a "representative sample" by formula (1), and its tag is yiPut these representative samples into set M, i.e. have
Figure FDA0003041394700000021
Wherein the content of the first and second substances,
Figure FDA0003041394700000022
d represents the number of samples contained in the cluster; 1 ═ i ═ n, h > 1;
a dataclass containing a plurality of tags is to be placed in set N; for a data class containing h labels, subdividing the data class into h sub-clusters, enabling each sub-cluster to only contain a single label, selecting h representative samples from the h sub-clusters through a formula (1), and putting the samples into a set M;
s302, taking the samples in the set M as training samples to train a support vector machine model;
s303, taking the support vector of the support vector machine trained in the previous step as an approximate support vector, and putting all samples in the cluster to which each approximate support vector belongs into a set S;
and S304, enabling F to be N and S, and taking the sample in F as a final training sample to train the support vector machine.
2. The method for diagnosing a rolling bearing failure based on an approximate support vector according to claim 1, wherein an arithmetic mean of 10 numbers whose absolute value is ranked ten in magnitude is the peak value in the total length of one signal sample in S100, and further, the mean value is calculated by the formula of
Figure FDA0003041394700000023
The root mean square value is calculated by the formula
Figure FDA0003041394700000024
The skewness is calculated by the formula
Figure FDA0003041394700000025
The formula for calculating the pretty degree is
Figure FDA0003041394700000031
The formula for calculating the waveform index is
Figure FDA0003041394700000032
The calculation formula of the peak index is
Figure FDA0003041394700000033
The pulse index is calculated by the formula
Figure FDA0003041394700000034
The formula for calculating the pretty index is
Figure FDA0003041394700000035
Wherein
Figure FDA0003041394700000036
Figure FDA0003041394700000037
The margin index is calculated by the formula
Figure FDA0003041394700000038
Wherein
Figure FDA0003041394700000039
The skewness index is calculated by the formula
Figure FDA00030413947000000310
Where N is the number of sampling points, xiIs the ith sample point.
3. The approximate support vector-based rolling bearing fault diagnosis method according to claim 1 or 2, wherein the calculation formula of the frequency domain features in the step S100 includes:
Figure FDA00030413947000000311
Figure FDA0003041394700000041
Figure FDA0003041394700000042
Figure FDA0003041394700000043
where f (K) is a spectrum with K being 1,2, … K, K being the number of spectral lines, fkIs the frequency value of the k-th spectral line, pf1Is a vibration energy value in the frequency domain, pf2、pf3、pf4、pf6、pf7、pf11、pf12、pf13For the convergence value of the spectral power, pf5、pf8、pf9、pf10Is the position variation value of the main frequency.
CN201910611796.7A 2019-07-08 2019-07-08 Rolling bearing fault diagnosis method based on approximate support vector Active CN110333077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910611796.7A CN110333077B (en) 2019-07-08 2019-07-08 Rolling bearing fault diagnosis method based on approximate support vector

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910611796.7A CN110333077B (en) 2019-07-08 2019-07-08 Rolling bearing fault diagnosis method based on approximate support vector

Publications (2)

Publication Number Publication Date
CN110333077A CN110333077A (en) 2019-10-15
CN110333077B true CN110333077B (en) 2021-06-15

Family

ID=68144726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910611796.7A Active CN110333077B (en) 2019-07-08 2019-07-08 Rolling bearing fault diagnosis method based on approximate support vector

Country Status (1)

Country Link
CN (1) CN110333077B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275004B (en) * 2020-02-21 2022-10-11 电子科技大学 Bearing fault diagnosis method based on LMD and impulse neural network
CN111272368B (en) * 2020-02-24 2021-07-30 南京震坤物联网科技有限公司 Support and hanger degradation alarm method based on fundamental frequency data K-Means clustering
CN111553316A (en) * 2020-05-14 2020-08-18 中国核动力研究设计院 Method for detecting nuclear-grade pipeline cavitation fault
CN112729815A (en) * 2020-12-21 2021-04-30 云南迦南飞奇科技有限公司 Wireless network-based online fault big data early warning method for health condition of transmission line
CN113469106B (en) * 2021-07-16 2024-06-04 辽宁大学 KNN interval type bearing fault diagnosis method for missing data
CN113822257B (en) * 2021-11-24 2022-06-07 航天智控(北京)监测技术有限公司 Abnormal point detection method based on combination of dimensionless features and virtual samples
CN114692692B (en) * 2022-04-02 2023-05-12 河海大学 Snowfall recognition method based on microwave attenuation signal fusion kernel extreme learning machine

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103939325B (en) * 2014-05-05 2017-04-05 重庆大学 A kind of method for diagnosing faults of the fire pump under low-speed running
CN105760839A (en) * 2016-02-22 2016-07-13 重庆大学 Bearing fault diagnosis method based on multi-feature manifold learning and support vector machine
CN107036817B (en) * 2017-04-05 2019-03-08 哈尔滨理工大学 SVR rolling bearing performance decline prediction technique based on krill group's algorithm
CN108956142A (en) * 2018-05-28 2018-12-07 常州信息职业技术学院 A kind of bearing fault recognition methods
CN109829402B (en) * 2019-01-21 2022-08-30 福州大学 GS-SVM-based bearing damage degree diagnosis method under different working conditions

Also Published As

Publication number Publication date
CN110333077A (en) 2019-10-15

Similar Documents

Publication Publication Date Title
CN110333077B (en) Rolling bearing fault diagnosis method based on approximate support vector
Pan et al. LiftingNet: A novel deep learning network with layerwise feature learning from noisy mechanical data for fault classification
Islam et al. Automated bearing fault diagnosis scheme using 2D representation of wavelet packet transform and deep convolutional neural network
Kuncan et al. A novel feature extraction method for bearing fault classification with one dimensional ternary patterns
CN111353482B (en) LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method
Zhang et al. Multi-fault diagnosis for rolling element bearings based on ensemble empirical mode decomposition and optimized support vector machines
Pan et al. Intelligent fault identification for industrial automation system via multi-scale convolutional generative adversarial network with partially labeled samples
Xiang et al. Fault diagnosis of rolling bearing under fluctuating speed and variable load based on TCO spectrum and stacking auto-encoder
CN110110768B (en) Rolling bearing fault diagnosis method based on parallel feature learning and multiple classifiers
CN105760839A (en) Bearing fault diagnosis method based on multi-feature manifold learning and support vector machine
CN109708907B (en) Equipment fault feature extraction method based on envelope information
CN108760300A (en) A method of intelligent fault diagnosis being carried out to it according to bearing vibration signal
CN115409131B (en) Production line abnormity detection method based on SPC process control system
CN114755017B (en) Variable-speed bearing fault diagnosis method of cross-domain data driving unsupervised field shared network
CN114705432B (en) Method and system for evaluating health state of explosion-proof motor bearing
CN117056849A (en) Unsupervised method and system for monitoring abnormal state of complex mechanical equipment
CN116956215A (en) Fault diagnosis method and system for transmission system
CN110160781B (en) Test set reconstruction and prediction method for rotary machine fault classification
CN111220388B (en) Rolling bearing fault diagnosis method based on time domain analysis
CN114964776A (en) Wheel set bearing fault diagnosis method based on MSE and PSO-SVM
Zhou et al. Degradation State Recognition of Rolling Bearing Based on K‐Means and CNN Algorithm
Fatima et al. Multiple fault classification using support vector machine in a machinery fault simulator
Cao et al. A novel method for detection of wind turbine blade imbalance based on multi-variable spectrum imaging and convolutional neural network
CN115901259A (en) Rolling bearing weak fault diagnosis method based on two-dimensional image and CNN
Zhao et al. Fault diagnosis of rotating machinery based on FMEA and zero-shot learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant