Abstract
Realtime health monitoring of industrial components and systems that can detect, classify, and predict impending faults is critical to reduce operating and maintenance costs. This paper presents a softmax regressionbased prognostic method for online health assessment and fault diagnosis. System conditions are evaluated by processing the information gathered from access controllers or sensors mounted at different points in the system, and maintenance is performed only when the failure or malfunction prognosis is indicated. Wavelet packet decomposition and fast Fourier transform techniques are used to extract features from nonstationary vibration signals. Wavelet packet energies and fundamental frequency amplitude are used as features, and principal component analysis is used for feature reduction. Reduced features are input into softmax regression models to assess machine health and identify possible failure modes. The gradient descent method is used to determine the parameters of softmax regression models. The effectiveness and feasibility of the proposed method are illustrated by applying to a real application.
1. Introduction
Considerable efforts have been made to develop methods and tools for failure diagnosis. However, limited results have been provided on prognostics that can detect, analyze, and correct equipment problems before failures manifest, as well as provide system operators with a sufficient time window to schedule maintenance without disrupting the operations [14]. This paper presents a prognostic method for the health assessment and fault diagnosis of online centrifugal pumps using multiple softmax regression.
Realizing effective centrifugal pumps health assessment and fault diagnosis, one challenge is selecting a proper feature space that can reflect comprehensive performance [5]. Traditional timedomain and frequencydomain analysis are based on the assumption that the processing signals are stationary and linear. However, the vibration signal of worn centrifugal pumps is a nonlinear and nonstationary signal [68]. Wavelet packet decomposition has particular advantages for collecting abundant information with arbitrary timefrequency resolution. It allows extraction of features that combine nonlinear and nonstationary characteristics [9]. Thus, Wavelet packet decomposition is employed for feature extraction in this method.
The paper is organized as follows. Section 2 provides a stateoftheart prognostic methodology and its related mathematics. Section 3 illustrates the test bed setup of a centrifugal pump and assessment/classification results obtained from an application of the proposed schemes on real data. Section 4 provides the summary and future research directions.
2. Methodology
The prognostic scheme is based on monitored data that contain centrifugal pump incipient failure signatures and on intelligent mathematical techniques that can be incorporated to detect and evaluate the risk of failure over a protracted period and classify which particular type of failure may occur.
2.1. Procedures of the methodology
The methodology has two major steps. First is the extraction and selection of features for health assessment and fault diagnosis. The search space is also reduced for fast computation. Second is the use of softmax regression for health assessment and root cause classification.
The process of the methodology can be summarized as Fig. 1.
Fig. 1The process of the methodology
2.2. Feature extraction using wavelet packet decomposition
The fault signals of a centrifugal pump are usually distributed in both high and lowfrequency bands where wavelet packet decomposition can reach a delicate degree. Wavelet packet analysis is a sophisticated method of orthogonal decomposition based on multiresolution analysis, which can divide the full frequency band of signals in multilevel, so that each band signal contains elaborate information about the original signal. Therefore, wavelet packet decomposition is suitable for extracting both low and highfrequency features [1013]. The energy index of all bands reflecting signal characteristics can be constructed by statistically analyzing all bands of signals decomposed by wavelet packet. Determining the scale of wavelet packet decomposition is important. Very low wavelet packet decomposition scale reduces the efficiency of fault feature extraction, whereas very high wavelet packet decomposition scale increases the dimension of feature vector, consequently affecting the calculation rate [14]. Therefore, in centrifugal pump health assessment and fault classification, according to the vibration signal characteristics, the original signal $f\left(t\right)$ can be constructed by the sum of 8 components as Eq. (1), and eight frequency band energy indices ${E}_{3,j}$ can be calculated by threelayer decomposition:
where ${f}_{8}^{j}\left(t\right)$ is the wavelet packet component signal that can be expressed by a linear combination of wavelet packet functions as:
where $j$ and $k$ are integers and defined as the modulation, scale and translation parameter, respectively;$\mathrm{}{S}_{8,k}^{j}$ is the wavelet packet coefficients, ${\psi}_{8,k}^{j}\left(t\right)$ is wavelet packet function:
where ${x}_{j,k}$ ($j=\text{0,}\text{}\text{1,}\text{\u2026}\text{,}\text{}\text{7}$; $k=\text{1,}\text{}\text{2,}\text{\u2026}\text{,}\text{}n$ ) stands for the amplitude of reconstruction signal ${S}_{3,j}$.
When a centrifugal pump has a fault, the energy of each band signal will have a great impact. Thus, the energy should be normalized into a feature vector $T$:
2.3. PCAbased feature reduction
The PCA procedure employed in this paper will be briefly presented for a more comprehensive data. At a given centrifugal pump operation stage $S$ (in this paper, only the normal machine behavior and machine operation with a worn bearing/impeller are considered), the signal features $X$ are characterized by the multivariate Gaussian distribution with mean ${\stackrel{}{\mu}}_{s}$ and the covariance matrix ${K}_{s}$. The symmetric matrix ${K}_{s}$ can now be represented as:
where $r$ is the rank of the covariance matrix ${K}_{s}$, ${\lambda}_{i}$, $i=\text{1,}\text{}\text{2,}\text{}\text{...,}\text{}r$ are the nonzero eigenvalues of ${K}_{s}$, ${\stackrel{}{\nu}}_{i}$ are the corresponding unit norm eigenvectors, and:
All eigenvalues of ${K}_{s}$ are real and greater than or equal to zero because of its positive semi definiteness. Each eigenvalue ${\lambda}_{i}$, $i=\text{1,}\text{}\text{2,}\text{}\text{...,}\text{}r$ depicts the amount of the covariance matrix energy projected in the direction of the corresponding eigenvector ${\stackrel{}{\nu}}_{i}$ [15]. Only a few of the eigenvalues in $\mathrm{\Lambda}$ account for most of the energy in the covariance matrix ${K}_{s}$ when a high degree of correlation exists among the components of $X\text{.}$ Thus, assuming that eigenvalues ${\lambda}_{i}\text{,}$$i=\text{1,}\text{}\text{2,}\text{}\text{...,}\text{}r$ are arranged in descending order, Eq. (5) can be represented as:
where:
where $p$ is the number of the principal components of ${K}_{s}$, ${\lambda}_{i}$, $i=\text{1,}\text{}\text{2,}\text{}\text{...,}\text{}r$ are the largest $p$ eigenvalues of ${K}_{s}$, and ${\stackrel{}{\nu}}_{i}$ are the corresponding unit norm eigenvectors.
A query item $\stackrel{~}{X}$ can now be transformed into a $p$ component random variable $\stackrel{~}{Y}$ given as:
If $\stackrel{~}{X}$ belongs to the class of signals from the centrifugal pump state $S$, then $\stackrel{~}{Y}$ should be normally distributed with zero mean and variance ${I}_{p}$, where ${I}_{p}$ is the unity matrix of order $p$. Thus, for each query item $\stackrel{~}{X}$, its adherence to the class $S$ can be assessed through the Euclidean norm of the vector $\stackrel{~}{Y}$, which in turn corresponds to assessment and classification based on the softmax regression of the query item from the training classes [16].
2.4. Logistic regression method
The machine condition description from daily maintenance records/logs is a dichotomous problem (either normal or failed) that can be represented using an LR function [1]. The goal of logistic regression (LR) is to find the best fitting model to describe the relationship between the categorical characteristics of dependent variables (the probability of an event, constrained between 0 and 1) and independent variables. The logistic function is:
The logistic or logit model is:
where $g\left(x\right)$ is a linear combination of the independent variables ${x}_{1}\text{,}{x}_{2}\text{,}\text{\u2026}\text{,}\text{}{x}_{k}$.
The precondition for figuring out $P\left(x\right)$ is determining parameters $\alpha $ and ${\beta}_{1}\text{,}\text{}\text{\u2026}\text{,}\text{}{\beta}_{k}$ in advance. Dichotomousdependent variables make estimation using ordinary least squares inappropriate [17]. Hence, estimation in LR chooses the parameters $\alpha $ and ${\beta}_{1}\text{,}\text{}\text{\u2026}\text{,}\text{}{\beta}_{k}$ using the maximum likelihood method rather than those that minimize the sum of squared errors [1]. Then, the probability of failure for each input vector $x$ can be calculated according to Eq. (11).
2.5. Softmax regression method
In the softmax regression setting, multiclass classification (as opposed to only binary classification) is interested, and so the label $y$ can take on $k$ different values, rather than only two. Thus, in training set $\left\{\right({x}^{\left(1\right)},{y}^{\left(1\right)}),\dots ,({x}^{\left(m\right)},{y}^{\left(m\right)}\left)\right\}$ , we now have that ${y}^{\left(i\right)}\in \{\mathrm{1,2},\dots ,k\}$ .
Given a test input $x$, the hypothesis to estimate the probability that $p(y=jx)$ for each value of $j=\text{1,}\text{}\text{2,}\text{}\text{...,}\text{}k$ is wanted. I.e., the probability of the class label taking on each of the $k$ different possible values is estimated. Thus, the hypothesis will output a $k$ dimensional vector (whose elements sum to 1) giving $k$ estimated probabilities [18].
Concretely, our hypothesis ${h}_{\theta}\left(x\right)$ takes the form:
Here ${\theta}_{1},{\theta}_{2},\dots ,{\theta}_{k}\in {\mathfrak{R}}^{n+1}$ are the parameters of the softmax regression model. Notice that the term $1/{\sum}_{j=1}^{k}{e}^{{\theta}_{j}^{T}{x}^{\left(i\right)}}$ normalizes the distribution, so that it sums to one.
In the special case where $k=$2, softmax regression reduces to logistic regression. This shows that softmax regression is a generalization of logistic regression. The cost function used for softmax regression is:
where $1\{\xb7\}$ is the indicator function, so that $1\left\{\text{a}\text{true statement}\right\}=\text{1,}$ and $1\left\{\text{a false statement}\right\}=\text{0}$.
There is no known closedform way to solve for the minimum of $J\left(\theta \right)$, and thus as usual well resort to an iterative optimization algorithm such as gradient descent or LBFGS. Taking derivatives, the gradient is:
Softmax regression has an unusual property that it has a “redundant” set of parameters. Adding weight decay to the cost function is needed. This will take care of the numerical problems associated with softmax regression's overparameterized representation.
The modified cost function is:
The derivative of the modified cost function is:
By minimizing $J\left(\theta \right)$ with respect to $\theta $, a working implementation of softmax regression is given.
3. Experimental result
The methodology was implemented in a centrifugal pump (Fig. 2) to evaluate dynamic health condition. In addition, fault mode analysis was performed to identify the possible root cause. The centrifugal pump is driven by a motor with stabilized speed of 2900 r/min. Four commonly occurring faults of the centrifugal pump were set, namely, bearing roller wearing, bearing inner race wearing, bearing outer race wearing and centrifugal pump impeller wearing.
Fig. 2Centrifugal pump data acquisition system
3.1. Data acquisition system description
Three vibration signals were acquired from an installed accelerometer, with a sampling rate of 10.24k Hz.
3.2. Feature extraction and reduction
FFT was used for each vibration signal to obtain the fundamental frequency amplitude. Threelevel wavelet packet decomposition using Daubechies wavelet (DB10) was adopted for each vibration signal, and fundamental frequency amplitude and packet energies were used as features. A subset of feature components was determined using PCA. In this case, a fourdimensional feature vector was finally selected as a feature vector for health assessment and fault mode classification after feature reduction.
3.3. Softmax regression models training
• Softmax regression model trained for health assessment.
When Softmax regression model was trained for health assessment, softmax regression reduced to logistic regression. A total of 120 sets of data were used as training data, including 40 sets of data sampled under normal conditions [$P\left(x\right)=$0] versus 80 sets of fault data [$P\left(x\right)=$1]. The parameters $\alpha $ and ${\beta}_{1}\text{,}\text{\u2026}\text{,}\text{}{\beta}_{4}$ were estimated using the maximum likelihood method to eventually obtain the model for performance assessment as softmax regression model 1.
• Softmax regression model trained for fault diagnosis.
Four fault modes of the centrifugal pump which include bearing roller wearing, bearing inner race wearing, bearing outer race wearing and centrifugal pump impeller wearing, were considered. Four sets of centrifugal pump vibration data (one set for each fault mode of the centrifugal pump) were used for training centrifugal pump fault diagnosis model based on softmax regression.
Set 1 (bearing roller wearing): 40 subsets of bearing roller wearing data ($p(y=1x)=\text{1}$) versus 40 subsets of nonwearing data ($p(y=1x)=\text{0}$);
Set 2 (bearing inner race wearing): 40 subsets of bearing inner race wearing data ($p(y=2x)=\text{1}$) versus 40 subsets of nonwearing data ($p(y=2x)=\text{0}$);
Set 3 (bearing outer race wearing): 40 subsets of bearing outer race wearing data ($p(y=3x)=\text{1}$) versus 40 subsets of nonwearing data ($p(y=3x)=\text{0}$);
Set 4 (centrifugal pump impeller wearing): 40 subsets of centrifugal pump impeller wearing data ($p(y=4x)=\text{1}$) versus 40 subsets of nonwearing data ($p(y=4x)=\text{0}$).
The parameters ${\theta}_{1}\text{,}\text{}{\theta}_{2}\text{,}\text{\u2026}\text{,}\text{}{\theta}_{k}$ were estimated using gradient descent method to eventually obtain the model for fault diagnosis as softmax regression model 2.
3.4. Validation
Four sets of centrifugal pump vibration data (one set for each fault mode) were used for validation of centrifugal pump health assessment and fault diagnosis models based on softmax regression.
• Set 1 (bearing roller wearing): 10 data under normal condition versus 40 bearing roller wearing data;
• Set 2 (bearing inner race wearing): 10 data under normal condition versus 32 bearing inner race wearing data;
• Set 3 (bearing outer race wearing): 10 data under normal condition versus 40 bearing outer race wearing data;
• Set 4 (centrifugal pump impeller wearing): 10 data under normal condition versus 30 impeller wearing data.
FFT and threelevel wavelet packet decomposition using Daubechies wavelet (DB10) were adopted for extracting features from four sets of validation data. The wavelet package energies (E0 to E7) extracted from data set 1 to 4 is respectively show in Figs. 36. The fundamental frequency amplitude (or ffa, for short) extracted from data set 1 to 4 (ffa 1, ffa 2, ffa 3 and ffa 4) is show in Fig. 7.
Fig. 3The wavelet package energies extracted from data set 1
Fig. 4The wavelet package energies extracted from data set 2
Fig. 5The wavelet package energies extracted from data set 3
Fig. 6The wavelet package energies extracted from data set 4
Fig. 7The fundamental frequency amplitude extracted from data set 1 to 4
Reduced features are input into softmax regression models to assess the centrifugal pump health condition and identify possible failure modes. The confidence value ($CV$) was calculated based on the probability of failure. Define $CV=1P\left(x\right)$. When the centrifugal pump operates normally, $CV$ is close to 1; if the centrifugal pump is going to fail, $CV$ is approaching 0 correspondingly; If the confidence value is less than a predetermined threshold (e.g., 0.8), the root fault diagnosis module will be triggered, and features are input into fault diagnosis models to calculate the probability of each fault.
Fig. 8(a)(d) show the overall health assessment of the four sets of centrifugal pump vibration data using model softmax regression model 1. The probability of different fault modes conducted from softmax regression model 2 is shown in Fig. 9(a)(d).
Fig. 8Health assessment result of four fault modes
a)
b)
c)
d)
In Fig. 8, both bearing and impeller problems can be detected from the $CV$ drops. However, the difference among the four drops and their cause are difficult to clarify. In this methodology, the fault diagnosis module is triggered as long as the confidence value is below a predetermined threshold (0.8) by inputting the corresponding features into the trained models (the softmax regression model 2) to calculate the probability of fault modes. From time 10, the probability of fault mode 1 (bearing roller wearing, p1), fault mode 2 (bearing inner race wearing, p2), fault mode 3 (bearing outer race wearing, p3), and fault mode 4 (impeller wearing, p4) is very high [solid line in Fig. 9(a)(d), respectively]. Consequently, the minor probability of the failure of these points is indicated in Fig. 9.
Fig. 9Probability of fault modes 1, 2, 3, and 4
a)
b)
c)
d)
In conclusion, a softmax classifier is more suitable for centrifugal pump fault diagnosis than 4 separate binary classifiers using logistic regression. The four fault modes considered in this paper are mutually exclusive, so softmax regression classifier would be appropriate. Besides that, softmax regression is more simple, useful and easy to implement.
4. Conclusions
A softmax regressionbased approach for centrifugal pump health assessment and root cause classification is presented in this paper. Softmax regression combined with the gradient descent method is an effective and efficient tool for dynamic health assessment and root cause classification. WPT combined with PCA is a suitable feature extraction step where appropriate features can be obtained from nonstationary signals. The method is generic and shows promising results for analyzing both stationary and nonstationary signals. Thus, it could be applied to other centrifugal pumps.
However, four types of centrifugal pump fault modes are considered in this paper. More fault modes should be taken into consideration for better health assessment and fault diagnosis of centrifugal pumps. When the process is not time shifted, the coefficients of WPT can be directly used as features instead of using packet energy, which will be investigated further in future research and applications.
References

Yan J., Lee J. Machine degradation assessment and root cause classification using logistic regression method. ASME Journal of Manufacturing Science and Engineering, Vol. 127, 2005, p. 912914.

Yan J., Lee J., Koc M. Predictive algorithm for machine degradation detection using logistic regression. Fifth International Conference on Managing Innovations in Manufacturing, Milwaukee, 2002, p. 172178.

Kacprzynski G. J., Roemer M. J. Health management strategies for 21st century conditionbased maintenance systems. International Comadem Congress, Houston, TX, 2000.

Liao L. X., Lee J. J. Design of a reconfigurable prognostics platform for machine tools. Expert Systems with Applications, Vol. 37, 2010, p. 240252.

Pan Y., Chen J., Li X. Bearing performance degradation assessment based on lifting wavelet packet decomposition and fuzzy cmeans. Mech. Syst. Sign. Proc., Vol. 24, 2010, p. 559566.

Soylemezoglu A., Jagannathan S., Saygin C. MahalanobisTaguchi system as a multisensor based decision making prognostics tool for centrifugal pump failures. Reliability, IEEE Transactions on 2011, Vol. 60, 2005, p. 864878.

Ahonen T., Tiainen R., Viholainen J., Ahola J., Kestila J. Pump operation monitoring applying frequency converter. Power Electronics, Electrical Drives, Automation and Motion, 2008, p. 184189.

Wu X. W. Vibration faults diagnosis for centrifugal ventilator based on DDAGSVM. Instrumentation & Measurement, Sensor Network and Automation, Vol. 1, 2012, p. 318321.

Li Z. X., Yan X. P., Yuan C. Q, Peng Z. X., Li L. Virtual prototype and experimental research on gear multifault diagnosis using waveletautoregressive model and principal component analysis method. Mech. Syst. Sign. Proc., Vol. 25, 2011, p. 25892607.

Jafar Z., Javad P. Bearing fault detection using wavelet packet transform of induction motor stator current. Tribology International, Vol. 40, 2007, p. 763769.

Li J., Jiang P. F., Xiang Y. Y., Ti J. W. Experimental investigation for fault diagnosis based on a hybrid approach using wavelet packet and support vector classification. Scientific World Journal, Vol. 10, 2014, p. 11551160.

Wang X., Liu. C. W., Bi F. R., Bi X. Y., Shao K. Fault diagnosis of diesel engine based on adaptive wavelet packets and EEMDfractal dimension. Echanical Systems and Signal Processing, Vol. 41, 2013, p. 581597.

Zhang Z. Y., Wang Y., Wang K. S. Fault diagnosis and prognosis using wavelet packet decomposition, Fourier transform and artificial neural network. Journal of Intelligent Manufacturing, Vol. 24, 2013, p. 12131227.

Yen G. G., Lin K. C. Wavelet packet feature extraction for vibration monitoring. IEEE Trans. Ind. Electron., Vol. 47, 2000, p. 650667.

Pirra M., Gandino E., Torri A., Garibaldi L., MachorroLópez J. M. PCA algorithm for detection, localisation and evolution of damages in gearbox bearings. Journal of Physics: Conference Series, 2011, p. 305.

Upadhyaya S. BakerDemaray. Comparison of NN and LR classifiers in the context of screening native American elders with diabetes. Expert Systems with Applications, Vol. 40, 2013, p. 58305838.

Djurdjanovic D., Ni J., Lee J. Timefrequency based sensor fusion in the assessment and monitoring of machine performance degradation. ASME International Mechanical Engineering Congress and Exposition, 2002.

D’Ambrosio R., Iannello G., Soda P. Solving biomedical classification tasks by softmax reconstruction in ECOC framework. ComputerBased Medical Systems, 2013, p. 433436.
About this article
This research was supported by the National Natural Science Foundation of China (Grant Nos. 61074083, 50705005 and 51105019), the Technology Foundation Program of National Defense (Grant No. Z132013B002), as well as the Innovation Foundation of BUAA for PhD Graduates.