Bearing fault diagnosis using a novel coding-statistic feature combined with NNC

. The failures of rolling bearings usually cause the breakdown of rotating machinery. Therefore, bearing fault diagnosis is receiving more and more attentions. In this paper, a new coding-statistic feature is proposed for bearing fault diagnosis. Firstly, a waveform coding matrix (WCM) is drawn from each signal using a coding algorithm then a statistical feature is extracted from the WCM with a pre-defined dictionary. Secondly, all statistical features are processed using two-dimensional principal component analysis (2DPCA) to reduce redundant information and dimensionality. Finally, a nearest neighbor classifier (NNC) is employed to classify the bearing faults. Two bearing fault classification problems are utilized to demonstrate the effectiveness of the proposed scheme. Experimental results show that an excellent performance could be accomplished with the proposed scheme.


Introduction
Rolling bearings, as the extremely essential support components in rotating machinery, are most widely used in industrial machines. Their failures may generally result in machine breakdowns, and even casualties [1]. Hence, it is essential to diagnose their faults accurately and rapidly, especially for industrial site.
In general, three procedures are commonly included in bearing fault diagnosis. The first is to collect monitoring data with sensors. The second is to process the acquired signals to extract sensitive features related to the bearing health state. And the third is to diagnose bearing health conditions and/or locate the corresponding faults. The aforementioned fault features are usually extracted from raw signals in time domain, frequency domain and time-frequency domain. Time domain analysis is directly analyzed base on the time waveform of the vibration signals to extract features such as root-mean-square amplitude, skewness, kurtosis etc. [2]. Frequency domain analysis is often conducted with transforming vibration signals into frequency domain via fast Fourier transform to extract features such as cepstrum, Hilbert envelope spectrum analysis etc. [3], [4]. Time-frequency features are usually abstracted by means of short time Fourier transform [5], empirical mode decomposition [6], Wigner-Ville distribution [7], wavelet analysis [8], HHT time-frequency analysis, and high order spectral analysis [9]. However, the efficiency of the extraction process is of great importance for real-time diagnosis. Hence, more attentions should be focused on the phase of feature extraction for fault diagnosis.
Once the characteristics are extracted, bearing health state can be identified with assistance from one or more classifiers, such as distance classifiers, artificial neural networks (ANNs), support vector machines (SVMs) and so on. For example, Liu et al. [10] performed bearing fault diagnosis with LS-SVM and Empirical Mode Decomposition. Yang et al [11] conducted fault diagnosis of rolling bearing based on SVMs and fractal dimension. Sreejith et al. [12] employed time-domain features and neural networks for bearing fault diagnosis. And deep learning algorithm are utilized for fault diagnosis of rotating machine for the past few years [13], [14]. In general, artificial intelligence (AI) approaches, like ANNs and SVMs, may show an improved performance over other approaches. But in practice, however, it's not easy to apply AI techniques to provide effective decision support due to their expensive computations. Considered their shortcomings, a simple classifier, nearest neighbor classifier (NNC), is employed in this study to classify the health states of bearings.
In this paper, a novel feature extraction method is proposed based on a coding strategy. First, a waveform coding matrix (WCM) is drawn from the time discrete series using a coding technique to capture the signal structures, and a coding-statistic feature is acquired from WCM with a pre-defined dictionary. Then two-dimensional principal component analysis (2DPCA) is employed to process these statistical features so that the redundancy and dimensionality of raw features could be rejected and reduced. Finally, an NNC is applied to classify the bearing faults. The main contributions of this work are to propose a new feature derived directly from time domain signal to interpret the health state of rolling bearings, and to perform fault diagnosis with 2DPCA and NNC.
The remainder of this paper is organized as follows. Section 2 presents the process of feature extraction using a coding strategy and statistical analysis in detail. Section 3 is dedicated to elaborating the proposed fault diagnosis scheme. And Section 4 provides the experimental validation with simulation and real data. Then lastly, the conclusions are presented in Section 5 together with possible future work.

Coding-statistic feature extraction
In this paper, a numeric coding technique is introduced to draw the information structures of collected signals at the feature extraction procedure. Suppose that the discrete time signal is = [ , , ⋯ , , ], where represents the length of , then the process of the proposed feature extraction can be summarized as follows.
Step 1: Perform zero-mean and normalization processing. In order to eliminate the influence of the mean and to meet the requirements of discrete coding, the discrete time series are first centered to have mean 0 and scaled to have standard deviation 1 according to standardized -scores algorithm, and then normalized with min-max normalization method as described in Eq. (1): where = 1,2, ⋯ , , and ( ), ( ) denote the maximum and minimum of respectively. In particular, the maximum and minimum, i.e.
( ) and ( ), should be calculated through current encoded signals when applying online monitoring system. That is, they are computed from the new series which are obtained by splicing the repetitive points what acquired last time with the new collected discrete ones. In practice, they can be specified manually and empirically. Finally, the discrete time series are normalized as values between -1 and 1.
Step 2: Encode as integers using 2 -level quantization. Uniform quantization considers the number range of equal divisions for creating region segments. In order to capture the structure of the time-domain waveform, uniform quantization is employed in this study, and the corresponding function "uencode" available in Matlab software has been used for this purpose. The syntax for this function is presented as " = uencode ( , )", where u is the normalized series in previous step and denotes the number of levels for quantization and n must be an integer between 2 and 32 (inclusive). In general, the value of n is recommended to specify between 7 and 12. The quantization rules are: (1) If the input is less than -1, the value of the output of "uencode" is 0.
(2) If the input is greater than 1, the value of the output of "uencode" is 2 − 1. can be encoded to integer values by applying 2 -level quantization as follows: where adding "1" is to ensure the minimum element of is one so as to generate the WCM in next step.
Step 3: Calculate the waveform coding matrix (WCM). Since the result of uniform quantization is obtained in step 2, the WCM can be constructed as ∈ × by labeling its elements with "0" or "1". The labeling rule is: label the element of as "1" at the location of the quantization output of ( ); otherwise, label the element of as "0". Step 4: Feature extraction through statistical analysis with a pre-defined dictionary. In order to perform feature extraction with the WCM of , a dictionary is predefined as Eq.
Step 2: Calculate the waveform coding of x. M = zeros(L, 2 n ); % n represents the encoding bits.
% F is the waveform coding of x.
Step 3: Feature extraction through statistical analysis with a pre-defined dictionary.
(1) Dictionary construction where represents a word in the dictionary D. (2) For each row of F, calculate the total number of as the corresponding vector of V.
, where , and denotes the non-overlapping emergence times of in the jth row of F. Finally, the coding-statistic feature V is constructed.
where ( = 1,2, ⋯ , 2 ) stands for a word in dictionary with length . Afterward, the total number of each word in every row of is calculated, and then arranged in a statistical matrix ( = 1,2, ⋯ , 2 ), each element of which denotes the non-overlapping emergence times of . A coding-statistic feature is constructed through the statistical analysis of with the pre-defined dictionary in the end.
The detailed step descriptions are presented in Fig. 1. As illustrated in Fig. 1, the calculation efficiency of the proposed coding form can be highlighted due to its omitting for the repetitive segments, especially for real-time condition monitoring system.
Next, the online computing process of the WCM is depicted with a simple parameter setting and the statistical feature extraction is illustrated intuitively. Fig. 2 gives an example of real-time waveform coding in condition monitoring system. Here, the sample length is set to = 18, and the quantization parameter is set to = 4. In the beginning, the waveform coding of signal acquired at time is performed to obtain the first WCM. While collecting the second coding sample at time , these parts marked with grey box, which belong to repetitive calculations, cannot be encoded to reduce computational burden. Hence, only those parts whose elements are marked as pink, are calculated via waveform coding algorithm. Eventually the second waveform coding is obtained by combining the repeated coding ones with the new coding ones. From the computing process, it can be seen that the computation complexity is reduced thanks to this manipulation.
In the stage of statistical feature extraction, for simplicity, the length of word in pre-defined dictionary is set to = 3. Then the dictionary is determined as follows: where -denote the words in dictionary . To illustrate the generating process of the statistical feature intuitively, the feature extraction process is presented in detail using the WCM at time shown in Fig. 2. The WCM at time is denoted as follows: For example, the calculating process of the first row of in Eq. (5) can be denoted as: Then the feature extraction can be performed with the dictionary defined in Eq. (4). The calculating process of the feature for is illustrated in Fig. 3.  7). From the calculation process of feature extraction, it can be emphasized that the length of the sliding window ( ) can be assigned to a larger size to contain more structure information about the analyzed object in real applications. Nevertheless, a high computational efficiency will be still retained for condition monitoring: In subsequent section, the fault diagnosis scheme using the derived statistical features will be presented combined with 2DPCA and NNC.

A description of 2DPCA
Two-dimensional principal component analysis (2DPCA), as a powerful tool for processing two-dimension data, was developed for image representation [15], and recently was found to be used in fault diagnosis of rotating machinery [16]. In 2DPCA, the global scatter matrix is constructed and analyzed, and usually its largest partial eigenvalues are employed to calculate the projection base. Finally, all original features are projected into eigenspace to obtain the dimension-decreased information (or called sensitive features). The principle of 2DPCA can be concluded as follows.
Suppose that there are two-dimensional samples, and the sample is denoted as , ( = 1,2, ⋯ , ) herein ∈ × . The global scatter matrix is first computed as: where = ∑ is the mean matrix, and (•) represents the transposition operation. Then the eigenvalues and eigenvectors of is derived by solving: Sorting ( = 1,2, ⋯ , ℎ) in decreasing order, the eigenvalues can be denoted as: JOURNAL OF VIBROENGINEERING. AUGUST 2022, VOLUME 24, ISSUE 5 and the eigenvectors can be denoted according to Eq. (10) as: Before determining the optimal projection axes to be applied to perform dimensionality reduction, an eigenvector selecting criterion is introduced here: where is the selecting threshold, and it is specified as = 90 % in this study. According to Eq. (12), the optimal projection coordinate can be constructed as below: For a given sample ∈ × , it can be transformed with the above derived coordinate as: Finally, the eigen matrix of can be evaluated as . More detailed descriptions about 2DPCA can be discovered in Ref. [15].

NNC-based classification
After reducing the dimension of the features, an NNC, a nearest Frobenius distance classifier, is applied to construct the state recognition model. Here, the distance between a projected test sample, ∈ × and a projected base sample ∈ × , is measured with Frobenius norm as: where ‖•‖ denotes the Frobenius norm. As mentioned above, the optimal projection coordinate is obtained as shown in Eq. (13). Therefore the training samples can be projected into eigen space and denoted as , , ⋯ , . Finally, given a testing sample , if: and belongs to the class identity (where ∈ {1,2, ⋯ , } and is the total class number of the training samples), the resulting decision is that is classified as . After this, the real-time collected vibration signal features are also assessed with the projection coordinate that obtained in the NNC training phase, and then input into the state recognition model to identify the bearing health state.

The proposed scheme for bearing diagnosis
The framework of the proposed scheme for bearing diagnosis is depicted in Fig. 4. The proposed method is decomposed into two main steps. The first step is done offline and aims at generating a classification model. The second step, which is achieved online, utilizes the model generated in the first step to classify the bearing health state. The process of the proposed method for offline training is given below: (1) Vibration signal is first measured from the bearing system using acceleration sensors.
(2) Divide the original signal into equal time segments with a sliding window (length: ) and M sub-signals are obtained.
(3) Perform feature extraction according to the methodology described in section 2.
(4) Reduce dimensionality and remove redundancy using 2DPCA, and obtain the projection coordinate .
(5) Train NNC model by projecting all training samples with into eigen space to obtain the training projective features ( = 1, 2,…, ).
For online fault diagnosis, the testing vibration signal is also acquired with length , and the same process of WCM-based feature extraction is conducted to derive the original feature. Finally, it is input into the NNC model to decide its current state. In addition, the computation complexity of the proposed method could be less than the traditional fault diagnosis methods, due to the WCM is derived by a slipping scheme in feature extraction phase when applied in online monitoring.

Simulation study
To express the structures of the collected signals using the proposed coding technology more intuitively, a simulation presentation was conducted first. In this subsection, an impulse signal is employed to observe the performance of the waveform coding. An modified impulse signal from Ref. [17] is expressed in Eq. (17): where , are equal to 110 and 3,900 respectively, the uniformly distributed random number ∈ [−0.1, 0.1] is used to simulate the randomness caused by the slippage, is the sampling frequency ( = 12,000 Hz). And ( − × / − )/ ) ≥ 0 is applied to ensure the causality of the exponential function. In this chapter, the presence of additive Gaussian noise for the simulated signals are considered, and the signal noise ratio (SNR) is set as -2 dB. This simulation study aims to show the waveform coding process of the proposed method. According to the generation process of WCM, it can be seen that the WCM can be dug as the eigenstate of the simulated signals. It is clear that, the intrinsic structure of the analyzed signal is exploited into a two-dimensional matrix. The original signal and its WCM are shown in Fig. 5, where only 1024 sampling points are displayed.
The feature extraction of the simulated data is omitted on account of the large length of the signals (i.e., = 1024), with which the feature matrix is too large to demonstrate here. For the technical details about feature extraction, readers can refer to the example presented in Fig. 3.

Bearing data description
In order to evaluate the performance of the proposed methodology, vibration signals collected from Case Western Reserve University [18] are utilized to validate the effectiveness of the diagnostic scheme. As depicted in Fig. 6  In this section, two experiments were conducted over two different data subsets (A-B) to fully verify the robustness of the proposed method. During the random segmentation phase, the samples were randomly extracted with a fix time window from the acquired raw vibration signals. And 200 samples were created for each bearing with NO (or IF, or OF and or BF) under load condition C1 (or C2, or C3 and or C4). The two classification problems are summarized in Table 2.

Diagnostic performance analysis
In order to eliminate the redundancy of the original coding statistic features and improve the computational efficiency, all collected original features for training sets are tackled by means of 2DPCA algorithm to calculate the projection coordinate for reduction of dimensionality. While processing using 2DPCA, the contribution of the selected largest eigenvalues is set as = 90 %, to establish the projection coordinate. Then the original training and testing features are mapped with the projection basis. As a result, the NNC is trained with the dimension-reduced features of training samples to derive the state classification model. One other thing to note is that training samples under the four load conditions are all applied to train the NNC model in the training phase, and each experimental routine is performed 20 times via randomly selecting training samples, then the average accuracies of the 20 randomized trials are calculated and recorded. The diagnostic performances of the two data sets are shown in Table 3.
Examining the diagnostic performance from Table 3, it can be observed that high accuracies are achieved for both data set A and B. For the first classification problem, testing samples from C1, C2, C3 and C4 were classified with accuracy 94.30 %, 95.36 %, 95.15 % and 91.88 % respectively. Average accuracy, 94.42 %, is obtained and it indicates that the proposed method could diagnose the health states of bearing with satisfied performance. Moreover, for the second classification problem, testing samples from C1, C2, C3 and C4 were classified with accuracy 95.65 %, 95.47 %, 95.06 % and 93.07 % respectively. This shows that the proposed method could be capable of handling this multi-class problem to perform bearing fault classification. Meanwhile, an interesting phenomenon is noticed that the average accuracy using data set B (10 class) is little higher than that using data set A (4 class). The main reason is that the samples in the class of 2B, 5B and 8B are more misclassified than others.
In order to compare the performance for fault diagnosis more intuitively, some published publications are convened to make comparisons in Table 4. Validating with the same bearing data, time-and frequency-domain features are calculated and classified with multiple adaptive neuro-fuzzy inference system (ANFIS) combination in [19], in which 100.00 % and 91.33 % are obtained for the two classification problems respectively. As for the ten-class classification problem, phase space features and Support Vector Regression Machine (SVR) in [20], trace ratio criterion LDA (TR-LDA) and kNN classifier in [21] are put forward to deal with this complicated situation, and accuracies of 90.30 %, 92.50-98.00 % and 99.38 % are obtained respectively. Meanwhile, the present work with 94.81 % classification accuracy seems to be another excellent diagnostic method in handling this problem. Through the comparison and analysis, it can be seen that the proposed diagnostic scheme could demonstrate excellent performance on fault classification problems, overall. Due to the excellent efficiency of feature extraction and the outstanding performance for fault classification, the proposed method is also applicable to be applied in the fault diagnosis system for industrial applications.

Conclusions
Obviously, the waveform coding algorithm proves that the nature and regularity of the time series could be grasped deeply. More and higher requirements are being put forward for industrial applications, as well as the efficiency of the utilized algorithms is receiving more and more attentions. Based on this consideration, a WCM-based feature extraction method is proposed in this paper. At first, the statistical feature is acquired directly from WCM of the time domain series with the assistance of a coding algorithm. Then 2DPCA is employed to tackle these statistical features obtained in previous phase. Finally, fault classification with NNC is applied for classification. Two groups of experiments are conducted to validate the performance of the presented methodology. Reviewing the process of feature extraction and the effectiveness for fault classification, some conclusions are derived as follows.
1) The proposed algorithm has small algorithmic complexity and high efficiency. Thanks to its non-repetitive calculation for the overlapping sections as illustrated in Fig. 2, the proposed algorithm is well suited for online monitoring in real-time fault diagnosis systems.
2) The diagnosis scheme in this paper could show an excellent performance in handling the multi-class classification problems like data set B in Table 2.
As described in Section 2, it can be seen that the parameter for quantization and for WCM-based feature calculation may influence the performance and robustness of the proposed coding method. And they are both selected empirically. Therefore, a future work will be aimed at the optimized selection of these parameters to further improve the robustness of the proposed method.