Adaptive feature selection method with FF-FC-MIC for the detection of mutual faults in rotating machinery

. In the coupling state of rotor unbalance fault and bearing defect fault for rotor system, the signals contain multiple fault components, and the fault diagnosis of the rotor system needs to contain comprehensive multidimensional feature quantities. However, irrelevant feature information in the multi-dimensional feature quantities increases the complexity of classification calculation and affects the efficiency and accuracy of diagnosis. In order to eliminate redundant and irrelevant features in the feature information, and achieve the goal of fewer diagnostic features and good diagnostic results, this paper proposes an adaptive feature selection based on the maximum information coefficient FF-FC-MIC (Feature-to-Feature and Feature-to-Category Maximum Information Coefficient) method. Firstly, the sparse representation algorithm is used to reconstruct the original signal to improve the signal-to-noise ratio, and the multi-dimensional feature quantity of the reconstructed signal is calculated; Secondly, calculate the correlation between features and features through MIC to obtain a feature set of weak correlation between features; thirdly, use MIC to calculate the correlation between features and signal categories to obtain a feature set with strong correlation between features and signals; Finally, the FF-FC-MIC feature selection method is used for feature adaptive selection and input into SVM to complete fault diagnosis. The method is analyzed by simulation signals and the real experiment signals. The results show that the method can effectively remove redundant and disclosed features in the coupling fault, reducing the characteristic dimension to reduce the fault classification time, and improve classification accuracy. Different experimental cases and various feature selection comparison methods further verify the accuracy and applicability of the proposed method.


Introduction
Rotating machinery plays an important role in industrial production, and due to the nature of the working environment and full-time operation, its rotor system is a high incidence of mechanical failures. When the rotor system fails, the type of failure is prone to multiple failures and compound failures. For example, when the rotor has an unbalanced failure, the bearing will be damaged due to continuous impact; Similarly, when a bearing fails due to wear, impact, pitting, fatigue stress, etc., the resulting shock and vibration will adversely affect other parts and easily cause other parts to fail. Compared with a single fault signal, the signal of a compound fault is more complex, and the fault characteristics are difficult to extract. Therefore, the coupling fault diagnosis of the rotor system is of great significance [1][2][3][4].
Accurate and effective extraction of fault features is the key to fault diagnosis [5]. However, in the actual operation of the equipment, there are two problems in the feature separation and extraction of coupling faults: First, the signal components of coupling faults are more complex, and the signals of different faults affect each other, and the signal-to-noise ratio decreases, resulting in weak fault features being covered by noise components. As a result, it is difficult to extract fault features; The second is that the multi-dimensional feature quantities have feature quantities or redundant features that have nothing to do with the classification target, which affects the efficiency of fault diagnosis and reduces the accuracy of fault classification [6,7].
The sparse representation method can characterize any signal as a superposition of a high-quality signal with structural features and a random noise signal, and can extract periodic structural features from a strong background noise environment to achieve sparse expression of the signal. For example, literature [8] proposed a latent component decomposition method based on sparse representation to realize the effective identification of weak fault characteristics of bearings and gears under strong noise background. Literature [9] Proposed a novel collaborative sparsity-assisted Fault Diagnosis (CSFD) method improves feature extraction capability and fault classification performance of rotor systems with strong noise. In the literature [10], the periodic weighted kurtosis sparse denoising is used and combined with the periodic filtering method to extract the repetitive pulses in the compound fault, so as to achieve the purpose of denoising and fault separation of the bearing compound fault signal. The sparse representation can be used to reconstruct the superposition of matched atoms into a high-quality signal with periodic structure characteristics in the signal [11]. Therefore, this paper chooses the sparse representation method for signal noise reduction.
Feature selection measures the correlation between features, removes redundant and irrelevant features from the feature set, improves the accuracy of fault diagnosis, and reduces the training time of the model. PCA was first applied to data dimension reduction, but it can only be used for linear relations between features, and is not suitable for complex nonlinear relations [12]. Compared with the traditional correlation coefficients of Pearson, Spearman, Kendall, etc. [13][14][15], MIC can capture a wide range of relationships when there are enough samples, instead of being limited to specific functional relationships, it has universality and uniformity [16]. Sun et al. [17] proposed a two-stage method of MIC and the approximate Markov blanket to measure the correlation between features and classes, remove redundant features, and select feature subsets. For complex mechanical fault diagnosis, reference [18] proposed a feature selection method that combined Fisher Score and MIC to evaluate the correlation between features to complete feature selection for gearbox faults.
To sum up, this paper proposes an adaptive feature selection method based on maximum information coefficient under sparse representation, which can comprehensively evaluate the correlation of features to features and features to categories. Firstly, in order to solve the empirical selection of parameters in sparse representation, the multi-step grid search method of parameter optimization is used to complete the reconstruction of the signal, and the multi-dimensional feature quantity of the reconstructed signal is calculated; Secondly, the maximum information coefficient MIC is used to calculate the correlation between features and features and between features and signal categories to filter out redundant and irrelevant features; Finally, the FF-FC-MIC adaptive feature selection method is used to complete the feature selection and input to SVM for fault diagnosis.
The main contributions of this paper can be summarized as follows: 1) Orthogonal Matching pursuit (OMP) sparse representation algorithm of adaptive Gabor sub-dictionary removes strong noise of coupling faults to highlight features.
2) The FF-FC-MIC method innovatively and comprehensively considers the feature-feature correlation and feature-category correlation in fault signals to solve the adaptive selection of multidimensional features.
3) FF-FC-MIC removes redundant features in coupling faults, reduces fault classification time and improves fault diagnosis rate. 4) Simulation experiment and real experiment combined with SVM classification model are used to verify the superiority of the proposed method for coupling faults.
Arrangements for other parts of the paper are as follows: Section 2 introduces the theoretical background of the method, including sparse representation and feature selection method based on maximum information coefficient. Section 3 describes the whole diagnostic process of the proposed method for the coupling failure of the rotor system. Section 4 consists of two parts: simulation experiment and real experiment. The feasibility of the method is preliminarily verified by simulation coupling signal, and the actual experiment is verified by three real cases and compared with other methods. Section 5 is the summary of the paper and the prospect of the future work.

Sparse representation
Assuming that the signal is and its length is , it can be regarded as a vector of , the redundant dictionary = , , , ⋯ , , where > , the signal y can be expressed as the computational superposition of the basis function as shown in Eq. (1): where is the coefficient of the basis function, and = , , ⋯ is the sparse representation of the signal.
The sparse representation method adopted in this paper is the adaptive Gabor sub-dictionary Orthogonal Matching Pursuit (OMP) algorithm. Gabor atom is a kind of time -frequency atom and has good time-frequency approximation performance. It has a good tracking effect on the stationary components in the signal and can match the cyclic characteristics in the signal. Eq. (2) is the basic definition: where, ( ) = is the Gaussian window function; = ( , , ) is the atomic parameter; is the scale factor; is the displacement factor; is the frequency factor. Let ( ) be the Fourier transform of ( ), then the Fourier transform of Gabor atom can be obtained according to Eq. (1), as shown in Eq. (3): The calculation process of OMP is to build the over-complete dictionary of Gabor atom through the original signal, and use OMP algorithm to match the original signal, find the best matching atom in the over-complete dictionary and generate the residual signal. A new over-complete dictionary is generated according to the residual signal and the next matching calculation is performed until the residual signal meets the shutdown condition [19].

MIC
The calculation of MIC is based on the concept of mutual information and meshing method of information theory [20]. Mutual information is used to measure the correlation between two variables. Let = , = 1,2, . . . , , = , = 1,2, . . . , , represent the number of samples, and the mutual information between and is defined as Eq. (4): where, ( , ) is the joint probability density of and . ( ) and ( ) are the edge probability density of and respectively. When MIC based on the concept of mutual information is calculated, the set of two data points is distributed in two-dimensional space. is defined as grid of size × , where the range of is divided into segment and the range of is divided into segment , and the limit of the grid is × < , = . . The maximum value of mutual information between variable and variable in different meshes is calculated and called maximum mutual information. Eq. (5-6) are the definition formula and calculation formula of maximum mutual information respectively: * ( , ) = max ( , ), where and are the number of cells divided on the range and , respectively. ( , ) is the mutual information of variable and variable .

Adaptive feature selection method based on MIC
Traditional feature selection only relies on calculating the correlation between features to remove redundant features without considering the signal-independent features. MIC can not only measure the correlation between features and features, but also between features and signal categories [21]. In order to reduce the complexity of the classification calculation caused by the irrelevant feature information in the multi-dimensional feature quantity, this paper proposes the FF-FC-MIC method to realize the adaptive feature selection of the coupled fault multi-dimensional information. Firstly, calculate the MIC between the feature and the feature, select the irrelevant feature of the feature and the feature in order to eliminate the redundant features; secondly, calculate the MIC between the feature and the signal category, and get the feature based on the strong correlation between the feature and the signal category. Finally, the features obtained by the two methods are intersected, and the feature intersection is the final feature set for classification.
According to the FF-MIC feature selection method, the correlation between features is calculated and the uncorrelated features are selected. kinds of signals whose signal category is = , , . . . are set. -dimensional features are selected for each type of signals, and the complete set of features is = , , . . .
. The maximum information coefficient between a feature of a single signal and other features is calculated as shown in Eq. (7)(8), where is the serial number of each feature: where is the serial number of each feature. According to Eq. (1) and Eq. (2), the maximum information coefficients between different eigenvalues of signals are calculated and listed as FF-MIC matrix: The elements in each column of matrix FF-MIC are compared with the threshold value , and the number of elements in each column less than the threshold value is set to to form the set . The more elements in each column are less than the threshold value, the weaker the correlation between the column features and other features is. The average of the number of elements in each column that are below the threshold is calculated, and features that are greater than the average are selected: where, is the set of elements in each column less than the threshold , and is the average value of elements in each column less than the threshold.
All the selected features are composed into a set , where is the selected feature, is the serial number of each selected feature, is the number of the selected feature, ∈ : Multiple types of signals are classified, the selected feature sets of each type of signals are calculated respectively, and the feature sets obtained by all signals are combined to form the feature matrix , ∈ : The feature selection method based on FC-MIC selects features with strong correlation between features and signal categories by calculating the correlation between features and signal categories. Establish the feature selection process as shown in Fig. 1, where the -axis represents the fault category (category is represented by a number), and the -axis represents the size of the eigenvalue of all samples if the sample type is the value of the -axis. The maximum system information coefficients between features and signal categories are calculated, and the FC-MIC matrix with size of 1 × is obtained: The average value of all MIC is calculated as the threshold value , and those greater than the average value are strongly correlated features, which are selected. All strongly correlated features are formed into an eigenmatrix , and is the number of selected features, ∈ : where is the number of selected features, ∈ . According to two parallel feature selection methods, eigenmatrices and are obtained. Then find the intersection of and to obtain the feature subset as shown in Eq. (20):

The framework of fault diagnosis based on FF-FC-MIC
Different fault components in a coupled fault affect each other, a single feature quantity cannot satisfy the diagnosis and recognition of a coupled fault, and the multi-dimensional feature quantity contains irrelevant feature information that will affect the efficiency and accuracy of the diagnosis. To solve the above problems, this paper proposes an adaptive feature selection method based on the maximum information coefficient. The technical route is shown in Fig. 2

Fault diagnosis
Feature subset (1) Use the multi-step grid search method to optimize the displacement factor and frequency factor of the Gabor atomic dictionary to improve the noise reduction ability and computing efficiency of the sparse representation algorithm.
(2) Use the sparse representation algorithm of step (1) to decompose and reconstruct the original signal to reduce the noise interference of the coupled fault signal.
(3) Calculate the correlation between features and features through MIC, and obtain a set of weak correlation features between features to eliminate redundant features of multi-dimensional information.
(4) Calculate the correlation between features and signal categories through MIC to obtain a set of strong correlation features between features and signals, and eliminate irrelevant features.
(5) The FF-FC-MIC feature selection method is proposed for feature adaptive selection, and the selected features are input into SVM to complete the classification and identification of coupling faults.

Experimental simulation analysis
In this section, the effect of the proposed method is preliminarily verified by constructing simulation signals: As shown in Eqs. (21)(22)(23)(24), bearing normal, single bearing fault, single unbalance fault and unbalanced-bearing coupling fault are simulated respectively, where represents Gaussian white noise signal. Simulate these four kinds of signals each with 60 groups, a total of 240 groups, of which 30 groups of data are selected as training samples for each signal, and 30 groups of data are used as test samples. Fig. 3 shows the imbalance-bearing coupling fault simulation signal and its signal composition. The improved sparse denoising algorithm is used to preprocess the simulated coupling fault signal and obtain the reconstructed signal with high SNR, as shown in Fig. 4.   Table 1 describes the detailed information of these features. Finally, use SVM for classification, use radial basis kernel function (RBF) as the kernel function, and optimize the parameters of SVM using grid optimization algorithm. Fig. 5 shows the fault classification under the one-dimensional feature when the feature is the root mean square. Fig. 6 shows the fault classification situation under the two-dimensional feature when the features are the root mean square value and kurtosis.   Fig. 5 and Fig. 6, it can be seen that the effect of fault classification using two-dimensional feature is higher than that using one-dimensional single feature. It shows that the multi-dimensional features of coupling faults express more comprehensive information, so the classification effect is higher than that of single-dimensional features. But which feature vectors to select for fault classification is a problem.
The FF-FC-MIC method proposed in this paper is used to select suitable features. First, feature ordinals with high correlation between feature-feature and feature-category are selected by FF-MIC and FC-MIC methods respectively, and then the intersection of these ordinals is calculated. The result of feature selection is shown in Table 2. Five features are selected using FF-FC-MIC, and the signal features are represented by numbers 1-14.  MIC 1,2,4,5,6,7,8,9,12,13,14 2,5,9,12,3,5,9,10,11,12,13 The five types of features selected by FF-FC-MIC are input into SVM for fault diagnosis. At the same time, the effects of FF-MIC method, principal component analysis (PCA) and this method are compared. The results of the full feature classification without feature selection and the classification effects of the three feature selection methods are shown in Fig. 7. It can be seen from Fig. 7 that the classification accuracy of the FF-FC-MIC feature selection method proposed in this paper is up to 100 %, and the shortest time to complete the classification is only 14.83 s. The simulation experiment analysis proves that this method not only improves the accuracy of fault classification, but also reduces the time of fault classification.

Real experimental signal analysis
In order to analyze and verify the proposed method, a rotor system was simulated by SQ company's mechanical fault comprehensive test bench. The rotor unbalance fault, load end bearing fault, motor end bearing fault and their coupling faults are simulated respectively by using the test bed. Fig. 8 shows the structure composition and fault setting of the experimental platform. The bearings used in the experiment were load bearing of Type ER-12K and motor bearing of type 6203, and the unbalanced mass was composed of a screw and two gaskets mounted on the wheel. In the experiment, the motor speed was set at 1800 r/min and the sampling frequency was set at 12800 Hz. The vibration signal in this paper was collected by acceleration sensor B.  Three experimental cases were used to verify the method proposed in this paper, and each case was composed of signals of four different fault types. The detailed information of signal fault types in the three experimental cases is shown in Table 3. The fault signal types in case 1 are coupling faults between rotor unbalance and bearings; Case 2 is the coupling fault signal between motor bearing and load bearing; Case 3 includes single rotor unbalance fault, single bearing fault, unbalance -bearing coupling fault, and bearing -bearing coupling fault. 240 groups of signals were collected for each experimental case, including 60 groups of each type of fault signal (30 groups of training samples, 30 groups of test samples).

JOURNAL OF VIBROENGINEERING
The coupling fault signal in case 1 is analyzed experimentally using the method proposed in this paper. Firstly, denoising preprocessing was performed on the collected multi-type coupling fault signals, using the improved sparse representation algorithm mentioned in this paper. Fig. 9-12 show the reconstructed signals after denoising. It can be seen from Fig. 9-12 that the reconstructed signal ensures the useful impact features in the original signal while removing the useless features in the signal.     Fig. 15 shows the classification results of unbalanced-bearing coupling fault signals in case1, which are selected by different feature selection methods and then classified by SVM model. It can be seen from Fig. 15 that the classification accuracy of the 4-dimensional features selected by FF-FC-MIC in case 1 is 94.17 %, and the classification time is only 14.95 s. In addition, PCA and FF-MIC feature selection methods as well as full features not selected by feature selection were used for comparative analysis, as shown in Fig. 15. Compared with the full-feature method without feature selection, although the classification accuracy is the same, the classification time of ff-FC-MIC method is significantly reduced due to the removal of useless features after feature selection. And compared with other two feature selection methods, FF-FC-MIC has great advantages in ensuring accuracy and reducing classification time for unbalanced-bearing coupling faults.  In the above analysis of three real cases of rotor system faults, the method proposed in this paper has excellent classification results compared with other methods. Experimental results show that this method can not only improve the accuracy of fault classification but also reduce the time of fault classification. Case 1 and Case 2 mainly deal with the applicability of this method when complex coupling faults exist in the rotor system. Case 3 includes single faults and coupled faults, but the classification effect still reaches 98.33 %. The three experiments show that the proposed method has excellent performance in fault diagnosis of the rotor system.

Conclusion
Aiming at the problem that the features extracted from composite fault signals of bearings contain redundant and irrelevant features, this paper proposes a diagnosis method based on MIC adaptive feature selection. According to the above-mentioned simulation experiment analysis and actual signal experiment analysis, the superiority of the FF-FC-MIC feature selection method proposed in this paper is verified. In the simulation experiment, due to the obvious fault characteristics of the simulation, the use of single or low-dimensional feature quantities can also have a better fault classification effect. However, in the actual measurement of complex coupling fault classification of the signal, due to the mutual interference between the characteristics of the coupling fault, the use of a single feature quantity or random low-dimensional feature quantity cannot achieve a better fault classification effect. Two sets of experiments on measured signals show that the FF-FC-MIC feature selection method can adaptively select multi-dimensional feature quantities, effectively eliminating redundant and irrelevant features. Comparing the method proposed in this paper with other feature selection methods, it is proved that this method improves the classification accuracy of coupled fault diagnosis, reduces the time of fault classification, and verifies the effectiveness and practicability of the method.
1) The proposed method effectively removes redundant features and irrelevant features from the coupled fault feature information, reduces the feature dimension and reduces the time of fault classification.
2) The proposed method improves the accuracy of classification while reducing the feature dimension, which proves the superiority of the algorithm.
In the future, we hope to transfer this adaptive feature selection method to residual life prediction of machinery. It is well known that the selection of suitable features is the premise of life prediction research, but the current research methods are based on experience or complex multiple attempts to select suitable features, and the selection is not necessarily the optimal feature. It is expected to contribute to the research of residual life prediction in the future.