A comprehensive review of mechanical fault diagnosis methods based on convolutional neural network

. Mechanical fault diagnosis can prevent the deterioration of mechanical equipment failures and is important for the stable operation of mechanical equipment. Firstly, this paper reviews three basic methods of fault diagnosis and common methods of data-driven fault diagnosis, focusing on the characteristics and advantages of deep learning and convolutional neural networks. Then, the basic structure and working principle of CNN (Convolutional Neural Networks) and some basic methods to achieve better training results are introduced. In the next place, from data processing, data fusion, sample set construction, and so on, it is reviewed that the method of fault diagnosis based on CNN and their application scenarios and advantages and disadvantages; for another, the related knowledge and concepts of transfer learning are introduced, and some current application scenarios and advantages and disadvantages of mechanical fault diagnosis techniques combining migration learning and convolutional neural networks are reviewed. Finally, the current difficulties and challenges of convolutional neural networks are discussed, and the research directions have been prospected for CNN applied to the field of fault diagnosis. Although there is quite some similar literature reviewed, this review aims to introduce the basic methods of fault diagnosis, which draw forth the basic applications of the fault diagnosis of data-driven, CNN in the domain of fault diagnosis, and the application scenarios and advantages and disadvantages of combining TL (Transfer Learning) and CNN in fault diagnosis, as well as some problems and prospects. It helps researchers to have a basic understanding of this.


Introduction
The fault diagnosis of mechanical equipment needs to monitor, diagnose and predict the state of equipment to ensure the stable operation of the machine [1][2][3].With the deepening of the depth and breadth of industrialization, the safe and stable operation of mechanical equipment and its component mechanical system is becoming more and more important in industrial production.In particular, the failure of large mechanical equipment will usually bring huge economic losses and a large number of casualties.As mechanical equipment becomes more and more sophisticated and control systems become more and more complex, it is particularly necessary to carry out fault diagnosis of mechanical equipment to maintain their stable operation [4][5][6].
At present, mechanical equipment fault diagnosis methods mainly include the based method of the physical model and signal processing, and data-driven [7][8][9][10], as shown in Fig. 1.The method based on the physical model is mainly to obtain the running data of mechanical equipment and analyze the data with the original physical model, to obtain the running state of mechanical equipment.This method requires an in-depth understanding of the working principle of mechanical equipment, and the more complex the mechanical equipment is, the more difficult it is to establish a complete physical model.The method based on signal processing mainly uses a variety of filtering techniques to remove the noise signal to highlight the fault signal.This method needs to understand the relevant fault characterization theory and mathematical knowledge, so it is difficult to popularize widely.The methods based on data-driven are to find out the deep fault of feature representation relationship behind the data, through data mining, the deep fault feature representation relationship behind the data is found, and the mapping relationship between the data and the fault is established to detect and identify the fault source [11].The data-driven fault diagnosis process is shown in Fig. 2. As the methods do not require deep professional knowledge and have a certain intelligence, the data-driven methods have a good development prospect.As an early data-driven method for mechanical fault diagnosis, machine learning not only has a wide range of applications but also has mature algorithm patterns, such as support vector machine, decision tree, random forest, logistic regression, naive Bayes, and neural network.Kumar A. et al. [12] proposed a multi-scale kernel support vector machine for rolling bearing fault diagnosis, which has higher accuracy and generalization ability than traditional SVM.Wan et al. [13] proposed a fault identification method for rolling bearings based on random forest, which has higher recognition accuracy than BP neural network and K-nearest neighbor algorithm.Zhou et al. [14] used simulation software to build a diagnosis scheme for a diesel-electric hybrid power system based on a support vector machine, and the fault recognition rate was up to 98 %, which indicates that the diagnosis algorithm based on a support vector machine could effectively identify faults.C Tutivén et al. [15] proposed an MDP-SVM fault discrimination method, which can identify multiple types of faults without any need.Compared with the traditional algorithm, their accuracy rate is up to 95.397 %.Wang et al. [16] proposed an AdaBoost algorithm based on Decision Tree.Compared with the conventional algorithm, it has better generalization performance and fewer iterations with the same accuracy.
However, traditional machine learning methods require the necessary mathematical processing of the collected data to extract high-quality data features before they can be applied to fault identification and diagnosis, which requires researchers to have certain data processing and signal analysis capabilities.Moreover, the generalization ability of the obtained model is weak and can only solve specific problems.In general, traditional machine learning has the following disadvantages: (1) It requires professional knowledge and a mathematical basis to design and extract features, which is greatly influenced by manual work; (2) The extracted features are shallow features with weak generalization ability; (3) Model training stage and feature extraction stage are separated, and the whole stage cannot be optimized at the same time; (4) Weak data processing ability and difficult to adapt to the background of big data.
Deep learning does not need to manually extract features.It directly imports data into the model and trains it, and finally realizes fault identification of mechanical equipment.According to different network structures, stacked auto-encoders, recurrent neural networks, deep confidence networks, convolutional neural networks, etc. are widely used in mechanical fault diagnosis.For example, Gu et al. [17] proposed a fault diagnosis method based on multi-task deep learning.Compared with the single-task deep learning model, the fault recognition accuracy is higher and has better anti-noise performance.Tran V. T. et al. [18] proposed a fault diagnosis method for industrial robots based on a deep confidence network, and the research results showed that the fault diagnosis accuracy was as high as 99.4 %. Park P. et al. [19] proposed a fault diagnosis method for AC motor systems based on a long short-term memory network (LSTM), and the results show that the proposed method has a high fault recognition rate.Yan et al. [20] proposed a bearing intelligent fault diagnosis method based on an improved superposition auto-encoder, and the results showed that the fault recognition rate reached 98.93 %, which was better than the comparison method.
As typical of the algorithm of deep learning, CNN is widely used in machine vision, speech processing, and other fields [21].The wide application of CNN mainly stems from the following advantages: The multi-layer convolutional structure has powerful feature extraction ability; The design of the pooling layer prevents model overfitting; Feature extraction, feature selection, and classifier training realize the overall joint optimization.The powerful feature extraction ability of CNN can dig out the feature correspondence in the depth of data.Although massive data training leads to excessive parameters and slow iteration speed, CNN greatly reduces model parameters with its local receptive field, weight sharing, and pooling operations, which not only improves the training speed but also prevents overfitting.The above advantages make CNN can be well applied in the field of fault diagnosis.
Although CNN has achieved good results in the domain of mechanical fault diagnosis, the introduction of TL can better improve the fault diagnosis recognition rate and model generalization ability in the face of small samples.At present, transfer learning combined with CNN has the greatest development potential in the field of mechanical fault diagnosis [22].
The existing literature usually reviews the deep learning or transfer learning methods, theoretical architectures, and related books, or makes classification experiments to verify the effectiveness of the described methods for the problems existing in specific application objects, then, pointing out the problems and future development prospects of the described methods [23][24][25].For example, Qian et al. [26] take the rotating machinery of nuclear power plants as the object and introduce the deep learning algorithms, and theoretical architectures in more detail, and comparative experiments of the same type are done to verify that the described models have better effectiveness and robustness.This paper summarizes the basic methods and development direction of fault diagnosis, which leads to the current hot research directions of machine learning and deep learning.Taking convolutional neural networks in deep learning as an example, it introduces its basic structure, principle, function, and application scenario combined with the literature.Due to the small sample size, the concept of transfer learning is introduced, and the concept and basic methods of transfer learning are briefly introduced.However, the importance of the combination of transfer learning and convolutional neural networks is emphasized because of the functions, effects, and application scenarios of mechanical fault diagnosis.In this review, the development of mechanical fault diagnosis based on convolutional neural networks in recent years is summarized from the basic concepts, basic methods, and application scenarios, rather than the existing theoretical framework and mathematical derivation, to facilitate the research of researchers.
In this paper, the development of mechanical fault diagnosis technology is reviewed, with an emphasis on the data-driven fault diagnosis method.In the data-driven fault diagnosis method, the advantages and disadvantages of the traditional machine learning-based diagnosis method and the deep learning-based diagnosis method are analyzed one by one.In the framework based on deep learning, the basic structure and principle of work of convolutional neural networks are introduced, and their peculiarities and advantages of theirs are analyzed.Firstly, the fault diagnosis process based on CNN is introduced, and then the fault diagnosis methods based on CNN are summarized and combined with specific mechanical system fault research.Secondly, the principle of transfer learning is introduced, and the application of convolutional neural network combined with transfer learning to mechanical system fault diagnosis method is summarized.Finally, some conclusions are given, the current difficulties and challenges of convolutional neural networks are discussed, and the research directions of CNN in the domain of fault diagnosis have prospected.As shown in Fig. 3, the structure of this paper is organized as follows: the basic structure and working principle of CNN are introduced in the second part; the third part is about the mechanical fault diagnosis based on the convolutional neural network; the introduction of TL (Transfer Learning) and the combination of CNN and TL of mechanical fault diagnosis are detail organized in the fourth part.Finally, some conclusions and prospects are given.

Basic principles of convolutional neural networks
The convolutional neural network is a kind of multi-layer feedforward neural network.Its local receptive field, weight sharing, pooling layer, and other structures can greatly decrease the parameters of a model and improve the speed of training without losing the expression effect.The structure of a typical convolutional neural network usually includes an input layer, convolutional layer, pooling layer, fully connected layer, and output layer.The essence of the multi-layer stacking of convolution and pooling is to extract features from the original data many times, and its function is equivalent to the filter.After several times of extraction, the deep features are gradually obtained which can be used for fault classification or recognition are obtained.And these characteristics do not change with the geometric transformation of the data.Because it can make data sparse, it is very suitable for deep learning to deal with big data.Its basic structure is shown in Fig. 4.

Fig. 4. The framework of CNN
The input layer is to import the collected information into the network structure in a way acceptable to the network structure after data processing.The quality of data processing usually directly affects the fault recognition rate.Convolutional layers mainly perform convolutional operations, and different combinations of convolutional layers can extract the deep features of input layers, which is the core component of convolutional neural networks.In the pooling layer, the downsampling method is used to process the data again, which can not only reduce the computation amount but also reduce the risk of overfitting and speed up the convergence.There are two main pooling methods, maximum pooling, and average pooling.After multiple convolutional layers and pooling layers, CNN usually adopts a single-layer or multi-layer fully connected layer as a high-level structure to carry out higher-level inference and classification.Softmax classifier is usually used for output.
The above is the basic structure of the convolutional neural network.After the convolutional neural network is built, the data will be imported into the network for training.In order to obtain good training results, the following four aspects should be paid special attention to: (1) With the use of dropout technology, the output results of the network get a good performance in the training set, while the fitting degree in the validation set is very poor, which is the data overfitting.The dropout technique is used to randomly drop some neurons in each layer of the neural network with a certain probability, so that the network structure is not repeated in each iteration update, thus avoiding the overfitting problem.Dropout is either placed in the input layer or the fully connected layer, both of which can achieve good performance.( 2) Selection of learning rate; it is equivalent to the step length of the learning process.For example: in the process of gradient descent, if the step selected is too small, the minimum value of the model cannot be found even after many iterations of the network structure, if the step selected is too large, the model may be swinging back and forth between the minimum value.Only when it is chosen properly can the network find the minimum quickly and well.At present, the selection of the learning rate is mainly due to the accumulation of experience without a well-fixed algorithm.(3) Size of convolution kernel: only the dimension of the convolution kernel is greater than 1 and can play the role of the receptive field; The larger the dimension of the convolution kernel, the more parameters.In the case of the same receptive field, the smaller the convolution kernel, the smaller the parameters and computation required by the network.( 4  The corresponding fault data can be collected by the sensor signal acquisition system through experiments or in real fault scenarios, and then the construction of data sets is completed after preprocessing.However, since a single signal acquisition system cannot meet the requirements of complex mechanical fault diagnosis, a multi-sensor is used to collect multiple signal information.And then data fusion is carried out to complete the construction of the sample dataset, which is used to improve the diagnostic accuracy of a complex mechanical system and provide support for the construction of a fault diagnosis model of a convolutional neural network.

Fault diagnosis method based on convolutional neural network
Although the generalization ability of the convolutional neural network has been greatly improved, the corresponding convolutional neural network still needs to be established for different fault diagnosis problems.In a shallow network, deep features cannot be extracted, and the fault diagnosis accuracy is not high.If the network level is too deep, the effect will be worse, which is the problem of network degradation.So far, there is no universal convolutional neural network structure that can be applied to all fault diagnoses, so the construction of a convolutional neural network model is particularly important.
The convolutional neural network is usually used to process image-type data.Some researchers transform the collected signals into two-dimensional images through data processing for mechanical fault diagnoses, such as Che et al. [27] proposed a fault diagnosis algorithm for rolling bearing cages based on CNN to solve the problems of unstable vibration signals and difficulty in extracting features.The fault recognition rate reached more than 99 %, with good generalization and robustness.Azamfar M. et al. [28] proposed a fault diagnosis method for rolling bearings based on a convolutional neural network for the characteristics of non-stationarity, nonlinearity, and easy interference of rolling bearing signals, which can change the dimension of the signal to suit the network input.The results show that the method can effectively identify fault types and has high stability.Piedad E. J. et al. [29] proposed a fault diagnosis method based on a two-dimensional convolutional neural network for rolling bearings under variable working conditions.The results show that compared with a traditional convolutional neural network, this method has significantly improved fault diagnosis accuracy and diagnosis efficiency.
Due to the translation invariance of a convolutional neural network, the data can be processed into two-dimensional picture information for fault diagnosis.However, the collected onedimensional temporal signals can more completely contain the fault data of the machinery, so some researchers begin to directly input the collected one-dimensional signals into the network for fault diagnosis.Such as Du et al. [30] proposed a fault diagnosis method for analog circuits based on a one-dimensional convolutional neural network, which can realize end-to-end fault diagnosis for analog circuits, effectively extracting deep fault of features, and having higher classification accuracy and classification stability.Jin et al. [31] proposed a drill pipe fault diagnosis model based on a one-dimensional convolutional neural network, which can effectively identify drill pipe fault types with an average accuracy of 98.7 %.It also has good performance in different working conditions and noisy environments.Wu et al. [32] proposed a kind based on adaptive noise cancellation and a one-dimensional convolutional neural network gearbox bearing fault diagnosis method for strong vibration interference problems, which can separate periodic signal and random signal, intelligently extract fault features of a random signal, and realize the high accuracy of fault diagnosis under the interference of strong vibration.Niyongabo J. et al. [33] proposed a fault diagnosis method based on one-dimensional convolution and orthogonal regularization to solve the problem that the fault diagnosis effect of industrial robots is not good.The results show that compared to existing methods, this method has a higher fault recognition rate and can effectively diagnose faults in industrial robots.
However, mechanical systems tend to be more complicated.By adopting data augmentation and expansion, multi-scale data processing, and multi-sensor fault signal acquisition, the fault types of mechanical systems can be more accurately identified.Such as An et al. [34] proposed a model development fault diagnosis method based on a multi-scale convolutional neural network for rolling bearings under the conditions of large noise, variable load, and complex working conditions, which can effectively improve the fault diagnosis accuracy and enhance the robustness of the network.Li et al. [35] proposed a fault diagnosis method based on multi-scale onedimensional deep convolution aiming at the difficulty of fault feature extraction and fault feature recognition for electromechanical equipment.The results show that this method has a high fault recognition rate, high diagnosis accuracy, and strong robustness.Hasan M. J. et al. [36] proposed a bearing fault diagnosis method based on multi-sensor information fusion to solve the problem that it is hard to diagnose the internal and external multi-excitation mechanical faults of aero-engine bearings.Compared with the traditional methods of fault diagnosis such as SVM and ANN, the accuracy of the method is improved by 36.92 % and 18.9 %, respectively, which can effectively identify fault types and has a high accuracy rate.Other relevant literature is shown in the following Table 1.
As mentioned in the above table, the convolutional neural network can be divided into one-dimensional convolution and two-dimensional convolution according to the dimension of data processing.Researchers have achieved good results in fault diagnosis of complex mechanical systems based on this method.[37,38], Mine hoisting mechanism dynamic system fault diagnosis [39], Diesel engine misfire fault realtime diagnosis [40], Fan blade icing fault detection [41,42], AC-DC transmission system fault diagnosis [43] It  [66][67][68][69][70][71] The end-to-end fault diagnosis is realized by using a onedimensional vibration signal directly.The integrity of signal transmission is maintained.
The original signal collected directly contains a lot of noise, so extracting hidden fault features from the original signal has higher requirements for the network structure and specific hyperparameters.

Variable convolution kernel
Bearing fault identification [72][73][74][75], Numerical control Machine tool ball screw pair Fault diagnosis [76], Fault identification of data imbalance [77] Select part of the literature in the above table, respectively investigate the accuracy of the model and the comparison model in the test set, and draw the line chart, as shown in Fig. 6.
In Fig. 6, the horizontal axis represents specific references and the vertical axis represents accuracy.CNN variants mean a variant of the framework of CNN; CNN means simple convolutional neural network; ML stands for machine learning, such as SVM, etc.It is not difficult to find from the figure above that the deep learning framework represented by CNN generally performs better than machine learning in the field of fault diagnosis.In addition, machine learning usually requires manual extraction of fault features, which has poor generalization compared with end-to-end deep learning frameworks.In the face of complex fault diagnosis, generally only relying on convolutional neural networks cannot achieve good diagnosis effects, so it is necessary to develop a specific fault diagnosis framework based on CNN to achieve better diagnosis results.However, developing new convolutional neural network structures for different fault problems not only wastes time but also consumes a lot of resources.The introduction of transfer learning can help the network accumulate prior knowledge and complete fault classification better even in the case of few fault samples.The research is mainly based on the same distribution of high-quality and sufficient training samples and test samples, but the real collected data usually cannot well meet the above conditions.Transfer learning relaxes the above restrictions, and fine-tuning the network parameters through the trained model can also achieve high diagnostic accuracy under small samples, at the same time it greatly saves the training time of the network model.The applications of transfer learning combined with deep learning in the diagnosis of fault have become the most potential research direction.

Fault diagnosis method based on convolutional neural network and transfer learning
As a classic and mature deep learning model, CNN is also limited by the conditions of the same distribution and the number of training samples.Although convolutional neural networks have achieved good results in the field of fault diagnosis, the introduction of transfer learning will further make them perform better in the field of fault diagnosis.To solve the problem of insufficient label sample data of fault diagnosis objects, researchers began to study the theory of transfer learning in the 1990s.Currently, transfer learning is widely used in the domain of fault diagnosis, and Stanford Professor Andrew Ng says that transfer learning will be the driving force of machine learning in the future.

The concept and classification of transfer learning
There are two basic concepts in transfer learning: domain and task.The existing knowledge is called the source domain and the corresponding task is called the source task.The knowledge to be learned and the corresponding task respectively target domain and target task.The domain contains two contents: sample set and its distribution in feature space; the task also has a corresponding decision function of feature sample and corresponding label.
Transfer learning is to extract transferable structures or parameters from known source domains ( ) and source tasks ( ) and apply them to target tasks ( ) in another situation to solve problems (see Fig. 7).As shown in Fig. 8, transfer learning can be divided into the following types according to different classification methods.

Fault diagnosis based on CNN and transfer learning
Fault diagnosis based on transfer learning aims to accumulate knowledge of the learned model and apply it to the target domain to improve learning efficiency.From the perspective of transfer learning, traditional machine learning usually assumes that the training set and test set have the same feature distribution, including support vector machine, artificial neural network, and so on.Traditional machine learning is a process of starting from scratch whenever it comes to new tasks and new fields, and new fields usually do not have a large number of data samples.The knowledge  is transferred to help solve the new task  , and the learning efficiency is improved while adapting to the small sample data.
The diagnosis process of fault diagnosis combined with CNN and transfer learning is shown in According to the classification of transfer learning methods, feature-based transfer learning mainly extracts features from the dataset of the source domain and a target domain and then maps the features to the same space to reduce the characteristic differences of the source domain and find similar features.For instance, Liu et al. [78].proposed a rolling bearing fault diagnosis method using deep transfer learning and adaptive weighting to solve the problem that additional fault state samples would affect the fault diagnosis accuracy.The results show that this method can overcome the influence of additional fault state samples.Their diagnostic accuracy is above 89 %, while the other method of comparison is below 80 %.Zhong S-s et al. [79] aiming at the inconsistency of bearing data feature distribution under variable working conditions, proposed a subdomain adaptive deep transfer learning fault diagnosis method.The results show that the average accuracy of this method is as high as 99 %, which is more effective and superior to other methods.Zhao et al. [80] proposed a UATL (Unsupervised Adversarial Transfer Learning) bearing fault diagnosis method to solve the problems such as the difficulty of obtaining bearing fault data labels and the weak generalization ability of the model.The results showed that the method had high diagnostic accuracy and good model generalization ability.The results show that this method can achieve high accuracy of fault diagnosis using only a small number of samples, which is of a certain value to the application of transfer learning in bearing fault diagnosis.Hasan M. J. et al. [85] proposed a rolling bearing fault diagnosis method based on AlexNet and transfer learning to solve the problem that traditional bearing fault diagnosis methods require complex signal processing, expert knowledge, and fewer fault data.The results show that the proposed method achieves 100 % diagnostic accuracy in the Case Western Reserve University Bearing Data Center dataset.At the same time, the proposed method still has high diagnostic accuracy in the case of scarce fault data, which is superior to the existing advanced methods.Wang et al. [86] proposed a fault diagnosis method based on time-frequency analysis VGG19 network transfer learning, aiming at the problem that bearing fault diagnosis relies on expert experience to extract features manually.The results showed that the accuracy of diagnosis of this method was 5.42 % higher than that of the comparison method.At the same time, the validity of the method in signal processing applications is verified, and it can solve the problem of fault diagnosis with a small sample.Hakim M. et al. [87] proposed a rolling bearing fault diagnosis method based on a one-dimensional CNN with multi-source domain transfer learning to solve the problem of the dependence of mechanical equipment fault diagnosis on complete data and the scarcity of actual malfunction data.The results show that the classification accuracy of the proposed method is significantly higher than that of the traditional fault diagnosis methods in the case of sparse fault data, and it has a faster convergence speed and better stability.
Combined with the advantages of transfer learning, the application of CNN to mechanical fault diagnosis has attracted more and more attention from researchers.A convolutional neural network model is trained with a large number of labeled data, and the network structure or model parameters in the model are extracted for another situation, in which there are usually few fault sample data, so it is difficult to train a high-quality fault diagnosis model.The trained source domain model is migrated to a new target domain for fault identification or diagnosis, which not only reduces the training time but also improves the recognition accuracy.The combination of CNN and transfer learning provides technical support for fault diagnosis in new fields and new working conditions.Other relevant literature is shown in the following Table 2.
As mentioned in the above table, researchers mainly study different situations and different categories of fault diagnosis from feature-based and model-based transfer learning.In most cases, convolutional neural networks are used to extract features, train models, and migrate network models to the target domain can help identify fault types in the target domain.
Select part of the literature in the above table, respectively investigate the accuracy of the model and the comparison model in the test set, and draw the bar chart, as shown in Fig. 10. that the deep learning framework can be used to extract fault features and migrate the network structure to a new working condition, and then better fault diagnosis results can be obtained by fine-tuning the network.In particular, the fault diagnosis method combining CNN and TL can effectively identify the fault types in mechanical and other fields.Although the theoretical research of transfer learning has been successfully applied and practiced, it still needs further research.

Conclusions
As a mature deep learning model, the convolutional neural network has been widely used in the field of mechanical fault diagnosis.This paper introduces the basic structure and principle of CNN and analyzes and summarizes the characteristics of its application in fault diagnosis in recent years.The concepts of transfer learning and the application of transfer learning combined with convolutional neural networks in fault diagnosis are introduced.Finally, some difficulties still faced are introduced, and the future development direction is forecasted.Although convolutional neural networks and transfer learning have achieved good results in the field of mechanical fault diagnosis, there are still some problems to be further studied.

Difficulties
As a classic deep learning model, a convolutional neural network has long been introduced in the field of mechanical equipment fault diagnosis.Although some achievements have been made, the characteristics of mechanical fault diagnosis and many difficulties still limit the further development of convolutional neural networks.Its characteristics and difficulties are as follows: (1) The mechanical system is huge and complex, and the fault formation level is different.
(2) Faults and features are not simple linear correspondence, but complex nonlinear mapping.
(3) There are many interference factors in the fault signal and the fault data are less, and the fault mode is not complete.(4) The amount of data is too large, limited by the hardware, and the information processing capacity of the computer is insufficient.

CNN fault diagnosis based on data imbalance
A large amount of data can effectively train network models.However, at present, the construction of data sets is faced with such a problem: there are more simulated fault data and less actual fault data.This is because when a fault occurs, the equipment cannot continue to operate, otherwise, a major accident may occur, so the actual data monitored are mostly normal.The imbalance between fault data and normal data brings difficulties to network training.The generative adversarial network can be used to simulate fault data to improve the training difficulties caused by data imbalance.Transfer learning can also be used to transfer similar network structures to the target task, and fine-tune the network through small samples to solve the fault identification of the target task.

Migration application of CNN diagnostic system in the same type of devices
The successful application of the CNN fault diagnosis system requires a large amount of data for training, and it is unrealistic to carry out a large number of fault tests for each device.Therefore, if the data mapping between devices of the same type can be realized, and the mutual reconstruction can realize the migration of fault pattern recognition, the CNN fault diagnosis system can be popularized in the field of fault diagnosis.

Application of CNN fault diagnosis method based on multi-sensor information fusion
With the increasing complexity of mechanical equipment, a single sensor can no longer describe the complete fault type.Multi-sensor fault information collection and integrated applications can improve the accuracy of fault identification.However, there are still some problems in the data fusion technology of multi-type sensors.At present, there is no optimal method for multi-sensor information fusion, and different fusion levels have their limitations.The correct fusion method for different fault tasks is also different.Therefore, the application of CNN fault diagnosis based on multi-sensor information fusion is still a major challenge.

Fig. 3 .
Fig. 3.The structure of this paper ) Selection of activation function: The nonlinear activation function can enhance the fitting ability of the network.Now the commonly used activation function is ReLU, which can alleviate the problem of vanishing gradient in network training, improve training speed and reduce training time.3. Mechanical fault diagnosis based on convolutional neural network 3.1.The basic process of fault diagnosis based on a convolutional neural network Due to the excellent performance of the convolutional neural network in many fields, researchers applied the convolutional neural network to the field of fault diagnosis.Its general steps mainly include: first, defining the fault mode; Second, the construction of data sets; Third, data preprocessing; Fourth, the convolutional neural network model is built.Fifth, model training of the convolutional neural network.Sixth, the test samples are input into the model for model evaluation and optimization; Finally, the fault diagnosis results are obtained.The process structure is shown in Fig. 5.

Fig. 5 .
Fig. 5.The process of fault diagnosis To construct the fault diagnosis of the model of CNN, the corresponding fault data set should be established first.The data quality of the input CNN model directly affects the result of fault diagnosis.Different diagnosis objects require different fault signals.The corresponding fault data can be collected by the sensor signal acquisition system through experiments or in real fault scenarios, and then the construction of data sets is completed after preprocessing.However, since a single signal acquisition system cannot meet the requirements of complex mechanical fault diagnosis, a multi-sensor is used to collect multiple signal information.And then data fusion is carried out to complete the construction of the sample dataset, which is used to improve the diagnostic accuracy of a complex mechanical system and provide support for the construction of a fault diagnosis model of a convolutional neural network.

Fig. 7 .
Fig. 7.The concept of transfer learning

Fig. 8 .
Fig. 8.The classification of Transfer learning

Fig. 9 .
Usually, the collected source domain signals are input into the established convolutional neural network model.Train the convolutional neural network until the ideal result is obtained, and then transfer the parameters or network structure of CNN to the newly built fault diagnosis network.The collected fault signals of the target domain are input to the newly built fault diagnosis network, which can help the target domain to complete fault diagnosis and reduce the training time when the fault data of the source domain is less.

Fig. 9 .
Fig. 9.The fault diagnosis of CNN combined with transfer learning Model-based transfer learning is to use the source domain dataset to train the network, transfer the network structure or parameters to the target domain, and realize the parameter sharing between  and  .Kumaresan S. et al. [81] aiming at the problem of insufficient training data for fault diagnosis, proposed an intelligent fault diagnosis method for rolling bearings based on the combination of long short-term memory and transfer learning.The results show that this method is more intelligent than the traditional fault diagnosis methods in identifying various fault categories, and has better accuracy and generalization ability.Xia et al. [82] proposed a spacecraft fault diagnosis method based on deep transfer learning to solve the problems of a few spacecraft telemetry data samples, high noise, and difficulty in fault identification in traditional fault diagnosis.The results show that this method can quickly and accurately identify spacecraft fault types.Qin et al. [83] proposed a fast fault diagnosis algorithm based on transfer learning and a deep residual network to solve the problems of large training amounts and the long training time of existing deep learning for rolling bearing fault diagnosis.The results showed that in the experiments of Case Western Reserve University and Paderborn University datasets, compared with the traditional fault diagnosis algorithm, it achieves higher diagnosis accuracy and less training time, which can be used for the rapid diagnosis of bearing faults in a practical environment.Udmale S. S. et al. [84] aiming at the problem of insufficient training samples in the actual environment, proposed a bearing diagnosis method based on small sample transfer learning.The results show that this method can achieve high accuracy of fault diagnosis using only a small number of samples, which is of a certain value to the application of transfer learning in bearing fault diagnosis.Hasan M. J. et al.[85] proposed a rolling bearing fault diagnosis method based on AlexNet and transfer learning to solve the problem that traditional bearing fault diagnosis methods require complex signal processing, expert knowledge, and fewer fault data.The results show that the proposed method achieves 100 % diagnostic accuracy in the Case Western Reserve University

Fig. 10 .
Fig. 10.Reference method bar chart In Fig. 10, the horizontal axis represents specific references and the vertical axis represents accuracy.DL represents the combination of deep learning framework represented by CNN and transfers learning; ML represents the combination of machine learning and transfer learning.It is not difficult to see from the above figure that the DL curve is generally more accurate than the ML curve.This indicates that the end-to-end fault diagnosis carried out by the deep learning framework represented by CNN is more advantageous than the machine learning model of manual feature extraction followed by transfer learning, mainly because the end-to-end fault diagnosis can optimize the feature extraction and fault classification as a whole.However, feature extraction and fault classification of machine learning are carried out separately, so both cannot be taken into account in the process of transfer learning, resulting in low accuracy.Research and practice show

Table 1 .
Convolutional neural network classification table

Table 2 .
The classification of transfer learning