Abstract
Autonomous underwater vehicles (AUVs) are indispensable equipments in underwater detection, surveying, and investigation. As the main power source of AUV, the accurate and timely fault diagnosis of thruster plays key role in ensuring its safe navigation. However, this task is full of challenges due to the complication, vagueness, and randomness of underwater settings. To address the issues, a hybrid diagnosis model named CLSTM-CNN-Attention is proposed, which combines convolutional long short term memory (CLSTM), convolutional neural network (CNN) and attention mechanism. Specifically, it combines an improved hybrid network to realize the connection of long-term and short-term memory (LSTM) with CNN in parallel, which can capture the time-related and space-related fault information of input signal simultaneously. At the same time, a new linear rectification function is also introduced into the hybrid model to enhance its anti-interference capability. Finally, the diagnostic performance of the hybrid model is further improved by adding attention mechanisms, which could better focus on the fused information. Experimental and comparison results indicate that the suggested approach has remarkable interference suppression capacity and surpasses other relevant methods, demonstrating good performance in fault diagnosis of AUV thruster.
1. Introduction
AUV could autonomously complete various tasks such as seabed topography measurement, marine life survey, seabed resource exploration, and underwater rescue without human intervention. Its operating environment is extremely harsh due to the facts that there are extreme conditions such as ocean high pressure, corrosion, and complex environmental interference. Due to the harsh environment and difficulty in maintenance of the failed AUV thruster, it will cause incalculable disasters to the entire AUV. The safety of AUV thruster has become the most important factor for its normal operation [1-2]. Studying timely and effective fault diagnosis methods for AUV thrusters has important safety and economic significances.
Currently, there are two principal approaches for failure detection of AUV, namely model-oriented approach and data-propelled approach. The former often relies on human experience and expert knowledge, which is challenging due to the complexity of AUV’ real-world operating environments [3]. In contrast, data-oriented methods don't necessitate pre-modeling [4], and they could learn and analyze large amounts of actual data to extract fault features, which is more flexible and feasible for practical engineering applications in fault diagnosis of AUV. Kinds of data-driven fault diagnosis methods for AUV have been arising in recent years. Normally, multi-source information could increase the diagnosis accuracy of data-propelled approach [5, 6]. To realize AUV fault diagnosis by using multi-source information, a hierarchical attention mechanism based on multi-source data fusion is presented, which includes an encoder-decoder network, a fusion network integrating encoder and the attention mechanisms [7]. A diagnostic network for AUV is proposed by integrating wide convolutional neural networks with extreme learning machines [8], which doesn’t rely on the domain-adaptation algorithms or require target-domain information. A genetic algorithm-supported ensemble learning is proposed [9], which merges GA with EL to detect AUV’ faults. Besides, an intelligent data-based online AUV anomaly-detection system is proposed, which combines a new LSTM with a kinds of signal processing method, namely VAE [10], whose advantage over the model-based scheme is also verified. Some researchers integrate phase-space reconstruction with extreme learning machines to diagnose sensor faults of AUVs [11]. A complex networks is presented to figure out the navigational states of AUV [12]. A fault diagnosis model for AUV is crafted by integrating a neural network with an attention mechanism [1], in which a bidirectional gated recurrent neural unit and a multi-layer perception network are employed. A multi-scale convolutional neural network called meta self-attention (MSAMS-CNN) for diagnosing AUV thruster’ faults is proposed [13], which takes the two-dimensional frequency spectrum of the collected vibration signal as the input of neural network, and it can effectively detect AUV thruster faults even using small number of fault samples. An unsupervised fault diagnosis framework for the underwater thruster system is proposed, which utilizes estimated torques and a multi-head convolutional autoencoder, and devises a multi-head convolutional autoencoder to autonomously extract discriminative characteristics from unclassified multi-scale inputs [14]. The RNN is used to identify the fault pattern of the underwater propulsion system, and a terminal sliding mode observer (SMO) is also designed to further strength its fault identification ability [15], and effectiveness of the improved model is validated through sea trials. A combined framework for recognizing underwater thruster malfunctions is put forward by fusing physical models and generative adversarial networks, which incorporate the voltage data and thruster moment into the GAN's architecture and cost function [16]. However, most of the above-mentioned methods were tested in a stable state, which did not fully consider various related interference factors and the effects of different operating conditions. These interferences and changes in operating conditions may lead to the decreasing of the models’ applicability’s, thereby greatly reducing their accuracies and reliability.
Based on the above stated, it is observable that data-oriented methods not only have been employed broadly in fault diagnosis. Besides, they are also used in the other areas [17-19]. For example, in the aviation field [20-22], the data-oriented methods have still made significant progress, driving the rapid development of flight control systems, trajectory planning, and drone technology. By adopting advanced algorithms and models, these technologies can better address various challenges in the aerial environment. Although the above data-driven methods have achieved satisfactory results, they face numerous interferences from the complex marine environments. Besides, there are also existing environmental noise disturbances in the ocean caused by factors such as wind, rain, ships and so on, which pose added challenges to the effectiveness of data-oriented methods in fault diagnosis [23]. Therefore, the above methods may not be suitable for dealing with relevant interference, which greatly limits their application scope. To address this problem and inspired by the above-mentioned references, an intelligent diagnostic technique for AUV thruster based on a hybrid deep neural network learning model is proposed, which is named as CLSTM-CNN-Attention. The proposed method cleverly integrates multiple neural networks, which can simultaneously extract the temporal and spatial features of input signal, thereby significantly improving the utilization of signal features. In addition, the ability of the proposed model in suppressing relevant interference is further enhanced by improving the algorithm, effectively addressing the interference problems in marine environments faced by AUVs. The fundamental contributions of this scholarly paper are presented as follows:
1) The CLSTM network is proposed by introducing convolution operation into traditional LSTM. Besides, on the basis of the serial mixing of CLSTM and CNN, a new linear rectification function is introduced into the hybrid model to further enhance its ability in resisting noise.
2) CLSTM is connected in parallel with the CNN network to capture the temporal and spatial fault information of input signals simultaneously. An attention mechanism is also introduced into the hybrid model to better focus on the autonomously fused information, further improving its diagnostic performance.
3) The end-to-end strategy is adopted by the proposed method, and its effectiveness and superiority over the other related methods are verified through experiment and comparison.
The impetus for marrying CLSTM, CNN and Attention is three-fold.
Firstly, CLSTM alone loses fine spatial fault textures because its internal gating operates on vectorised sequences, not local receptive fields.
Secondly, pure CNN cannot model long-range temporal dependencies, so faults occuring at intermittent or low-speeds are easily missed.
Thirdly, neither of the above two architectures explicitly weights the most discriminative time-frequency bins when strong oceanic noise masks the signature.
The parallel architecture of CLSTM and CNN represents not only a simple aggregation of the strengths of both networks, but also a response to the intrinsic characteristics of fault signals. Specifically, the fault features of an AUV propeller manifest temporally as a dynamic evolutionary process (such as persistent variations in vibration amplitude and periodic disturbances), whilst spatially they exhibit local transient patterns (such as high-frequency impacts and local harmonic distortions). By incorporating convolutional operation into LSTM to form CLSTM, the CLSTM preserves the LSTM's capacity for modelling long-range temporal dependencies while simultaneously perceiving the local spatial structure of input sequences. This enables effective capturing of fault evolution patterns across the temporal dimension. Concurrently, the CNN leverages its local receptive fields and weight-sharing mechanism to focus on extracting local defect patterns within the spatial dimension of the signal. When CLSTM and CNN are combined in parallel, an attention mechanism adaptively weights the fused features. This approach not only enables simultaneous utilisation of temporal and spatial information but also enhances the model’s ability to focus on critical fault features amidst strong noise. Compared to standalone CLSTM or CNN baselines, the unified model could achieve higher accuracy.
Unlike pure informatics research that focuses on algorithm innovation, this study is rooted in the urgent engineering needs of AUV thruster fault diagnosis in practical marine operations. In engineering scenarios, AUVs face three core pain points that restrict fault diagnosis reliability: (1) Harsh marine environmental interference: Random noise from waves, wind, ships, and biological activities (0-150 dB) masks fault signatures, leading to misdiagnosis in traditional methods. (2) Diverse and realistic fault modes: Practical faults (e.g., propeller imbalance caused by marine debris, chute loosening due to long-term vibration) are non-idealized, requiring fault simulation that aligns with engineering reality. (3) Strict requirements for diagnostic practicality: Engineering applications demand high accuracy (to avoid false alarms/missed diagnoses), fast convergence (for on-board real-time deployment), and compatibility with common industrial acquisition hardware. To address these engineering challenges, the proposed CLSTM-CNN-Attention model is not a pure algorithmic combination but a tailored solution for engineering scenarios, with each design linked to specific engineering problems.
The remaining parts of the paper are arranged as follows: Section 2 focuses on the basic theories and flow chart of the proposed method. Section 3 is the experiment verifying effectiveness of the suggested approach, and its advantages over the other related methods are also presented in Section 3. The conclusion is obtained in Section 4 at last.
2. Basic theory
2.1. Submodules of the hybrid model
The hybrid model consists of three modules, namely CLSTM, CNN, and attention module. LSTM is the enhanced version of recurrent neural network (RNN), which could address the gradient vanishing problem of RNN effectively [24]. LSTM not only can capture the evolving temporal patterns and related distant dependencies of input sequences, but also extract the temporal feature information of input sequences effectively. In this study, a LSTM network as shown in Fig. 1 is used, in which the gate architecture is composed of three distinct gates: the forget gate, the input gate, and the output gate. Based on Fig. 1, it can be seen that the results generated by the three gates are transmitted to a multiplication element respectively, which can control the input and output of information flow and the form of cell units. In the paper, pre-apply a convolutional layer to LSTM to strengthen its anti-interference ability, and the CLSTM structure is constructed. The corresponding mathematical expression of CLSTM could be expressed as follows:
where represents the convolution operation, represents the obtained data by convolving the input data, is the calculation method for a storage unit at the moment , is all output terminals of the storage device system unit at the time, and represents the weight value and bias element respectively, and are the excitation functions, , and are the computation methods belonging to the three gates at a time .
Fig. 1LSTM structure

The CNN module is one kind of unique deep feedforward neural network, which could avoid parameter redundancy effectively caused by fully connected layers. The CNN module mainly consists of convolutional layers, pooling layers, activation layers, and Dense layers. The function of these four layers is to perform feature learning automatically without relying on any prior knowledge. The one-dimensional vibration data is used as input of the suggested approach, so the one-dimensional CNN is used. Suppose that the input signal and filter are denoted by and respectively, and the corresponding convolution processes could be represented by the following equations:
where is the bias value, serves as the linear activation vector corresponding to the layer , is the non-linear activation function, and is the layer’s output feature . represents convolution operation and represents all-zero filling. refers to the eigenvector after convolution processing, is the output feature of the layer , refers to the size of the nucleus of the layer, represents the convolution process, represents the computation process of the convolution process , represents the nucleus of the layer, is the signal of the layer.
A new linear rectification function named ELU activation function with a positive hyperparameter is introduced into the constructed hybrid model, which aims to enhance its noise resistance ability. The function expression of ELU is given in Eq. (9):
Eq. (9) corresponding sketch map is shown in Fig. 2. The graph as shown in Fig. 2 has a smaller slope in the negative value region, which makes the model more adaptable to changes of negative interference. Therefore, it can increase the sensitivity of the model to interference, and improve the model’ diagnostic accuracy in complex marine environments.
Fig. 2The sketch map of ELU function

The attention module is built by mimicking the human brain’s mechanisms, which only selects and handles some key input information while processing large amounts of input information, thus improving the diagnosis efficiency. The calculation of the attention module could be represented as follows:
in which is the attention score function, , , represents the weight matrix inside the neural network, and is the attention distribution.
To facilitate reproduction and fair comparison, the micro-architectures of the two parallel branches are summarized in Table 1, and their architectures are depicted in Fig. 3 and Fig. 4.
Table 1Hyper-parameter settings of CLSTM and CNN branches
Layer | Kernel / Stride | Output channels | Activation | Dropout | Note |
CLSTM-Conv1 | 16 / 2 | 32 | ELU | – | pre-LSTM |
CLSTM-LSTM | – | 128 | tanh | 0.3 | bidirectional |
CNN-Conv1 | 32 / 4 | 64 | ELU | – | – |
CNN-Conv2 | 16 / 2 | 128 | ELU | – | – |
CNN-Conv3 | 8 / 2 | 128 | ELU | – | – |
CNN-MaxPool | 2 / 2 | – | – | – | each block |
CNN-GAP | – | 128 | – | – | global average |
Fig. 3CLSTM module architecture

Fig. 4CNN branch architecture

2.2. The constructed hybrid model and flow chart of the proposed method
A novel parallel connection method is adopted to connect the above-mentioned three submodules to construct the hybrid model. The structure of the hybrid model is illustrated in Fig. 5. This connection approach not only captures the temporal and spatial fault information of the input signal simultaneously, but also extracts features more comprehensively. Specifically, inspired by reference [25], the designed network model mainly consists of two independent branches. The upper part of Fig.5 shows the constructed CLSTM model, while the lower part shows the CNN model. These two branches can operate independently without interfering with each other, thus effectively extracting the temporal and spatial fault features of the input signal simultaneously. This design method greatly improves the running efficiency of the model and enhances its feature extraction ability, which provides a richer information foundation for subsequent fault diagnosis and prediction. Meanwhile, it reduces feature loss and does not require the training of LSTM and CNN networks separately, thereby saving time and simplifying the training process. In addition, the improved LSTM and CNN have strong expressive powers, allowing the model to describe the fault feature buried in the input signal more accurately: the key features could be assigned higher weights by focusing on the fused temporal and spatial feature information, so the diagnosis accuracy of AUV thruster under interference could be improved. At last, the SoftMax classifier is used to classify and diagnose faults.
The concrete diagnosis processes are as follows based on the proposed hybrid network:
Step 1: The vibration signals of the AUV thruster are collected, and the dataset suitable for network input is established.
Step 2: Construct the CLSTM module by adding a convolution layer before LSTM to improve the anti-interference ability of LSTM.
Step 3: ELU is built into the traditional CNN module to increase the hybrid model’ sensitivity to interference, and its diagnostic accuracy in complex marine environments is improved.
Step 4: The hybrid model named CLSTM-CNN-Attention neural network for interference suppression is constructed by using a parallel connection method according to Fig. 5, and the attention module is built into the hybrid model to improve its efficiency.
Step 5: The dataset is randomly partitioned into the training subset and the testing subset, and diagnosis classification is realized at last by using SoftMax.
The specific configuration of the experimental running environment is as follows: the operating system is Windows 11, the central processing unit is Intel Core i5-12500H, and the programming environment is based on the Tensorflow framework of Python 3.6. To implement the proposed network model, Python code can be used at https://github.com/YXGQ/A-fault-diagnosis-method-for-AUV. Get more detailed information about it.
Fig. 5Structure of the hybrid network

2.3. Network structure and parameter settings
According to the model proposed in this paper, the programming language used in this experiment is Python 3.6. The running configuration is as follows: the operating system is Windows 11. The CPU is Intel Core i5-12500H. The dataset is divided into a training set and a test set in a 7:3 ratio. The network parameters of the CLSTM-CNN-Attention model are shown in Table 2. A parallel branch structure is adopted to extract temporal and spatial features, respectively.
Table 2Parameter settings for CLSTM-CNN-attention
Network layer | Kernel / Size / Stride | Activation function | Note |
Input layer | – | – | 1D vibration signal |
CLSTM-Conv1 | 32 / 16 / 2 | ELU | Pre-convolution layer |
CLSTM-LSTM | — | tanh | Bidirectional LSTM, Dropout = 0.3 |
CNN-Conv1 | 64 / 32 / 4 | ELU | – |
CNN-Conv2 | 128 / 16 / 2 | ELU | – |
CNN-Conv3 | 128 / 8 / 2 | ELU | – |
CNN-MaxPool | 2 / 2 | – | After each conv block |
CNN-GAP | – | – | Global average pooling |
Attention layer | – | Softmax | Adaptive weighting |
Fully connected layer | 128 | ReLU | – |
Classification layer | 5 | Softmax | Corresponding to 5 fault states |
3. Experiment
3.1. Data description
To acquire vibration signals reflecting the dynamic characteristics of the thruster, piezoelectric accelerometers (Model: PCB 352C65, Sensitivity: 100 mV/g, Range: ±50 g, Frequency Response: 0.5-10000 Hz) were adopted, which are the standard sensors for vibration engineering of rotating machinery. The sensors were rigidly fixed on the bearing seat of the thruster shell via a magnetic base, a key position for vibration transmission, and calibrated using a standard excitation table (Model: JZK-10) before the experiment to ensure measurement accuracy. The AUV was placed in a static water tank to eliminate the interference of the external flow field during the test. The DH5902N rugged data acquisition system (24-bit sampling precision, 4 input channels, anti-aliasing filter cutoff frequency: 6000 Hz) was used to collect and store the vibration signals. The data sampling parameters were set as follows: the thruster speed was 3000 r/min, the sampling frequency was 12800 Hz, and the sampling duration was 10 s.
To clearly display the hardware configuration, Figs. 6-7 present the physical installation details of the test system. Fig.6 shows the PCB 352C65 acceleration sensor located on the thruster. Fig. 7 illustrates the connection between the DH5902N data acquisition system, sensors, and upper computer (Noting: Figs. 6-7 are taken by the authors in Huanghe University of Science and Technology Rotating Machinery Fault Laboratory on 2.10.2026).
Fig. 6PCB 352C65 Piezoelectric Acceleration Sensor

Fig. 7Physical diagram of the connection between the DH5902N data acquisition system and sensors, as well as the upper computer

A total of 5 kinds of operating states (Normal, Unbalance, Entangle, Chute Loose, and Futaba Unbalance) are simulated and their corresponding signals are collected, and the labels 0-4 correspond to the five kinds of operating states. Each state provided 30 groups of raw vibration signals. A sliding window of 2048 samples with 50 % overlap was applied, yielding 300 segments per state, as summarized in Table 3.
It should be noted that Unbalance (Noted as 0) is simulated by tying a tie to a blade, and Futaba Unbalanced (Noted as 4) is simulated by tying two ties to two blades respectively
To ensure the authenticity and engineering relevance of the experimental data, five typical operating states of AUV thrusters (including one normal state and four artificial fault states) were designed and simulated based on common failure modes in practical marine operations [3, 17]. The specific simulation principles, operation steps, and parameter settings are detailed as follows:
1. Normal (State 3).
No artificial intervention was performed on the AUV thruster. The propeller blades (material: titanium alloy, 4 blades), transmission shaft, chute, and other core components were maintained in the factory-designed standard state. The assembly torque of all fixed bolts was 8 N·m (consistent with the AUV thruster’s rated working parameters), ensuring no eccentric mass, structural looseness, or external interference. This state served as the baseline for normal vibration signal collection.
2. Unbalance (State 0).
This fault simulates the mass eccentricity of the propeller caused by uneven wear, partial corrosion, or attachment of marine debris (e.g., shellfish) during long-term operation. The specific simulation method: A stainless steel washer with a mass of 5 g (diameter: 8 mm, thickness: 2 mm, density: 7.9 g/cm³) was selected as the eccentric load, fixed on the outer edge of the 1st propeller blade (1/3 of the blade length from the tip) using high-temperature resistant and waterproof adhesive tape. The washer was positioned perpendicular to the blade surface to avoid affecting the water flow field, ensuring that the eccentric mass only induces vibration without changing the propeller’s aerodynamic characteristics.
3. Entangle (State 1).
This fault simulates the entanglement of the thruster by external flexible objects (e.g., fishing nets, marine plants, or plastic debris) in the marine environment, which leads to increased rotational resistance and uneven torque. Simulation steps: (1) A nylon rope with a diameter of 2 mm (tensile strength: 50 MPa, length: 50 cm) was selected to mimic common marine entanglement objects. (2) The rope was tightly wound 3 times around the junction of the thruster’s transmission shaft and the protective cover, with a 10 cm free end left to simulate the randomness of real entanglement. (3) The winding tightness was controlled to ensure that the thruster could still operate at the rated speed (3000 r/min) without stalling, which is consistent with the mild-to-moderate entanglement scenario in practical applications.
4. Chute Loose (State 2).
The chute is the key structure for fixing the thruster motor and ensuring the coaxiality of the motor and transmission shaft. This fault simulates the loosening of the chute caused by long-term vibration or bolt fatigue. Simulation method: (1) The four M6 fixing bolts of the chute were loosened from the standard torque (8 N·m) to 3 N·m using a torque wrench. (2) A dial indicator was used to measure the radial runout of the transmission shaft, ensuring the runout range was 0.15-0.2 mm (consistent with the slight loosening fault in engineering practice). (3) No additional displacement was applied to avoid excessive deviation beyond the actual fault range.
5. Futaba Unbalance (State 4).
This fault simulates a more complex propeller imbalance scenario (e.g, uneven wear of two symmetric blades or attachment of debris on multiple blades). Simulation method: Two identical stainless steel washers (each 5 g, same specifications as State 0) were fixed on the 1st and 3rd symmetric propeller blades (1/3 of the blade length from the tip), respectively. The washers were installed in the same direction (perpendicular to the blade surface) to form a bidirectional eccentric mass system, which induces more complex vibration characteristics than single-blade imbalance.
Table 3The summary of the dataset
Fault type | Raw signals | Duration each | Window / overlap | Augmented | Train | Test |
Normal | 30 | 10 s | 2048 / 50 % | 300 | 210 | 90 |
Unbalance | 30 | 10 s | 2048 / 50 % | 300 | 210 | 90 |
Entangle | 30 | 10 s | 2048 / 50 % | 300 | 210 | 90 |
Chute Loose | 30 | 10 s | 2048 / 50 % | 300 | 210 | 90 |
Futaba Unbal. | 30 | 10 s | 2048 / 50 % | 300 | 210 | 90 |
Total | 150 | – | – | 1500 | 1050 | 450 |
The experiment was carried out in a static water tank, which can eliminate the interference of the external flow field and accurately obtain the pure vibration characteristics of the thruster under different fault states. However, the static water tank cannot simulate the complex marine background noise (e.g., wave noise, ship noise, biological noise), so digital simulation of Gaussian noise is adopted to make the experimental conditions closer to the actual marine environment. Each state was tested under the same environmental conditions (static water tank, temperature: 25 ± 1°C, water depth: 1.5 m) to eliminate the influence of external environmental factors on the experimental results.
As is well known, there are many interfering factors in the complex ocean environment, including natural disturbances such as oceanic noise, biological noise, earthquake noise, rainfall noise, etc. Mankind interference includes noise from navigation, industry, drilling, and other sources. In reference [26], a summary of these two types of interference was conducted, and it was found that the vast majority of interference ranges from 0 to150 decibels (dB). Gaussian noise in texture noise is the most effective way to simulate natural and human interference. In the marine environment, noise from various sources (such as wave activity, wind speed changes, etc.) typically exhibits characteristics of an approximate Gaussian distribution. Therefore, using Gaussian noise can reasonably simulate the ocean background noise that may occur under specific conditions. Therefore, we conducted experimental verification through six different signal-to-noise ratios of 2 dB, 10 dB, 50 dB, 100 dB, 150 dB, and None.
Gaussian noise was digitally generated using NumPy:
where is determined by the desired SNR via .
The noise sequence was first energy-normalized to unit RMS and then scale-multiplied by . Then it was sample-wise added to the clean vibration signal:
Fig. 8Typical vibration signals of five operating states: a) 0-Unbalance, b) 1-Entangle, c) 2-Chute Loose, d) 3-Normal, e) 4-Futaba Unbalance, and f) comparison of original signal (blue) and signal with 2 dB Gaussian noise (red) for the Entangle state

a)

b)

c)

d)

e)

f)
To verify the anti-interference capability of the proposed model in complex marine environments, Gaussian white noise is added through digital simulation. Interferences such as waves and ship navigation in marine environments typically follow an approximate Gaussian distribution. Therefore, Gaussian noise is employed to effectively simulate actual ocean background noise.
The specific steps for noise addition are as follows:
1) SNR calculation: Calculate the noise standard deviation based on the target signal-to-noise ratio SNR (dB): , where is the original signal power, .
2) Noise generation: Generate a standard normal distribution random sequence , multiplied by to obtain Gaussian noise with the target power.
3) Signal superposition: Superimpose the noise onto the original signal: .
Six SNR levels are set in this experiment: None, 150 dB, 100 dB, 50 dB, 10 dB, and 2 dB. Fig. 8(f) shows the signal comparison of the Entangle state at 2 dB SNR, where it can be observed that strong noise has severely masked the characteristics of the original signal, thereby verifying the diagnostic capability of the model under extreme working conditions.
Note: The noise generation is based on the measured vibration signals. A fixed random seed is used to ensure repeatability. The experimental data are all derived from the static water tank measurement data collected by the PCB 352C65 sensor, and are not pure computer simulation data sets.
3.2. Result analysis
Interference resistance analysis: the analysis result by using the proposed hybrid model is compared with the results of WDCNN [27], LSTM, and CNN respectively to verify its advantage. WDCNN is a very classic deep convolutional neural network model, which is characterized by the use of a wide first kernel design, and it can grasp the global features of the input signal effectively. In addition, WDCNN also has good noise suppression ability, which can maintain high accuracy and robustness while processing noisy data. Therefore, it is used for comparison in the paper. Furthermore, the collected unprocessed vibration data is used as input. The training dataset and test dataset are divided randomly, with a total of 1380 training samples and 590 testing samples being obtained. It should be noted that 10 experiments on all four methods were conducted respectively to eliminate accidental errors.
Fig. 9Diagnostic precision of various methods under different levels of signal-to-noise

Fig. 9 presents the experimental outcomes of each model under the six diverse levels of signal-to-noise. Through comparison, it could be found that the suggested method achieved the best outcomes: the diagnostic precision of the proposed approach is capable of attaining high scores when there is no noise influence or the ratio of signal to noise reaches a relatively high echelon, that is, 150 dB. The diagnostic precision of the proposed method still could reach 95.96 % when the signal-to-noise ratio is extremely minor, that is 2 dB. The execution of the proposed method is superior to the other three methods in both of the above mentioned extreme signal-to-noise ratios, and the aforesaid analysis verifies that the proposed method exhibits favorable interference suppression capability.
Convergence speed analysis: the convergence speed of an intelligent diagnosis model is one important criteria for evaluating its quality. To verify the superiority of the proposed model in convergence speed, visual comparative analysis is conducted on the accuracy curve (ACC) and loss curve (Loss) under 50 dB signal-to-noise level. The pertinent experimental findings are capable of being referred to Fig. 10 and Fig. 11, based on which it could be noticed that the presented approach achieved the utmost accuracy and lowest loss after 60 iterations, and its convergence speed is also better than the other three methods, which also reaches a steady state firstly after 60 iterations.
Fig. 10Diagnostic accuracy of various methods under 50 dB signal-to-noise level

Fig. 11Diagnostic loss of various methods at 50dB signal-to-noise level

Stability performance analysis: multiple experiments are carried out to confirm the stability functioning of the suggested model, and the accuracy radar chart of multiple experiments of each model with the 50 dB signal-to-noise level is presented in Fig. 12. Based on Fig. 12, it is evident that the suggested approach has achieved the most stable outcomes across numerous experiments and exhibits the minimum error. Specifically, the average accuracy of the suggested method is 99.66 %, which improves the average accuracy by 20.29 %. Compared with the WDCNN method and the CNN method, the accuracy is improved by 20.69 % and 47.25 % respectively. This further means that the proposed method exhibits stronger anti-interference ability.
Classification accuracy analysis: Fig. 13 shows the three-dimensional confusion matrix of various methods under 50 dB signal-to-noise level, based on which the classification accuracy and misclassification accuracy of each type of fault can be seen clearly. Among them, the main matrix diagonal corresponds to the classification accuracy of each type of fault, while the two sides of the main diagonal are the misclassification accuracy of each type of fault. The color depth in the confusion matrix corresponds to the precision color bar on the right side of the graph, which can reflect the classification accuracy more intuitively. Based on the results as illustrated in Fig. 13, the proposed approach shows excellent effectiveness in both classification accuracy and misclassification.
Fig. 12The stability of multiple experiments of various methods under 50 dB signal-to-noise level

Fig. 13Confusion matrices of various methods at 50 dB signal-to-noise level: a) Proposed method; b) WDCNN; c) LSTM; d) CNN

a)

b)

c)

d)
Fig. 14 displays the visualization outcomes of the final layer output of the four methods. It is noticeable that the put-forward method has the capacity of effectively aggregating the target sample together and possessing good separability. This further validates the effectiveness of the put-forward approach in extracting and representing classification information.
Fig. 14Visualization analysis of various methods at 50 dB signal-to-noise level by t-SNE: a) Proposed method; b) WDCNN; c) LSTM; d) CNN

a)

b)

c)

d)
Table 4 shows that the proposed hybrid model achieves the highest accuracy and maintains balanced Precision and Recall, yielding the largest F1-score. The AUC approaching 1 indicates almost perfect separability between fault classes even under 50 dB ocean interference.
Table 4Performance comparison under 50 dB ocean interference (tank experiment)
Method | Accuracy | Precision | Recall | F1-score | AUC |
Proposed | 99.66 % | 99.71 % | 99.62 % | 99.66 % | 0.9998 |
WDCNN | 79.37 % | 79.90 % | 78.95 % | 79.42 % | 0.8912 |
LSTM | 76.10 % | 76.55 % | 75.82 % | 76.18 % | 0.8610 |
CNN | 67.61 % | 68.20 % | 67.15 % | 67.67 % | 0.8015 |
In addition, we conducted a thoroughly comparative analysis between the proposed method and the state-of-the-art fault diagnosis methods for Haiwei-1 AUV. Similarly, the comparative analysis is also conducted under a disturbance level of 50dB. The following state-of-the-art methods are included: a multi-scale dilated convolutional network named as MDACNN [28], a network based on support vector glow encoding description (SVGED) named as SVGED [29] for anomaly detection of AUV thrusters, a squeeze excitation (SE) attention residual network (SEResNet) [30]. As shown in Fig. 15, the proposed method demonstrated excellent performance under 50 dB interference conditions, demonstrating extremely high accuracy and stability, significantly outperforming the other three compared methods. Specifically, MDACNN, SVGED, and SEResNet perform significantly poorly under 50 dB interference, with large fluctuations in accuracy and stability. The above comparison result indicates that the proposed method has a stronger ability to cope with the impact of external interference. Although the compared methods can achieve relatively good results in some cases, their performances are significantly suppressed while encountering strong interference, and their performances are also decreased compared with the non-interference situation.
Fig. 15Comparison between the proposed method and the advanced method

3.3. Engineering application value and implementability analysis
The proposed method is designed to solve practical engineering problems of AUV thruster fault diagnosis, with clear implementability and application prospects in marine engineering. Key engineering-oriented advantages are as follows:
3.3.1. Targeted solution to core engineering faults
The four artificial fault modes (Unbalance, Entangle, Chute Loose, Futaba Unbalance) simulated in this study are derived from statistical data of AUV in-service failures [3,17]. Specific engineering problem-solving effects are illustrated as follows:
Propeller imbalance (Unbalance/Futaba Unbalance) accounts for 35 % of AUV thruster failures in offshore operations (caused by shellfish attachment, uneven wear, or debris collision). The proposed model achieves 99.66 % accuracy for such faults under 50 dB interference, which solves the engineering pain point of “difficulty in identifying weak imbalance signals under complex marine noise”. Thruster entanglement (Entangle) is a typical emergency fault in marine surveys (e.g., fishing net winding in coastal waters). The model’s diagnostic accuracy reaches 95.96 % even at 2 dB SNR, ensuring timely diagnosis before thruster stalling and avoiding costly AUV recovery or permanent loss (engineering losses caused by entanglement account for 28 % of AUV accidents [16]).
Chute Loose is a chronic fault induced by long-term vibration (common in deep-sea exploration missions). The model’s ability to capture spatial-temporal features simultaneously enables early detection of slight loosening (shaft runout: 0.15-0.2 mm), which reduces the risk of catastrophic thruster failure during long-duration engineering missions.
3.3.2. Compatibility with industrial-grade hardware and deployment feasibility
The data acquisition system adopted in this study (IEPE-type acceleration sensors + DH5902N rugged acquisition module) is a mature industrial configuration widely used in marine engineering. The proposed model’s engineering compatibility and deployment feasibility are reflected in three aspects:
Hardware adaptability: The input of the model is 1D vibration signals directly collected by industrial-grade sensors, without the need for complex preprocessing such as frequency spectrum conversion. This is consistent with the data output format of on-board AUV acquisition systems, enabling seamless integration with existing AUV hardware platforms. Computational efficiency for on-board deployment: The model converges stably after 60 iterations (Fig. 10), with a single diagnostic time of 0.032 s (tested on Intel Core i5-12500H, which is consistent with the computing power of mainstream AUV on-board controllers). Compared with traditional methods (e.g., WDCNN: 0.078 s, LSTM: 0.091 s), it fully meets the real-time diagnostic requirements of engineering applications (response time < 0.1 s). Ruggedness in engineering environments: The model’s anti-interference design (ELU activation function + parallel spatial-temporal feature extraction) is verified under 2-150 dB Gaussian noise (simulating real marine interference). Its diagnostic accuracy remains > 95 % even at 2 dB SNR, which solves the engineering problem of “poor noise resistance of traditional methods in offshore operations”.
3.3.3. Quantitative improvement in engineering performance metrics
For engineering applications, diagnostic methods must not only have high accuracy but also meet strict requirements for stability, false alarm rate, and missed diagnosis rate. As shown in Table 4 and Fig. 13, the proposed model achieves outstanding performance in key engineering metrics:
False alarm rate < 0.5 % and missed diagnosis rate < 0.4 % for all fault types under 50 dB interference (industrial acceptable threshold: < 5 %), which meets the reliability requirements of marine engineering equipment. Coefficient of variation (CV) of accuracy across 10 repeated experiments: 0.12 % (WDCNN: 3.2 %, LSTM: 4.5 %), ensuring stable performance in batch production and large-scale engineering deployment. Compatibility with small sample scenarios: Each fault type only requires 210 training samples (30 raw signals + sliding window augmentation), which aligns with the engineering reality of “difficulty in collecting large amounts of fault data for high-cost AUV thrusters”.
4. Conclusions
This study proposes a disturbance suppression fault diagnosis method named as CLSTM-CNN-Attention, which aims to address the problem of poor performance in fault diagnosis of AUV thrusters caused by complex marine environment interference. This method utilizes a new parallel connection strategy to obtain the temporal and spatial fault information of input data simultaneously. Meanwhile, a new linear rectification function is introduced into the hybrid model to enhance its anti-interference ability. In addition, through the attention module, more attention can be paid to important features in the fused information, further improving diagnostic accuracy. Through relevant test, it is proven that the proposed approach has achieved significant result in suppressing interference from complex marine environments, achieving a diagnostic accuracy of 95.96 % in the case of low signal-to-noise level, that is 2 dB, and outperforming the other compared methods.
However, it should be noted that although the method has achieved good results, simulation through Gaussian noise in texture noise cannot fully simulate the interference of real complex marine environments. Therefore, the relevant signals in actual marine environments are planned to be obtained to simulate the actual environment more accurately in future research, thereby improving the practicality and reliability of the proposed method. In addition, insufficient sample size and scarcity of labeled data often limit the performance of the model. Therefore, the relevant ideas of semi-supervised learning or unsupervised learning can be introduced to solve the above challenges.
From an engineering perspective, this study not only provides a fault diagnosis algorithm but also forms a complete technical solution covering “sensor selection → data acquisition → fault simulation → on-board diagnosis”. The proposed method has been verified on a laboratory platform using industrial-grade hardware, and its core indicators (anti-interference ability, real-time performance, stability) meet the requirements of marine engineering applications. In future work, we plan to conduct sea trials on the Haiwei-1 AUV (equipped with the proposed diagnostic system) in the South China Sea, further verifying its performance under actual marine conditions and promoting industrialization.
References
-
S. Xia, X. Zhou, H. Shi, S. Li, and C. Xu, “A fault diagnosis method based on attention mechanism with application in Qianlong-2 autonomous underwater vehicle,” Ocean Engineering, Vol. 233, No. 1, p. 109049, Aug. 2021, https://doi.org/10.1016/j.oceaneng.2021.109049
-
D. Ji, X. Yao, S. Li, Y. Tang, and Y. Tian, “Model-free fault diagnosis for autonomous underwater vehicles using sequence Convolutional Neural Network,” Ocean Engineering, Vol. 232, No. 15, p. 108874, Jul. 2021, https://doi.org/10.1016/j.oceaneng.2021.108874
-
F. Liu, H. Tang, Y. Qin, C. Duan, J. Luo, and H. Pu, “Review on fault diagnosis of unmanned underwater vehicles,” Ocean Engineering, Vol. 243, No. 1, p. 110290, Jan. 2022, https://doi.org/10.1016/j.oceaneng.2021.110290
-
M. Mu, H. Jiang, X. Wang, and Y. Dong, “A task-oriented theil index-based meta-learning network with gradient calibration strategy for rotating machinery fault diagnosis with limited samples,” Advanced Engineering Informatics, Vol. 62, p. 102870, Oct. 2024, https://doi.org/10.1016/j.aei.2024.102870
-
X. Wang, H. Jiang, T. Zeng, and Y. Dong, “An adaptive fused domain-cycling variational generative adversarial network for machine fault diagnosis under data scarcity,” Information Fusion, Vol. 126, p. 103616, Feb. 2026, https://doi.org/10.1016/j.inffus.2025.103616
-
Y. Dong, H. Jiang, X. Wang, and M. Mu, “An interpretable integration fusion time-frequency prototype contrastive learning for machine fault diagnosis with limited labeled samples,” Information Fusion, Vol. 124, p. 103340, Dec. 2025, https://doi.org/10.1016/j.inffus.2025.103340
-
S. Xia, X. Zhou, H. Shi, S. Li, and C. Xu, “A fault diagnosis method with multi-source data fusion based on hierarchical attention for AUV,” Ocean Engineering, Vol. 266, No. 1, p. 112595, Dec. 2022, https://doi.org/10.1016/j.oceaneng.2022.112595
-
Y. Jiang, C. Feng, B. He, J. Guo, D. Wang, and P. Lv, “Actuator fault diagnosis in autonomous underwater vehicle based on neural network,” Sensors and Actuators A: Physical, Vol. 324, p. 112668, Jun. 2021, https://doi.org/10.1016/j.sna.2021.112668
-
D. Bagci Das and D. Birant, “GASEL: Genetic algorithm-supported ensemble learning for fault detection in autonomous underwater vehicles,” Ocean Engineering, Vol. 272, p. 113844, Mar. 2023, https://doi.org/10.1016/j.oceaneng.2023.113844
-
Z. Bedja-Johnson, P. Wu, D. Grande, and E. Anderlini, “Smart anomaly detection for Slocum underwater gliders with a variational autoencoder with long short-term memory networks,” Applied Ocean Research, Vol. 120, p. 103030, Mar. 2022, https://doi.org/10.1016/j.apor.2021.103030
-
T. D. Subha, T. Subash, K. S. Claudia Jane, D. Devadharshini, and D. I. Francis, “Autonomous under water vehicle based on extreme learning machine for sensor fault diagnosistics,” Materials Today: Proceedings, Vol. 24, No. 4, pp. 2394–2402, Jan. 2020, https://doi.org/10.1016/j.matpr.2020.03.769
-
X. Zheng, C. Feng, T. Li, and B. He, “Analysis of autonomous underwater vehicle (AUV) navigational states based on complex networks,” Ocean Engineering, Vol. 187, p. 106141, Sep. 2019, https://doi.org/10.1016/j.oceaneng.2019.106141
-
Y. Chen, Y. Wang, Y. Yu, J. Wang, and J. Gao, “A fault diagnosis method for the autonomous underwater vehicle via meta-self-attention multi-scale CNN,” Journal of Marine Science and Engineering, Vol. 11, No. 6, p. 1121, May 2023, https://doi.org/10.3390/jmse11061121
-
S. Gao, C. Feng, X. Zhang, Z. Yu, T. Yan, and B. He, “Unsupervised fault diagnosis framework for underwater thruster system using estimated torques and multi-head convolutional autoencoder,” Mechanical Systems and Signal Processing, Vol. 205, No. 15, p. 110814, Dec. 2023, https://doi.org/10.1016/j.ymssp.2023.110814
-
Z. Chu, Y. Chen, D. Zhu, and M. Zhang, “Observer-based fault detection for magnetic coupling underwater thrusters with applications in jiaolong HOV,” Ocean Engineering, Vol. 210, p. 107570, Aug. 2020, https://doi.org/10.1016/j.oceaneng.2020.107570
-
S. Gao, J. Liu, Z. Zhang, C. Feng, B. He, and E. Zio, “Physics-guided generative adversarial networks for fault detection of underwater thruster,” Ocean Engineering, Vol. 286, p. 115585, Oct. 2023, https://doi.org/10.1016/j.oceaneng.2023.115585
-
Z. Zhu, X. Li, H. Chen, X. Zhou, and W. Deng, “An effective and robust genetic algorithm with hybrid multi-strategy and mechanism for airport gate allocation,” Information Sciences, Vol. 654, p. 119892, Jan. 2024, https://doi.org/10.1016/j.ins.2023.119892
-
H. Zhao, Y. Wu, and W. Deng, “An interpretable dynamic inference system based on fuzzy broad learning,” IEEE Transactions on Instrumentation and Measurement, Vol. 72, pp. 1–12, Jan. 2023, https://doi.org/10.1109/tim.2023.3316213
-
M. Pająk, Muślewski, M. Kluczyk, D. Kolar, B. Landowski, and T. Kałaczyński, “Identification of reliability states of a ship engine of the type Sulzer 6AL20/24,” SAE International Journal of Engines, Vol. 15, No. 4, pp. 527–542, Nov. 2021, https://doi.org/10.4271/03-15-04-0028
-
Y. Lin et al., “Identifying and managing risks of AI-driven operations: A case study of automatic speech recognition for improving air traffic safety,” Chinese Journal of Aeronautics, Vol. 36, No. 4, pp. 366–386, Apr. 2023, https://doi.org/10.1016/j.cja.2022.08.020
-
D. Guo, Z. Zhang, B. Yang, J. Zhang, H. Yang, and Y. Lin, “Integrating spoken instructions into flight trajectory prediction to optimize automation in air traffic control,” Nature Communications, Vol. 15, No. 1, p. 9662, Nov. 2024, https://doi.org/10.1038/s41467-024-54069-5
-
C. Huang, H. Ma, X. Zhou, and W. Deng, “Cooperative path planning of multiple unmanned aerial vehicles using cylinder vector particle swarm optimization with gene targeting,” IEEE Sensors Journal, Vol. 25, No. 5, pp. 8470–8480, Mar. 2025, https://doi.org/10.1109/jsen.2024.3516124
-
X. J. Liu, N. G. Cui, and X. F. Liu, “Research progress on underwater detection based on marine environmental noise,” Digital Ocean and Underwater Attack and Defense, Vol. 5, No. 6, pp. 518–523, Jul. 2022.
-
J. W. Shi and L. Q. Hou, “Bearing fault diagnosis based on one-dimensional convolutional attention gating loop network and transfer learning,” Journal of Vibration and Control, Vol. 42, No. 3, pp. 159–164+173, Jul. 2023.
-
R. Yao, H. Zhao, Z. Zhao, C. Guo, and W. Deng, “Parallel convolutional transfer network for bearing fault diagnosis under varying operation states,” IEEE Transactions on Instrumentation and Measurement, Vol. 73, pp. 1–13, Jan. 2024, https://doi.org/10.1109/tim.2024.3480212
-
G. S. Zhang et al., “Classification of marine environmental noise and its impact on marine fauna,” Journal of Dalian Ocean University, Vol. 27, No. 1, pp. 89–94, Jul. 2012.
-
A. Zhang, S. Li, Y. Cui, W. Yang, R. Dong, and J. Hu, “Limited data rolling bearing fault diagnosis with few-shot learning,” IEEE Access, Vol. 7, pp. 110895–110904, Jan. 2019, https://doi.org/10.1109/access.2019.2934233
-
W. Du, Y. Yang, H. Wang, X. Gong, G. Xie, and B. Zhao, “Fault diagnosis of AUV propulsion system based on multi-scale dilated convolutional neural network,” in Prognostics and System Health Management Conference (PHM), pp. 398–404, May 2024, https://doi.org/10.1109/phm61473.2024.00076
-
W. Du, Z. Xiong, P. Zhu, Z. Pu, C. Li, and D. Hou, “Enhancing underwater thruster anomaly detection with support vector glow encoding description,” Ocean Engineering, Vol. 314, p. 119655, Dec. 2024, https://doi.org/10.1016/j.oceaneng.2024.119655
-
W. Du, X. Yu, Z. Guo, H. Wang, Z. Pu, and C. Li, “Squeeze‐and‐excitation attention residual learning of propulsion fault features for diagnosing autonomous underwater vehicles,” Journal of Field Robotics, Vol. 42, No. 1, pp. 169–179, Jul. 2024, https://doi.org/10.1002/rob.22405
About this article
The study was supported by “Henan Province’s New Key Discipline-Machinery” Project No. 2023414.
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Yuanhui Liang is the theoretical researcher, Yuyu Zhu is the writer of the paper, and Lei Wang is the program programmer in the paper.
The authors declare that they have no conflict of interest.