Prediction of liquefaction-induced lateral spreading based on Neural network

. In light of inherent errors associated with the existing methods for predicting lateral spreading of liquefied soil during earthquakes, a novel approach has been proposed. Based on the Newmark sliding block method, a neural network model has been trained to calculate lateral liquefaction displacement, which was achieved by compiling a substantial dataset and establishing a comprehensive seismic motion database. Taking into consideration six input features to train the sensitivity model, based on the sensitivity analysis, a predictive model for liquefaction-induced lateral spreading was developed include three parameters, moment magnitude, peak ground acceleration and yield acceleration. This model was then compared to empirical lateral spreading prediction models. The results demonstrate that this model shows notable concurrence with the existing empirical models. Additionally, using 22 well-documented cases of liquefaction-induced lateral spreading, three high-quality models were employed to predict residual shear strength of the soil. Notably, this novel model surpasses the performance of empirical liquefaction-induced lateral spreading prediction models.


Introduction
Liquefaction pertains to the phenomenon in which saturated sandy soil experiences an increase in pore water pressure due to cyclic shearing during an earthquake, subsequently followed by a gradual reduction.The finite deformation of the ground in gentle slope regions is referred to as lateral spreading induced by liquefaction.This lateral spreading can lead to shear failure in pile foundations, resulting in the cracking, stretching, and even collapse of surface structures.The magnitude of lateral spreading influences the seismic design of foundational infrastructure.When lateral spreading is substantial, the impact on engineering facilities becomes notably significant.Hence, the need arises for an accurate predicting method concerning liquefaction-induced lateral spreading.
The Newmark sliding block method, introduced by Newmark in 1965 [1], was initially proposed for computing the permanent displacements of dams subjected to seismic loads.It later found wide application in calculating the permanent displacements of slopes and embankments under earthquake loads.Subsequently, numerous predictive models for lateral spreading based on the Newmark sliding block method have been proposed.For example, Faris et al [2], building upon existing models and utilizing Bayesian regression alongside field and laboratory data, introduced a novel semi-empirical predictive model for liquefaction-induced lateral spreading.Bray and Travasarou [3] based on moment magnitude, yield acceleration, and peak ground acceleration as research parameters.They established a seismic-induced lateral spreading database using 688 seismic records and formulated corresponding displacement prediction models.Jibson [4], through comparative analysis of various parameters including critical acceleration ratio, moment magnitude, Arias intensity, and yield acceleration, employed regression on 2270 seismic records to propose a new displacement prediction model.Rathje and Saygili [5] proposed two displacement prediction models.One relies on a single ground motion parameter (peak ground acceleration), while the other incorporates two ground motion parameters (peak ground acceleration and peak ground velocity), employing a probabilistic approach.Hsieh and Lee [6], relying on plenty of strong earthquake data, introduced a new displacement prediction model through data regression based on Arias intensity and yield acceleration.Ekstrom and Franke [7], incorporating generalized site conditions and lateral spreading reference parameter maps, devised a performance-based probabilistic lateral spreading model for liquefaction induced by earthquakes of recurrence period.Du and Wang [8], employing a probabilistic approach, based one-step lateral spreading prediction model on four seismic parameters, moment magnitude, rupture distance, fault type, and average shear wave velocity of the top 30 meters of soil.Little and Rathje [9], by simulating model geometries numerically, including factors like liquefied soil layer thickness and free surface slope, explored the impact on liquefaction-induced lateral spreading.Consequently, they proposed a bilinear surface model for predicting liquefaction-induced lateral spreading.
In recent years, with the rapid advancement of artificial intelligence and big data technologies, machine learning methods have gained increasing attention and application in the realm of lateral spreading prediction research.Among these methods, algorithms based on machine learning have shown tremendous potential for developing data-driven predictive models [10], [11].Traditional analytical approaches often require precise function relationships, yet complex seismic phenomena like liquefaction-induced lateral spreading are often challenging to accurately describe using simple mathematical models.Consequently, many researchers have turned to data-driven methods, utilizing machine learning algorithms to uncover patterns and regularities within data and construct corresponding predictive models.For example, Yang et al. [12], utilizing cases of liquefaction-induced lateral spreading, trained an artificial neural network model to predict the residual shear strength ratio of liquefied soil, which is then used for subsequent lateral spreading predictions.Demir and Sahin [13] employed multiple machine learning models including eXtreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Light Gradient Boosting Machine (LightGBM) to predict liquefaction-induced lateral spreading.They performed comparative analyses using particle swarm optimization and found particle swarm optimization to outperform other models.Gade et al. [14], considering factors such as moment magnitude, seismic source mechanism, and yield acceleration, proposed a new neural network displacement prediction model based on the Newmark sliding block method and a large dataset.This model exhibited good applicability in slope displacement prediction.Kaya et al. [15] compared and analyzed the applicability of multigene genetic programming (MGGP), multilayer perceptron (MLP), and random forest (RF) models in predicting liquefaction-induced lateral spreading.They discovered that the MGGP model provided more accurate predictions for both free-face and gently sloping ground conditions compared to MLP and RF.From the literature, it is evident that machine learning algorithms are capable of swiftly capturing data characteristics and generating predictive values.However, they demand high-quality input data and cannot assess the inherent rationality of the inputs themselves.Therefore, to achieve accurate displacement prediction models, establishing a database with strong and reasonable features for training is of paramount importance.
The author has previously evaluated the reliability of applying the Newmark sliding block method to analyze liquefaction-induced lateral spreading [16].Thus, this paper aims to compute liquefaction-induced lateral spreading values under different yield accelerations based on the Newmark sliding block method.Additionally, based on six input parameters, namely, peak ground acceleration (PGA), yield acceleration ( ), moment magnitude ( ), the average shear wave velocity of the top 30 meters of soil ( ), focal mechanism (FM), and rupture distance ( ), as well as one output variable, the calculated lateral spreading value, a neural network is employed to train the sensitivity analysis model for lateral spreading parameters.Guided by the outcomes of the sensitivity analysis, three crucial seismic parameters are selected as inputs for training the predictive model of liquefaction-induced lateral spreading.The reliability of this predictive model is evaluated from three different perspectives, i.e.,  , RMSE, and comparison with empirical prediction models.Furthermore, the applicability of the model is demonstrated through predictions made on well-documented liquefaction-induced lateral spreading cases.The research framework is shown in Fig. 1.

Earthquake motion database and research methods
As previously discussed, machine learning algorithms have the capacity to swiftly capture the features of data and produce predictive outcomes.Nonetheless, these algorithms exhibit a high demand for quality input data and are unable to assess the rationality of the input data.Hence, in this section, an extensive collection of ground seismic records is amassed for the purpose of computing lateral spreading under varying yield accelerations.The calculations are conducted using the Newmark sliding block method.

Earthquake motion database
A compilation of seismic events has been meticulously collected from various countries, including China, the United States, Canada, Iran, Turkey, and Japan, sourced from the Pacific Earthquake Engineering Research Center [17].This collection encompasses 27 significant seismic events, among which notable examples are the 1995 Kobe earthquake in Japan, the 1999 Chi-Chi earthquake in Taiwan, and the 1989 Loma Prieta earthquake, amounting to a total of 1960 seismic records.The seismic motion database used for the analysis in this research is detailed in Table 1.Concurrently, corresponding seismic parameters for each event have been systematically gathered and organized.These encompass variables such as PGA,  ,  , FM and  .

Newmark sliding block method
During the soil liquefaction, once liquefaction is triggered and the sliding surface begins to form, the strength of saturated sandy soil rapidly diminishes.As the soil strength continues to decrease, the residual shear strength of the soil remains within the liquefied site.When the shear forces induced by the earthquake, combined with the gravitational forces acting on the soil above the sliding surface, exceed the residual shear strength, the soil above the sliding surface initiates movement.This movement accumulates displacement continuously until the earthquake subsides.The motion pattern of the soil above the sliding surface can be described using the Newmark sliding block method.Specifically, by performing two integrations of the acceleration exceeding the  , the resulting displacement under seismic conditions can be determined.The ultimate sliding displacement is the cumulative result of multiple increments of sliding and is illustrated in Fig. 2.This approach has been employed by several scholars to predict lateral spreading induced by liquefaction, including Yang [16], Baziar [18], Taboada [19], Olson,and Johnson [20].From Fig. 2, it is evident that the  is a pivotal parameter in the computation of displacement using the Newmark sliding block method.Therefore, to obtain a substantial dataset of liquefaction-induced lateral spreading calculations suitable for machine learning algorithms, this section assumes various values for the  , 0.01 g, 0.03 g, 0.05 g, 0.75 g, 0.1 g, 0.15 g, 0.2 g, 0.25 g, and 0.3 g.The selection range for the  is determined based on the analysis of corresponding models using 23 well-documented cases of liquefaction-induced lateral spreading, as compiled by Yang et al. [12].It is important to note that the computation of  necessitates the establishment of a limit equilibrium analysis model based on parameters such as soil distribution, soil strength, soil density, and groundwater levels [16].Consequently, by employing the  , the information encompassing the parameters is inherently considered.
Subsequently, using the collected dataset of 1960 seismic records, lateral spreading under different yield accelerations is computed.Note, before calculating the liquefaction-induced lateral spreading, records with a  exceeding the PGA are excluded, as the calculated lateral spreading would be zero.In the end, a total of 8914 valid liquefaction-induced lateral spreading calculations are obtained.

Sensitivity analysis model for lateral spreading parameters
By utilizing the Newmark sliding block method for liquefaction-induced lateral spreading calculations, a neural network algorithm is applied to train a sensitivity analysis model for lateral spreading parameters.This involves six seismic parameters, PGA,  ,  ,  , FM, and  , as feature inputs.The ln( ) is assigned as the output feature.Subsequently, this neural network model is trained to conduct sensitivity analysis on the factors influencing lateral spreading.

Parameter selection
Based on the findings from the studies mentioned in references [8] and [14], this research selects six seismic parameters, PGA,  ,  ,  , FM, and  , as input features, with ln( ) as the output feature.The model used for training the lateral spreading parameter sensitivity analysis is expressed in the following functional form: where,  represents the lateral spreading in mm.PGA stands for peak ground acceleration, measured in units of g,  denotes the yield acceleration, also in units of g,  signifies the moment magnitude,  represents the average shear wave velocity in the top 30 meters of soil, measured in m/s, FM corresponds to the focal mechanism, which includes three forms, i.e., Normal, Reverse, and Strike-Slip faults, and  signifies the rupture distance, measured in km.Within the dataset of 1960 seismic records used in this research, the ranges of the six seismic parameters mentioned above are depicted in Table 2.It is important to note that the  is assumed, and in the FM category, 1 corresponds to Normal fault, 2 to Reverse fault, and 3 to Strike-Slip fault.The distribution between PGA and  is illustrated in Fig. 3(a), while the distribution among  ,  , and FM is presented in Fig. 3

Neural network model
A neural network is a type of machine learning algorithm designed to emulate the structure and functionality of the human neural system.It comprises a hierarchical structure of multiple neurons and is utilized for prediction and classification tasks by learning patterns and features from input data.Neural networks optimize model parameters through forward and backward propagation algorithms to minimize prediction errors.A neural network model consists of an input layer, hidden layers, and an output layer.The input layer primarily receives and passes input features to the hidden layers.The hidden layers play a key role in feature extraction and transformation.Finally, the output layer provides the responses of the predictive model.
In this research, there are six input neurons and one output neuron.Additionally, according to the Universal Approximation Theorem, a single hidden layer is sufficient to describe continuous nonlinear functions [14].Therefore, a single hidden layer is employed.Considering the nonlinear relationship between input and output features, the sigmoid function is chosen as the activation function to produce predicted values that closely approximate input features.The operation stops when the output signal meets the desired criteria; otherwise, error backpropagation is executed.This involves feeding back error values to the hidden layer and continuously adjusting network weights, allowing the hidden layer to gain strong nonlinear mapping capabilities and ultimately produce the desired output.Note, to minimize scale differences between various features and enhance data uniformity, the input features are scaled to a range between -1 and 1 according to Eq. ( 2): where,  = -1 and  = 1, and  and  takes the minimum and maximum values of the various features that are mapped.
Based on this framework, 80 % of the data is randomly allocated for training, while the remaining 20 % forms the testing set.Bayesian regularization is employed for training the data.After multiple iterations, the regression achieves the best results with 10 hidden neurons.Thus, the artificial neural network structure employed in this research is depicted in Fig. 4.
The neural network structure model depicted in Fig. 4 was utilized for training using the dataset from Section 1. Fig. 5 illustrates the performance concerning six parameters related to lateral spreading.Impressively, across the training set, testing set, and the overall assessment, the predicted results consistently exhibit a robust correlation coefficient ( ) of 0.94 or higher.This high and consistent  strongly signifies the predictive reliability of the model.

Sensitivity analysis
Considering the challenges in acquiring the six parameters for engineering applications, this study aims to enhance the applicability of the lateral spreading prediction model by conducting a sensitivity analysis.This analysis will compare the influence of the six parameters on the prediction outcomes and identify the three most sensitive parameters to further refine and optimize the model, enhancing its suitability for practical use.Therefore, all parameters, i.e., PGA,  ,  ,  , FM, and  , were multiplied by scaling factor of 1.2 and 1.5.Using this modified data, a secondary prediction was conducted with the model, and the results were contrasted with the original predictions.Here, taking the PGA as an example is pivotal for illustration.Initially, the PGA undergoes scaling, being amplified by factors of 1.2 and 1.5.Subsequently, leveraging the sensitivity analysis model from Fig. 5, predictions for lateral spreading are made considering the 1.2 and 1.5 scaled PGA while keeping other data constant.Finally, based on the initial lateral spreading predictions, the  and RMSE (Root Mean Square Error) are computed separately.This process reveals the  and RMSE for each parameter under varying scaling factors.By comparing  and RMSE, it becomes possible to assess the sensitivity of the six parameters.This analysis yields the sensitivity of each input feature, as presented in Table 3.
From the variation of  and RMSE values in Table 3.It is evident that among the six input parameters,  ,  , and PGA yield the most significant influence on liquefaction-induced lateral spreading.As the energy intensity of an input earthquake is often described by  , and PGA, additionally,  emerge as a critical factor within the Newmark sliding block frame, representing the input and yield acceleration, respectively.Thus, the three parameters significantly influencing the analysis of lateral spreading [1]: where,  represents the number of observed samples,  , denotes the th target value, and  , denotes the th predicted value.RMSE measures the magnitude of deviation between the target values and the predicted values.It reflects the discrepancy between the predicted and actual values.

Comparison between the lateral spreading prediction model and the empirical models
In this section, three highly sensitive parameters, identified through sensitivity analysis of liquefaction-induced lateral spreading parameters, are selected as input features to train the lateral spreading prediction model.A subsequent comparison is made with existing empirical models to ascertain the applicability and effectiveness of the model proposed in this research.

Prediction model of liquefaction-induced lateral spreading
Building upon the original data and the input feature sensitivity analysis results from Table 3, the prediction model is constructed using  ,  , and PGA as input features, with lateral spreading calculated values as the output feature.The functional form of the predictive model is expressed as shown in Eq. ( 4): where, it can be observed that this research employs a neural network structure with three input neurons and one output neuron.Similarly, to account for the nonlinear relationship between seismic parameters and liquefaction-induced lateral spreading, the Sigmoid function is chosen as the activation function for the hidden layer to produce predicted values that approximate the input features.Regarding dataset splitting, the data is randomly divided into an 80 % training set and a 20 % testing set.Bayesian regularization is employed for training the data.After multiple attempts, it was found that the optimal regression performance is achieved with seven hidden neurons.
Hence, employing the artificial neural network structure illustrated in Fig. 6, the model was trained to predict lateral spreading.The training results for the lateral spreading prediction model are depicted in Fig. 7.As seen in Fig. 7, whether for the training set, testing set, or overall analysis, the predicted results demonstrate a  of 0.93 or higher.This signifies a high level of reliability in the predictive liquefaction-induced lateral spreading of the model.

Comparison with empirical lateral spreading prediction models
Upon comparing Fig. 5 and Fig. 7, it becomes evident that the lateral spreading prediction model depicted in Fig. 7 maintains predictive accuracy while halving the number of input parameters, thereby considerably streamlining the model usability.Furthermore, to objectively assess the reliability of the proposed lateral spreading prediction model in this research, a comparative analysis with two established empirical models [4]- [5] that employ identical input variables ( ,  , and PGA) was conducted.This comparative analysis serves to offer additional insights into the model proposed in this research.
In Eqs.
(5-6),  represents the lateral spreading in cm.From Fig. 8(a), it can be observed that for constant values of other seismic parameters, as the  increases, the lateral spreading also increases.This is because a larger  implies higher energy carried by seismic waves during the corresponding seismic event, making liquefactionprone areas more susceptible to liquefaction, resulting in larger lateral spreading.
In Fig. 8(b), it is evident that with constant values for other seismic parameters, an increase in the  results in a decrease in the lateral spreading.This observation aligns with the displacement calculation theory of Newmark [1], where, under unchanged conditions, a higher  leads to a smaller Newmark-calculated displacement.Moreover, a higher  signifies a greater residual shear strength of liquefied soil.Soil displacement begins only when the combined shear forces resulting from seismic and gravitational effects surpass the residual shear strength.As a result, the lateral spreading decreases.
Fig. 8(c) illustrates that different PGA have varying effects on liquefaction-induced lateral spreading.However, there is an overall trend of increasing lateral spreading with higher PGA, like the pattern observed for moment magnitude.This alignment with theoretical expectations and the calculation principles of Newmark [1] reinforces the consistency of the model.
The alignment of the proposed predictive model with theoretical analysis and its good agreement with existing empirical models highlight its reliability.This suggests that the proposed liquefaction-induced lateral spreading prediction model can effectively capture the intricate nonlinear relationship between input parameters and the output parameter, making it a dependable prediction tool.

Example application of liquefaction-induced lateral spreading
To further validate the reliability of the proposed model for lateral spreading prediction, this section gathers and organizes a dataset of 22 well-cases of lateral spreading.This dataset will be utilized for validation analysis.The observed data of lateral spreading serve as the benchmark to evaluate the effectiveness of the methodology presented in this research.Additionally, a comparative analysis will be conducted by comparing the predictions of the proposed model with those of existing empirical models.

Example of lateral spreading of liquefaction
A comprehensive dataset of liquefaction-induced lateral spreading well-cases has been meticulously gathered and organized for validation based on published literature.This dataset encompasses detailed information about soil profiles, soil properties, and Standard Penetration Test (SPT) values of potentially liquefiable soils.The collected data has been compiled into an extensive lateral spreading database, as presented in Table 4.This table incorporates various parameters, including (N 1 ) 60 , which signifies the normalized and standardized SPT blow count under standard atmospheric pressure, and (N 1 ) 60-cs , representing the equivalent clean-sand normalized SPT blow count, accounting for the purity of the sand.Additionally, ln( , ) denotes the natural logarithm of the observed lateral spreading in the field, measured in millimeters (mm).

Application of lateral spreading prediction model
The determination of the yield acceleration for lateral spreading involves various factors, including soil layer distribution, soil strength, soil unit weight, and groundwater level.To achieve this, a limit equilibrium analysis model is set up using Slide software, utilizing the Morgenstern-Price method for limit equilibrium analysis.A critical step in this process is establishing the residual shear strength of the liquefiable soil, as emphasized by previous research [16].
To calculate the residual shear strength of the liquefiable soil, three different formulas are utilized, i.e., Olson and Johnson [20], Idriss and Boulanger [37], and Kramer and Wang [38].These formulas are applied based on the standard penetration test (SPT) values of the liquefiable soil.By adjusting the dynamic load factor to achieve a safety factor of exactly 1.0, corresponding to the horizontal seismic coefficient, the yield acceleration of the site is obtained.The yield accelerations determined using different residual shear strength prediction models are summarized in Table 5.This process ensures that the analysis captures the specific characteristics of the liquefaction potential of the soil in question.Yield acceleration (g) Ref. [20] Ref. [37] Ref. [38] Ref. [20] Ref. [37] Ref. [ The application of the model proposed in this research is compared with the Jibson 2007 model [4] and the Rathje and Saygili 2009 model [5].Using the residual shear strength predictions from the Olson and Johnson [20], Idriss and Boulanger [37], and Kramer and Wang [38] models, the lateral spreading for the 22 liquefaction cases described in Table 5 are predicted.These predictions are compared with observed lateral spreading data to assess the effectiveness of the proposed method.The predictive results are illustrated in Fig. 9.
To provide an objective assessment of the accuracy of the lateral spreading prediction method presented in this research, the evaluation is conducted using  and RMSE.The results are detailed in Table 6.This comprehensive analysis aims to validate the reliability and accuracy of the approach outlined in this research when compared to established models and actual observations.

Conclusions
This research developed a predictive model for lateral spreading induced by liquefaction using a data-driven approach.The model is built on artificial neural networks and machine learning techniques, enabling accurate predictions of lateral spreading given specific seismic parameters.Through a sensitivity analysis, it was identified that the most influential seismic parameters, and they were incorporated into the model construction, leading to further enhancement of prediction applicability.Compared to existing empirical models, the proposed model demonstrates great performance in predicting liquefaction-induced lateral spreading.The following conclusions and suggestions have been drawn.
1) Neural networks were able to capture the nonlinear relationships between input and output features, which was proved in predicting liquefaction-induced lateral spreading.Based on sensitivity analysis, moment magnitude, yield acceleration, and peak ground acceleration are critical factors for an accurate lateral spreading prediction.The liquefaction-induced lateral spreading prediction model trained in this research demonstrates great performance compared to empirical models across 22 well-documented cases of liquefaction-induced lateral spreading.
2) It should be emphasized that exploring alternative neural network structures or basis functions may potentially yield even greater improvements in accurately predicting liquefaction-induced lateral spreading.

Fig. 1 .
Fig. 1.Research framework for prediction of lateral spreading based on Neural network (b). a) Distribution of PGA and M b) Distribution of R , M and FM Fig. 3. Distribution of relevant parameters of seismic records [21]

Fig. 4 .Fig. 5 .
Fig. 4. The neural network structure for lateral spreading parameter sensitivity analysis model

Fig. 8 .
a) Variation with respect to  with  = 0.15g, PGA = 0.3 g b) Variation with respect to  with  = 7.0, PGA = 0.3 g c) Variation with respect to PGA with  = 7.0,  = 0.15 g Comparison of the lateral spreading predictions model with the existing empirical prediction models[21]

theFig. 9 . 6 .
Jibson 2007 model and the Rathje and Saygili 2009 model.This comparative analysis underscores the effectiveness and accuracy of the approach developed in this research for predicting lateral spreading.a) Olson and Johnson model [20] b) Idriss and Boulanger model [37] c) Kramer and Wang model [38] Prediction of liquefaction-induced lateral spreading based on different residual shear strength models of liquefied soilTable Application of prediction model for lateral spreading Model of residual shear strength Prediction method Evaluation index  RMSE Olson and Johnson [20]
Fig. 2. Diagram of dynamic permanent displacement calculated by Newmark sliding block method

Table 2 .
The ranges of parameters in the earthquake motion database

Table 5 .
Residual shear strength and yield acceleration of liquefied soil in different models