Abstract
Randomness complexity is a kind of features which is widely used to describe bearings’ degradation. However, different randomness complexities present different properties. It is necessary to figure out different randomness complexities’ properties. In this paper, we are going to make comparisons of seven commonly used randomness complexities namely approximate entropy, sample entropy, fuzzy entropy, Shannon entropy, permutation entropy, LempelZiv complexity and ${C}_{0}$ complexity by simulation signals with three different aspects and two runtofailure bearing’s data. By comparisons, we have found that there are a kind of similarity between them and we have proposed a trend similarity index to expound this similarity. Based on the comparisons, we can infer that randomness complexities are a family feature of rolling bearings’ degradation. Among the seven discussed complexities, sample entropy has the best performance, and it can be a good representative of the complexity features. In this paper, the difference between complexity features and other features when monitoring bearings’ degradation have been discussed. The research will provide a reference for rolling bearings’ multifeatures dimensionality reduction by attribute selection method.
Highlights
 There are a kind of similarity between different randomness complexity.
 A new similarity index is proposed to expound the trend similarity.
 Simulation signals with three different aspects are used for comparisons.
 For the seven randomness complexities in the paper, sample entropy has the best performance.
1. Introduction
The rolling bearing is one of the most frequently used components in rotating machinery, which has an important influence on the modern industry. The function of bearings is to permit linear motion or constrained relative rotation between two parts. During the operation, the bearings are often subject to high loading and severe conditions. Under this severe operating condition, defects are often developed on the bearing components which are the most frequent cause of failure in mechanisms [1]. Most of the operational life of a bearing shows no significant trend until the time very close to failure [2]. Hence, it is pivotal and costeffective to find suitable methods to detect fault and monitor the degradation process [3]. So, the conditionbased monitoring (CBM) have come into being, and the industry is undergoing the transition from timebased part replacement decisions in operational systems to CBM [4]. The most extensively used monitoring tool is vibration signal. As a result, a plethora of methods for diagnosing bearing faults have been proposed, good reviews can be seen in Ref. [59]. Although to detect faults effectively, these features can hardly describe the degradation of bearings, let alone estimate the remaining useful life (RUL) of bearings. For instance, two main families of signal processing tools have gained a leading role in the diagnostic of such components: the kurtogrambased family and the cyclostationarity family [10], but both of them can hardly extract indicators for degradation process. Many references used statistical parameters such as root mean square (RMS), kurtosis and crest factor as the degradation indicators, but neither of them always shows an increasing trend during the degradation process. To find a reliable, robust, trend consistent feature, and meanwhile, which can clearly reflect the stage of the degradation is allimportant.
Complexity is a kind of features which have been widely used in many areas for decades. Rapp and Schmah [11, 12] have classified a variety of complexity algorithms into two categories. One is the randomness complexity, the other is the rule complexity. For example, language is complex, if a meaningful sentence is randomly disturbed, the original text should have higher rule complexity but lower randomness complexity. The “snow” displayed on television when there is no signal presents high randomness complexity but low rule complexity. Boskoski et al. have presented a kind of rule complexity in Ref. [2]. They supposed that both periodical and purely random signals should have no complexity. The complex signals should be located somewhere in between and have chaotic behaviors. And they have defined the product of Rényi entropy and JensenRényi divergence as a rule complexity.
Compared to the rule complexity, randomness complexity is more common used and easy to realize. Many references have used randomness complexities for diagnosis and prognostics. Ref. [13] proposed a quantitative diagnosis method of a spalllike fault for bearings based on empirical mode decomposition (EMD) and approximate entropy (ApEn). Ref. [14] proposed a bearing diagnosis method based on EMD energy entropy and ANN. Ref. [15] presented a bearing diagnosis approach based on local characteristicscale decomposition (LCD) and fuzzy entropy (FuzzyEn). Spectral entropy has been applied as a complementary index of bearings for performance degradation assessment in Ref. [16]. Yan et al. have applied LempelZiv complexity (LZC), ApEn and permutation entropy (PermEn) as features for bearings diagnosis, respectively in Ref. [1719]. Shannon entropy (ShEn) is selected as one of the basic features for prognostics in Ref. [20]. A bearing diagnosis method based on empirical wavelet transform and fuzzy entropy is proposed in Ref. [21]. Zhao et al. have applied Multiscale Fuzzy Entropy and EEMD for motor bearings [22]. General mathematical morphological particle and mathematical morphological fractal dimension are respectively proposed in Ref. [23, 24]. In the numerous relevant literatures, authors have applied many randomness complexities for research and even combined with signal processing methods like EMD, LCD and wavelet transform. No matter what the forms of the randomness complexities are, the basic principle of randomness complexities is invariable, namely, the greater the regularity is, the lower the randomness complexities value. For convenience, when we talk about randomness complexity later, we use complexity instead.
The history of the complexity can be traced back to 1940s when Shannon first developed information theory and proposed Shannon entropy [25]. When calculating ShEn in frequency domain, the entropy is called spectrum entropy. A problem with the ShEn is that it is relatively insensitive to the changes in the tails of the distribution, so Rényi extended and generalized the ShEn by proposing Rényi entropy in 1961 [26]. The definition of Kolmogorov complexity (KC) is proposed by Kolmogorov in 1965, which can measure pointwise randomness. The KC is different from ShEn where it only concerned with the average information of a random source [27]. Some equivalences between ShEn and KC is discussed in Ref. [28]. In 1976, Lempel and Ziv proposed a specific algorithm for the calculation of KC called LZC [29]. In 1991, ApEn is first developed by Pincus to handle the limitations that accurate entropy calculation requires vast amounts of data and great influence by system noise [30]. Sample entropy (SampEn) is a modification of ApEn proposed by Richman and Moorman in 2000 [31]. It has two advantages over ApEn: a relatively troublefree implementation and data length independence. Besides, SampEn needn’t the template vector comparison between itself. In Ref. [32], Costa et al. have extended SampEn named multiscale SampEn, where the SampEn is a special case of multiscale SampEn with the skipping parameter equals to one. In 2002, Bandt and Prompe introduced PermEn which based on comparisons of neighboring values of times series [33]. FuzzyEn is first proposed by Chen et al, and it extended the “membership degree” with a fuzzy function [34]. The concept of ${C}_{0}$ complexity (${C}_{0}C$) was proposed by Chen and Gu [35]. Cai and Sun [36, 37] had been improved ${C}_{0}C$. In literature, there are more than ten proposed complexities, and we would not enumerate them. Distinguishingly, fractal dimensions and Lyapunov exponents are specific complexities because both of them are sensitive to the noise and generally used to test chaotic behaviors. Their application is narrow, and lack of adaptability.
In this paper, we are going to explore the properties of the seven commonly used complexities, i.e., ApEn, SampEn, FuzzyEn, ShEn, PermEn, LZC, ${C}_{0}C$ with simulation signals and runtofailure bearing’s vibration data. After the comparisons, we have found there are some similarities within, and we have proposed an index to measure these similarities. From the comparison, we can figure out which one has the best performance. In this paper, we will engage in discussing the similarities of complexities and try to prove that complexities are a good family feature of rolling bearings’ degradation.
The paper is organized as follows. In Section 2, the calculation procedures of seven complexities are introduced. Based on their own algorithms, we have classified them into three categories and discussed them by their own definitions. The performance of complexities in simulation signals is in Section 3, and we will compare them in three different aspects. In Section 4, we are going to use two runtofailure data (an inner race fault and an outer race fault) to examine their performance in real bearing’s data. Next, we have proposed a trend similarity index to measure the similarity between complexities. A detailed discussion is presented in Section 5. Finally, concluding remarks are given in Section 6.
2. Brief introduction of the seven complexities
Above all, we have reviewed the development of complexities. In this section, we are going to briefly introduce the calculation procedures of the seven complexity features, namely, ShEn, ApEn, SampEn, FuzzyEn, LZC and ${C}_{0}C$. The parameters of the features are stated as well.
2.1. Shannon entropy
ShEn is the first proposed complexity, and it quantifies the probability density function (PDF) of the signal as $ShEn={\sum}_{i}{p}_{i}\mathrm{l}\mathrm{o}\mathrm{g}{p}_{i}$where $i$ is all amplitude values of the signal and ${p}_{i}$ is the probability that amplitude value ${a}_{i}$occurs anywhere in the signal [25]. However, in the case of measured signals, the PDF is not known and should be estimated. Also, to consider all amplitude is not reasonable generally. The easy way to evaluate the PDF is to use the histogram where the amplitude range ($N$) of the signal can be divided into $k$ bins linearly so that the ratio $k/N$ is constant. The ratio $k/N$ characterizes the average filling of the histogram. It is worth to notice that there is something difference between the real PDF and the histogram. To reduce the influence, the ratio$k/N$ should be set bigger. However, the bigger ratio will reduce the ability of noise resistance. In this paper, we set it as 50.
2.2. Approximate entropy
The calculation steps of ApEn are as follows [30]:
Step 1: Given an $N$ point time series_{}$u=\left\{u\left(i\right)\right\}$, and form vector sequences $x\left(1\right)$ through $x(Nm+1)$, defined by $x\left(i\right)=\left[u\left(i\right),\cdots ,u\left(i+m1\right)\right]\text{.}$ These vectors represent $m$ consecutive $u$ values with $i$th point.
Step 2: The distance $d\left[x\right(i),x(j\left)\right]$ between vectors $x\left(i\right)$ and $x\left(j\right)$ can be defined as the maximum difference in their respective scalar components, where:
Step 3: For each vector $x\left(i\right)$, a measure that can describe the similarity between the vector $x\left(i\right)$ and all other vectors $x\left(j\right)$ can be constructed as:
where $\mathrm{\Theta}\left\{x\right\}$is the Heaviside step function, and it is represented as:
The parameter $r$ symbolizes a tolerance value or similarity criterion, which is defined as $r=k\cdot std\left(u\right)$, where $k$ is a positive constant, and $std(\xb7)$ is the standard deviation of the time series. The parameter $r$ is considered as a regularity or frequency of patterns similar to a given pattern of window length $m$.
Step 4: Define ${\varphi}^{m}\left(r\right)={(Nm+1)}^{1}{\sum}_{i=1}^{Nm+1}\mathrm{l}\mathrm{o}\mathrm{g}{C}_{i}^{m}\left(r\right)\text{,}$ and define $ApEn(m,r)=\underset{N\to \infty}{\mathrm{l}\mathrm{i}\mathrm{m}}\left[{\varphi}^{m}\right(r){\varphi}^{m+1}(r\left)\right]$. Given a finite time series with $N$ data points, the statistic ApEn value is defining the $ApEn(m,r,N)={\varphi}^{m}\left(r\right){\varphi}^{m+1}\left(r\right)$.
The four steps can be described as phasespace reconstruction, distance calculation, similarity calculation and complexity calculation. The concept of ApEn is derived from correlation dimension. When calculating correlation dimension, the correlation integral must be obtained first, defined as:
which is similar to ${C}_{i}^{m}\left(r\right)$. It has been proved that there is a relationship with $\underset{r\to \infty}{\mathrm{l}\mathrm{i}\mathrm{m}}C\left(r\right)\propto {r}^{D}$, where $D$ is the correlation dimension. Since$D=\underset{r\to 0}{\mathrm{l}\mathrm{i}\mathrm{m}}\left(\mathrm{l}\mathrm{o}\mathrm{g}\right(C\left(r\right))/\mathrm{l}\mathrm{o}\mathrm{g}(r\left)\right)$, $D$ can be estimated by calculating the slope of the $\mathrm{l}\mathrm{o}\mathrm{g}\left(r\right)~\mathrm{l}\mathrm{o}\mathrm{g}\left(C\right(r\left)\right)$ when increasing $r$ from a small value.
2.3. Sample entropy
The calculation procedures of SampEn are as follows [31]:
Step 1 and Step2: Phasespace reconstruction and distance calculation as the same as step 1 and 2 in the calculation of ApEn.
Step 3: Given the tolerance value $r$ and count the number of ${d}_{ij}^{m}<r$ as ${B}_{i}$, and define ${B}_{i}^{m}\left(r\right)={(Nm1)}^{1}{B}_{i}$, where $i=\mathrm{1,2},\cdots ,Nm+1$ and $i\ne j$. Calculate the mean value as$\mathrm{}{B}^{m}\left(r\right)={(Nm)}^{1}{\sum}_{i=1}^{Nm}{B}_{i}^{m}\left(r\right)$.
Step 4: Calculate$\mathrm{}{B}^{m+1}\left(r\right)$, and define $SampEn(m,r)=\underset{N\to \infty}{\mathrm{l}\mathrm{i}\mathrm{m}}[\mathrm{l}\mathrm{o}\mathrm{g}[{B}^{m+1}\left(r\right)/{B}^{m}\left(r\right)\left]\right]$. The statistic SampEn value is estimated by defining the $SampEn(m,r,N)=\mathrm{l}\mathrm{o}\mathrm{g}\left[{B}^{m+1}\right(r)/{B}^{m}(r\left)\right]$.
2.4. Fuzzy entropy
Based on the ApEn, FuzzyEn expanded the $\mathrm{\Theta}\left\{x\right\}$ in ApEn’s third step. Heaviside step function causes a kind of twostate classifier, which is a crisp one, namely the classifier is one or the other. FuzzyEn combined fuzzy theory with ApEn. By introducing the “membership degree” with a fuzzy function ${\mu}_{C}\left(x\right)$ which associates each point $x$ with a real number in the range [0, 1], the fuzzy theory gives a property that the higher ${\mu}_{C}\left(x\right)$is, the higher the membership grade of x in the relevant set. When calculating FuzzyEn, Chen et al. defined the fuzzy function $\mu ({d}_{ij}^{m},n,w)$ as$\mathrm{}\mathrm{e}\mathrm{x}\mathrm{p}({\left({d}_{ij}^{m}\right)}^{n}/w)$, where $w$ determines the width and the gradient of the boundary of the exponential function (It is generally set to be 2), and replace the Heaviside step function in the ApEn [34]. At last, the statistic FuzzyEn is estimated by$FuzzyEn(m,r,w,N)={\varphi}^{m}(r,w){\varphi}^{m+1}(r,w)$. Moreover, FuzzyEn has a difference in distance calculation compared with ApEn. In FuzzyEn, vectors are generalized by removing the baseline of themselves.
2.5. Permutation entropy
The calculation steps of PermEn are as follows [33]:
Step 1: Phasespace reconstruction as same as step 1 in the calculation of ApEn.
Step 2: Arrange $X\left(i\right)$ in an increasing alignment. The $m$ number of values contained in each $X\left(i\right)$ can be arranged in an increasing alignment as:
Accordingly, any vector $X\left(i\right)$ can be mapped onto a set of symbols as $S\left(l\right)=({j}_{1},{j}_{1},\cdots ,{j}_{m})$, where $l=\mathrm{1,2},\cdots ,k$and $k\le m!$. $S\left(l\right)$ is one of the $m!$ symbol permutations, which is mapped onto the $m$ number symbols $({j}_{1},{j}_{2},\cdots ,{j}_{m})$ in $m$dimensional embedding space. If ${P}_{1},{P}_{2},\cdots ,{P}_{k}$ are the probability distribution of each sequences where ${\sum}_{l=1}^{k}{P}_{l}=1$, then the PermEn for the times series of order $m$ can be defined as the Shannon entropy as for the $k$ symbol sequences:
2.6. LempelZiv complexity
The calculation steps of LZC are as follows [29]:
Step 1: To calculate LempelZiv complexity, the time series should be conducted “coarsegraining” operation first. In general, the time series would change to a sequence that only contains two symbols. The sequence ${S}_{N}$ is reconstructed by comparing the value of each sample of the previous sequence with the median value $m$ (or mean value, we take the median value as default). If the sample’s value is large than $m$, it will change to 1, otherwise as 0.
Step 2: By obtained the twosymbol $N$ point sequence ${S}_{N}=\{{s}_{1}{s}_{2}\cdots {s}_{N}\}$, initialize ${S}_{v,0}=\left\{\text{\hspace{0.17em}}\right\}$, ${Q}_{0}=\left\{\text{\hspace{0.17em}}\right\}$, ${C}_{N}\left(0\right)=0$ and $r=0$. Set ${Q}_{r}=\left\{{Q}_{r1}{S}_{r}\right\}$. Due to the ${Q}_{r}$ does not belong to ${S}_{v,r1}$, so set ${C}_{N}\left(r\right)={C}_{N}(r1)+1$, ${Q}_{r}=\left\{\text{\hspace{0.17em}}\right\}$ and $r=r+1$.
Step 3: Set ${Q}_{r}=\left\{{Q}_{r1}{S}_{r}\right\}$ and judge whether the ${Q}_{r}$ belongs to ${S}_{v,r1}$, if true, set ${C}_{N}\left(r\right)={C}_{N}(r1)+1$ and $r=r+1$; if not, then set ${C}_{N}\left(r\right)={C}_{N}(r1)+1$, ${Q}_{r}=\left\{\text{\hspace{0.17em}}\right\}$ and $r=r+1$. Repeat step 3 to the end of the sequence. Finally, we have the ${C}_{N}\left(r\right)$.
Step 4: The length of the sequence has obvious influence on the ${C}_{N}\left(N\right)$. So, Lempel and Ziv gave a normalized LZC, which is defined as ${C}_{normalizedN}={C}_{N}\left(N\right)\times \mathrm{l}\mathrm{o}\mathrm{g}N/N$.
2.7. ${\mathit{C}}_{0}$ complexity
The calculation steps of ${C}_{0}C$ are as follows [36]:
Step 1: Implement Fast Fourier Transform (FFT) for the time series $s\left(t\right)$ and gain the spectrum where $x\left(k\right)=FFT\left(s\right(t\left)\right)$.
Step 2: Obtain ${G}_{N}$, where it stands for the mean square value of the amplitude spectrum. Introduce a variable $r$ where$\mathrm{}r(r>1)$. Where the part that its value is bigger than ${rG}_{N}$ is considered as regular component. Where the opposite part is considered as irregular part:
Step 3: Transform the regular part $\stackrel{~}{x}\left(t\right)$ into $\stackrel{~}{s}\left(t\right)$ through the inverse FFT, and $\stackrel{~}{s}\left(t\right)$ is the regular part of the original signal.
Step 4: Define ${C}_{0}C$ as the ratio of the component of $\lefts\right(t)\stackrel{~}{s}(t\left)\right$ to the $s\left(t\right)$ where it stands for the ratio between the irregular part to the original signal:
Ref [36] suggests that $r$ should be 5 to 10, and we will discuss the parameter as below.
Compute the ${C}_{0}C$ of white noise with different $r$ as shown in Fig. 1. When $r>6$ the value of ${C}_{0}C$ is close to 1. Actually, the ${C}_{0}C$ of a random time series should be equal to 1. By using the group simulation signals in Section 3.1, we have four ${C}_{0}$ complexities curves with $r=$ 6, 10, 100 and 200, as shown in Fig. 2. Too higher $r$ will lead to the fact that the ${C}_{0}$ complexity of pseudoperiodical signal does not close to 0. In this paper, we set $r=\text{10}$ as default.
Fig. 1The C0C value with different r
Fig. 2The curves of the C0C with different SNRs and r
Table 1The summary of the selected parameter values
Complexity  Parameters  Value 
ApEn  Embedding dimension $m$ Tolerance $r$ Delay time $\tau $  2 0.2 std 1 
SampEn  Embedding dimension $m$ Tolerance $r$ Delay time $\tau $  2 0.2 std 1 
FuzzyEn  Embedding dimension $m$ Tolerance $r$ Parameter $w$ Delay time $\tau $  2 0.2 std 2 1 
PermEn  Embedding dimension $m$ Delay time $\tau $  6 1 
LZC  Parameter $m$  median value 
${C}_{0}C$  Parameter $r$  10 
From the procedures of the sevens, we can see that ApEn derives from correlation dimension. They have many similarities in their forms and calculation. SampEn and FuzzyEn are two modifications of ApEn in different aspects. Although, PermEn have somewhat similar in form to ApEn, e.g. phasespace reconstruction, but it calculates complexity in ShEn form. Thus, we can classify the complexities into two categories: one is the ApEn, SampEn and FuzzyEn, the other is PermEn and ShEn. As to the rest of the sevens, LZC and ${C}_{0}C$, we put them into another category, for they have threshold parameters for coarsegraining or as the boundary of periodicity and randomness. To compare the performance of the sevens, we need to reduce the influence of parameters, so the communal parameters should be consistent. The setting of parameters is within the recommended values of their original references. It is suggested that the tolerance of ApEn, SampEn and FuzzyEn should be set 0.10.25 std. The embedding dimension is suggested to set as 2 or 3. We set them as 0.2 std and 2. Particularly, though PermEn has the phasespace reconstruction procedure which has the same form as ApEn, FuzzyEn and PermEn, it is essentially different. For PermEn, vectors come from the phasespace reconstruction are not to be compared. Comparisons exist within the vectors. For the other two, comparisons are implemented between vectors to calculate the distance. So, the embedding dimension $m$ of PermEn is different from the others. Large $m$ will extremely increase the calculation time, and we set $m=\text{6}$.
3. Performance of complexities in simulation signals
3.1. Performance in sinusoidal signals with additive noise
To figure out the performance of different complexities in rolling bearings, it is the primary and paramount to test in periodical signals with different intensity noise, where the sine wave is the simplest periodical signal. Now, we set a class of simulation signals defined as $S\left(t\right)=X\left(t\right)+e\left(t\right)$, where $X\left(t\right)=\mathrm{s}\mathrm{i}\mathrm{n}(2\pi \times 10t)$ and $e\left(t\right)$ represents the additive noise. The sampling frequency is 10000 Hz with 1 s duration. Fig. 3 shows the complexities with different SNRs. To be more vivid, all the complexities have been normalized.
Fig. 3The curve of seven complexities versus SNRs
It is obvious to see that all the complexities tend to descend with the decreasing of the additive noise. The complexities should decrease with the decreasing of the additive noise. Among the sevens, ShEn and PermEn are the worst, since they do not have a good monotonous tendency. The rests of complexities present good monotonicity. Among them, ApEn the rightmost is the most sensitive to the noise, and ${C}_{0}C$ the leftmost is the most insensitive to the noise.
3.2. Performance in logistic map
In Section 3.1, we have discussed the performance of seven complexities in a specific group of simulation signals. In this part, we will use more general simulation signals. The logistic map can be taken as an easy platform for it can generate periodical and chaotic signals. As many references shown, bearings are nonlinear components and many researches have reported nonlinear phenomena such as chaos, bifurcations and quasiperiodicity in bearings [3840]. The logistic map is an easy nonlinear system, where it can be simple mathematical written as ${x}_{n+1}=\mu {x}_{n}(1{x}_{n})$. Fig. 4 shows the process of the logistic map where $2.5<\mu <4$, meanwhile, it shows the largest Lyapunov exponent (LLE) of the logistic map. Where $LLE<0$ means the system is periodical. $LLE=0$ means bifurcation occurs, and $LLE>0$ means chaotic behaviors occur. In the logistic map, the period 2 bifurcation happens at $\mu =3$; the 4 bifurcation happens at $\mu =\text{3.449}$. After $\mu =\text{3.83}$, there is a short period 3.
Analogically, we have calculated the seven complexities of the logistic map, and the results are shown in Fig. 5. The LLE is a reference of the sevens. The trend of the sevens should be similar to the LLE When $LLE>0$. Let’s define a term of $LL{E}_{1}$ which satisfies that $LL{E}_{1}=LLE$when $LLE>0$, $LL{E}_{1}=$0 when $LLE<0.$ It is observed that PermEn has quite a few fluctuations before $\mu =\text{3.449}$. ShEn has some platforms before $\mu =\text{3.449}$. FuzzyEn has something wrong around $\mu =\text{3.5}$ and before $\mu =\text{3}$. The reason of that must lie in the fuzzy membership of the FuzzyEn. LZC and ${C}_{0}C$ have some wrong value about $\mu =\text{3.6}$, where there exist chaotic behaviors, but both have values close to zero. All the sevens exhibit the short period 3 after $\mu =\text{3.83}$. As it shows, ApEn and SampEn almost coincide with $LL{E}_{1}$. In this part, ApEn and SampEn have the best performance. It should be notice that the application scope of the $LLE$ is narrow. It only can be calculated in the determined chaos systems. It cannot be computed with an arbitrary time series.
Fig. 5The seven complexities of the logistic map
a) ShEn, PermEn and SampEn versus $LL{E}_{1}$
b) ApEn, FuzzyEn, LZC and ${C}_{0}C$ versus $LL{E}_{1}$
3.3. Performance in rate of convergence
The length of data can affect the complexities value too. By using the data where the complexities are not convergent is inaccurate. In this part, we will make certain the performance in rate of convergence of the seven complexities, and it is a supplementary performance of complexities. In order to study the influence of data length on the sevens, we use the simulation signal in Section 3.1 with SNR = –10 dB as an example. The length is from 100 to 4000, as illustrated in Fig. 6.
Fig. 6The curve of seven complexities versus SNRs
As we can see, PermEn and ShEn have similar increasing convergence trend, and the complexities value are convergent after 2000 data points. SampEn, FuzzyEn, LZC and ${C}_{0}C$ have similar convergence trend. The complexities valued have a fluctuant decreasing trend. ApEn is different from the above, and it has a fluctuant increasing trend. From the comparison of convergence rate, we can confirm that ${C}_{0}C$ is the best for it can be convergent about 1000 data points. ShEn and PermEn are no doubt the worst. To be more accurate and explicable, the length of each data should beyond 2000.
3.4. Brief summary of comparisons of simulation signals
From the three comparisons of simulation signals above, we can have some brief conclusions. Primarily, every complexity conforms to the basic principle i.e. the higher the regularity is, the lower the complexities value. Among them, the performances of ShEn and PermEn are dissatisfactory. Take ShEn as an example, the parameter of average filling of the histogram ($k/N$) must be set. Though, to be a certain extent, such a method can improve antinoise performance, the complexity still has its inherent problems. For instance, it is relatively insensitive to the changes in the tails of the distribution and slow convergence. Similarly, LZC and ${C}_{0}C$ have their inherent problems too. Coarsegraining may change the dynamic properties of the original time series. As to ${C}_{0}C$, it is lack of rigor to define a threshold as the boundary of periodicity and randomness. Relatively, ApEn and SampEn perform better than the others. As to a modification of ApEn, we consider that SampEn has the best performance among them. Thus, we will take SampEn as the benchmarking of the seven complexities.
4. Complexities comparisons in bearings’ runtofailure data
In Section 3, we have used three methods to judge different aspects performance of the seven complexities in simulation signals. In this section, we will discuss the sevens in real signals i.e. the two bearings’ runtofailure data. The failure form of the Example 1 is inner race fault, the other one is outer race fault.
4.1. Example 1 (inner race fault)
The Example 1 is from the IEEE PHM 2012 Prognostics Challenge data and the data were provided by FEMTOST Institute [41]. The purpose of the challenge was to estimate the remaining useful life of bearings. FEMTOST Institute has made an experimentation platform which is named PRONOSTIA as shown in Fig. 7. In this challenge, the data were monitored with 3 different loads. There were 6 complete runtofailure data for training and 11 truncated data for predicting the remaining useful life. Tests were stopped when the vibration signal reached 20 g. However, we have no idea of the failure type of the bearings. In this paper, the first dataset is taken as the Example 1 within 2803 files. The parameters of the test and bearings are listed in Table 2.
Fig. 7Overview of PRONOSTIA
As a fact that we don’t know the failure type of the Example 1, we can study the envelop spectrum of it. The envelope spectrum of the last file of the Example 1 is shown in Fig. 8. By calculation of characteristics frequencies of the test, we can obtain the ball pass frequency on inner race is 221.66 Hz, the ball pass frequency on outer race is 168.34 Hz and fundamental train frequency is 12.95 Hz. In Fig. 8, there is a peak at 218.8 Hz. So, we can infer that the final failure type of Example 1 is inner race fault.
Table 2The parameters of the test and bearings.
Pitch diameter ($D$)  Number of rolling elements ($Z$)  Bearing’s rolling element diameter ($d$)  Sample frequency (${f}_{s}$)  Sample length ($L$)  Record frequency  Operating condition of Example 1 
25.6 mm  13  3.5 mm  25.6 kHz  25600  10s  1800 rpm and 4000 N 
Fig. 8The envelope spectrum of e Example 1’s last file data
The seven complexities of the Example 1 are calculated as shown in Fig. 9. The seven complexities present a downtrend. They show consistent results, however, the ShEn seems to have more fluctuation. Among them, ApEn, SampEn and LZC seem to be more similar. Take SampEn as an example, before #500 (where # means the number file), the complexity increases to a peak and then decreases for a long time. As the figure shows, there is a local peak appeared about #2400 to #2600. Before #500, oil film is not fully formed is the reason of the arise of the complexities. When formed, the complexities’ decreasing indicate that the fault of the bearing is deepening. The reason for the peak around about #2400 to #2600 is probably the pitting of the surface are planished by the rotating. So, the signal may present not so periodical. And then, the new cracks appear making the signal more periodical, so the complexities are then dropped.
The complexities present a kind of similarity and can be explained with Fig. 3. Assume that the curves in Fig. 3 are seven functions (${y}_{i}\left(x\right)={f}_{i}\left(x\right)$, $i=\mathrm{1,2},\cdots ,7$) that reflect monotonic decreasing, they have a similar trend. Take each file data as x ($x=\mathrm{1,2},\cdots ,2803$) and put them to ${y}_{i}\left(x\right)={f}_{i}\left(x\right)$. If the functions ${f}_{i}\left(x\right)$ are similar, then the results are similar too. To give quantitative similarity results, in this paper, we will propose a trend similarity index (TSI). Many similarity indexes are based on distance measures e.g. Manhattan distance, Euclidean distance, KullbackLeibler divergence or correlation analysis e.g. Pearson correlation coefficient. Neither can measure the trend similarity of two sequences. In order to measure the trend of a time series, what comes to mind first is to analyze the derivatives. Certainly, we need to deburr the complexities curves. There we use the smoothing spline for fitting. Fig. 10 shows the SampEn deburred by smoothing spline with the smoothing parameter s equals to 1e6, 1e7 and 1e8. As we can see, when $s=$1e6, the fitted SampEn exists a little fluctuation. When $s=$1e8, the smoothed SampEn doesn’t have a good fitting at the end of the data.
Fig. 9The seven complexities of Example 1
a) ShEn and PermEn
b) ApEn
c) SampEn
d) FuzzyEn
e) LZC
f)${C}_{0}C$
To define the trend similarity of two same length time series, a direct way is to consider the derivatives of them. Fig. 11(a) shows three functions, where ${y}_{1}={x}^{2}$, ${y}_{2}=x$, ${y}_{3}=\sqrt{x}$, we can consider that they have the same trend, for they have positive derivatives. Fig. 11b) shows two functions, where ${y}_{4}={x}^{2}$, ${y}_{5}={x}^{3}\text{+}1$. We can deem that where $x>0$, they have the same trend, where $x<0$, they have the different trend. By defining the ratio of the same trend length to the entire length, we can obtain the TSI, where the TSI of ${y}_{4}$ and ${y}_{5}$ is 50 %. In Section 3, we have concluded that SampEn present the best performance of the seven complexities in simulation signals, and then, we can have each TSI based on the SampEn deburred by smoothing spline which is shown in Table 3. From the results, we can clearly find that ApEn have extremely high similarity. ShEn and PermEn have relatively low similarity to the SampEn. Though, fitting parameters can affect the results of TSI, it is not an obstacle to estimate which complexity has higher similarity to SampEn.
Fig. 10The SampEn of Example 1 deburred by smoothing spline with the three smoothing parameters
a) The smoothing parameter equals to 1e6
b) The smoothing parameter equals to 1e7
c) The smoothing parameter equals to 1e8
Fig. 11The example of trend similarity
a)${y}_{1}$, ${y}_{2}$ and ${y}_{3}$
b)${y}_{4}$ and ${y}_{5}$
Table 3The TSI of each complexity based on the SampEn denoised by smoothing spline with different smoothing parameters
Complexity  TSI ($s=$1e6)  TSI ($s=$1e7)  TSI ($s=$1e8) 
ApEn  98.39 %  99.50 %  98.64 % 
FuzzyEn  91.75 %  97.25 %  93.54 % 
ShEn  82.22 %  86.68 %  89.75 % 
PermEn  71.87 %  69.44 %  69.69 % 
LZC  85.58 %  91.75 %  91.57 % 
${C}_{0}C$  85.26 %  91.86 %  93.72 % 
Intuitively speaking, the directly way to reflect the degradation of impact is the amplitude of defect frequency (ADF). The peaks at the envelope spectrum of impact of the Example 1 are calculated and make up the ADF as shown in Fig. 12. As we can see, there is a little fluctuation before #2750, and then there is a quick increasing. Accurately, the ADF measures the energy of the impact. It is a portion of the whole energy of the vibration signals. The root mean square (RMS) is a commonly used feature to measure the holistic energy of the signals. The RMS of the Example 1 is shown in Fig. 13 with SampEn. The increasing of RMS indicates the deepening of the deterioration. We can see that there is an opposite trend between the RMS and SampEn. When there is a pitting occurs on the rubbing surface, it will make energy concentration and the signal more periodical. So, the complexities go down. However, at the end of failure, the RMS increases quickly, but there is no sudden decreasing of SampEn. That’s can be explained when the bearing is closing to failure, there are more pitting on the rubbing surface, each pitting can cause the periodical signal, but the combined signal is not as periodical as the one caused by a pitting. So, the RMS and SampEn measure different properties of the signal.
Fig. 12The Example 1’s ADF
Fig. 13The RMS and SampEn of the Example 1
4.2. Example 2 (outer race fault)
Another runtofailure data is used to verify the performance of the seven complexities. The data comes from the Intelligent Maintenance System (IMS) center [42]. The test rig is mounted four bearings on a shaft as shown in Fig. 14. In the test, four double row bearings typed Rexnord ZA2115 were installed on the shaft. An accelerometer was installed on the test rig to monitor the vibration signal of the bearings. The parameters of the test and bearings are shown in Table 4. The bearing 21 which is the first bearing of the Set No. 2 is used as the Example 2. The failure type of the Example 2 is outer race defect which is shown in Fig. 15. The Example 2 has 982 files.
Table 4The parameters of the test and bearings.
Pitch diameter ($D$)  Number of rolling elements ($Z$)  Bearing’s rolling element diameter ($d$)  Sample frequency (${f}_{s}$)  Sample length ($L$)  Record frequency  Operating condition of Example 1 
0.331 inches  16  2.815 inches  20 kHz  20480  10 min  2000 rpm and 6000 lbs 
The complexities of Example 2 are shown in Fig. 16. We can see there is a similar trend. We use the SampEn as an example. Before #520, it presents stationary process and the bearing is in normal condition. We can see the complexity decreases linearly from #520 to #700 and the bearing is in slight defect condition. From #700 to #850, there is a peak and the condition is severe. From # 900 to the failure, the SampEn presents an increasing and the bearing is run to failure. On the whole, the SampEn is decreasing which is a same conclusion of Example 1. We can compare the RMS and ADF of the Example 2 to explore the reason why the trend of complexities is present like that.
Fig. 14The test rig of Example 2
Fig. 15The failure type of Example 2
Fig. 16The seven complexities of Example 2
a) ShEn and PermEn
b) ApEn
c) SampEn
1
e) LZC
f)${C}_{0}C$
Fig. 17 and Fig. 18 show the RMS and ADF of the Example 2. The trend of RMS and ADF is similar. Before #520, the RMS presents stationary and it increases linearly to #700. Between #700 and #850, the RMS experiences decreasing and then increasing. This is socalled “healing” phenomenon and have been stated detail in Ref. [7, 43, 44]. Form #900 to the end, the environment of the bearing is becoming violent. The RMS rises quickly to the failure.
Take SampEn as a baseline, we can have the TSI of each complexity too, as shown in Table 5. Before calculating TSI, the SampEn must be deburred. Fig. 19 shows the SampEn deburred by smoothing spline with the smoothing parameter s equals to 1e3, 1e4 and 1e5. From the results, we can still find that ApEn has extremely high similarity to SampEn.
Fig. 17The Example 2’s RMS
Fig. 18The Example 2’s ADF
Fig. 19The SampEn of Example 2 deburred by smoothing spline with the three smoothing parameters
a) The smoothing parameter equals to 1e3
b) The smoothing parameter equals to 1e4
c) The smoothing parameter equals to 1e5
Table 5The TSI of each complexity based on the SampEn denoised by smoothing spline with different smoothing parameters
Complexity  TSI ($s=$1e3)  TSI ($s=$1e4)  TSI ($s=$1e5) 
ApEn  89.69 %  92.76 %  90.92 % 
FuzzyEn  69.59 %  74.39 %  69.59 % 
ShEn  60.20 %  61.94 %  57.45 % 
PermEn  64.80 %  70.82 %  63.67 % 
LZC  67.14 %  69.90 %  67.86 % 
${C}_{0}C$  61.63 %  55.92 %  59.39 % 
5. Discussion
At present, we have completed the comparisons of the seven complexities. It can be seen that complexity is a reliable, robust feature of rolling bearings’ degradation. It can reflect the process stage of degradation. From the Example 1, we can move forward to see the ADF and RMS curves. The ADF shows a long time with nearly no significant characters until failure. That is universal phenomenon of inner race fault. The RMS seems to be better than ADF, it shows a slightly increasing until failure. We have plotted the RMS curves of the seventeen individuals of the IEEE PHM 2012 Prognostics Challenge data. Most of them manifest an increasing trend, but some of the others have no regularity. However, for complexity feature, it is destined to have a decreasing trend. With the deepening of the process, there must be defect occurs, whatever the type is. What will make a defect frequency, leading to the decreasing of randomness. In addition, there are many defect types, a punctate one can trigger characteristic frequency, hardly for a flaky one. In Ref. [1719], Yan et al. have the similar conclusions that with the time elapses, the complexities are increasing. By carefully studying the references, we found that the key reason is about the experiment. Yan et al. did the experiment by cut a slot beforehand, thus making the process extremely changed.
In the paper, we have proposed an index of trend similarity. From the charts of two runtofailure data, we can find this similarity. However, how to define the “trend” of a time series still need to study. In our work, we have tried every fitting method of curve fitting toolbox in MATLAB, and find that smoothing spline have a good result. But, how to select the smoothing parameter s is still unsolved. Furthermore, by finding that complexities are a good family feature of rolling bearings’ degradation, this paper gives a direction for dimensionality reduction of multifeatures. A kind of family feature can be represented by a good one of it, thus can reduce dimensions in an initiative way.
6. Conclusions
In this paper, we have discussed and compared seven commonly used randomness complexities in simulation signals and real signals. By comparisons, we have found the similarity of complexities and explained it. In addition, we have defined a trend similarity index to measure the similarity of different complexities in runtofailure bearings’ data. Finally, we can conclude that randomness complexity is a family feature of rolling bearings’ degradation. The complexities are similar, among the sevens, SampEn have the best performance, it can be a representative to represent the family feature of complexities.
References

Zhang B., Georgoulas G., Orchard M., Saxena A. Rolling element bearing feature extraction and anomaly detection based on vibration monitoring. Mediterranean Conference on Control and Automation, 2008, p. 17921797.

Boškoski P., Gašperin M., Petelin D., Juričić Đ. Bearing fault prognostics using Rényi entropy based features and gaussian process models. Mechanical Systems and Signal Processing, Vol. 52, Issue 53, 2015, p. 327337.

Liao Z., Song L., Chen P., Zuo S. An automatic filtering method based on an improved genetic algorithm – with application to rolling bearing fault signal extraction. IEEE Sensors Journal, Vol. 17, 2017, p. 63406349.

Zhang B., Sconyers C., Byington C., Patrick R., Orchard M. E., Vachtsevanos G. A probabilistic fault detection approach: application to bearing fault detection. IEEE Transactions on Industrial Electronics, Vol. 58, 2011, p. 20112018.

Wang Y., Xiang J., Markert R., Liang M. Spectral kurtosis for fault detection, diagnosis and prognostics of rotating machines: a review with applications. Mechanical Systems and Signal Processing, Vol. 6667, 2016, p. 679698.

Randall R. B., Antoni J. Rolling element bearing diagnostics – a tutorial. Mechanical Systems and Signal Processing, Vol. 25, 2011, p. 485520.

El Thalji I., Jantunen E. A summary of fault modelling and predictive health monitoring of rolling element bearings. Mechanical Systems and Signal Processing, Vol. 60, Issue 61, 2015, p. 252272.

Jardine A. K. S., Lin D., Banjevic D. A review on machinery diagnostics and prognostics implementing conditionbased maintenance. Mechanical Systems and Signal Processing, Vol. 20, 2006, p. 14831510.

Tandon N., Choudhury A. A review of vibration and acoustic measurement methods for the detection of defects in rolling element bearings. Tribology International, Vol. 32, 1999, p. 469480.

Borghesani P., Pennacchi P., Chatterton S. The relationship between kurtosis and envelopebased indexes for the diagnostic of rolling element bearings. Mechanical Systems and Signal Processing, Vol. 43, 2014, p. 2543.

Rapp P. E., Schma T. Complexity measures in molecular psychiatry. Molecular Psychiatry, Vol. 1, 1996, p. 408416.

Rapp P. E., Schmah T. I. Dynamical Analysis in Clinical Practice. Proceedings of the Workshop Chaos in Brain, 2000, p. 5262.

Zhao S. F., Liang L., Xu G. H., Wang J., Zhang W. M. Quantitative diagnosis of a spalllike fault of a rolling element bearing by empirical mode decomposition and the approximate entropy method. Mechanical Systems and Signal Processing, Vol. 40, 2013, p. 154177.

Yang Y., Yudejie, Cheng J. Roller bearing fault diagnosis method based on EMD energy entropy and ANN. Journal of Sound and Vibration, Vol. 294, 2006, p. 269277.

Zheng J., Cheng J., Yang Y. A rolling bearing fault diagnosis approach based on LCD and fuzzy entropy. Mechanism and Machine Theory, Vol. 70, 2013, p. 441453.

Pan Y. N., Chen J., Li X. L. Spectral entropy: a complementary index for rolling element bearing performance degradation assessment. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, Vol. 223, 2009, p. 12231231.

Yan R., Gao R. X. Complexity as a measure for machine health evaluation. IEEE Transactions on Instrumentation and Measurement, Vol. 53, 2004, p. 13271334.

Yan R., Gao R. X. Approximate entropy as a diagnostic tool for machine health monitoring. Mechanical Systems and Signal Processing, Vol. 21, 2007, p. 824839.

Yan R., Liu Y., Ga R. X. Permutation entropy: a nonlinear statistical measure for status characterization of rotary machines. Mechanical Systems and Signal Processing, Vol. 29, 2007, p. 474484.

Javed K., Gouriveau R., Zerhouni N., Nectoux P. Enabling health monitoring approach based on vibration data for accurate prognostics. IEEE Transactions on Industrial Electronics, Vol. 62, 2014, p. 647656.

Deng W., Zhang S., Zhao H., Yang X. A novel fault diagnosis method based on integrating empirical wavelet transform and fuzzy entropy for motor bearing. IEEE Access, Vol. 6, 2018, p. 3504235056.

Zhao H., Sun M., Deng W., Yang X. A new feature extraction method based on EEMD and multiscale fuzzy entropy for motor bearing. Entropy, Vol. 19, 2017, p. 114.

Li H., Wang Y., Wang B., Sun J., Li Y. The application of a general mathematical morphological particle as a novel indicator for the performance degradation assessment of a bearing. Mechanical Systems and Signal Processing, Vol. 82, 2017, p. 490502.

Wang B., Hu X., Li H. Rolling bearing performance degradation condition recognition based on mathematical morphological fractal dimension and fuzzy Cmeans. Measurement, Vol. 109, 2017, p. 18.

Shannon C. E. A mathematical theory of communication. Bell System Technical Journal, Vol. 27, 1948, p. 355.

Renyi A. On Measures of Information and Entropy. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 1978, p. 547561.

Kolmogorov A. N. On tables of random numbers. Theoretical Computer Science, Vol. 207, 1963, p. 369376.

Leung Yan Cheong S., Cover T. Some equivalences between Shannon entropy and Kolmogorov complexity. IEEE Transactions on Information Theory, Vol. 24, 1978, p. 331338.

Lempel A., Ziv J. On the complexity of finite sequences. IEEE Transactions on Information Theory, Vol. 22, 1976, p. 7581.

Pincus S. M. Approximate entropy as a measure of system complexity. Proceedings of the National Academy of Sciences of the United States of America, Vol. 88, 1991, p. 2297.

Richman J. S., Moorman J. R. Physiological timeseries analysis using approximate entropy and sample entropy. American Journal of Physiology Heart and Circulatory Physiology, Vol. 278, 2000, p. H2039.

Costa M., Goldberger A. L., Peng C. K. Multiscale entropy analysis of biological signals. Physical Review E Statistical Nonlinear and Soft Matter Physics, Vol. 71, 2005, p. 021906.

Bandt C., Pompe B. Permutation entropy: a natural complexity measure for time series. Physical Review Letters, Vol. 88, 2002, p. 174102.

Chen W., Wang Z., Xie H., Yu W. Characterization of surface EMG signal based on fuzzy entropy. IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 15, 1998, p. 267272.

Fang C., Fanji G. A new measurement of complexity for studying EEG mutual information. Acta Biophysica Sinica, Vol. 14, 1998, p. 508512, (in Chinese).

Zhijie C., Jie S. Modified C0 complexity SND applications. Journal of Fudan University (Natural Science), Vol. 47, 2008, p. 791796.

Zhijie C., Jie S. Convergence of C0 complexity. International Journal of Bifurcation and Chaos, Vol. 19, 2011, p. 977992.

Mevel B., Guyader J. L. Routes to chaos in ball bearings. Journal of Sound and Vibration, Vol. 162, 2007, p. 471487.

Tiwari M., Gupta K., Prakash O. Effect of radial internal clearance of a ball bearing on the dynamics of a balanced horizontal rotor. Journal of Sound and Vibration, Vol. 238, 2000, p. 723756.

Yuan W., Liu S. Numerical analysis of the dynamic behavior of a rotorbearingbrush seal system with bristle interference. Journal of Mechanical Science and Technology, Vol. 33, Issue 8, 2019, p. 38953903.

Nectoux P., Gouriveau R., Medjaher K., Ramasso E., Morello B., Zerhouni N., et al. Pronostia: an experimental platform for bearings accelerated life test. IEEE International Conference on Prognostics and Health Management, Denver, USA, 2012.

Qiu H., Lee J., Lin J., Yu G. Wavelet filterbased weak signature detection method and its application on rolling element bearing prognostics. Journal of Sound and Vibration, Vol. 289, 2006, p. 10661090.

Williams T., Ribadeneira X., Billington S., Kurfess T. Rolling element bearing diagnostics in runtofailure lifetime testing. Mechanical Systems and Signal Processing, Vol. 15, 2001, p. 979993.

El Thalji I., Jantunen E. A descriptive model of wear evolution in rolling bearings. Engineering Failure Analysis, Vol. 45, 2014, p. 204224.
About this article
We like to acknowledge the support from the National Natural Science Foundation of China (Grant No. 51541506). We also appreciate the IEEE Reliability Society and FEMTOST Institute and the Centre for Intelligent Maintenance System, University of Cincinnati for providing the experimental data.