Artificial Immune System via Euclidean Distance Minimization for Anomaly Detection in Bearings

In recent years new alternative diagnostics methodologies have emerged, with particular interest into machinery operating in non-stationary conditions. In fact continuous speed changes and variable loads make non-trivial the application of frequency analysis techniques. Indeed a clear example comes from the spectrum analysis of ball-bearings which supports a rotating shaft: it is well-known that a damage in the bearing causes the rising of characteristic fault frequency in the vibration spectrum, and this frequency is proportional to rotational speed of the shaft. A variable speed of the shaft means a variable characteristic fault frequency that is no more recognizable in the spectrum. In order to overcome this problem the scientific community proposed different approaches that can be listed in two main categories: model-based approaches and expert systems. The expert systems include among the others: Artificial Neural Networks, Support Vector Machines, Artificial Immune Systems, etc… This paper focuses on the condition monitoring by means of an expert system. In this context the paper aims to present a simple method inspired and derived from the mechanisms of the immune system called Euclidean Distance Minimization, and its application in a real case of bearing faults recognition. The proposed method is a simplification of the original process, adapted by the class of Artificial Immune Systems algorithms, which have proven to be useful and promising in many different application fields. Comparative results are provided, with a complete explanation of the algorithm and its functioning aspects.


Introduction
One of the main problems in the industrial production field is avoiding machines to be stopped by component faults.Statistical evidences prove that the majority of unexpected stops (about 50-60%) are due to faulted bearings [1].This makes bearing diagnostics a prime research field to improve efficiency and durability of the industrial mechanical systems.Ideal diagnostics should also provide a real-time condition of the components in order to monitor them until the appearance of the first signals of malfunctioning.This could ensure a longer lifetime of the parts than other maintenance techniques, e.g. the preventive maintenance, where components are substituted at given time intervals independently of their actual conditions, with a relevant economic advantage [2].So far the problem has been solved analytically studying the bearing as a planetary gear in order to define the dependence of selected damage parameters from the working condition of the bearings.In particular the presence of damage introduces a specific frequency in the vibration spectrum called fault frequency [3].The proportionality between this frequency and the rotating frequency of the motor shaft is proved [4,5].Although this model-based approach revealed to be probably the best one for the monitoring of bearings subjected to stationary conditions, its effectiveness diminishes in case of machine diagnostic in non-stationary conditions.For example in non-stationary load conditions [6][7][8], in variable speed operating conditions [9], e.g. in the diagnostic of wind turbines [10].In the last decade several authors proposed new methodologies in order to overcame the limitations of the "stationary conditions" limits.The first step was the introduction of order tracking (both hardware and software), which made possible to pass from time domain to angular domain (orders), by means of a tacho signal synchronously acquired with the vibration signal [11,12].The advantage of order tracking is that the angular domain is no more speed-dependent, while any damage of the bearing is still related to the angular rotation of the shaft.Subsequently more complex techniques have been introduced to analyze vibration signal in multi-domains: time-frequency domain (STFT, Wavelet, etc [13,14]), frequency-cycle domain [15,16], etc… Form a practical point of view these approaches seem to be less immediate for the inexperienced user (e.g.technical service of an industries), especially in case of motors operating with very short periodicities and/or with reversals in the motion of the shaft [17].This problem is sensible in modern highly automated machines like in the packaging industries, where servomotors (or direct-drive motors) are broadly used for their flexibility and capability to perform complex speed profiles.These two primary requirements, i.e. diagnostic of servomotors and simple-to-use tools, led to a significant development of the so-called supervised learning systems for the condition-based maintenance [18].In particular artificial neural network (ANN) produced good results in fault detection and a great number of other application.The idea behind these algorithms is to actually recreate, by means of computational operations, the ability of the biological neural networks (e.g. the human brain) to perform complex tasks such as memorization, elaboration and learning from the data [19].Such methods, derived from the scientific observation of what actually happens in these biological systems, have demonstrated to be widely applicable in a multitude of areas, and this success increased the interest of scientist and engineers to model and mimic other natural cognitive systems just like it has been made for ANN [20].These techniques have been applied successfully in the diagnostics of mechanical components such as bearings [21,22] and gears [23][24][25].One of the most recent field of research among cognitive systems are the artificial immune systems (AIS) [26,27], which can be referred to as the totality of the algorithms and computational methods derived by the study of human immune system (HIS).The human immune system is probably the most powerful diagnostic tool that can be found in daily experience.In fact the main scope of HIS is to recognize in real-time possible menaces like viruses and bacteria by means of specific elements (antigens and antibodies), and to initiate the process of defense [28].Dasgupta [29] compared biological neural networks and immune systems revealing many interesting analogies in their global behavior against the fundamental differences that characterize the two natural systems, and this seems to be reflected in the corresponding computational formulations.A detailed description of the HIS processes is presented in the section 2. Within the several tasks that AIS can perform, the fault detection and classification are certainly one of the most interesting and useful [30 -34].The application in real cases of diagnostics is a very recent development, and the first results prove the efficiency and primarily the simplicity of implementation and computation of such algorithms [31], which can make the diagnostic activity, like many other artificial intelligence methods, a totally automatic routine [32].This paper focuses on a specific AIS proposed by DeCastro [35], called AbNet which has been recently applied to bearing diagnostic [36,37].As other supervised learning systems, AbNet requires as input a vector of arbitrary features (called antigens) of the bearing vibration signal.Then the input vector must be converted into binary form to start the specific AIS processes, such as the creation of the antibodies from the antigens.In this this paper an alternative algorithm is proposed which doesn't require the binary conversion of the input data, reducing the complexity of the AIS.The core part of this algorithm is the Euclidean Distance Minimization (EDM), which directly compares the input features of testing data (i.e. an unknown bearing) with the historical data of the training phase (i.e.known bearings).Preliminary results have been reported in [38].The EDM algorithm has been successfully verified on an industrial application, demonstrating the ability to correctly manage the anomaly detection task even when a very small set of data is given as input.The paper is structured as follows: backgrounds on human and artificial immune systems are provided in chapter 2, the state-of-art of AIS algorithm (AbNet) is presented in chapter 3, while the Euclidean Distance Minimization is introduced in Section 4. Section 5 focuses on the application of EDM methodology for the anomaly detection of ball bearings, focusing on a real industrial application.
Finally the conclusions close the paper.

Backgrounds on the Human Immune Systems (HIS)
The terms Artificial Immune Systems (AIS) cover a multitude of different algorithms and methods and their description is out of the scope of this paper.For further details refer to [26].Only the main concepts necessary to understand one of them -the AbNet proposed by DeCastro [35] -will be provided.Concepts will be initially explained from human body perspective and then converted into data and algorithm functions.The human immune system is a complex and fascinating biological framework that is able to react against specific environmental entities, i.e. being able to classify them as threats.The human immune system activity is based on a continuous monitoring of the body by means of specific entities (antibodies), which are able to recognize specific strings of proteins that could damage the tissues (antigens).These two fundamental entities are defined as:

•
Antigens: proteins produced by external pathogen agents that are able to cause damage to human body tissue if unstopped.
• Antibodies: proteins produced by immune cells of the human body that are able to neutralize dangerous foreign cells by recognizing their antigens.
In nature there are several antigens, each one characterized by a specific sequence of proteins, and also several antibodies.Each type of antibody is able to recognize a specific type of antigen, based on complementarity between them.
The elimination of such dangerous cells is performed so that only the antibodies that couple a specific antigen will stimulate the immune response against it.This coupling process takes place only when the complementarity between the antigen and the antibody exceeds a fixed threshold.
The interaction between antigens and antibodies is regulated by two important immune principles: the clonal selection principle and the negative selection principle.Human systems always generate new antibodies to contrast the enormous number of antigens that strike us every day.
A singular antibody is not a cognitive entity and it is not able to distinguish between antigens (non self) and other antibodies (self), so the immune system could attack and destroy itself (the socalled autoimmune disease).Moreover antibodies generation -through a pure randomization process -could provide a weak and inefficient immune response.
In order to avoid these two problems the human body uses two principles previously cited: 1. Clonal selection principle ensures that only the most suited antibodies will be cloned, that is the antibody which couples with more antigens is the one more to reinforce immune response against that precise antigen.

2.
Negative selection principle, on the other hand, avoids antibodies to attack what is self, eliminating those which attack each other.
For better explaining and evaluating the complementarity between antibodies and antigens, scientists [39] introduced the idea of shape space i.e. a space identified by L-dimensions where L is the number of different features that influence the coupling process (e.g.chemical, geometrical…features).The L-dimensional space is also known as generalized shape.
Using the shape space formalism is possible to represent antigens and antibodies as punctual entities in the L-dimensional space, and also their affinity can be evaluated through the distance that separates them.Although the ability of antibodies to couple a given antigen is a sigmoid function of such a distance, it is an acceptable approximation taking a certain threshold value that is able to determine whether the antibody-antigen coupling is verified or not.In the shape space a given threshold defines the radium of the spherical workspace of the antibody, which will be able to recognize all the antigens whose perfect complements are located within that volume.It is important to notice that the shape space is applicable to any kind of space, not only the Euclidean one.Figure 1 represents the antibody in a bi-dimensional Euclidean shape space, where the blue antibody recognizes only the red antigen within the threshold radius.

The Antibody Network (AbNet) algorithm
The first and better-known immune algorithm is AbNet, developed by De Castro and Von Zuben [35].The idea at the base of AbNet is to transform antigens and antibodies into data and to use the previously explained principles to operate in the field of pattern recognition.AbNet has already been recently used with success in machinery diagnostics [37], but some drawbacks that are detailed later.The new AIS method proposed in this paper starts from AbNet approach and introduces new features.For that reason this section recall a brief description of the AbNet algorithm, as prerequisite to the new methodology.

Main characteristics
AbNet works on binary data only, i.e. the binding between antigens and antibodies requires a metric suitable for binary input.The metric used is called Hamming distance (H) and is computed calculating the number of complementary bits between the input arrays.So the higher the complementarity between the two arrays, the higher will be this distance.In this way it is possible to apply the shape-space formalism to a binary space identified by L-dimensional binary strings.
Again, the binding occurs when the distance H between them overcomes a certain threshold value ().Once an arbitrary number of L-size binary arrays representing the antigens and a threshold binding value  are defined, AbNet is able to generate the minimum number of antibodies which can bind every inserted antigen.If  is set to 0 the number of antibodies generated will be exactly the same of the antigens, and every antibody will be the perfect complement of a given antigen.On the other hand if  is set to a different value, e.g. 3, will be generated fewer antibodies than antigens because some antibodies will be able to bind more than one antigen, and the complementarity won't be perfect anymore.

AbNet application for bearing diagnostics
The application of AbNet for bearing diagnostics is immediate.AbNet requires binary inputs while experimental data are analog vibrational signal, so a binary conversion of the data is necessary.A given bearing dataset represents antigens and the algorithm computes a set of antibodies, which are able to recognize not only antigens belonging to the dataset, but also new unknown antigens.
Following this method [36] and [37] used AbNet as an expert system obtaining the complete recognition of different kinds of faults in ball bearings.Like any other supervised learning system this implementation of AbNet is composed of two different phases: the training phase, where some known data are used as input, and the testing phase, where unknown data are divided and classified..The AbNet method proved to work very well in case of multiple accelerometers used to survey vibration signals and more generally with a very large quantity of information in the training phase, while its detection efficiency decreases dramatically using only one accelerometer.This is a sensible drawback in industrial maintenance where only a limited number of accelerometers are used.
The limits observed in the application of AbNet with a small quantity of data, and its inability to perform real industrial diagnostics are due to multiple causes: 1.The algorithm was not ideated for the fault detection task, but rather to simulate the main processes of the human immune system in order to obtain the maximal coverage of the antigens inserted by the generation of antibodies. 2.
The process of data conversion from real to binary forms a large quantity of information gets lost.Then the diagnosis activity with AbNet requires many data to balance this phenomenon.
3. AbNet application for bearings diagnosis requires a complex and expensive acquisition set-up.In particular several accelerometers are required which are not usually available, especially in industrial environment.
To avoid these three undesirable aspects, a negligible component of AbNet were eliminated, trying to create a particularization of the algorithm for the condition-monitoring task.

The Euclidean Distance Minimization: a simpler approach
In this paper an alternative method called Euclidean distance minimization (EDM) is proposed, based on the shape space formalism.The EDM method is derived from the AbNet concept and it allows solving and simplifying the diagnosis process.
Considered from the Euclidean shape space perspective, AbNet application in the fault recognition is simple: an initial distribution of different kinds of antigens is used to generate a set of antibodies, which will be divided in base of the antigens they can couple.
Once this training phase is terminated, antibodies generated and divided can be used to classify new unknown antigens.From these considerations clearly emerges that antibodies are just tools, that AbNet uses to classify unknown antigens.The same way is followed by the human immune system that tries to keep the antibodies surveyed in the system memory.Once this becomes clear, antibodies can be neglected since they are not really needed at computational level.It is then possible to find other tools to accomplish the task, since algorithms are not subject to physical constraints which limit the human body and that forced it to develop such specific mechanisms (i.e.antibodies).

Main characteristics
The Euclidean Distance Minimization (EDM) approach makes the classification of unknown antigens simpler and highly representable by taking the Euclidean metric as comparing parameter.
The definition of the Euclidean distance between a set of two n-dimensional vectors Unlike AbNet, which uses arrays of binary numbers, the proposed method works now directly on real valued arrays and it is reduced to a least squares problem.The shape space is then a normal Euclidean space with L-dimension.Moreover the antibodies will be neglected avoiding the whole generation phase that was necessary in AbNet.
These simplifications lead to the creation of a new method, which is based on two types of basic entities only: • Training antigens: antigens relative to bearings of which the health state is known a priori.
Let's refer to these as Ag-train.
• Testing antigens: antigens relative to bearings of which the health state is to be determined.
Let's refer to these as Ag-test.
The algorithm performs Euclidean distance between a set of Ag-train and Ag-test in order to classify the Ag-test according to the proximity to a certain group of Ag-train that represents a specific health condition of the observed bearing.Now the resulting algorithm is better suited for the fault recognition task.Since no binary conversion process is required, the method needs much less input data to evaluate the health state of bearings.Moreover it requires a simpler execution, as observable from Fig. 3.In fact, the whole part of antigens conversion and antibodies generation has been removed.The resulting training phase is only composed by the integration of the training antigens.Also the testing phase results more efficient, since the information provided by the testing antigens won't be lost anymore through the previous conversion to binary values.
The procedure can be summarized in the following steps: 1. Choice of an array of specific features as input (antigen) of the AIS.
2. Antigens (step 1) computation for the training data (all class available).

Euclidean distance between features array).
The classification of test data, which is unknown a priori, is made on how much closer to known data is the input array itself.The specific features array that identifies the dataset depends on the specific field of AIS application.The experience of vibration analyst or a bibliographic survey usually guides this choice.The next chapter will show an example of features array for the diagnostics of ball bearings.

EDM application for bearing diagnostics
The algorithm has been tested on a real industrial application.In particular the experimental activity regarded the condition monitoring of a Rockwell Automation AC Brushless servomotor used in a packaging machine.Packaging industries, as many other highly automated process, prefer servomotor to asynchronous ones since they are fully controlled (so called "electric cams"), avoiding mechanical cams and complex kinematics.The main drawback of servomotors is the condition-based maintenance of the mechanical components like bearings.As mentioned in the introduction, it's well known that variable-speed working condition makes no longer so useful the frequency analysis of the vibration signal, since fault-related frequency are proportional to continuously changing rotational frequency of the shaft.The following shows how to apply the EDM to overcome this limitation and make a simple and effective anomaly detection of the ball bearings.
The experiment took into account 13 bearings, 7 of them were healthy and 6 were damaged at different levels.The faulted bearings mainly come from the field as "claimed motors" by the customers.The severity of the damage in the bearing varying significantly: three of them show distribute roughness on the inner race, together with pitting on the outer race, one has generalized roughness on both races and the cage was broken.Finally two bearing have been artificially damaged in laboratory: a small engrave on the outer and on the inner race respectively.The limited amount of bearings available forced the authors to use the EMD as anomaly detector only.As a consequence no further distinction between outer or inner race faults will be further considered in the paper.The usefulness of the methodology is still maintained since the customer service of the company doesn't distinguish the type of the fault, but it just replaces the damaged motor.
The bearings (NSK 6309) were tested on dedicated test-bench that simulated the packaging machine.A piezoelectric accelerometer recorded the vibration signal for a period of 50 seconds at a sampling rate of 10 kHz.The bearing has been tested at different hourly capacity of the packaging machine (i.e.same motion profile but performed at different speed levels): 5000, 7000 and 9000 packages per hour.Figure 4.d) shows an example of the angular rotation over time for the shaft of the motor.
An amount of 35 acquisitions were taken into account.Among these, 29 have been used in the training phase and 6 for the testing.Note that all the 6 test acquisitions came from different bearings, three of them were faulted and three were healthy.
The choice of the features array was based on both the experience of the authors [21] and from the literature [36].In particular the input antigen calculated from an input vibration data ) is an array of four elements: • The Root Mean Square (RMS) of the vibration data • The kurtosis of the vibration data where E ( ) is the expected value operator, µ and σ are the mean value and the standard deviation of x respectively.
• The peak of jerk of vibration data • The hourly capacity of the machine expressed as packets per hours.It is a good indicator of the global speed of the machine and then of the kinematic energy of the system.
The resulting antigens became a 4-element array: To increase the number of data available, each acquisition has been split in single cycles of the machine.The vibration signal of each cycle has then been considered as a realization of different bearings.This hypothesis is acceptable due to the non-stationary nature of the system.Moreover the cyclic motion profile of the motor favors the choice of a single cycle as fundamental time unit.
Finally all the antigens have been normalized to make the shape space uniform.
The importance of the normalization process was highlighted after a first test done without it.
The results obtained were very poor, and after specific considerations on the shape space was find out that problem was the anisotropy of the considered space.
For a given antigen -relative to a certain health state -the Euclidean distance represents the probability that another antigen could correspond to the same working condition.In other words the same Euclidean distance, the same probability the antigens are in the same working conditions.
Substantially the method requires the zones with the same probability to be spherical (or hyperspherical in a multidimensional case), and this is not possible if the components of the antigens present different fields of variability.In the activity presented, e.g. the features obtained had different units of measurement and varied between totally different values: the maximum value of the speed was 0.7 round/sec while the kurtosis reached even a value of 100.In this case the condition of spherical probability cannot be verified, requiring a normalization process to align the maximum values.The normalization process is detailed in the next section.Once the process of antigen preparation is terminated, they will be introduced in the algorithm to operate the anomaly detection.This will be achieved by the classification of the Ag-test, performing all Euclidean distances between the Ag-test and all the Ag-train, and then considering which antigen and which category (F or H) is associated with the minimum Euclidean distance.
At the end of the process all Ag-tests are classified into the faulty category (Ag-F) and/or the healthy category (Ag-H).The largest population of one of the two categories will provide the final classification.The classification scheme is summarized in Fig. 5.
After the evaluation of the minimum distances, the software implementation of the algorithm provides as output the percentage of Ag-test attributed to the two conditions F and H to provide a description of the most probable health state.

Antigens normalization
Fixed a two-dimensional features array for an antigen (L=2), if P x and P y are the probability distributions for each dimension, the probability (P 0 ) that another antigen will represent the same working condition will be given by Eq. 6.
The normalization process has been applied to the antigens with the following steps.Figure 6 represents the variation of the iso-probability curve with and without the normalization process.
After using the normalization process the test was repeated with good results: the method was able to recognize all of the six test bearings with no error.100% of the antigens derived from the three broken bearings were classified as faulty (F), and the 100% of the antigens derived from the three healthy bearings were associated to the healthy (H) category.The results are summarized in Fig. 7.
These results were quite unexpected since one of the healthy bearings was not subject to a breaking-in process but presented really high values of RMS and jerk-peak, almost comparable with those of broken bearings.Also AbNet was implemented to evaluate its efficiency in comparison with the proposed method.The original algorithm of De Castro was used to generate the antibodies, and then they were divided according to the category of the Ag-train coupled.
Subsequently they were used to match new Ag-test to determine a percentage of antigens belonging to the two classes F and H. AbNet requires, during the antibody generation, to insert a threshold value  that defines the limit in the Hamming distance which activates the antigenantibody coupling.The higher this value, the less complementary can be two matched arrays.This threshold makes the algorithm able to produce less antibodies than antigens, since one singular antibody will be able to couple more antigens even of different classes.Due to these considerations it is possible that the sum of the classification percentages relative to the two conditions gives a value greater than 100%.Results with AbNet application are shown in Fig. 8.In this experimental case AbNet did not perform well, in fact even with the best settings ( = ) wasn't able to classify two of the six bearings, attributing all their antigens to the health condition.
Probably the cause is the small variation of the features in the healthy bearings, which are described by two antibodies only that were able to couple also the broken condition.This phenomenon verifies when the working state of a bearing is not sufficiently characterized by its antigens, and that is very probable when these antigens are an array with few components like this case.With 4 binary cells there are in fact only n=2 4 =16 different combinations possible, too few to adequately describe such a complex problem.In the case of vibrations data acquired by one accelerometer only and a small number of features, EDM outperforms AbNet results, providing an efficient and simple way of performing bearings diagnosis.

EDM training sensitivity
Good results encouraged a further testing on the algorithm to discover which is the minimum quantity of data needed to obtain correct classification of the unknown.The sensitivity analysis of the algorithm has been done changing the input parameters and measuring the resulting efficiency.
In particular three input Ag-train quantities were decreased: • Ag-train obtained from both healthy and faulted bearings (Ag) (Tab. 1) • Ag-train obtained from healthy bearings (AgH) (Tab.2) • Ag-train obtained from faulted bearings (AgF) (Tab.3) For every variation were registered the results with six different randomizations over the antigens, in order to take the medium behavior of the algorithm as a representation of its global efficiency.
In Fig. 9  The results of the sensitivity test suggest that the algorithm tends to attribute all the antigens to one of the two possible conditions only, and just in rare cases it splits the percent values among them.
An advantage of the EDM algorithm is that a minute of data acquisition (i.e. about 40 machine cycles) provides enough data to perform an efficient classification.Although the first errors occur at very low quantity of antigens, the misclassification errors tend to be catastrophic (0% of Ag-test classified in the right health state).Then is suggested to take a consistent safe margin in the practical application of the method.
The rapid change of the efficiency is simply explainable with some considerations on the antigens.
In the diagnostics field antigens are punctual representations of the health state of the bearings, while the L-dimensional space in which they are disposed is the collection of all the possible functioning conditions.The antigen obtained from a given bearing will be spread in the shape space with different concentrations depending on the considered zone.The areas where this concentration is maximum will be the ones more representative of the bearing analyzed.That allows presenting the concept of capability of representation, or the ability of the antigens to describe the health condition of the bearings.According to this definition it's immediately understandable that the most representative antigens are the ones positioned in high-density antigenic subspaces.
The randomization process in the sensitivity analysis may causes the elimination of the most descriptive antigens and leaving as input only some marginal antigens, which are no more able to properly classify the unknown antigens.This is the main cause of the rapid change of the EDM efficiency.
On these considerations a further simplification of the method can be proposed.If the quantity of input data is a constraint for the diagnostics activity, only the most descriptive antigens can be used reducing the problem to a really low number of Euclidean distances to calculate.This is achievable by isolating the best antigens for every class only, even if this process of data minimization is suggested mainly in a very compact group of initial antigens.For the broken bearing conditions e.g. the antigens are usually spread around a large subspace, and the identification of the ideal ones results almost impossible.However in an ideal case would be even possible to work with just an antigen per class.As an example, Fig. 10 shows the individuation of the most representative antigen into a hypothetic bi-dimensional discrete distribution.

Conclusions
The paper aims to present a simple method inspired and derived from the mechanisms of the immune system, and its application in a real case of bearing faults recognition.The proposed algorithm is a simplification of the original process, adapted to a particular case of a much bigger class of algorithms and methods grouped under the name of Artificial Immune Systems, which have proven to be useful and promising in many different application fields.The proposed algorithm is based on the Euclidean distance minimization in the evaluation of the binding between antigens.
Applied to bearing diagnostics the generic antigen used is created collecting together four features computed from the vibration signal: kurtosis, jerk-peak, RMS and the hourly capacity of the packaging machine the motor is mounted on.Normalization with respect to the maximum computed value of the features is required in order to compare the antigens developed in training phase with the antigens under classification.
Experimental activity proved the suggested methodology, obtaining maximum results, but further test are required for a complete evaluation of its performance and its limits.Moreover the capability to differentiate the type of fault (e.g.inner or outer race) will be developed in the future steps.

Figure captions
Fig. 1 Representation of a bi-dimensional Euclidean shape space [15], [19].Red entities represent the perfect complements of the antigens considered.Fig. 6 Example of the iso-probability curve variation with the normalization process (not in scale).
The probability distribution for each dimension has been assumed normal, and decreases with the distance.

Fig. 7
Experimental results of the EMD algorithm.

Figure 2
graphically represents the process.In the upper graph is displayed the generation (a) and separation (b) of the antibodies, while in the lower part the generated antibodies (c) are used to classify the unknown antigens (d).

Figure 4
Figure 4 shows vibration data for three stage of bearing damage: a) healthy bearing, b) early stage faulted bearing, c) late stage faulted bearing.The subplot d) shows the angular profile of the shaft corresponding to the early stage faulted bearing.Comparison among the three levels of damage highlights a big difference in the maximum peaks for the healthy case.Nevertheless in two healthy bearing cases the noise level of vibration data was comparable with the one of early stage fault bearings, making non-trivial the anomaly detection based on statistical parameters only.

2 . 3 . 4 .
As an example consider three antigens defined by two features each: Ag 1 = [0.0125], Ag 2 = [0.1050], Ag 3 = [0.05100]. 1.The user fixes a reference value R v representing the upper range for the field of variability (e.g. 100 or 1, etc.).For example R v =10.For each feature of antigen array calculate the maximum value among all dataset available.In the example: Max=[0.10 100].Determine the normalization coefficient C for each feature, as the ratio between R v and the maximum value of each feature.In the example: Normalize all the antigens multiplying each feature for the corresponding normalization coefficient C. In the example: Ag 1n =[1 2.5], Ag 2n =[10 5], Ag 3n =[5 10].
are reported the different trends of the algorithm by varying the different input antigens.No relevant differences occur into the three curves, except a slightly better performance for low values of AgH.However 50 antigens per class are more than sufficient for an efficient anomaly detection of the bearings, while the first misclassification errors occur around the 30 input antigens, independently on their category.In addition to this evaluation of the efficiency, in Tab.1, Tab.2 and Tab.3 are provided the specific results obtained in each randomization (results are given in the range [0-1]).It can be observed that a decrease of the AgH only influences the efficiency of recognition for the healthy bearings while has no effect on the broken ones.The same behavior verifies for the faulty class decreasing the AgF.As expected the bearing B1, which was the one with high values of RMS and jerk-peak, is the first to present signs of misclassification, which consistently begin under 3 AgH.

Fig. 2
Fig. 2 Generation (a) and separation (b) of the antibodies.The generated antibodies (c) are used to classify the unknown antigens (d).

Fig. 4
Fig. 4 Vibration data acquired on faulted bearing at different stage: a) healthy bearing, b) early stage fault on inner race, c) late stage fault (distributed roughness), d) angular rotation of the shaft corresponding to the early stage fault plot.

Fig. 5
Fig. 5 Classification scheme: for every Ag-test the distances with every Ag-train are performed, then the quantity of minimum distances relative to the two classes are compared to obtain the final classification.The totality of Ag-test (1,2,3,…p) represents all the features extracted from the signal of the bearing to be classified.

Fig. 9
Fig. 9 Graphical representation of the medium efficiency obtained from the different randomizations varying the three types of input antigens.

Fig. 10
Fig. 10 Individuation of the most representative antigen into an hypothetic bi-dimensional distribution.

Fig. 5 Fig. 6 Fig. 7
Fig. 5 Classification scheme: for every Ag-test the distances with every Ag-train are performed, then the quantity of minimum distances relative to the two classes are compared to obtain the final classification.The totality of Ag-test (1,2,3,…n) represents all the features extracted from the signal of the bearing to be classified.

Fig. 9 Fig. 10
Fig. 9 Graphical representation of the medium efficiency obtained from the different randomizations varying the three types of input antigens.