Resumen:
Background: identifying relevant data to support the automatic analysis of
electroencephalograms (EEG) has become a challenge. In the literature, there are many
proposals built to support the diagnosis of neurological pathologies. However, the current
challenge is to improve the reliability of the tools to classify or detect the abnormalities.
Thus, the Ensemble Feature Selection approach allows the integration of the advantages
of several Feature Selection algorithms to improve the identification of features with high
power of differentiation in the classification of normal and abnormal EEG signals. Feature
Selection has attracted the attention of many researchers in the last years due to the
increasing sizes of datasets. In many cases, the datasets contain hundreds or thousands
of columns. However, not all columns contain relevant information, which leads to the
weak performance of classifiers. Besides, several Feature Selection Algorithms have
been proposed in the literature to analyze datasets and determine their subsets of
relevant features and remove irrelevant or redundant features from the classification
process. Those Feature Selection algorithms are typically classified according to their
design, which is related to how they find the subset of relevant features and the
complexity to calculate them. There are three main types of feature selection algorithms:
filters, wrappers, and embedded. The implementation of wrappers and embedded
algorithms are complex because its implementation requires including at least a
classification algorithm to calculate the relevance index of each feature; the index
relevance could change when instances are added or removed from the dataset.
Likewise, the filter-based feature selection algorithms can be computationally simpler
than the other approaches (envelopes and embedded).
Objectives: the main objective of this thesis is to propose a mechanism for selecting
relevant features for the classification of electroencephalograms segments to support
the automatic detection of epileptiform events. For this, a conceptual framework was
designed following a quantitative method in order to represent a structure that provides
an understanding of how to improve the performance of machine learning algorithms by
using the consensus of several feature selection algorithms.