Optimización multi-objetivo de arquitecturas de aprendizaje profundo para el procesamiento de señales EEG en plataformas de cómputo heterogéneas

  1. Aquino Brítez, Diego
Supervised by:
  1. Andres Ortiz García Co-director
  2. Juan José Escobar Co-director

Defence university: Universidad de Granada

Fecha de defensa: 27 May 2022

Committee:
  1. Héctor Pomares Cintas Chair
  2. Pablo García Sánchez Secretary
  3. Gracia Esther Martín Garzón Committee member
  4. Consolación Gil Montoya Committee member
  5. Antonio Jesús Rivera Rivas Committee member

Type: Thesis

Abstract

In recent years, models based on deep learning have revolutionized several areas of science allowing the resolution of a wide variety of complex problems, many of which were considered unsolvable. Compared to alternatives for the classification of samples that require a priori knowledge of the descriptors with greater representation or discriminant power, deep neural networks have the advantage of extracting descriptors tailored to a specific problem through a learning process. In this way, feature extraction and classification processes are integrated in the same architecture. Although some of these architectures (such as convolutional networks) were presented in the 70s and 80s, their training applied to real problems involved a computational load that could not be assumed using computers of that time. Nowadays, with the development of heterogeneous computing devices, these networks have gained popularity, making it possible to efficiently perform the computations required by deep neural networks (in some cases even inference is performed at real-time) and to develop effective, specific optimization algorithms. However, the design of deep network architectures is a complex task that usually depends on the final application and there are no general design rules beyond the experience of the designer. Therefore, although the designer plays an important role, having an architecture that provides the required accuracy usually involves a trialand- error process based on partial results or through the analysis of the evolution of the network parameters during training. On the other hand, the efficiency and performance of a given architecture depends to a great extent on the selection of hyperparameters, and in this case, it is again necessary to resort to a trial-and-error process. Therefore, the design and tuning of deep architectures is a complex and computationally time-consuming task even for experienced designers. Moreover, the development of specific tools to optimize the models is essential to take advantage of the capabilities of deep learning models. With this in mind, this thesis proposes the development of a fully configurable optimization framework for deep learning architectures based on evolutionary computation. The framework not only performs the optimization of hyperparameters but also the network architecture, including regularization parameters to reduce the generalization error. In addition, it is possible to perform the optimization of an architecture based on multiple objectives that may be in conflict during the optimization process. The results (solutions) provided by the framework will correspond to a set of non-dominated solutions that provide a trade-off between the proposed objectives, being possible to select the most appropriate solution within the Pareto front. On the other hand, it is worth to note that both CPUs and GPUs are used to speed-up the execution of the optimization procedure by taking advantage of the parallelism and heterogeneity present in the current computing nodes. The developed tool has been applied to the optimization of deep learning architectures for electroencephalography (EEG) signal processing. The classification of EEG signals is a complex task that usually uses a priori known statistical descriptors. However, there is no guarantee that the chosen descriptors provide the best classification rate. On the other hand, previously proposed deep learning networks for EEG classification using the signal in the time domain contain a large number of hyperparameters. In the experiments performed, it is shown how to use the framework to optimize architectures based on one-dimensional convolutional networks for extracting descriptors and classification. These experiments are not only performed to optimize hyperparameters, but the framework is configured so that the optimization process can perform structural changes in the network or introduce regularization layers to improve the generalization capability. The experimental results corroborate that the deep architectures optimized by the method proposed in this thesis improve the baseline networks and produce computationally efficient models, not only from the point of view of classification rate but also in terms of computational efficiency and, therefore, in energy efficiency.