Neuromorphic audio processing through real-time embedded spiking neural networks

Dominguez Morales, Juan Pedro

Neuromorphic audio processing through real-time embedded spiking neural networks

Dominguez Morales, Juan Pedro

Supervised by:

Manuel Domínguez Morales Director

Defence university: Universidad de Sevilla

Fecha de defensa: 03 December 2018

Committee:

Antonio Abad Civit Balcells Chair
Alejandro Linares Barranco Secretary
Enrique Cabello Pardos Committee member
Arturo Morgado Estévez Committee member
Luis Plana Cabrera Committee member

Type: Thesis

Teseo: 570805 DIALNET Idus editor

Abstract

In this work novel speech recognition and audio processing systems based on a spiking artificial cochlea and neural networks are proposed and implemented. First, the biological behavior of the animal’s auditory system is analyzed and studied, along with the classical mechanisms of audio signal processing for sound classification, including Deep Learning techniques. Based on these studies, novel audio processing and automatic audio signal recognition systems are proposed, using a bio-inspired auditory sensor as input. A desktop software tool called NAVIS (Neuromorphic Auditory VIsualizer) for post-processing the information obtained from spiking cochleae was implemented, allowing to analyze these data for further research. Next, using a 4-chip SpiNNaker hardware platform and Spiking Neural Networks, a system is proposed for classifying different time-independent audio signals, making use of a Neuromorphic Auditory Sensor and frequency studies obtained with NAVIS. To prove the robustness and analyze the limitations of the system, the input audios were disturbed, simulating extreme noisy environments. Deep Learning mechanisms, particularly Convolutional Neural Networks, are trained and used to differentiate between healthy persons and pathological patients by detecting murmurs from heart recordings after integrating the spike information from the signals using a neuromorphic auditory sensor. Finally, a similar approach is used to train Spiking Convolutional Neural Networks for speech recognition tasks. A novel SCNN architecture for timedependent signals classification is proposed, using a buffered layer that adapts the information from a real-time input domain to a static domain. The system was deployed on a 48-chip SpiNNaker platform. Finally, the performance and efficiency of these systems were evaluated, obtaining conclusions and proposing improvements for future works.