Memristive computational architecture of an echo state network for real-time speech-emotion recognition


Echo state neural networks (ESNs) provide an efficient classification technique for spatiotemporal signals. The feedback connections in the ESN topology enable feature extraction of both spatial and temporal components in time series data. This property has been used in several application domains such as image and video analysis, anomaly detection, and speech recognition. In this research, we explore a hardware architecture for realizing ESN efficiently in power-constrained devices.

Specifically, we propose a scalable computational architecture applied to speech-emotion recognition. Two different topologies are explored, with memristive synapses. The simulation results are promising with a classification accuracy of ≈ 96% for two distinct emotion statuses.