Speech Spectrogram Estimation from Intracranial Brain Activity Using a Quantization Approach

Miguel Angrick, Christian Herff, Garett Johnson, Jerry Shih, Dean Krusienski, Tanja Schultz


Direct synthesis from intracranial brain activity into acoustic speech might provide an intuitive and natural communication means for speech-impaired users. In previous studies we have used logarithmic Mel-scaled speech spectrograms (logMels) as an intermediate representation in the decoding from ElectroCorticoGraphic (ECoG) recordings to an audible waveform. Mel-scaled speech spectrograms have a long tradition in acoustic speech processing and speech synthesis applications. In the past, we relied on regression approaches to find a mapping from brain activity to logMel spectral coefficients, due to the continuous feature space. However, regression tasks are unbounded and thus neuronal fluctuations in brain activity may result in abnormally high amplitudes in a synthesized acoustic speech signal. To mitigate these issues, we propose two methods for quantization of power values to discretize the feature space of logarithmic Mel-scaled spectral coefficients by using the median and the logistic formula, respectively, to reduce the complexity and restricting the number of intervals. We evaluate the practicability in a proof-of-concept with one participant through a simple classification based on linear discriminant analysis and compare the resulting waveform with the original speech. Reconstructed spectrograms achieve Pearson correlation coefficients with a mean of r=0.5 ± 0.11 in a 5-fold cross validation.


 DOI: 10.21437/Interspeech.2020-2946

Cite as: Angrick, M., Herff, C., Johnson, G., Shih, J., Krusienski, D., Schultz, T. (2020) Speech Spectrogram Estimation from Intracranial Brain Activity Using a Quantization Approach. Proc. Interspeech 2020, 2777-2781, DOI: 10.21437/Interspeech.2020-2946.


@inproceedings{Angrick2020,
  author={Miguel Angrick and Christian Herff and Garett Johnson and Jerry Shih and Dean Krusienski and Tanja Schultz},
  title={{Speech Spectrogram Estimation from Intracranial Brain Activity Using a Quantization Approach}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2777--2781},
  doi={10.21437/Interspeech.2020-2946},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2946}
}