Neural Prosthesis: Using Brain Activity to Decode Speech

Topic: Electronics/Software
Author: National Research University Higher School of Economics
Published: 2023/01/19 - Updated: 2023/09/27
Publication Type: Product Release / Update - Peer-Reviewed: Yes
Contents: Summary - Definition - Introduction - Main Item - Related Topics

Synopsis: Speech decoding from a small set of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network. Millions of people worldwide are affected by speech disorders limiting their communication ability. Causes of speech loss can vary and include stroke and certain congenital conditions. Speech neuroprostheses-brain-computer interfaces capable of decoding speech based on brain activity can provide an accessible and reliable solution for restoring communication to such patients.

Introduction

Researchers from HSE University and the Moscow State University of Medicine and Dentistry have developed a machine learning model that can predict the word about to be uttered by a subject based on their neural activity recorded with a small set of minimally invasive electrodes. The paper 'Speech decoding from a small group of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network' has been published in the Journal of Neural Engineering. A grant from the Russian Government financed the research as part of the 'Science and Universities' National Project.

Main Item

Millions of people worldwide are affected by speech disorders limiting their communication ability. Causes of speech loss can vary and include stroke and certain congenital conditions.

Technology is available today to restore such patients' communication function, including 'silent speech' interfaces which recognize speech by tracking the movement of articulatory muscles as the person mouths words without making a sound. However, such devices help some patients but not others, such as people with facial muscle paralysis.

Speech neuroprostheses-brain-computer interfaces capable of decoding speech based on brain activity-can provide an accessible and reliable solution for restoring communication to such patients.

Unlike personal computers, devices with a brain-computer interface (BCI) are controlled directly by the brain without needing a keyboard or a microphone.

A major barrier to the wider use of BCIs in speech prosthetics is that this technology requires highly invasive surgery to implant electrodes in the brain tissue.

The most accurate speech recognition is achieved by neuroprostheses with electrodes covering a large cortical surface area. However, these solutions for reading brain activity are not intended for long-term use and present significant risks to the patients.

Researchers of the HSE Centre for Bioelectric Interfaces and the Moscow State University of Medicine and Dentistry have studied the possibility of creating a functioning neuroprosthesis capable of decoding speech with acceptable accuracy by reading brain activity from a small set of electrodes implanted in a limited cortical area. The authors suggest that in the future, this minimally invasive procedure could even be performed under local anesthesia. In the present study, the researchers collected data from two patients with epilepsy who had already been implanted with intracranial electrodes for presurgical mapping to localize seizure onset zones.

The first patient was implanted bilaterally with five sEEG shafts with six contacts in each. The second patient was implanted with nine electrocorticographic (ECoG) strips with eight contacts. Unlike ECoG, electrodes for sEEG can be implanted without a full craniotomy via a drill hole in the skull. In this study, only the six contacts of a single sEEG shaft in one patient and the eight contacts of one ECoG strip in the other were used to decode neural activity.

The subjects were asked to read aloud six sentences, each presented 30 to 60 times in a randomized order. The sentences varied in structure, and most words within a single sentence started with the same letter. The sentences contained a total of 26 different words. As the subjects were reading, the electrodes registered their brain activity.

This data was then aligned with the audio signals to form 27 classes, including 26 words and one silence class. The resulting training dataset (containing signals recorded in the first 40 minutes of the experiment) was fed into a machine-learning model with a neural network-based architecture. The learning task for the neural network was to predict the next uttered word (class) based on the neural activity data preceding its utterance.

In designing the neural network's architecture, the researchers wanted to make it simple, compact, and easily interpretable. They devised a two-stage architecture that first extracted inner speech representations from the recorded brain activity data, producing log-mel spectral coefficients. Then they predicted a specific class, i.e., a word or silence.

Thus trained, the neural network achieved 55% accuracy using only six channels of data recorded by a single sEEG electrode in the first patient and 70% accuracy using only eight channels of data recorded by a single ECoG strip in the second patient. Such accuracy is comparable to that demonstrated in other studies using devices that required electrodes to be implanted over the entire cortical surface.

The resulting interpretable model makes it possible to explain in neurophysiological terms which neural information contributes most to predicting a word about to be uttered. The researchers examined signals from different neuronal populations to determine which were pivotal for the downstream task. Their findings were consistent with the speech mapping results, suggesting that the model uses neural signals which are pivotal and can therefore be used to decode imaginary speech.

Another advantage of this solution is that it does not require manual feature engineering. The model has learned to extract speech representations directly from the brain activity data. The interpretability of results also indicates that the network decodes signals from the brain rather than from any concomitant activity, such as electrical signals from the articulatory muscles arising due to a microphone effect.

The researchers emphasize that the prediction was always based on the neural activity data preceding the utterance. This, they argue, makes sure that the decision rule did not use the auditory cortex's response to speech already uttered.

"Using such interfaces involves minimal risks for the patient. If everything works out, it could be possible to decode imaginary speech from neural activity recorded by a small number of minimally invasive electrodes implanted in an outpatient setting with local anesthesia" - Alexey Ossadtchi, leading author of the study, director of the Centre for Bioelectric Interfaces of the HSE Institute for Cognitive Neuroscience.

Related Information

Attribution/Source(s):

This peer reviewed publication was selected for publishing by the editors of Disabled World due to its significant relevance to the disability community. Originally authored by National Research University Higher School of Economics, and published on 2023/01/19 (Edit Update: 2023/09/27), the content may have been edited for style, clarity, or brevity. For further details or clarifications, National Research University Higher School of Economics can be contacted at hse.ru/en/. NOTE: Disabled World does not provide any warranties or endorsements related to this article.

Explore Related Topics

1 - - Consistent virtual haptic technology for virtual reality (VR) and augmented reality (AR) users.

2 - - HeardAI has advanced to Phase 2 of National Science Foundation's Convergence Accelerator program to make voice-activated AI accessible and fair to people who stutter.

3 - - A11yBoard for Google Slides is a browser extension and phone app that allows blind users to navigate through complex slide layouts and text.

4 - - The neuromorphic invention is a single chip enabled by a sensing element, doped indium oxide, that's thousands of times thinner than a human hair and requires no external parts to operate.

5 - - Speech decoding from a small set of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network.

Complete List of Related Information

Page Information, Citing and Disclaimer

Disabled World is a comprehensive online resource that provides information and news related to disabilities, assistive technologies, and accessibility issues. Founded in 2004 our website covers a wide range of topics, including disability rights, healthcare, education, employment, and independent living, with the goal of supporting the disability community and their families.

Cite This Page (APA): National Research University Higher School of Economics. (2023, January 19 - Last revised: 2023, September 27). Neural Prosthesis: Using Brain Activity to Decode Speech. Disabled World. Retrieved October 10, 2024 from www.disabled-world.com/assistivedevices/computer/neural-prosthesis.php

Permalink: <a href="https://www.disabled-world.com/assistivedevices/computer/neural-prosthesis.php">Neural Prosthesis: Using Brain Activity to Decode Speech</a>: Speech decoding from a small set of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network.

Disabled World provides general information only. Materials presented are never meant to substitute for qualified medical care. Any 3rd party offering or advertising does not constitute an endorsement.