November 8, 2023
Michael Merzenich, PhD

“He even makes the deaf hear and the mute speak.” [Mark 7:37]

I was privileged as a young scientist to lead a team that helped develop the modern cochlear implant, which (in several different forms) has restored hearing to approximately 800,000 formerly profoundly deaf individuals.

The crude information representing language sounds that cochlear implants deliver to the brain through stimulation of surviving auditory nerve fibers is sufficient for the remarkable plastic machinery of our brains to ultimately reinterpret it as normal-sounding speech.

But what about the large population of individuals who can hear speech, but are unable to verbally respond? Such mute individuals fall into two large classes. Many have endured brain injuries that have physically destroyed speech production abilities in their brains. In others, the cortical speech production “machinery” is intact, but because of physical injury to the vocal tract (from cancer, trauma, or brainstem injury), the tract is neurologically dysfunctional.

Nearly 20 years ago, I met a young child — let’s call her “Katy” — who was a delightful sprite in my extended family. Near her fourth birthday, a nanny focused on watching her baby sister while Katy was playing on a playground slide with her jump rope. In a freak accident, Katy managed to hang herself. By the time she was discovered, brainstem damage had resulted in general paralysis and a “permanent” loss of her ability to produce speech. Now as a young wheelchair-bound adult, Katy has some understanding of what she hears, but almost no ability to convey her sentiments or thoughts. I imagine there must be some (silent) screaming involved.

Brain Computer Interface
In the same era, a brilliant young student showed up in my neuroscience lab motivated to learn more about the brain and its plasticity. Edward Chang, MD, PhD, as I predicted, evolved into an internationally distinguished brain surgeon and is now professor and chairman of the Department of Neurosurgery at the University of California, San Francisco (UCSF). For more than a decade, Chang and his research team have asked a simple question: Can we record information that accurately represents what a mute individual is trying to say, and use that information to restore their voice? As recently reported in Nature, the answer appears to be “yes.”

Conceptionally, their strategy is relatively simple. By recording from densely distributed locations across the surfaces of the cortical region that controls speech production and related facial expression, they can analytically define what the patient is trying to say, even while the disconnected outputs of the patient’s brain hit a dead end.

hang’s team could have had this interpretation of the signals recorded from 253 brain locations written out as text on a screen, or spoken from a speaker mounted on the patient’s stony face or head. Instead they chose a video avatar. Through that avatar, the patient could express their speech in sound with appropriate lip and mouth movements — and when called for, a smile, frown, sneer, or giggle that more fully conveys their intent. By recording facial expression dynamics, it can also modulate the speech to further express the patient’s conveyed emotions.

This brain-AI interface worked in an initial patient, with a high speech production accuracy rate (≈ 95%). There are still important challenges to make it practical for wider use. The brain response recording side of the device needs to be engineered to be fully implantable, with rechargeable electronics that could wirelessly communicate with portable analytical, acoustic, and visual display hardware. Program development is now focused on improving the speed of accuracy of neurologic signal-to-avatar responding. Currently, it’s about half as fast as the normal speaking rate.

In parallel, Chang’s team is further training patients to deploy inherent brain plasticity to improve the accuracy and speed of their speech production, facial expressions, and emotion-modulated voicing still further.

I believe that these challenges are addressable and that within a few years, a reliable, deliverable device shall be in hand.

Microelectrode Arrays Deeper in the Brain
An alternative approach to that of this UCSF team uses grids of microelectrodes introduced vertically into the middle layers or across the depths of the thin cerebral cortical mantle. This placement allows researchers to record more specific voluntary movement control information than can be recorded by the dense surface-mounted electrocorticography arrays used in the UCSF team’s devices.

In practice, this substantial advantage in the quality of information fed to the artificial intelligence analytics is offset by disadvantages in recording response stability, a shorter mean time between failure of these more invasive devices, and trauma associated with the surgical introduction and potential removal or replacement of the microelectrodes. In a recent Stanford trial applying this strategy, researchers succeeded in translating aural speech to text by recording from the speech motor cortex. As in the UCSF experiment, the speech production rate was at about half the normal speed; the accuracy was a little lower, at a rate of about 75%.

Because of limitations in recording stability, frequent calibration was required to implement this approach. The shorter operable time before failure for the current forms of these devices represent a major unresolved question for their future use.

Chang’s UCSF team appears to have a clearer road ahead, at least in the near term, for restoring those silent voices. Such technology and will eventually enable us to reanimate paralyzed arms and legs and bodies.

For my part, I foresee a day, not too far in the future, when I visit Katy and can hear and see her avatar laugh as she tries to tell this old scientist the exciting things that are happening in our medical world.

It is, as described in Mark 7:37, the stuff of miracles.