How do our brains so effortlessly take all the sounds of speech, piece them together into larger units such as syllables, words, and ultimately into sentences and phrases that can be understood?
January 29, 2019
For some, walking into a crowded cocktail party can be exhilarating. But for the hearing impaired, it can be nightmarish. Parties are filled with sounds: voices, background music, the irregular clinking of glassware. For most of us, the brain’s natural ability to filter out background noise makes it relatively easy to focus only on what we want to hear.
But the hearing-impaired do not have this luxury. Even the best hearing aids work only by elevating all noises at once. This produces a cacophony of sound for device wearers, making it extraordinarily difficult to follow or participate in conversations. As a result, it has been dubbed the ‘cocktail party problem.’ For the 360 million people worldwide who are hearing impaired, this situation is a fact of life. But Nima Mesgarani, PhD, does not see this as an unsolvable problem — he sees an opportunity for innovation.
Though trained as an electrical engineer, Dr. Mesgarani, a principal investigator at Columbia’s Mortimer B. Zuckerman Mind Brain Behavior Institute, has long been intrigued by what he calls the “ultimate machine:” the human brain.
“The brain has an unparalleled ability to process complex sounds, but we understand so little about how it does so,” said Dr. Mesgarani, who is also associate professor of electrical engineering at Columbia’s Fu Foundation School of Engineering and Applied Sciences. “How do our brains so effortlessly take all the sounds of speech, piece them together into larger units such as syllables, words, and ultimately into sentences and phrases that can be understood? This is the question I that I return to over and over again with my research.”
For example, several years ago Dr. Mesgarani devised a method to reconstruct which sounds the brain listens to — and which it ignores — by measuring a person’s brainwaves.
“We asked epilepsy patients, who were already undergoing brain surgery and who volunteered to participate in our research, to listen to a series of sentences, spoken by different people simultaneously, while we measured their brainwaves,” said Dr. Mesgarani. “By plugging those brainwaves into a computer algorithm that we developed, we could reproduce on a computer the words the patients were paying attention to — and with surprising accuracy.”
The results of this study, published in 2012, were hailed by the scientific and medical communities. By showing that the human brain is selective about the sounds it hears, Dr. Mesgarani and his team offered insight into our brains’ natural ability to filter our background noise.
And, because their algorithm could essentially translate an individual’s brainwaves into real words, it laid the foundation for brain-machine interfaces that could one day help people with limited speech abilities — such as those recovering from a stroke — communicate with the outside world.
Building on these findings, Dr. Mesgarani and his collaborators made another important discovery in 2014: that the brain has an uncanny ability to pick up on even the tiniest differences in speech. These differences, known to linguists as ‘features,’ are generated by miniscule changes in the movement and placement of the lips, tongue and vocal cords.
“Our brains’ fine-tuned ability to distinguish one feature from another appears critical to interpreting language,” said Dr. Mesgarani. “It may be what allows us to understand people who speak the same language, but with different accents, such as two native English speakers who grew up on opposite sides of the country.”
Dr. Mesgarani’s research has proven indispensable for our understanding of how the brain processes language. It also paved the way for a hearing device designed specifically to help solve the cocktail party problem, allowing the hearing impaired to communicate more easily in crowded, noisy environments. Dr. Mesgarani calls this device a cognitive hearing aid.
The cognitive hearing aid works by first automatically separating out the voices of multiple speakers in a group. Then, it compares the voice of each speaker to the brain waves of the person wearing the hearing aid. The speaker whose voice pattern most closely matches the listener’s brain waves (an indication that this is the person that the listener is most interested in) is amplified over the others. Dr. Mesgarani published details of this this device in 2017.
Dr. Mesgarani is also immersed in the field of deep neural networks — machines that can decipher the meaning of human speech. A decade ago, neural networks were rudimentary. Now, they exist on the kitchen counters of tens of millions of homes around the world.
“Devices like Google Home and Amazon Echo represent some of the most advanced neural networks ever invented,” said Dr. Mesgarani, “in large part because they are becoming more and more like the human brain.”
By learning more about how information is encoded in these neural networks, Dr. Mesgarani hopes to apply it to studies of the brain. The end goal, he says, is to create technologies that help people — whether it’s so the hearing impaired can converse more easily in a crowded party, or stroke survivors can speak to their loved ones.
A self-described ‘neural engineer,’ Dr. Mesgarani lies at the intersection of engineering and neuroscience.
“Engineers are not so different than neuroscientists,” said Dr. Mesgarani. “We are all ultimately problem solvers. In this case, we are searching for answers to biology’s biggest questions.”