IBM Research: IBM 5 in 5 2012: Hearing

Imagine knowing the meaning behind your child’s cry, or maybe even your pet dog’s bark, through an app on your smartphone. In the next five years, you will be able to do just that thanks to algorithms embedded in cognitive systems that will understand any sound.

Each of a baby’s cries, from pain, to hunger, to exhaustion, sound different – even if it’s difficult to tell. But some of my colleagues and I patented a way to take the data from typical baby sounds, collected at different ages by monitoring brain, heart and lung activity, to interpret how babies feel. Soon, a mother will be able to translate her baby’s cries in real time into meaningful phrases, via a baby monitor or smartphone.

Predicting the sound of weather

Sensors already help us with everything from easing traffic, to conserving water. These same sensors can also be used to interpret sounds in these environments. What does a tree under stress during a storm sound like? Will it collapse into the road? Sensors feeding the information to a city datacenter would know, and be able to alert ground crews before the collapse.

Scientists at our Research lab in Sao Paolo are using IBM Deep Thunder to make these kinds of weather predictions in Brazil.

These improvements in auditory signal processing sensors can also apply to hearing aids or cochlear implants to better-detect, extract, and transform sound information into codes the brain can comprehend – helping with focus, or the cancelation of sounds.

Forget to hit “mute” while on that conference call at work? Your phone will know how to cancel out background noise – even if that “noise” is you carrying on a separate conversation with another colleague!

Ultrasonics to bridge the distance between sounds

Sound travels at 340 meters per second across thousands of frequencies. IBM Research also wants to take the information from ultrasonic frequencies that we humans can’t hear, into audio that we can. So, in theory, an ultrasonic device could allow us to understand animals such as dolphins or that pet dog.

And what if a sound you want or need to hear could cut through the noise? The same device that transforms and translates ultrasonics could work in reverse. So, imagine wanting to talk with someone who, while only a short distance away, is still too far away to yell (say, from across a crowded room). A smartphone, associated with an ultrasonic system, could turn the speaker’s voice into an ultrasonic frequency that cuts through sounds in the room to be delivered to, and re-translated for only the recipient of the message (who will hear the message as if the speaker was standing close by – no receiving device needed).

This ultrasonic capability could also help a police officer warn a pedestrian to not cross a busy road, without shouting over the traffic noise. And parents could “call” their children to come in from playing in the neighborhood when it’s time for dinner – without worrying if their children’s cellphones were on or not.

If you think cognitive systems will most-likely have the ability to hear, before augmenting the other senses, vote for it, here.

IBM thinks these cognitive systems will connect to all of our other senses. You can read more about sight, smell, taste, and touch technology in this year’s IBM 5 in 5.

Note: In February 2013, as part of earning the Tan Chin Tuan Exchange Fellowship in Engineering, Dr. Kanvesky lectured about and demonstrated his transcription technology at Nanyang Technological University - Singapore. You can watch his lecture Why I Care About Hessian-Free Optimization in its entirety, here.