Editor’s note: This 2012 IBM 5-in-5 article is by IBM
Master Inventor Dimitri Kanevsky.
Imagine knowing the meaning behind your child’s cry, or
maybe even your pet dog’s bark, through an app on your smartphone. In the next
five years, you will be able to do just that thanks to algorithms embedded in
cognitive systems that will understand any sound.
Each of a baby’s cries, from pain, to hunger, to exhaustion,
sound different – even if it’s difficult to tell. But some of my colleagues and
I patented
a way to take the data from typical baby sounds, collected at different ages by
monitoring brain, heart and lung activity, to interpret how babies feel. Soon,
a mother will be able to translate her baby’s cries in real time into
meaningful phrases, via a baby monitor or smartphone.
Predicting the sound of weather
Sensors already help us with everything from easing
traffic, to conserving
water. These same sensors can also be used to interpret sounds in these
environments. What does a tree under stress during a storm sound like? Will it
collapse into the road? Sensors feeding the information to a city datacenter
would know, and be able to alert ground crews before the collapse.
Scientists at our Research lab in Sao Paolo are using IBM
Deep Thunder to make these kinds of weather predictions in Brazil.
These improvements
in auditory signal processing sensors can also apply to hearing aids or
cochlear implants to better-detect, extract, and transform sound information
into codes the brain can comprehend – helping with focus, or the cancelation of
sounds.
Forget to hit “mute” while on that conference call at work?
Your phone will know how to cancel out background noise – even if that “noise”
is you carrying on a separate conversation with another colleague!
Ultrasonics to bridge
the distance between sounds
Sound travels at 340
meters per second across thousands of frequencies. IBM Research also wants to
take the information from ultrasonic frequencies that we humans can’t hear,
into audio that we can. So, in theory, an ultrasonic device could allow us to
understand animals such as dolphins or that pet dog.
And what if a sound
you want or need to hear could cut through the noise? The same device that
transforms and translates ultrasonics could work in reverse. So, imagine
wanting to talk with someone who, while only a short distance away, is still
too far away to yell (say, from across a crowded room). A smartphone, associated with an ultrasonic
system, could turn the speaker’s voice into an
ultrasonic frequency that cuts through sounds in the room to be delivered to,
and re-translated for only the recipient of the message (who will hear the
message as if the speaker was standing close by – no receiving device needed).
This ultrasonic
capability could also help a police officer warn a pedestrian to not cross a
busy road, without shouting over the traffic noise. And parents could “call”
their children to come in from playing in the neighborhood when it’s time for
dinner – without worrying if their children’s cellphones were on or not.
If you think cognitive systems will most-likely have the ability to hear, before augmenting the other senses, vote for it, here.
IBM thinks these cognitive systems will connect to all of
our other senses. You can read more about sight, smell, taste, and touch technology
in this year’s IBM 5 in 5.
Note: In February 2013, as part of earning the Tan Chin Tuan Exchange Fellowship in Engineering, Dr. Kanvesky lectured about and demonstrated his transcription technology at Nanyang Technological University - Singapore. You can watch his lecture Why I Care About Hessian-Free Optimization in its entirety, here.