Sound — The Next Frontier

Ideaspring Capital
5 min readApr 24, 2019

--

Speech is the most ancient form of communication humans know. Over time, it has continued to be the preferred medium of communication, or at least the most efficient one, for a majority of the population.

Even today, speech is the easiest, most natural and most efficient form of communication humans know. As people look for more natural ways to interact with machines, Voice is going to be the next big step in UX.

In addition to communication, humans have used sounds for a range of diagnostic purposes as well. Hunters listen to and use sounds to track and lure prey. Doctors listen to your heartbeat and breathing to evaluate your health. Mechanics (and even you) listen to your vehicle and say “this doesn’t sound right”.

Of late, there has been a lot of activity in the voice domain. In the last few years, we have seen a rapid rise in voice-activated interfaces and audio-based content like podcasts, audiobooks, alexa briefings, etc.

Voice-based human-machine interfaces, more commonly in the form of voice assistants, are at the forefront of voice technology today. Hands-free, in-car voice assistants and home assistants have seen great adoption in the consumer market, and will continue to dominate.

However, moving beyond just voice assistants, sound-based solutions are being built and increasingly deployed in retail, healthcare, industrial IoT and various other areas. Advances in microphone technology, digital signal processing, machine learning, deep learning and the like, are fueling the growth of these new innovations (just as it did for image and video-based innovations).

Sound is ubiquitous and everything generates sound, be it humans or machines. It characteristically carries more information (emotion, context, loudness, etc.) as compared to text and is easy to collect. This rich information is helpful in making better analysis, which wouldn’t otherwise be possible.

Microphones trained to listen to and classify different sounds could reshape a number of industries. For example, a very practical use-case is the diagnosis of Tuberculosis based on sound of your cough. For context, it can currently only be diagnosed by a range of tests that are performed after the initial skin or blood test to confirm the presence of Tuberculosis-causing bacteria.

Use-cases can even be as remote as identifying a dying piglet when it’s about to be crushed by a pig rolling over in farms.

Here are a few hand-picked domains with interesting use-cases that go beyond just voice and focus on ‘sound’ — and they are just the tip of the iceberg.

1. Industrial IoT

Even though a lot of sensors and IoT platforms are available, deployment of IoT solutions within manufacturing industries is not frictionless for the following reasons:

a. Sensors need to be either connected and/or mounted on existing machines, sometimes wired to the machines.

b. Maintenance of the above systems is not easy, particularly if machines are moved around.

c. Legacy machines may not have any mechanism to collect data

However, sound-based sensors would provide a simple non-touch and non-intrusive solution to the above problems. This would encourage industries to try out solutions without making many drastic changes to their existing setup.

2. Medical Diagnostics

Historically speaking, audible sound has had two primary uses in healthcare — stethoscopes and believe it or not, medical transcription. Stethoscopes are perhaps the most recognizable of all medical diagnostic devices used to listen to the heart, lungs, and even blood flow in arteries and veins.

However, this has always been a non-digital device. With the advent of electronic stethoscopes, sound quality has improved — resulting in better diagnosis — and they can also be recorded and stored for further analysis, consultations with other doctors and more importantly, to train interns, junior doctors and even machine learning models.

That last part presents an opportunity to mine large data sets that can be accumulated over time, and train models that can then diagnose diseases early on, and not necessarily with significant human involvement, as is currently required.

As an extension of this, sound-based diagnosis doesn’t have to be limited to stethoscopes alone; take for example, the startup working on diagnosing TB by studying coughs.

Even medical transcription is now getting a tech overhaul — there are startups which are attempting to transcribe voice notes by doctors into text, in real time, avoiding the hassle and delay of having a team manually transcribe the same.

3. Content creation tools

With the increase in the consumption of voice based content, especially podcasts and audiobooks, there is a great demand for voice based content creation and editing tools.

With available tools not being very user-friendly, there is a lot of scope for innovation in this area, for example, editing audio files using their textual representation.

With voice synthesis technology picking up, another use-case can be the creation of audio content with minimal effort from the speaker.

4. Sound-based ads personalization

Listening to the sound around you gives out a lot of contextual information which can be used to provide a personalized ad experience.

An interesting solution for advertisers is by serving relevant ads on your secondary device (for example, your mobile phone), based on the sounds generated by the content you are consuming on your TV or laptop.

Another use-case is to localize your position in a retail store by picking up sound from speakers which are quite prevalent, and to use that to serve relevant ads on your mobile device in real time.

5. Customer call analysis

With enterprises increasingly investing in keeping their customers happy, it is imperative to analyze their customer interactions in a more detailed and methodical approach instead of on a sampling basis. However, the amount of customer calls is simply too big for manual analysis.

Automating customer call analysis using transcription and then identifying the intent and mood of the customer would go a long way in improving customer happiness. Doing it in real time can also enable customer support representatives to be guided during the call itself.

There are many challenges in developing solutions which are real time, highly accurate and more importantly, which ensure “privacy”. But with advances in technology, especially edge-computing, we are bound to see solutions which address all these concerns.

At Ideaspring Capital, we look forward to discovering innovative startups in this space and are excited about the potential it holds.

This article was written by Suryaprakash Konaruru, CTO of Ideaspring Capital.

--

--

Ideaspring Capital
Ideaspring Capital

Written by Ideaspring Capital

An early-stage VC fund investing in technology product companies in India.

No responses yet