Nine months ago, SoundHound wowed the world when it took the wraps off Hound, a voice-enabled personal assistant that was not only faster than rival apps Siri and Google Now but that could understand and correctly answer questions of surprising complexity. This isn’t something that just happened overnight, however — it required nine years of work and incredible foresight on the part of its engineers. In an interview with BGR, SoundHound founder and CEO Keyvan Mohajer explained how he conceived Hound and how he expects to compete against rival digital personal assistants created by tech giants Apple, Google and Microsoft.
Mohajer first came up with the idea for Hound when he was a PhD student at Stanford. He had a deep conviction that the next big evolution in personal computing would come in the form of voice-enabled personal assistants that would not only answer basic questions but would be capable of more or less conversing.
There was just one problem with his grand vision: He knew the technology required to make it a reality would take a long time develop. As in, a decade. Pitching this idea to venture capitalists who want faster returns on their investments was going to be a challenge.
So he knew that he’d have to build a product that would release in a shorter time period that would serve as a bridge to his grander ambitions. That product was SoundHound, the music identification app that helped people identify songs just from hearing them hum the tune. This gave him a preview of some of the challenges he and his team would have building software capable of accurately interpreting human speech — after all, many people who hum a tune aren’t doing so in the proper key, which means that the app has to do a lot of filling in the blanks to properly interpret musical syntax.
“Most people are not good singers or hummers,”Mohajer explains, while adding that there’s an additional challenge because “most times when people use the feature, they’re drunk.”
All the while, the crew at SoundHound kept their eye on the long-term prize and worked diligently on voice recognition software that didn’t have a lot of the drawbacks that we see with Siri and Google Now. Right from the start, they knew they wanted to develop software that eliminated unnecessary extra steps that would harm user experience.
“When you speak to Siri, first your voice becomes text then it gets converting to meaning,” he says. “It takes two steps, so it’s a little bit slower.”
In contrast, Hound converts your voice to text and interprets it simultaneously. This is why it delivers much faster responses than any other voice assistant on the market right now. They’ve also trained Hound to be much better at understanding and answering complex questions — for example, you can ask it to list all Asian restaurants within three miles of you that aren’t Chinese food and that are open between the hours of noon and 8 p.m. on Sundays… and Hound will be able to retrieve that information.
As impressive as Hound is, the app is still only a small piece of Mohajer’s grand plan to “Houndify” everything. Houndify is a platform that lets third-party app developers integrate Hound’s voice recognition technology into their apps. The company demonstrated how powerful this capability can be by bringing Hound support for both Yelp and Uber so that users can access Yelp’s business listings or request a ride right through the Hound app.
The eventual goal is have Houndify embedded not only in smartphone apps but also all devices that you have connected around your house. Or as Mohajer puts it, it’s “not just about an app on your phone, it’s about anything you interact with.”