The truth about Siri

IPhone Security Encryption Law Enforcement

Poor, poor Siri. Apple’s (AAPL) virtual personal assistant was welcomed with open arms when it was unveiled last year, and the new feature was also credited with being the driving force behind Apple’s record iPhone 4S sales. But in tech, new features quickly lose their luster and in recent months, complaints surrounding Siri have mounted. Several lawsuits have even been filed against Apple claiming that its Siri commercials are “misleading and deceptive” when they portray the virtual assistant as being easy to use and quick to respond. There is no question that Apple’s personal assistant needs work — I’ve had my fair share of disagreements with Siri (NSFW) — but an article titled “The Stupidity of Computers” published in a recent issue of n+1 magazine helps explain just how impressive Siri really is.

Media and analyst coverage of Siri has become increasingly critical these past few months. Articles along the lines of The New York Times’s recent piece “With Apple’s Siri, a Romance Gone Sour” are growing increasingly more common, and even notorious Apple fanalyst Gene Munster gave Siri a D grade following an exhaustive series of tests he and his team conducted last month.

When Apple responds to lawsuits with suggestions like if you don’t like it buy a different phone, it’s easy to be rubbed the wrong way. Apple’s frustration is understandable, however, when you consider just how much work goes into products like Siri and the new voice search features introduced in Android 4.1 Jelly Bean.

Forget about how spectacularly complicated it is for a device to translate the human voice into a language computers understand — the process of taking those questions and commands, and then acting on them is a feat of immense proportions.

From n+1’s article:

Computers are near-omnipotent cauldrons of processing power, but they’re also stupid. They are the undisputed chess champions of the world, but they can’t understand a simple English conversation. IBM’s Watson supercomputer defeated two top Jeopardy! players last year, but for the clue “What grasshoppers eat,” Watson answered: “Kosher.” For all the data he could access within a fraction of a second—one of the greatest corpuses ever assembled—Watson looked awfully dumb.

Author David Auerbach goes on to describe just how complicated personal computing is, and he focuses at great length on search and the complexities involved with what he describes as “the problem of logically representing language.”

The problem was twofold. First, a program had to resolve the ambiguity inherent in a sentence’s syntax and semantics. Take the fairly simple sentence “I will go to the store if you do.” For an English speaker, this sentence is unambiguous. It means, “I will go to the store only if you go with me (at the same time).” But to a computer it may be confusing: does it mean that I will go to the store (how many times? and which store do I mean?) if you ever, in general (or habitually), go to the store, or just if you go to the store right now, with me? This is partly a problem with the word if, which can be restrictive in different ways in different situations, and possibly with the concept of the store, but there are lots of words like if and lots of concepts like the store, and many situations of far greater ambiguity: uncertain referents, unclear contexts, bizarre idioms. Symbolic logic cannot admit such ambiguities: they must be spelled out explicitly in the translation from language to logic, and computers can’t figure out the complex, ornate, illogical rules of that translation on their own.

Second, a program analyzing natural language must determine what state of affairs that sentence represents in the world, or, to put it another way, its meaning. And this reveals the larger problem: what is the relation of language to the world? In everyday life, people finesse this issue. No one is too concerned with exactly how much hair a man has to lose before he is bald. If he looks bald, he is. Even the legal profession can address linguistic confusion on an ad hoc basis, if need be. But when reality must be represented in language with no ambiguity (or with precisely delineated ambiguity, which is even harder), we’re stuck with the messiest parts of the problem.

Auerbach’s piece is a fantastic read, and is highly recommended to anyone looking for some insight into just how complicated a problem companies like Apple and Google (GOOG) are looking to solve. And, of course, Auerbach only scratches the surface.

Personally, I don’t use Siri very often and when I do use it, I see mixed results. Sometimes Siri doesn’t understand my question or command, sometimes it jumbles words, and sometimes it gets stuck processing my query indefinitely until I cancel my request. Most times though, Siri gives me exactly what I’m looking for. And the truth about Siri is that this is just the beginning.

Apple’s Siri personal assistant and Google’s voice search will continue to evolve and improve, and technologies like these — as well as motion control like the tech touted by Leap Motion — will change the way we interact with smartphones, computers and eventually household gadgets, cars and more. The technology is painfully complex and growing pains are inevitable, but the endgame is a digital revolution.

blog comments powered by Disqus