James Hendler recently discussed what the arrival of Watson at RPI will mean  for the growing technology. He writes, “The Watson program is already a breakthrough technology in AI. For many years it had been largely assumed that for a computer to go beyond search and really be able to perform complex human language tasks it needed to do one of two things: either it would “understand” the texts using some kind of deep ‘knowledge representation,’ or it would have a complex statistical model based on millions of texts.”

He goes on, “Watson used very little of either of these. Rather, it uses a lot of memory and clever ways of pulling texts from that memory. Thus, Watson demonstrated what some in AI had conjectured, but to date been unable to prove: that intelligence is tied to an ability to appropriately find relevant information in a very large memory. (Watson also used a lot of specialized techniques designed for the peculiarities of the Jeopardy! game, such as producing questions from answers, but from a purely academic viewpoint that’s less important.)”

Hendler adds, “Right now, to take Watson into a new domain — for example, to be able to answer questions about health and medicine — Watson works by reading texts. First, it needs a lot of information to go into its memory, which is generally provided by giving it a million or more documents to process from any particular area or discipline. Second, it needs to have information about the specialized terms used – for example, to be told that the word ‘attack’ in ‘heart attack’ is a noun and not a verb. Technical terms, such as, say, ‘myocardial infarction’ also need to be identified. Finally, to hone its ability in the new area it needs a combination of questions and answers to train from.”

Image: Courtesy IBM