Industry veteran Bruce Wilcox is creating NPC text chatbots for online world Blue Mars, and this technical article
discusses his adventures in AI markup language to create effective human-text interaction.
After providing a historical overview of chatbots, which covers chatbots from Eliza to A.L.I.C.E., Wilcox delves into AIML, an AI markup language based on XML and used for making chatbots. AIML is rule-based: matching a sequence of words to generate either a completely canned response or a response involving substitution of input words into an output template.
"In AIML, each stimulus/response pair is called a category (calling it a "category" is confusing since it is not what that word means in English). An AIML category is a series of tags with data that describe how to react to a specific input string.
The minimal tags are Pattern and Template, which describe the input text stimulus and the output text response. Patterns consist of case insensitive words and the wildcard * which matches one or more words. The pattern must cover the entire input sequence with punctuation removed.
A simple pattern might be My name is * and the template It's good to meet you, * . Thus when it sees input matching the pattern, whatever follows my name is becomes the * and is filled in as part of the response. 'My name is Roger Rabbit' becomes 'It's good to meet you, Roger Rabbit.'
Of course if the user had said 'My name is Bob, because I was named after my father,' you'd get 'It's good to meet you, Bob, because I was named after my father.' After all, chatbots are inherently dumb."
He goes on to list complaints specific to AIML, pointing out that the markup language's biggest flaw is that it's too wordy and requires huge numbers of effectively redundant categories:
"Since the pattern matching of AIML is so primitive and generic, it takes a lot of category information to perform a single general task. If you want the system to respond to a keyword, either alone or as a prefix, infix, or suffix in a sentence, it takes you four categories to do so, one for each condition (with three of them remapping using SRAI to the fourth).
Had AIML used regular expressions for patterns, this could have been reduced to a single category statement. This leads to a critical point. Conciseness is good and having to have multiple flavors of the same rule is bad. The more you write, the harder it is to keep it organized, debug it, etc.
This was a frequent problem for large-scale expert systems (or any software for that matter). Of course regular expressions are devilishly hard to read and being able to easily understand your rules is also important. But I find even using XML like this is hard to read. It lacks conciseness. The intrusion of the xml keyword structure makes it slow to skim read what you do have. XML is not readable; it is barely legible."
You can read the full feature article
, which includes additional criticisms of AIML and an overview of CHAT-L, a new scripting language Wilcox is designing that doesn't have many of AIML's problems (no registration required, please feel free to link to this feature from external websites).