Parsing For Meaning

A new approach to extracting meaning from English sentences.
by Gary J. Shannon
July 9, 2006

What Is Meaning?

Before we can extract the meaning from a sentence we need to know what we mean by "meaning". For the purpose of this discussion we will assume that the parser we are trying to build is an interface to a computerized world simulation of some kind, and that sentences are meant to provide information to the simulation model, or to make queries about objects in that model. The goal is to make it possible for a human to have a normal conversation with a virtual character in that simulation. In this context "meaning" consists of whatever processing steps must be performed to bring about, within the simulation, whatever is described by the sentence. Thus we can say that a sentence is "understood" when it has been successfully simulated in the model world.

As an example, we might tell the simulation, "My friend John has a cat." The sentence is understood when the simulation engine creates an instance of a person object, assigns the name "John" to that instance, marks that instance as "friend of" the person object representing the speaker, creates an instance of a cat object, and assigns ownership of that cat object to the person object named "John". After executing these simulation commands, the simulated world contains a simulated John who owns a simulated cat, and the sentence has been understood and incorporated into the world model.

The Unimportance of Grammar

A formal grammar, and any parsing strategies based on that grammar, have as their primary purpose recognizing whether a given sentence is or is not a statement in the language defined by that grammar. The output may be some structural analysis of the sentence showing the syntactical relationships among the various components of the sentence. While this is an interesting endeavor, it is of little practical use in a conversational system. Such parsers operate on text which has been proof-read and edited and which can be assumed to have correct grammar and spelling. In casual conversation, including typed on line conversation, these assumptions cannot be made. Nor is a structural diagram of the sentence of any particular use in extracting the meaning, of the sentence, as defined above.

Formal grammars and parsing methods, therefore, fail on two counts where casual conversation is concerned. Any parsing method used in casual conversation must be tolerant of grammatical and spelling errors, and must grasp the meaning of the sentence, while being indifferent to its exact structure, and even the exact vocabulary used to express that meaning. While the formal parser must retain the identity of each word, informal parsing allows words to be replaced by suitable substitutes during the parsing process. Thus "go into" can be replaced by "enter", and "caught a glimpse of" by "saw briefly" without affecting the meaning of the sentence.

In order for the informal parser to be indifferent to grammatical correctness, grammatical constraints must be loosened considerably. These fuzzier rules, however, must still require that the sentence have meaning. Thus a sentence like "Them dogs is mean." should be recognized as meaningful while "My run else book and." should be recognized as not meaningful.

While a linguist might insist that nuances and shades of meaning be retained, the informal parser has no need to do so. The meaning extracted from the sentence only needs to be close enough to enable reasonably accurate modeling of the event described by the sentence. Thus "The ball was hit by John." is considered to be identical in meaning to "John hit the ball." Any nuance suggested by the passive is regarded as irrelevant, since both sentences result in the same event being modeled in the world simulation.

Link Grammar

Link Grammar is a relatively new method of describing the relationships between words in a sentence. An introductory web site by Davy Temperley, Daniel Sleator, and John Lafferty gives a good description of the approach. Like any formal grammar it is not suitable, as it presently stands, for informal parsing of conversational sentences. The basic principle, however, is adaptable to a kind of "fuzzy link grammar" that could form the basis of the kind of parsing we are exploring here.

Unlike Link Grammar we will not attempt to parse every word of the input sentence. Because casual speech is heavily loaded with slang and idiomatic phrases whose parse structures have nothing to do with their meanings, these phrases are captured by a pre-processing step which replaces them with conventionalized representations. Thus "chill out dude" might become "calm yourself", and "give you a heads up" might become "warn you". Likewise common multi-word names for single entities like "peanut butter and jelly sandwich", and "used car salesman", will be compressed into single-word internal representations (e.g. "pbnj-sandwich", "used-car-salesman") before parsing.

While such a dictionary of slang, idioms, and compound nouns could potentially be huge, disk space is hardly a concern on modern computer systems, and with the right access methods (e.g. a relational database to select only those patterns with words actually present in the sentence.) retrieving patterns from the dictionary would be very quick and could prevent a lot of pointless parsing while bringing the meaning of the sentence closer to the surface.

The Importance of Semantics

Having relaxed the rules of syntax in the informal parsing process, we need to emphasize the importance of semantics. By definition the word "semantics" refers to meaning in language, and parsing for meaning will have to put heavy emphasis on semantics. For example in a Link Grammar dictionary it suffices to know that "box" can be a noun, but in a semantic dictionary it is important to know that "box" is a kind of container, since some verbs and prepositions ("Put that in the box.") only make sense for containers. Other words might have a different meanings depending on whether the word they link to is an animate object, an event, or a machine. ("John is running." vs "John is running the election." vs "The generator stopped running.") These distinctions, while irrelevant for syntactical parsing, are crucial to accurate semantic parsing.

Each thing word in the semantic parsing dictionary must be identified with whatever class properties are appropriate to the named object, and each action word must include whatever information is necessary to know how that word is used with members of various classes of entities. In other words, a certain amount of common sense world knowledge is encoding directly into the dictionary and used during parsing to narrow down the possibilities.

Left to Right Parsing

Unlike most parsers, which are recursive and generate many alternate candidate parse trees before arriving at an acceptable one, we will attempt to create a parser that proceeds in a strictly left-to-right fashion, carrying alternative interpretations as a sort of "superposition of states", to borrow a term from Quantum Physics. By the time the end of the sentence is reached it should be possible to "collapse the wave function" as they say in physics, and have all the alternate possibilities cancel out leaving one correct interpretation.

Also unlike conventional parsers, this parser may, at various times during the parsing process, remove and replace sections of the sentence and emit "units of meaning" representing the meaning of the removed sections. For example, while parsing "The cat I saw was black but this cat is white." there could come a time when the unit of meaning "cat[1] that I saw" is emitted in some form and replaced with "cat[1]" in the sentence. The sentence then becomes "cat[1] was black, but this cat[2] is white." Later in the process the unit "cat[1] was black" is emitted leaving only "but this cat[2] is white." to be further parsed. This is analogous to factoring out terms in a mathematical expression, and we will use the term "factoring" to describe it. It is obvious that this process of eager factoring alters the structure of the sentence while it is being parsed, something that is never done by conventional parsers.

The Next Step

This web page is not here to present the results of building the kind of parser needed for informal conversational parsing, but to chronicle my experiments as I attempt to do so. The project has only just begun so no conclusions about the feasibility of the parser can be offered. Everything here should be treated as preliminary and provisional.


On Representing Semantic Classes

Every object noun in the dictionary is a member of one or more semantic classes. Rather than try to devise a strictly hierarchical class structure, each word will inherit the properties of one or more classes. Some of these classes will be hierarchical (e.g. "Enterable" is a kind of "Container", and "Room" is a kind of "Enterable".) A class may directly descend from only one other class. In addition to hierarchical classes, there will be auxiliary classes, referred to as "types" which can be applied in a non-hierarchical manner. A single class may inherit many different types, and each type may be inherited by objects which are not hierarchically related. (e.g. "Door" is a "Barrier" of type "Openable" and type "Lockable", while "Box" is a "Container" of type "Openable" and "Handcuffs" could be a "Restraint" of type "Lockable", and type "Wearable".)

For ease of editing the dictionary long class names and type names will be used. But for economy of disk space and speed of processing these long-word dictionary entries will be compressed into an internal form using short machine-generated abbreviations for the classes and types. Thus a dictionary entry that looks like: jail-cell cell cage = Container( Enterable, Closable, Lockable, Transparent ) might be compressed into the form: jail-cell cell cage=C27(T19,T16,T42,T6). All of this is irrelevant to the parsing processes and can be ignored without loss of meaning. For all practical purposes we can think only about long names in the dictionary entries and leave the compression as something magical that happens behind the scenes.

Many words, if not most of them, have more than one syntactical category and more than one semantic class. "Cell" in the example above can also be a unit of biological life, a transparent sheet used by animators, a room in a monastery, a small group in a covert organization, or a part of an electrical battery, among others. For each such meaning a separate dictionary entry is made.

Like nouns, verbs also have complementary semantic classes. For example, the verb "lock" wants to have a direct object of type "Lockable", while "open", "close", "slam", "shut", "seal", "break into" are all content to associate with direct objects of type "Closable". These preferences are enforced by using "links" between words. These links are inspired by the Link Grammar mentioned above, but are more semantic in nature.

Semantic Links

In a link grammar each word is given a set of links which connect with complementary links on other words in the sentence. These links may either reach out for words to the left, or words to the right. For example, a typical transitive verb will have a left-pointing link that expects a subject (an S- link in the terminology of Link Grammar) and a right-pointing link that expects a direct object (an O+ link). In the present system each link also requires a list of semantic classes that will satisfy the link. "Lock" will require not just a direct object, but specifically a direct object of the type "Lockable". "Run" will require a subject which is either of type "Animate", or "Machine". If it is warranted by an individual verb a type might be created specifically for that verb, so that "run" expects a subject that is of type "Runner", and possibly a direct object of type "Event", which might also be of type "Runnable", as in "Jack ran the meeting smoothly." Some nouns might even be of both those types as in "The drill press was running." and "Fred was running the drill press." And of course "water" needs to be of both types "Runner" and "Runnable" as in "Will you run the water for me?" and "The water ran down the side of the mountain."

The biggest problem with semantic classes is that their number could easily grow to become enormous. Essentially, we would end up pairing every possible verb with every possible noun and deciding whether that pair works or not. This is a good reason to institute some kind of vocabulary compression scheme before the parsing itself begins. Then we would not have to consider a large number of synonyms in the dictionary, but only one representative word that would stand in for all those synonyms. Along the lines of Basic English our dictionary would contain a reduced inventory of English words, and a pre-processing step would insure that every word in the sentence was either in the basic dictionary, or could be replaced by a word or phrase that is in the basic dictionary. We can dispense with "foal" by using "baby horse", while substituting "female horse" for "mare". Likewise, specific place names could be replaced by a generic category name qualified by a wildcard specifier, so that "New York" would become "city[new-york]". Then only "city[*]" needs to appear in the link dictionary, and "new-york" gets brought along for the ride without ever having to match a dictionary entry.