Building a Context Network from a Narrative
An Approach to Chatbot Question Answering
Page Two: Details of Parsing
by Gary J. Shannon
Started Nov. 4, 2008
Last Updated Nov 7, 2008
Page One: Overview
Page Two: Details of Parsing
Page Three: World Models and Reasoning by Analogy
Page Four: Exploring Some Complications
Page Five: Nuts and Bolts 1: Part of Speech Tagging
Page Six: Nuts and Bolts 2: From Tags to Nodes
Page Seven: Chunking and Knowledge Units
Noun Phrases and Head Nouns
To begin with, we do not want to parse out complete noun phrases. Complete noun phrases often contain prepositional phrases and possessive pronouns or suffixes on nouns; "John's box of tea from India". Since we intend for the possessive and prepositional relationships between nouns to be represented by edges in the directed graphs, we need to have each primitive noun occupying its own individual node in the graph. Even the phrase "a beautiful boat" needs to be represented by two separate nodes in the graph, one for "beautiful" and one for "boat". The articles can be ignored.
For example, "John's box of tea from India," should parse out like this:
box (head noun)
of -> tea
belong to -> John
from -> India
So we have four nodes and three edges.
Both simple and compound nouns are looked up in the dictionary. Thus nouns like "cupboard door", and "dining room table" are simply found, as entire units, in the dictionary, and placed into nodes as a single unit.
Ambiguous words are collected as if they are nouns, but also left in the sentence. Thus if we find a sentence like: "They decided to table the motion." Both a noun node and a verb node are built for "table", and only later, when the noun node finds itself without any connection to the rest of the graph, is it trimmed away and discarded. Likewise, "I put it on the table." would also build a noun and verb node for "table", but later, when the verb is unable to find arguments for its unsatisfied links, it is left an orphan and discarded, leaving only the noun node connected.
The head noun of a noun phrase is the noun that connects to the innermost node of the graph. Thus if we find that when "John's big, old book of poetry" is found from inside the graph, the "book" node is the first one encountered, with the remaining words forming a sub-graph (or sub-tree) rooted on "book".
As far as determining which noun is the head noun, that is the work of the prepositional rules that install the edges connecting the noun nodes to each other. The preposition "of" knows, for example, that it's head noun is the head of it's left argument:
[book] of [poetry]: head is "book"
The possessive suffix "'s" knows that its right argument is the head:
[[John] 's [[book] of [poetry]]]: head is "book"
Parsing, Step by Step
We will begin with the sentence from page one:
He looked so silly in his neon pink running shorts and shapeless tee shirt printed with the name of his hospital's softball team that Jessica was tempted.
Before parsing begins, certain preprocessing steps take place. Possessive pronouns are replaced with the nominative of the pronoun and the possessive suffix: "your" -> "you 's", "our -> we 's", "his -> he 's", etc. This is so that possessive pronouns and regular nouns marked with the possessive suffix "'s" will all present a uniform pattern to the parser: "John 's book" having the same form as "you 's book". While conventional parsing accepts the existence of possessive pronouns, there can be no such thing in a graph parse, since there must be a node for the owner of a thing, and an edge expressing the owner relationship.
The next step is to mark the nouns (and pronouns), verbs, and adjectives. Since there are no ambiguous noun/verb words in this sentence, the additional step of marking those words both ways is not needed. The word "running" cannot be confused for a verb since the compound noun "running shorts" is in the dictionary as a whole unit, although a sentence could exist where that compound is not valid: "While he was running shorts in his electrical system started a fire." Such complications will be explored a little later.
The dictionary entries for "running shorts", "tee shirt" and "softball team" have nodes and edges in their entries, which we can copy into the graph we are going to build. The bare nouns and adjectives we can just place into the graph as bare nodes.
[he] (looked) so <silly> in [he] 's <neon pink> [running shorts] and <shapeless> [tee shirt] printed with [the name] of [he] 's [hospital] 's [softball team] that [Jessica] (was tempted).

Now we can replace the nouns and adjectives with tokens in the original sentence:
[NP1] (V1) so <ADJ1> in [NP1] 's <ADJ2> [NP2] and <ADJ3> [NP3] printed with [NP4] of [NP1] 's [NP5] 's [NP6] that [NP7] (V2)
Notice that we've assumed that all instances of "he" refer to the same person. In general this is probably not a safe assumption but for now, and for illustrative purposes only, we'll go ahead with that assumption.
Adding Edges
The next step is to insert edges between adjectives and the nouns they refer to. First, we will look for all occurrences of "<ADJ> [NP]" and condense each down to its [NP] while placing an appropriate edge in the graph. The edges are labeled with whatever label is found in the dictionary for the adjective. Thus, it is up to "pink" to know that it is a color, so that the parser itself need not be burdened with such details. (In the examples that follow, we will stick with the original words rather than the tokens so that it will be easier to understand what is happening. Just keep in mind that, internally, these are tokens.)
[he] (looked) so <silly> in [he] 's [shorts] and [shirt] printed with [name] of [he] 's [hospital] 's [team] that [Jessica] (was tempted)
The next step is to combine, left to right, groups of the form "[NP] 's [NP]", inserting the correct "belongs to" edges into the graph, and discarding the left [NP] from the sentence while retaining the right [NP] as head of the phrase.
[he] (looked) so <silly> in [shorts] and [shirt] printed with [name] of [team] that [Jessica] (was tempted)
The graph now looks like this:

Turning Prepositions into Edges
We now have three prepositions left, "in", "of", and "printed with". Note that for the purpose of this example we've treated "printed with" as a dictionary preposition, whereas the real application would apply some parsing rule to extract such compound prepositions.
Of the three, the preposition "of" has the highest precedence order, so we will do that one first. The pattern "[NP] of [NP]" has the rule that the left member is retained as the head. We build an edge labeled "of" and remove the non-head noun and preposition from the sentence.
[he] (looked) so <silly> in [shorts] and [shirt] printed with [name] that [Jessica] (was tempted)
Next in precedence order, the "printed with" preposition also retains the left member as the head, and after drawing the edge and removing the non-head noun, we have:
[he] (looked) so <silly> in [shorts] and [shirt] that [Jessica] (was tempted)
Finally, we have the preposition "in". Since there is the conjunction "and" between the two following [NP] nodes, we have the equivalent of the algebraic expression: "in(NP2 + NP3)". By applying the Distributive Law we get "in [NP2] and in [NP3]" meaning that both NP2 and NP3 are treated as if they stand alone with the other constituents of the sentence. There are, in effect, two sentences:
[he] (looked) so <silly> in [shorts] that [Jessica] (was tempted)
[he] (looked) so <silly> in [shirt] that [Jessica] (was tempted)
Edges are constructed for "<ADJ1> in [NP2]" and for "<ADJ1> in [NP3]". The head of this phrase is not the noun, but the adjective, leaving the two sentences:
[he] (looked) so <silly> that [Jessica] (was tempted)
[he] (looked) so <silly> that [Jessica] (was tempted)
Since the sentences are now identical, one of them is discarded.
The pattern "so <ADJ> that" is a special case of a split preposition. It is rearranged to read "<ADJ> so that". This leaves the arguments for V1 adjacent to the verb, and leaves the preposition "so that" in position to connect the two verbs.
[he] (looked) <silly> so that [Jessica] (was tempted)
The verb V1 finds its arguments, and they are connected on the graph and removed from the sentence. (The verb is always the head of it's pattern.) Then the arguments of V2 are found, connected, and removed from the sentence leaving:
(looked) so that (was tempted)
This final preposition connects V1 to V2, and leaves the head of the entire sentence, V1 as all that is left. This head is then connected to the head of the previous sentence, if any, in the dialog or discourse.
The final graph then looks like this:

Notice the grayed node "game" under "softball". Many dictionary entries might include links to nodes in the pre-built general world knowledge graph, so that the present sentence could be given a context within the world at large, and related to other things that the bot knows about softball teams, ("What position did Father play?") hospitals, ("Is father a doctor?"), and famous people named "Jessica" ("Are we talking about Jessica Simpson?").
Ready access to such world knowledge allows the bot to exhibit curiosity, and make relevant tangential comments and observations.
Training the Bot
Because the bot is able to connect new information with its existing understanding of the world, conversations with the bot could be a way of training the bot with new information about the world. Current conversations could be considered as temporary data to be discarded at the end of a session, or as long term memory to be retained indefinitely. Perhaps a certain administrative password would enable long term memory, or perhaps recent conversations could be held in intermediate term memory until the information in those conversations is reviewed by a bot "coach" or administrator, who could then tell the bot which items to save in long term memory and which to discard.
Or the bot could retain all information it is given, but by comparing new information to existing trusted information, assign a probability of truth to a given bit of information. Thus the bot could be told, by any non-administrative person, that "The moon is made of green cheese." and come away from the experience not believing that "the moon is made of green cheese," but knowing that "I've been told that the moon is made of green cheese, but I know better". That seems like a much more life-like thing for the bot to know.
As long as only those people with the proper administrative passwords would be taken to be completely trustworthy coaches, the bot could be trained with all sorts of information that might be classified by the bot as "possibly true", "probably true", "unlikely" or "impossible" depending on the extent to which these new facts conflicted with existing trusted knowledge. In this way, the bot could acquire information continuously from conversations with strangers on the web, and not fall into the trap of believing everything it is told. It might even be able to develop a model of trustworthiness for each frequent corespondent.
Page One: Overview
Page Two: Details of Parsing
Page Three: World Models and Reasoning by Analogy
Page Four: Exploring Some Complications
Page Five: Nuts and Bolts 1: Part of Speech Tagging
Page Six: Nuts and Bolts 2: From Tags to Nodes
Page Seven: Chunking and Knowledge Units