Home

Building a Context Network from a Narrative
An Approach to Chatbot Question Answering
Page Three: Reasoning by Analogy

by Gary J. Shannon

Started Nov. 7, 2008
Last Updated Nov 8, 2008
Page One: Overview
Page Two: Details of Parsing
Page Three: World Models and Reasoning by Analogy
Page Four: Exploring Some Complications
Page Five: Nuts and Bolts 1: Part of Speech Tagging
Page Six: Nuts and Bolts 2: From Tags to Nodes
Page Seven: Chunking and Knowledge Units

Prototypes vs Hard Nodes

If a node is a permanent part of the world model, (i.e., is in long-term memory) then we can call that node a hard node. Such nodes would include world model nouns like "Albert Einstein", and "San Francisco". There can only ever be one instance of each of those nodes. Nodes that might be instantiented several times, within a given conversation, like "person", are built from prototype nodes. Prototype nodes have existing links into the world model, but they do not really exist as part of the world model. For example, the generic "person" prototype might link into world knowledge common to all humans. These links only matter when a new instance of that generic node is copied into short-term memory, and given a specific identity, such as "Marsha".

Associated with each prototype are various "definitions". A definitions is a pathway to a related node, which may or may not exist. For example, consider these statements:

1. David is John and Marsha's father.
2. Albert is David's father.
2. Who is Albert's granddaughter?

Three copies of the generic "person" node are copied into short-term memory, and then the names of those three people are inserted into their respective nodes. Each statement creates new arcs between nodes. For example, statement 1 creates a "father_is" node from John to David, and another from Marsha to David. One important thing about these arcs is that they are two-way paths, and each such path may have several alternative names, depending on which node it is viewed from. For example, the "father_is" arc from John points, quite naturally, to John's father, David. The same arc, viewed from David's node might be labeled "father_of", pointing to a person that David is the father of. (A dictionary of given names tells the bot whether a particular name is most likely that of a female or male person.)

Even when viewed from one particular side, an arc may have additional labels, such as "child_is" and "son_is", showing that the David to John relationship can be called "David is father of John", or "David's son is John", or "David's child is John", or "David's offspring is John". Some of these labels are dependant on gender: "son_is" vs "daughter_is", and some are not: "child_is".

Links that cannot be filled from the information the bot has been given will not exist. Thus, if we have not been told anything about "John's uncle" then the John node will not have an "uncle_is" node. But if we are later told that David has a brother, then we will have enough information to decide that John does have an uncle, and who that uncle is. Consider these sentences:

1. David is John and Marsha's father.
2. Albert is David's father.
3. Simon is Albert's son.
4. Timmy and Susan are Simon's children.
5. Does John have any cousins?
6. Who is David's brother?
7. Whose grandfather is Albert?

As difficult as such a question might seem to the average bot, using link defintions, question such as these are trivial, provided that the bot has been given the appropriate defintions of such terms as "cousin", and "grandfather". We can now take a closer look at exactly what these defintions look like, and how they work.

How Prototypes Work

Keep the following chart in mind while reviewing the defintions below.

A definition of a relationship is a symbolic link consiting of the chain of link names that must be traversed to arrive at the node with the defined relationship. This may sound complicated, but when we consider a basic example, we can see just how obvious it is.

We say that "the father of my father is my grandfather". Simple. Defining "grandfather" symbolically, then, amounts to giving the chain of links from me to my grandfather: "grandfather = father_is -> father_is". In other words, from the "me" node follow my "father is" link to the "my father" node, and then from his node, follow the his "father_is" link to his father's (my grandfather's) node. Thus, for example:

grandfather_is = father_is -> father_is
grandfather_is = mother_is -> father_is
grandmother_is = father_is -> mother_is
grandmother_is = mother_is -> mother_is
uncle_is = father_is -> brother_is
uncle_is = father_is -> sister_is -> husband_is
uncle_is = mother_is -> brother_is
uncle_is = mother_is -> sister_is -> husband_is
cousin_is = uncle_is -> child_is
cousin_is = aunt_is -> child_is

When a person node is instantiated, it receives only those links that the bot has been told about. Later, when a question is asked, and the answer is not found in the network, the definition is used to attempt to trace out a path to the answer. If the bot is asked: "Who are John's cousins?" and the definition leads to a null result because no aunt or uncle can be found, then the bot simply replies "I don't know if John has any cousins." But if the bot has been told about David's brother's children, the it has no difficulty locating an enumerating the cousins of John.

Let's look at those question from above again, and see how they are processed, step by step.

1. David is John and Marsha's father.
2. Albert is David's father.
3. Simon is Albert's son.
4. Timmy and Susan are Simon's children.
5. Does John have any cousins?
6. Who is David's brother?
7. Whose grandfather is Albert?
1. David is John and Marsha's father.

	1a. Instantiate node of type "person" for "David".
	1b. Instantiate node of type "person" for "John".
	1c. Instantiate node of type "person" for "Marsha".
	1d. Populate these three nodes with known information from sentence and names table.

N1 person: David
	gender -> male
	father_of -> N2:John
	father_of -> N3:Marsha
N2 person: John
	gender -> male
	son_of -> N1:David
N3 person: Marsha
	gender -> female
	daughter_of -> N1:David
      

2. Albert is David's father.

	2a. Instantiate node "Albert".
	2b. Populate with facts from sentence. (i.e., "father_is" link with additional 
	labels like "son_of", etc.)

N1 person: David
	gender -> male
	father_of -> N2:John
	father_of -> N3:Marsha
	son_of -> N4:Albert
N2 person: John
	gender -> male
	son_of -> N1:David
N3 person: Marsha
	gender -> female
	daughter_of -> N1:David
N4 person: Albert
	gender -> male
	father_of -> David
      

3. Simon is Albert's son.

	3a. Instantiate node "Simon"
	3b. Populate from sentence.

N1 person: David
	gender -> male
	father_of -> N2:John
	father_of -> N3:Marsha
	son_of -> N4:Albert
N2 person: John
	gender -> male
	son_of -> N1:David
N3 person: Marsha
	gender -> female
	daughter_of -> N1:David
N4 person: Albert
	gender -> male
	father_of -> N1:David
	father_of -> N5:Simon
N5 person: Simon
	gender -> male
	son_of -> N4:Albert
      

4. Timmy and Susan are Simon's children.

	4a. Instantiate nodes for "Susan" and "Timmy"
	4b. Populate with data from sentence.

N1 person: David
	gender -> male
	father_of -> N2:John
	father_of -> N3:Marsha
	son_of -> N4:Albert
N2 person: John
	gender -> male
	son_of -> N1:David
N3 person: Marsha
	gender -> female
	daughter_of -> N1:David
N4 person: Albert
	gender -> male
	father_of -> N1:David
	father_of -> N5:Simon
N5 person: Simon
	gender -> male
	son_of -> N4:Albert
N6 person: Susan
	gender -> female
	daughter_of N5:Simon
N7 person: Timmy
	gender -> male
	son_of N5:Simon
      

5. Does John have any cousins?

	5a. Check short term memory for "cousin". Word not found.
	5b. Definition of cousin_of =  
			cousin_of = father_is -> brother_is -> child_is
			cousin_of = father_is -> sister_is -> child_is
			cousin_of = mother_is -> brother_is -> child_is
			cousin_of = mother_is -> sister_is -> child_is
	5c. Evaluate and apply both instances of: 
		cousin_of = father_is -> brother_is -> child_is to John
	5d. Answer question: "Yes, Timmy and Susan."
	
 N1 person: David
	gender -> male
	father_of -> N2:John
	father_of -> N3:Marsha
	son_of -> N4:Albert
N2 person: John
	gender -> male
	son_of -> N1:David
	cousin_of -> N6:Susan
	cousin_of -> N7:Timmy
N3 person: Marsha
	gender -> female
	daughter_of -> N1:David
N4 person: Albert
	gender -> male
	father_of -> N1:David
	father_of -> N5:Simon
N5 person: Simon
	gender -> male
	son_of -> N4:Albert
N6 person: Susan
	gender -> female
	daughter_of N5:Simon
	cousin_of -> N2:John
N7 person: Timmy
	gender -> male
	son_of N5:Simon
	cousin_of -> N2:John
      

6. Who is David's brother?

	6a. David has no "brother_of" link.
	6b. Definition of brother_of =
			brother_of = father_is -> son_of
			brother_of = mother_is -> son_of
	6c. Given that nodes cannot point to themselves, evaluate and add to graph
		the one defintion that applies:
			Father_of David -> (Albert) son_of Albert -> Simon
	6d. Answer question: "Simon"

 N1 person: David
	gender -> male
	father_of -> N2:John
	father_of -> N3:Marsha
	son_of -> N4:Albert
	brother_of -> N5:Simon
N2 person: John
	gender -> male
	son_of -> N1:David
	cousin_of -> N6:Susan
	cousin_of -> N7:Timmy
N3 person: Marsha
	gender -> female
	daughter_of -> N1:David
N4 person: Albert
	gender -> male
	father_of -> N1:David
	father_of -> N5:Simon
N5 person: Simon
	gender -> male
	son_of -> N4:Albert
	brother_of -> N1:David
N6 person: Susan
	gender -> female
	daughter_of N5:Simon
	cousin_of -> N2:John
N7 person: Timmy
	gender -> male
	son_of N5:Simon
	cousin_of -> N2:John
      

7. Whose grandfather is Albert? 

	7a. no node has a "grandfather_is" link
	7b. Definition of grandfather_is =
			grandfather_of = father_of -> father_of
			grandfather_of = mother_of -> father_of
	7c. Evaluate and apply definition: 
		grandfather_of = father_of -> father_of to each node that works
	7d. Enumerate list of links in Albert for 
		"grandfather_of": "John, Marsha, Susan, and Timmy"

 N1 person: David
	gender -> male
	father_of -> N2:John
	father_of -> N3:Marsha
	son_of -> N4:Albert
	brother_of -> N5:Simon
N2 person: John
	gender -> male
	son_of -> N1:David
	brother_of -> N3 Marsha
	cousin_of -> N6:Susan
	cousin_of -> N7:Timmy
	grandson_of -> N4:Albert
N3 person: Marsha
	gender -> female
	daughter_of -> N1:David
	sister_of -> N2:John
	granddaughter_of -> N4:Albert
N4 person: Albert
	gender -> male
	father_of -> N1:David
	father_of -> N5:Simon
	grandfather_of -> N2:John
	grandfather_of -> N3:Marsha
	grandfather_of -> N6:Susan
	grandfather_of -> N7:Timmy
N5 person: Simon
	gender -> male
	son_of -> N4:Albert
	brother_of -> N1:David
N6 person: Susan
	gender -> female
	daughter_of N5:Simon
	sister_of -> N7:Timmy
	cousin_of -> N2:John
	granddaughter_of -> N4:Albert
N7 person: Timmy
	gender -> male
	son_of N5:Simon
	brother_of -> N6:Susan
	cousin_of -> N2:John
	grandson_of -> N4:Albert
      

Notice that links are not explored and completed until they are needed. If they are never needed then there is no reason to waste precious "thinking" time on filling them out. The graph now looks like this: (Again, only one of several alternative arc labels are used in the image. Keep in mind that the arc labelled "son_of" is also labeled internally as "child_of, father_is, parent_is, and so on.)

We cannot call this the "final" graph, because as more questions are asked, more links may be explored and completed. We have not, for example, asked who Susan's uncle is. If those questions were asked, then a new "uncle_is" link would be added in Susan to point to David, and a new "uncle_of" link added in David to point to Susan.

The basic world model always includes nodes for the bot itself, and for it's virtual "relatives", so that when the bot is asked, for example, to define "grandfather", it can do so by reference to it's own node and say something like: "My grandfathers are my parent's fathers." This is done by invoking the link names themselves as knowledge. Using a simpler approach, the bot might be taught to identify itself as, say, an eight-year-old child, and might answer the question more literally by saying something like "My grandpas are grandpa Todd, and grandpa Bill. Grandpa Bill lives in Alaska."

Implementation Notes

Internally, links said to be "in" a node are actually a linked list of named pointers, pointed to by the node. In this way the list of links in a node can be expanded without limit as new links are established for a given node. No space needs to be reserved in advance for links that may or may not become necessary, and no preconceived notions exist as to which links a given node should have. It is all open ended so that as the bot is told of new relationships, new types of links can be created on the fly.

Thus if the bot has never dealt with, nor been "programmed" to recognize the phrase "second cousin twice removed", it can be told, conversationally, how that relationship is defined, and it will then be capable of creating such a link in the future.

Note: Programmers familiar with the implementation of linked lists will realize that what we call "named arcs" are, in reality, nodes in their own right. But they remain as hidden nodes, and are not to be confused with "node" as we've defined it above.




Page One: Overview
Page Two: Details of Parsing
Page Three: World Models and Reasoning by Analogy
Page Four: Exploring Some Complications
Page Five: Nuts and Bolts 1: Part of Speech Tagging
Page Six: Nuts and Bolts 2: From Tags to Nodes
Page Seven: Chunking and Knowledge Units